[jira] [Comment Edited] (MAHOUT-1177) GSOC 2013: Reform and simplify the clustering APIs

Yexi (JIRA) Thu, 25 Apr 2013 08:34:18 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641893#comment-13641893
 ]


Yexi edited comment on MAHOUT-1177 at 4/25/13 3:34 PM:
-------------------------------------------------------

Hi, 

I am a graduate student majored in data mining, I am very interested in this 
project.
I have used some experiences on distributed data mining using hadoop, so I 
believe I can handle this project.

In order to work on this project, is it necessary for me to join the GSOC 
program?
As the GSOC requires the international student who studies in the US to apply 
for the CPT, and I almost used up the CPT due to previous internships, so I am 
not be able to apply CPT for the GSOC. 

Regards,
Yexi
                
      was (Author: yxjiang):
    Hi, 

I am a graduate student majored in data mining, I am very interested in this 
project.
I have used some experiences on distributed data mining using hadoop, so I 
believe I can handle this project.

In order to work on this project, is it necessary for me to join the GSOC 
program?
As the GSOC requires the international student who studies in the US to apply 
for the CPT, and I almost used up the CPT due to previous internships, so I am 
not be able to apply CPT for the GSOC. 
                  
> GSOC 2013: Reform and simplify the clustering APIs
> --------------------------------------------------
>
>                 Key: MAHOUT-1177
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1177
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Dan Filimon
>              Labels: gsoc2013, mentor
>
> Clustering is one of the most used features in Mahout and has many 
> applications [http://en.wikipedia.org/wiki/Cluster_analysis#Applications].
> We have of lots clustering algorithms. There's:
> - basic k-means
> - canopy clustering
> - Dirichlet clustering
> - Fuzzy k-means
> - Spectral k-means
> - Streaming k-means [coming soon]
> We want to make them easier to use by updating the APIs and make sure they 
> all work in the same way have consistent inputs, outputs, diagnostics and 
> documentation.
> This is a great way to gain an in-depth understanding of clustering 
> algorithms, familiarize yourself with Hadoop, Mahout clustering and good 
> software engineering principles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (MAHOUT-1177) GSOC 2013: Reform and simplify the clustering APIs

Reply via email to