[
https://issues.apache.org/jira/browse/MAHOUT-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642881#comment-13642881
]
Yu Lee commented on MAHOUT-1177:
--------------------------------
Hello Guys,
I am also a graduate student with research interests in data mining and big
data analytics.
I am familiar with programming in Hadoop/Mahout for addressing large scale data
analysis problems. I think I can collaborate with Yexi on this project.
The only thing is I cannot apply CPT for GSOC neither...
If it is ok for me to work with Yexi, what would be our next step?
Looking forward to your earliest replies. Thank you!
Best,
> GSOC 2013: Reform and simplify the clustering APIs
> --------------------------------------------------
>
> Key: MAHOUT-1177
> URL: https://issues.apache.org/jira/browse/MAHOUT-1177
> Project: Mahout
> Issue Type: Improvement
> Reporter: Dan Filimon
> Labels: gsoc2013, mentor
>
> Clustering is one of the most used features in Mahout and has many
> applications [http://en.wikipedia.org/wiki/Cluster_analysis#Applications].
> We have of lots clustering algorithms. There's:
> - basic k-means
> - canopy clustering
> - Dirichlet clustering
> - Fuzzy k-means
> - Spectral k-means
> - Streaming k-means [coming soon]
> We want to make them easier to use by updating the APIs and make sure they
> all work in the same way have consistent inputs, outputs, diagnostics and
> documentation.
> This is a great way to gain an in-depth understanding of clustering
> algorithms, familiarize yourself with Hadoop, Mahout clustering and good
> software engineering principles.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira