[ 
https://issues.apache.org/jira/browse/MAHOUT-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643718#comment-13643718
 ] 

Zhivko Lazarov edited comment on MAHOUT-1177 at 4/27/13 6:06 PM:
-----------------------------------------------------------------

Hello, 
I am an undergraduate student who is very interested in this project. I am 
already familiar with clustering algorithms, since I've take classes like Data 
Mining, Pattern Recognition, Machine Learning, Artificial Intelligence and the 
closest to this project is a news aggregation project I've done where I was 
required to implement different clustering algorithms K-Means, Hierarchical 
Clustering on which I had some own improvements for better accuracy. The news 
were aggregated into a NoSQL MongoDB database. I have competed on ACM ICPC 2011 
and 2012 and have been a laboratory assistant on Algorithms and Data Structures 
at my faculty. I would like to be involved in this project but as part of GSOC 
if it is possible. Regards, Zivko.
                
      was (Author: lazzrov):
    Hello, 
I am an undergraduate student who is very interested in this project. I am 
already familiar with clustering algorithms, since I've take classes like Data 
Mining, Pattern Recognition, Machine Learning, Artificial Intelligence and the 
closest to this project is a news aggregation project I've done where I was 
required to implement different clustering algorithms K-Means, Hierarchical 
Clustering on which I had some own improvements given the aggregated news 
database(NoSQL MongoDB). I have competed on ACM ICPC 2011 and 2012 and have 
been a laboratory assistant on Algorithms and Data Structures at my faculty. I 
would like to be involved in this project but as part of GSOC if it is 
possible. Regards, Zivko.
                  
> GSOC 2013: Reform and simplify the clustering APIs
> --------------------------------------------------
>
>                 Key: MAHOUT-1177
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1177
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Dan Filimon
>              Labels: gsoc2013, mentor
>
> Clustering is one of the most used features in Mahout and has many 
> applications [http://en.wikipedia.org/wiki/Cluster_analysis#Applications].
> We have of lots clustering algorithms. There's:
> - basic k-means
> - canopy clustering
> - Dirichlet clustering
> - Fuzzy k-means
> - Spectral k-means
> - Streaming k-means [coming soon]
> We want to make them easier to use by updating the APIs and make sure they 
> all work in the same way have consistent inputs, outputs, diagnostics and 
> documentation.
> This is a great way to gain an in-depth understanding of clustering 
> algorithms, familiarize yourself with Hadoop, Mahout clustering and good 
> software engineering principles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to