Shannon, Do you mean that we need to give a specific plan right now? Or wait until you finish your work?
2013/5/23 Shannon Quinn (JIRA) <[email protected]> > > [ > https://issues.apache.org/jira/browse/MAHOUT-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665203#comment-13665203] > > Shannon Quinn commented on MAHOUT-1177: > --------------------------------------- > > Yu Lee and Yexi: For the time being, I'd be on board with shelving the > addition of any new clustering algorithms, and instead focusing on > improving documentation and unifying the APIs for the existing ones. I > think that would help scope your work a little more effectively, while > still providing an extremely valuable body of work. Plus, it would greatly > aid the development of new algorithms to have a specific interface to build > into. Beyond that, I think your ideas are good and would encourage you to > start laying out your specific plans. > > Ravi: I would suggest browsing the open JIRAs for Mahout and to submit a > patch for one you think you can tackle. Please feel free to ping our email > list if you have specific questions, though for general ones please submit > them to the list rather than on JIRA. > > > > GSOC 2013: Reform and simplify the clustering APIs > > -------------------------------------------------- > > > > Key: MAHOUT-1177 > > URL: https://issues.apache.org/jira/browse/MAHOUT-1177 > > Project: Mahout > > Issue Type: Improvement > > Reporter: Dan Filimon > > Labels: gsoc2013, mentor > > > > Clustering is one of the most used features in Mahout and has many > applications [http://en.wikipedia.org/wiki/Cluster_analysis#Applications]. > > We have of lots clustering algorithms. There's: > > - basic k-means > > - canopy clustering > > - Dirichlet clustering > > - Fuzzy k-means > > - Spectral k-means > > - Streaming k-means [coming soon] > > We want to make them easier to use by updating the APIs and make sure > they all work in the same way have consistent inputs, outputs, diagnostics > and documentation. > > This is a great way to gain an in-depth understanding of clustering > algorithms, familiarize yourself with Hadoop, Mahout clustering and good > software engineering principles. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira > -- ------ Yexi Jiang, ECS 251, [email protected] School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
