[
https://issues.apache.org/jira/browse/MAHOUT-5?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Eastman updated MAHOUT-5:
------------------------------
Attachment: MAHOUT-5b.diff
This patch integrates my refactoring of Point and the DistanceMeasures into
utils and adds some changes to improve unit test reliability: removing all test
directories in setup; and improvements to the KMeansDriver.runJob to avoid
mv-ing directories in the loop since those operations are evidently
asynchronous in Hadoop.
All tests run, more reliably than before.
> Implement a k-means clustering prototype
> -----------------------------------------
>
> Key: MAHOUT-5
> URL: https://issues.apache.org/jira/browse/MAHOUT-5
> Project: Mahout
> Issue Type: New Feature
> Components: Clustering
> Affects Versions: 0.1
> Reporter: Jeff Eastman
> Assignee: Jeff Eastman
> Priority: Minor
> Attachments: kmeans.zip, MAHOUT-5a.diff, MAHOUT-5b.diff
>
>
> K-means clustering is closely related to Canopy clustering and often uses
> canopies to determine the initial clusters. I'd like to implement a k-means
> prototype and tests in the package org.apache.mahout.clustering.kmeans.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.