[ https://issues.apache.org/jira/browse/MAHOUT-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683312#action_12683312 ]
Pallavi Palleti commented on MAHOUT-99: --------------------------------------- I have used KeyValueLineRecordReader internally for my code and forgot to revert back to SequenceFileReader. Will that be sufficient to add another patch on the latest code and modify only KMeansDriver to use SequenceFileReader? Kindly let me know. Thanks Pallavi > Improving speed of KMeans > ------------------------- > > Key: MAHOUT-99 > URL: https://issues.apache.org/jira/browse/MAHOUT-99 > Project: Mahout > Issue Type: Improvement > Components: Clustering > Reporter: Pallavi Palleti > Assignee: Grant Ingersoll > Fix For: 0.1 > > Attachments: MAHOUT-99-1.patch, Mahout-99.patch, MAHOUT-99.patch > > > Improved the speed of KMeans by passing only cluster ID from mapper to > reducer. Previously, whole Cluster Info as formatted s`tring was being sent. > Also removed the implicit assumption of Combiner runs only once approach and > the code is modified accordingly so that it won't create a bug when combiner > runs zero or more than once. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.