[ https://issues.apache.org/jira/browse/HAMA-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13863008#comment-13863008 ]
Hudson commented on HAMA-834: ----------------------------- SUCCESS: Integrated in Hama-trunk #225 (See [https://builds.apache.org/job/Hama-trunk/225/]) HAMA-834: Fix KMeans example (millecker: rev 1555747) * /hama/trunk/CHANGES.txt * /hama/trunk/core/src/main/java/org/apache/hama/pipes/util/SequenceFileDumper.java * /hama/trunk/examples/src/main/java/org/apache/hama/examples/Kmeans.java * /hama/trunk/ml/src/main/java/org/apache/hama/ml/kmeans/KMeansBSP.java * /hama/trunk/ml/src/test/java/org/apache/hama/ml/kmeans/TestKMeansBSP.java > Fix KMeans example > ------------------ > > Key: HAMA-834 > URL: https://issues.apache.org/jira/browse/HAMA-834 > Project: Hama > Issue Type: Bug > Components: examples, machine learning > Affects Versions: 0.6.3 > Reporter: Martin Illecker > Assignee: Martin Illecker > Labels: example > Fix For: 0.7.0 > > Attachments: HAMA-834.patch, HAMA-834_v02.patch, HAMA-834_v03.patch > > > Fix problems in KMeans example and revise test case. > 1) Typo \[1] and input path issue > 2) Wrong *summationCount* in assignCentersInternal > *summationCount* should also be incremented if \[2] > {code} > if (clusterCenter == null) { > newCenterArray[lowestDistantCenter] = key; > } > {code} > Otherwise *summationCount* may stay zero when only one value is assigned. > Then this zero will be propagated to *incrementSum* \[3] and might cause a > divide by zero in \[4]. > By the way if we add three vectors and the *summationCount* would only be > two, this will lead to wrong results. Because later we are dividing the > vector by the amount of increments. > 3) Results depend on the amount *numBspTask* > (results vary if *numBspTask* is changed) > \[1] > https://github.com/apache/hama/blob/trunk/ml/src/main/java/org/apache/hama/ml/kmeans/KMeansBSP.java#L518-519 > \[2] > https://github.com/apache/hama/blob/trunk/ml/src/main/java/org/apache/hama/ml/kmeans/KMeansBSP.java#L249 > \[3] > https://github.com/apache/hama/blob/trunk/ml/src/main/java/org/apache/hama/ml/kmeans/KMeansBSP.java#L161 > \[4] > https://github.com/apache/hama/blob/trunk/ml/src/main/java/org/apache/hama/ml/kmeans/KMeansBSP.java#L172 -- This message was sent by Atlassian JIRA (v6.1.5#6160)