[ https://issues.apache.org/jira/browse/MAHOUT-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844468#action_12844468 ]
Robin Anil commented on MAHOUT-326: ----------------------------------- BTW, i hope you tried it with the latest trunk. I see that you have tagged this issue to version 0.2. If thats the case, I would strongly suggest that you move to the trunk > a possible bug with the isConverged() method in KMeansDriver.java > ----------------------------------------------------------------- > > Key: MAHOUT-326 > URL: https://issues.apache.org/jira/browse/MAHOUT-326 > Project: Mahout > Issue Type: Bug > Components: Clustering > Affects Versions: 0.2 > Reporter: Chad Chen > Attachments: mahout_bug.png > > > In one of my today's test runs using the clustering example from the book > "Mahout in Action", I noticed the following exception thrown by > KMeansClusterMapper: > ---------------------------- > java.lang.RuntimeException: Error in configuring object at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at > org.apache.hadoop.mapred.Child.main(Child.java:159) Caused by: > java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > ... 5 more Caused by: java.lang.RuntimeException: Error in configuring object > at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) ... 10 > more Caused by: java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) at > *** > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > ... 13 more Caused by: java.lang.NullPointerException: Cluster is empty!!! at > *** > org.apache.mahout.clustering.kmeans.KMeansClusterMapper.configure(KMeansClusterMapper.java:63) > --------------------------- > which says that the runClustering method didn't see the cluster ouput. The > same map task did finally succeed after a few failed attempts. > After looking into KMeansDirver.java, I think may be a bug in the isConverged > method. Basically, this method doesn't wait for the cluster output file to be > fully populated. If the part-* file doesn't exist yet or has not been fully > written, then this method can return true prematurally. I am not sure if this > is a bug of hadoop itself because it may report successful job before the > mapred output file is fully written. Meanwhile, a possible way to fix this > problem is to force the isConverged method to wait for the existence of the > cluster output file and make sure the file contains the 'converged' values > for all the clusters. > Please note, I saw this problem only once in many test runs I had so far. It > may be a little bit difficult to reproduce. If you need any further > information, please let me know. > Thanks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.