[ 
https://issues.apache.org/jira/browse/MAHOUT-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844597#action_12844597
 ] 

Chad Chen commented on MAHOUT-326:
----------------------------------

Hi Robin,
The bug does seem to be very difficult to reproduce. I have run the same test 
many more times and have not seen the same problem again. '
Unfortunately I lost the original logging output to the console. Otherwise, I 
should be able tell whether or not the jc.monitorAndPrintJob call in the 
following method returned true :

  public static RunningJob runJob(JobConf job) throws IOException {
    JobClient jc = new JobClient(job);
    RunningJob rj = jc.submitJob(job);
    try {
      if (!jc.monitorAndPrintJob(job, rj)) {
        throw new IOException("Job failed!");
      }
    } catch (InterruptedException ie) {
      Thread.currentThread().interrupt();
    }
    return rj;
  }

However, I have a little bit confusion about the following code block:
private static boolean runIteration(..) {
...
    try {
      JobClient.runJob(conf);
      FileSystem fs = FileSystem.get(outPath.toUri(), conf);
      return isConverged(clustersOut, conf, fs);
    } catch (IOException e) {
      log.warn(e.toString(), e);
      return true;
    }
}

So if the call to JobClient.runJob throws an IOException, the reunInteration 
will return true? In this case, the runClustering method may encounter the same 
problem I saw (i.e, the cluster output file was not ready). Is my understanding 
correct?

Thanks.


> a possible bug with the isConverged() method in KMeansDriver.java
> -----------------------------------------------------------------
>
>                 Key: MAHOUT-326
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-326
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.2
>            Reporter: Chad Chen
>         Attachments: mahout_bug.png
>
>
> In one of my today's test runs using the clustering example from the book 
> "Mahout in Action", I noticed the following exception thrown by  
> KMeansClusterMapper:
> ----------------------------
> java.lang.RuntimeException: Error in configuring object at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) 
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at 
> org.apache.hadoop.mapred.Child.main(Child.java:159) Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  at java.lang.reflect.Method.invoke(Method.java:597) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) 
> ... 5 more Caused by: java.lang.RuntimeException: Error in configuring object 
> at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) 
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) ... 10 
> more Caused by: java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  at java.lang.reflect.Method.invoke(Method.java:597) at 
> ***
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) 
> ... 13 more Caused by: java.lang.NullPointerException: Cluster is empty!!! at 
> ***
> org.apache.mahout.clustering.kmeans.KMeansClusterMapper.configure(KMeansClusterMapper.java:63)
> ---------------------------
> which says that the runClustering method didn't see the cluster ouput.  The 
> same map task did finally succeed after a few failed attempts.
> After looking into KMeansDirver.java, I think may be a bug in the isConverged 
> method. Basically, this method doesn't wait for the cluster output file to be 
> fully populated. If the part-* file doesn't exist yet or has not been fully 
> written, then this method can return true prematurally. I am not sure if this 
> is a bug of hadoop itself because it may report successful job before the 
> mapred output file is fully written. Meanwhile, a possible way to fix this 
> problem is to force the isConverged method to wait for the existence of the 
> cluster output file and make sure the file contains the 'converged' values 
> for all the clusters.
> Please note, I saw this problem only once in many test runs I had so far. It 
> may be a little bit difficult to reproduce. If you need any further 
> information, please let me know.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to