Baoxu Shi created SPARK-2228:
--------------------------------

             Summary: onStageSubmitted does not properly called so 
NoSuchElement will throw in onStageCompleted
                 Key: SPARK-2228
                 URL: https://issues.apache.org/jira/browse/SPARK-2228
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 1.1.0
            Reporter: Baoxu Shi


We are using `SaveAsObjectFile` and `objectFile` to cut off lineage during 
iterative computing, but after several hundreds of iterations, there will be 
`NoSuchElementsError`. We check the code and locate the problem at 
`org.apache.spark.ui.jobs.JobProgressListener`. When `onStageCompleted` is 
called, such `stageId` can not be found in `stageIdToPool`, but it does exist 
in other HashMaps. So we think `onStageSubmitted` is not properly called. 
`Spark` did add a stage but failed to send the message to listeners. When 
sending `finish` message to listeners, the error occurs. 

This problem will cause a huge number of `active stages` showing in `SparkUI`, 
which is really annoying. But it may not affect the final result, according to 
the result of my testing code.

I'm willing to help solve this problem, any idea about which part should I 
change? I assume `org.apache.spark.scheduler.SparkListenerBus` have something 
to do with it but it looks fine to me.

FYI, here is the test code that could reproduce the problem. I do not see code 
filed in the system so I put the code on gist.

https://gist.github.com/bxshi/b5c0fe0ae089c75a39bd



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to