[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

Jisoo Kim (JIRA) Thu, 23 Feb 2017 15:12:06 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881525#comment-15881525
 ]


Jisoo Kim commented on SPARK-19698:
-----------------------------------

[~kayousterhout] Thanks for linking the JIRA ticket, I agree that the ticket 
describes a very similar problem that I had. However, I don't think that fixes 
the problem because the PR only deals with a problem in ShuffleMapStage and 
doesn't check the attempt Id in case of ResultStage. In my case, it was 
ResultStage that had the problem. I had run my test with a fix from 
(https://github.com/apache/spark/pull/16620) but it still failed. 

Could you point me to where driver will wait until all tasks finish? I tried 
finding the part but wasn't successful. I don't think Driver shuts down all 
tasks when a job is done, however, DAGScheduler signals the JobWaiter every 
time it receives completion event for a task that is responsible for unfinished 
partition 
(https://github.com/apache/spark/blob/ba8912e5f3d5c5a366cb3d1f6be91f2471d048d2/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1171).
 As a result, JobWaiter will call success() on the job promise 
(https://github.com/jinxing64/spark/blob/6809d1ff5d09693e961087da35c8f6b3b50fe53c/core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala#L61)
 before the 2nd task attempt finishes. This could not be of a problem if Driver 
waits until all tasks finish and SparkContext doesn't return results before all 
tasks finish, but I haven't found that it does yet (please correct me if I am 
missing something). I call SparkContext.stop() after I get the result from the 
application to clean up and upload event logs so I can view the spark history 
from the history server. And when SparkContext stops, AFAIK, it stops the 
Driver as well, which will shut down the task scheduler and executors, and I 
don't think executors will wait until it finishes its task before it shuts 
down. Hence, if this happens, the 2nd task attempt will get shut down as well I 
think.

> Race condition in stale attempt task completion vs current attempt task 
> completion when task is doing persistent state changes
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-19698
>                 URL: https://issues.apache.org/jira/browse/SPARK-19698
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos, Spark Core
>    Affects Versions: 2.0.0
>            Reporter: Charles Allen
>
> We have encountered a strange scenario in our production environment. Below 
> is the best guess we have right now as to what's going on.
> Potentially, the final stage of a job has a failure in one of the tasks (such 
> as OOME on the executor) which can cause tasks for that stage to be 
> relaunched in a second attempt.
> https://github.com/apache/spark/blob/v2.1.0/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1155
> keeps track of which tasks have been completed, but does NOT keep track of 
> which attempt those tasks were completed in. As such, we have encountered a 
> scenario where a particular task gets executed twice in different stage 
> attempts, and the DAGScheduler does not consider if the second attempt is 
> still running. This means if the first task attempt succeeded, the second 
> attempt can be cancelled part-way through its run cycle if all other tasks 
> (including the prior failed) are completed successfully.
> What this means is that if a task is manipulating some state somewhere (for 
> example: a upload-to-temporary-file-location, then delete-then-move on an 
> underlying s3n storage implementation) the driver can improperly shutdown the 
> running (2nd attempt) task between state manipulations, leaving the 
> persistent state in a bad state since the 2nd attempt never got to complete 
> its manipulations, and was terminated prematurely at some arbitrary point in 
> its state change logic (ex: finished the delete but not the move).
> This is using the mesos coarse grained executor. It is unclear if this 
> behavior is limited to the mesos coarse grained executor or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

Reply via email to