[
https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881525#comment-15881525
]
Jisoo Kim commented on SPARK-19698:
-----------------------------------
[~kayousterhout] Thanks for linking the JIRA ticket, I agree that the ticket
describes a very similar problem that I had. However, I don't think that fixes
the problem because the PR only deals with a problem in ShuffleMapStage and
doesn't check the attempt Id in case of ResultStage. In my case, it was
ResultStage that had the problem. I had run my test with a fix from
(https://github.com/apache/spark/pull/16620) but it still failed.
Could you point me to where driver will wait until all tasks finish? I tried
finding the part but wasn't successful. I don't think Driver shuts down all
tasks when a job is done, however, DAGScheduler signals the JobWaiter every
time it receives completion event for a task that is responsible for unfinished
partition
(https://github.com/apache/spark/blob/ba8912e5f3d5c5a366cb3d1f6be91f2471d048d2/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1171).
As a result, JobWaiter will call success() on the job promise
(https://github.com/jinxing64/spark/blob/6809d1ff5d09693e961087da35c8f6b3b50fe53c/core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala#L61)
before the 2nd task attempt finishes. This could not be of a problem if Driver
waits until all tasks finish and SparkContext doesn't return results before all
tasks finish, but I haven't found that it does yet (please correct me if I am
missing something). I call SparkContext.stop() after I get the result from the
application to clean up and upload event logs so I can view the spark history
from the history server. And when SparkContext stops, AFAIK, it stops the
Driver as well, which will shut down the task scheduler and executors, and I
don't think executors will wait until it finishes its task before it shuts
down. Hence, if this happens, the 2nd task attempt will get shut down as well I
think.
> Race condition in stale attempt task completion vs current attempt task
> completion when task is doing persistent state changes
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-19698
> URL: https://issues.apache.org/jira/browse/SPARK-19698
> Project: Spark
> Issue Type: Bug
> Components: Mesos, Spark Core
> Affects Versions: 2.0.0
> Reporter: Charles Allen
>
> We have encountered a strange scenario in our production environment. Below
> is the best guess we have right now as to what's going on.
> Potentially, the final stage of a job has a failure in one of the tasks (such
> as OOME on the executor) which can cause tasks for that stage to be
> relaunched in a second attempt.
> https://github.com/apache/spark/blob/v2.1.0/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1155
> keeps track of which tasks have been completed, but does NOT keep track of
> which attempt those tasks were completed in. As such, we have encountered a
> scenario where a particular task gets executed twice in different stage
> attempts, and the DAGScheduler does not consider if the second attempt is
> still running. This means if the first task attempt succeeded, the second
> attempt can be cancelled part-way through its run cycle if all other tasks
> (including the prior failed) are completed successfully.
> What this means is that if a task is manipulating some state somewhere (for
> example: a upload-to-temporary-file-location, then delete-then-move on an
> underlying s3n storage implementation) the driver can improperly shutdown the
> running (2nd attempt) task between state manipulations, leaving the
> persistent state in a bad state since the 2nd attempt never got to complete
> its manipulations, and was terminated prematurely at some arbitrary point in
> its state change logic (ex: finished the delete but not the move).
> This is using the mesos coarse grained executor. It is unclear if this
> behavior is limited to the mesos coarse grained executor or not.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]