[
https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878959#comment-15878959
]
Charles Allen commented on SPARK-19698:
---------------------------------------
I *think* this is due to the driver not having the concept of a "critical
section" for code being executed, meaning that you can't declare a portion of
the code being run as "I'm in a non-idempotent command region, please let me
finish"
> Race condition in stale attempt task completion vs current attempt task
> completion when task is doing persistent state changes
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-19698
> URL: https://issues.apache.org/jira/browse/SPARK-19698
> Project: Spark
> Issue Type: Bug
> Components: Mesos, Spark Core
> Affects Versions: 2.0.0
> Reporter: Charles Allen
>
> We have encountered a strange scenario in our production environment. Below
> is the best guess we have right now as to what's going on.
> Potentially, the final stage of a job has a failure in one of the tasks (such
> as OOME on the executor) which can cause tasks for that stage to be
> relaunched in a second attempt.
> https://github.com/apache/spark/blob/v2.1.0/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1155
> keeps track of which tasks have been completed, but does NOT keep track of
> which attempt those tasks were completed in. As such, we have encountered a
> scenario where a particular task gets executed twice in different stage
> attempts, and the DAGScheduler does not consider if the second attempt is
> still running. This means if the first task attempt succeeded, the second
> attempt can be cancelled part-way through its run cycle if all other tasks
> (including the prior failed) are completed successfully.
> What this means is that if a task is manipulating some state somewhere (for
> example: a upload-to-temporary-file-location, then delete-then-move on an
> underlying s3n storage implementation) the driver can improperly shutdown the
> running (2nd attempt) task between state manipulations, leaving the
> persistent state in a bad state since the 2nd attempt never got to complete
> its manipulations, and was terminated prematurely at some arbitrary point in
> its state change logic (ex: finished the delete but not the move).
> This is using the mesos coarse grained executor. It is unclear if this
> behavior is limited to the mesos coarse grained executor or not.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]