[
https://issues.apache.org/jira/browse/SPARK-14915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273830#comment-15273830
]
Apache Spark commented on SPARK-14915:
--------------------------------------
User 'srowen' has created a pull request for this issue:
https://github.com/apache/spark/pull/12950
> Tasks that fail due to CommitDeniedException (a side-effect of speculation)
> can cause job to never complete
> -----------------------------------------------------------------------------------------------------------
>
> Key: SPARK-14915
> URL: https://issues.apache.org/jira/browse/SPARK-14915
> Project: Spark
> Issue Type: Bug
> Affects Versions: 1.5.3, 1.6.2, 2.0.0
> Reporter: Jason Moore
> Assignee: Jason Moore
> Priority: Critical
> Fix For: 1.6.2, 2.0.0
>
>
> In SPARK-14357, code was corrected towards the originally intended behavior
> that a CommitDeniedException should not count towards the failure count for a
> job. After having run with this fix for a few weeks, it's become apparent
> that this behavior has some unintended consequences - that a speculative task
> will continuously receive a CDE from the driver, now causing it to fail and
> retry over and over without limit.
> I'm thinking we could put a task that receives a CDE from the driver, into a
> TaskState.FINISHED or some other state to indicated that the task shouldn't
> be resubmitted by the TaskScheduler. I'd probably need some opinions on
> whether there are other consequences for doing something like this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]