[
https://issues.apache.org/jira/browse/SPARK-25211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590267#comment-16590267
]
Lijia Liu commented on SPARK-25211:
-----------------------------------
In https://issues.apache.org/jira/browse/SPARK-23948, This issue maybe resolved.
But, The stage will not finished when all outputs ready. Why can't we finish it
immediately after all outputs are ready?
cc [[email protected]]
> speculation and fetch failed result in hang of job
> --------------------------------------------------
>
> Key: SPARK-25211
> URL: https://issues.apache.org/jira/browse/SPARK-25211
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.2.2
> Reporter: Lijia Liu
> Priority: Major
>
> In current `DAGScheduler.handleTaskCompletion` code, when a shuffleMapStage
> with job not in runningStages and its `pendingPartitions` is empty, the job
> of this shuffleMapStage will never complete.
> **Think about below**
> 1. Stage 0 runs and generates shuffle output data.
> 2. Stage 1 reads the output from stage 0 and generates more shuffle data. It
> has two tasks with the same partition: ShuffleMapTask0 and ShuffleMapTask0.1.
> 3. ShuffleMapTask0 fails to fetch blocks and sends a FetchFailed to the
> driver. The driver resubmits stage 0 and stage 1. The driver will place stage
> 0 in runningStages and place stage 1 in waitingStages.
> 4. ShuffleMapTask0.1 successfully finishes and sends Success back to driver.
> The driver will add the mapstatus to the set of output locations of stage 1.
> because of stage 1 not in runningStages, the job will not complete.
> 5. stage 0 completes and the driver will run stage 1. But, because the output
> sets of stage 1 is complete, the drive will not submit any tasks and make
> stage 1 complte right now. Because the job complete relay on the
> `CompletionEvent` and there will never a `CompletionEvent` come, the job will
> hang.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]