viirya commented on a change in pull request #25498: [SPARK-28699][Core] Fix a
corner case for aborting indeterminate stage
URL: https://github.com/apache/spark/pull/25498#discussion_r315298845
##########
File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
##########
@@ -1571,13 +1571,13 @@ private[spark] class DAGScheduler(
// guaranteed to be determinate, so the input data of the
reducers will not change
// even if the map tasks are re-tried.
if (mapStage.rdd.outputDeterministicLevel ==
DeterministicLevel.INDETERMINATE) {
- // It's a little tricky to find all the succeeding stages of
`failedStage`, because
+ // It's a little tricky to find all the succeeding stages of
`mapStage`, because
// each stage only know its parents not children. Here we
traverse the stages from
// the leaf nodes (the result stages of active jobs), and
rollback all the stages
- // in the stage chains that connect to the `failedStage`. To
speed up the stage
+ // in the stage chains that connect to the `mapStage`. To
speed up the stage
// traversing, we collect the stages to rollback first. If a
stage needs to
// rollback, all its succeeding stages need to rollback to.
- val stagesToRollback = HashSet(failedStage)
+ val stagesToRollback = HashSet[Stage](mapStage)
Review comment:
hmm, in original logic, won't mapStage be added into stagesToRollback by
collectStagesToRollback?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]