mridulm commented on code in PR #38371:
URL: https://github.com/apache/spark/pull/38371#discussion_r1009018499
##########
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala:
##########
@@ -3089,13 +3089,14 @@ class DAGSchedulerSuite extends SparkFunSuite with
TempLocalSparkContext with Ti
submit(finalRdd, Array(0, 1), properties = new Properties())
// Finish the first 2 shuffle map stages.
- completeShuffleMapStageSuccessfully(0, 0, 2)
+ completeShuffleMapStageSuccessfully(0, 0, 2, Seq("hostA", "hostB"))
Review Comment:
This change is not required.
Fetch failed is due to stage 1 partition on hostB going missing - by
default, `completeShuffleMapStageSuccessfully` will progressively complete on
hostA, hostB, etc ... - it will result in recomputing 0 (since there are two
partitions - on hostA and hostB) and 1 (and 2 ofcourse).
In this case, since there is output on hostB for stage 0, it is recomputed.
If it is confusing, we can add this to the javadoc of
`completeShuffleMapStageSuccessfully`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]