mridulm commented on code in PR #38371:
URL: https://github.com/apache/spark/pull/38371#discussion_r1009018499


##########
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala:
##########
@@ -3089,13 +3089,14 @@ class DAGSchedulerSuite extends SparkFunSuite with 
TempLocalSparkContext with Ti
     submit(finalRdd, Array(0, 1), properties = new Properties())
 
     // Finish the first 2 shuffle map stages.
-    completeShuffleMapStageSuccessfully(0, 0, 2)
+    completeShuffleMapStageSuccessfully(0, 0, 2, Seq("hostA", "hostB"))

Review Comment:
   This change is not required.
   Fetch failed is due to stage 1 partition on hostB going missing - by 
default, `completeShuffleMapStageSuccessfully` will progressively complete on 
hostA, hostB, etc ... - it will result in recomputing 0 (since there are two 
partitions - on hostA and hostB) and 1 (due to fetch failure) - and 2 ofcourse.
   
   In this case, since there is output on hostB for stage 0 and 1, they are 
recomputed.
   
   If it is confusing, we can add this to the javadoc of 
`completeShuffleMapStageSuccessfully`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to