squito commented on a change in pull request #22806: [SPARK-25250][CORE] : Late 
zombie task completions handled correctly even before new taskset launched
URL: https://github.com/apache/spark/pull/22806#discussion_r258690709
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala
 ##########
 @@ -920,6 +923,9 @@ private[spark] class TaskSetManager(
         s" be re-executed (either because the task failed with a shuffle data 
fetch failure," +
         s" so the previous stage needs to be re-run, or because a different 
copy of the task" +
         s" has already succeeded).")
+    } else if (sched.stageIdToFinishedPartitions.get(stageId).exists(
+      partitions => partitions.contains(tasks(index).partitionId))) {
+      sched.markPartitionCompletedInAllTaskSets(stageId, 
tasks(index).partitionId, info)
 
 Review comment:
   hashmaps are totally unsafe to be used for multiple threads -- its not just 
getting inconsistent values, its that the hashmap may be in some undefined 
state b/c of rehashing.  see eg. 
http://javabypatel.blogspot.com/2016/01/infinite-loop-in-hashmap.html (I just 
skimmed this but I think it has the right idea).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to