venkata91 commented on a change in pull request #34122:
URL: https://github.com/apache/spark/pull/34122#discussion_r797872918



##########
File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
##########
@@ -2313,7 +2326,7 @@ private[spark] class DAGScheduler(
     // Register merge statuses if the stage is still running and shuffle merge 
is not finalized yet.
     // TODO: SPARK-35549: Currently merge statuses results which come after 
shuffle merge
     // TODO: is finalized is not registered.
-    if (runningStages.contains(stage) && 
!stage.shuffleDep.shuffleMergeFinalized) {
+    if (runningStages.contains(stage) && 
!stage.shuffleDep.isShuffleMergeFinalizedMarked) {

Review comment:
       @mridulm If `shuffleMergeId` is not same we are not finalizing the merge 
right? But there could be a `finalize` request for the correct `shuffleMergeId` 
but we could have messed up the `MergeStatuses` in that case coming from a 
`shuffleMergeId`. Ok makes sense. Will file a separate JIRA for this one as 
well.
   
   On a side note, this is also another reason why we can 
`setShuffleMergeAllowed(false)` for determinate stages in 
`handleShuffleMergeFinalized` because it is possible that we might not finalize 
due to different `shuffleMergeId` finalize request.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to