waitinfuture commented on code in PR #2609:
URL: https://github.com/apache/celeborn/pull/2609#discussion_r1677126884


##########
client-spark/spark-2/src/main/java/org/apache/spark/shuffle/celeborn/SparkUtils.java:
##########
@@ -136,6 +136,11 @@ public static int celebornShuffleId(
     }
   }
 
+  public static int getMapAttemptNumber(TaskContext context) {
+    assert (context.stageAttemptNumber() < (1 << 15) && 
context.attemptNumber() < (1 << 16));
+    return (context.stageAttemptNumber() << 16) | context.attemptNumber();
+  }
+

Review Comment:
   OK, so the failed task might have pushed some batches to Celeborn worker 
before it fails, say Batch0 and Batch1. The re-executed task also pushes its 
own Batch0 and Batch1 (and more) to Celeborn worker.
   
   For these two batches, the failed task and the re-executed task have the 
same (mapaId, attemptNumber, batchId), Celeborn client will treat them 
identical and will filter out either, for example it can filter out re-executed 
tasks' Batch0 and Batch1, and consume the failed tasks' Batch0 and Batch1.
   
   However, as I said previously, Celeborn's design does not guarantee that 
re-executed task generates the identical Batches as the failed one (for current 
implementation, I think Celeborn does generates the same Batches if the task 
generates exactly the same data with the same order, but it's not guaranteed by 
the design and the behavior might change in the future, and `with the same 
order` is usually not satisfied for shuffle read), so the correct behavior 
should be that the failed task's Batch0 and Batch1 be filtered out.
   
   When `throwsFetchFailure` is disabled and not barrier stage, this works fine 
because the re-executed task will not have the same `attemptNumber`. So I think 
the issue is essentially that stage-rerun can break this design consistency: 
`Batches with the same (mapId, attemptId, batchId) should contain the identical 
data`.
   
   cc @mridulm @jiang13021 @ErikFang @RexXiong 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to