waitinfuture commented on code in PR #2609:
URL: https://github.com/apache/celeborn/pull/2609#discussion_r1677126884


##########
client-spark/spark-2/src/main/java/org/apache/spark/shuffle/celeborn/SparkUtils.java:
##########
@@ -136,6 +136,11 @@ public static int celebornShuffleId(
     }
   }
 
+  public static int getMapAttemptNumber(TaskContext context) {
+    assert (context.stageAttemptNumber() < (1 << 15) && 
context.attemptNumber() < (1 << 16));
+    return (context.stageAttemptNumber() << 16) | context.attemptNumber();
+  }
+

Review Comment:
   OK, so the failed task might have pushed some batches to Celeborn worker 
before it fails, say Batch0 and Batch1. The re-executed task also pushes its 
own Batch0 and Batch1 (and more) to Celeborn worker.
   
   For these two batches, the failed task and the re-executed task have the 
same (mapaId, attemptId, batchId), Celeborn client will treat them identical 
and will filter out either, for example it can filter out re-executed tasks' 
Batch0 and Batch1, and consume the failed tasks' Batch0 and Batch1.
   
   However, as I said previously, Celeborn's design does not guarantee that 
re-executed task generates the identical Batches as the failed one (for current 
implementation, I think Celeborn does generates the same Batches if the task 
generates exactly the same data with the same order, but it's not guaranteed by 
the design and the behavior might change in the future, and `with the same 
order` is usually not satisfied), so the correct behavior should be that the 
failed task's Batch0 and Batch1 be filtered out.
   
   When `throwsFetchFailure` is disabled, this works fine because the 
re-executed task will not have the same `attemptNumber`. So I think the issue 
is essentially that stage-rerun can break this design consistency: `Batches 
with the same (mapId, attemptId, batchId) should contain the identical data`.
   
   This PR encodes `stageAttemptId` and `attemptNumber` together, making 
`attemptId` unique across stage reruns, so IMO it's necessary. WDYT? @mridulm 
Correct me if I missed something :)
   
   cc @jiang13021 @ErikFang @RexXiong 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to