Re: [PR] [WIP][CELEBORN-1319] Optimize skew partition logic for Reduce Mode to avoid sorting shuffle files [celeborn]

via GitHub Fri, 12 Apr 2024 00:48:51 -0700


wangshengjie123 commented on code in PR #2373:
URL: https://github.com/apache/celeborn/pull/2373#discussion_r1562166228



##########
client/src/main/java/org/apache/celeborn/client/ShuffleClientImpl.java:
##########
@@ -1025,6 +1034,10 @@ public void onSuccess(ByteBuffer response) {
                       attemptId,
                       partitionId,
                       nextBatchId);
+                  if (dataPushFailureTrackingEnabled) {
+                    pushState.addFailedBatch(
+                        latest.getUniqueId(), new PushFailedBatch(mapId, 
attemptId, nextBatchId));

Review Comment:
   The initial design intention was that any Revive could potentially lead to 
duplicate data reads because the current task is unable to perform 
deduplication at the mapId-reduce level:
   
   Before this pr, one reduce task will read some determine mapId`s data, but 
current one mapId`s data maybe read by multi-sub-reduce tasks partitioned by 
PartitionLocation, for example:
   
   - map1 write reduce1 shuffle data with 2 batch: batch1 and batch2,
   - batch1 pushed to PartitionLocation1 suceess,
   - but batch2 failed due to only Primary peer succeed, then batch2 Revive to 
PartitionLocation2
   - sub-redcue1 will read PartitionLocation1 with batch1 and maybe read batch2
   - sub-reduce2 will read PartitionLocation2 with batch2 again
   
   Therefore, all Revive-related push batch will be record.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [WIP][CELEBORN-1319] Optimize skew partition logic for Reduce Mode to avoid sorting shuffle files [celeborn]

Reply via email to