Re: [PR] [CELEBORN-1319] Optimize skew partition logic for Reduce Mode to avoid sorting shuffle files [celeborn]

via GitHub Tue, 14 Jan 2025 04:41:39 -0800


RexXiong commented on code in PR #2373:
URL: https://github.com/apache/celeborn/pull/2373#discussion_r1914748980



##########
client/src/main/java/org/apache/celeborn/client/ShuffleClient.java:
##########
@@ -252,6 +257,8 @@ public abstract CelebornInputStream readPartition(
       ExceptionMaker exceptionMaker,
       ArrayList<PartitionLocation> locations,
       ArrayList<PbStreamHandler> streamHandlers,
+      Map<String, Set<PushFailedBatch>> failedBatchSetMap,

Review Comment:
   > For a non skewed stage, we handle this right
   In a non-skewed stage, there is no need for this, as the reducer can read 
data by map range, allowing for the deduplication of identical batches when the 
reducer processes the entire dataset from the map task. However, in a skewed 
stage, the reducer reads only partial data in chunks, which may originate from 
all map tasks. In this scenario, identical batches may appear in different 
chunks, making it difficult for the reducer to deduplicate them unless it is 
aware of which batches shouldn't be read, that's why all map tasks should tell 
LifecycleManager failedBatches which can't be read.



##########
client/src/main/java/org/apache/celeborn/client/ShuffleClient.java:
##########
@@ -252,6 +257,8 @@ public abstract CelebornInputStream readPartition(
       ExceptionMaker exceptionMaker,
       ArrayList<PartitionLocation> locations,
       ArrayList<PbStreamHandler> streamHandlers,
+      Map<String, Set<PushFailedBatch>> failedBatchSetMap,

Review Comment:
   > For a non skewed stage, we handle this right
   
   In a non-skewed stage, there is no need for this, as the reducer can read 
data by map range, allowing for the deduplication of identical batches when the 
reducer processes the entire dataset from the map task. However, in a skewed 
stage, the reducer reads only partial data in chunks, which may originate from 
all map tasks. In this scenario, identical batches may appear in different 
chunks, making it difficult for the reducer to deduplicate them unless it is 
aware of which batches shouldn't be read, that's why all map tasks should tell 
LifecycleManager failedBatches which can't be read.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [CELEBORN-1319] Optimize skew partition logic for Reduce Mode to avoid sorting shuffle files [celeborn]

Reply via email to