CodingCat commented on code in PR #2358:
URL: 
https://github.com/apache/incubator-celeborn/pull/2358#discussion_r1522471770


##########
client-spark/spark-2/src/main/java/org/apache/spark/shuffle/celeborn/SortBasedShuffleWriter.java:
##########
@@ -211,7 +211,7 @@ private void fastWrite0(scala.collection.Iterator iterator) 
throws IOException {
 
   private void doPush() throws IOException {
     long start = System.nanoTime();
-    pusher.pushData();
+    pusher.pushData(false);

Review Comment:
   I didn't intent to backport this functionality to spark-2 , shall I ?



##########
client-spark/common/src/main/java/org/apache/spark/shuffle/celeborn/SortBasedPusher.java:
##########
@@ -169,6 +238,7 @@ public long pushData() throws IOException {
                   numPartitions);
           mapStatusLengths[currentPartition].add(bytesWritten);
           afterPush.accept(bytesWritten);
+          memoryThresholdManager.updateStats(offSet, offSet == 
pushBufferMaxSize);

Review Comment:
   actually I tried to compare `pushBufferMaxSize / (1 + factor)` and I found 
my unit test always failed in that case , the reason is that we actually 
triggered  line 251 `memoryThresholdManager.updateStats(offSet, true);` for 
many times with a partially filled buffer....
   
   I agree that `offSet == pushBufferMaxSize` is rarely true but to make the 
precise decision about whether to increase the buffer, we still need tracking 
"shouldPushedBytes" and "shouldPushedCount", therefore we still need a switch 
in the method `updateStats` whether we want to update these two values 
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to