[GitHub] [spark] Ngone51 commented on a change in pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

GitBox Sun, 18 Oct 2020 22:25:06 -0700


Ngone51 commented on a change in pull request #30062:
URL: https://github.com/apache/spark/pull/30062#discussion_r507462453




##########
File path: 
common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java
##########
@@ -363,4 +363,26 @@ public boolean useOldFetchProtocol() {
     return conf.getBoolean("spark.shuffle.useOldFetchProtocol", false);
   }
 
+  /**
+   * The minimum size of a chunk when dividing a merged shuffle file into 
multiple chunks during
+   * push-based shuffle.
+   * A merged shuffle file consists of multiple small shuffle blocks. Fetching 
the
+   * complete merged shuffle file in a single response increases the memory 
requirements for the

Review comment:
       I'm not sure if you're aware of the feature that Spark can save the 
shuffle data into the disk when the data is too large to hold in the memory.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] Ngone51 commented on a change in pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

Reply via email to