[GitHub] [spark] Ngone51 commented on a change in pull request #30312: [SPARK-32917][SHUFFLE][CORE] Adds support for executors to push shuffle blocks after successful map task completion

GitBox Thu, 03 Dec 2020 00:30:26 -0800


Ngone51 commented on a change in pull request #30312:
URL: https://github.com/apache/spark/pull/30312#discussion_r534913268




##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -1992,4 +1992,32 @@ package object config {
       .version("3.1.0")
       .doubleConf
       .createWithDefault(5)
+
+  private[spark] val SHUFFLE_NUM_PUSH_THREADS =
+    ConfigBuilder("spark.shuffle.push.numPushThreads")
+      .doc("Specify the number of threads in the block pusher pool. These 
threads assist " +
+        "in creating connections and pushing blocks to remote shuffle services 
when push based " +
+        "shuffle is enabled. By default, the threadpool size is equal to the 
number of cores.")
+      .version("3.1.0")
+      .intConf
+      .createOptional
+
+  private[spark] val SHUFFLE_MAX_BLOCK_SIZE_TO_PUSH =
+    ConfigBuilder("spark.shuffle.push.maxBlockSizeToPush")
+      .doc("The max size of an individual block to push to the remote shuffle 
services when push " +
+        "based shuffle is enabled. Blocks larger than this threshold are not 
pushed.")
+      .version("3.1.0")
+      .bytesConf(ByteUnit.KiB)
+      .createWithDefaultString("800k")

Review comment:
       @Victsm 
   For 1, I think 1MiB is better(that's what Spark usually do) if you do not 
see much performance difference between 800Kib and 1 MiB. And you'd better add 
more explanation in the conf doc to say something like the default value is the 
appropriate value to avoid potential disk throughput issue and a small value 
could lead to severe disk issue.
   
   For 2, AFAIK, Spark doesn't know the details of AQE at the shuffle level 
yet. So we actually don't even know whether there's a join operation at SQL 
query. So how can we decide whether the skewed partition needs to be calculated 
or not inside ShuffleBlockPusher? Besides, if we want to calculate the skewed 
partition, we should ensure it's the same(or less than) as the calculated one 
at AQE level, right? Could we make sure of it? (Maybe this's easy if two places 
use the same algorithm)
   
   It's might possible when AQE can pass more info through the task and 
ShuffleBlockPusher leverage it then. But for now, I feel it's kind of hard to 
intersect with AQE.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] Ngone51 commented on a change in pull request #30312: [SPARK-32917][SHUFFLE][CORE] Adds support for executors to push shuffle blocks after successful map task completion

Reply via email to