[GitHub] [spark] siknezevic commented on a change in pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

GitBox Wed, 17 Jun 2020 17:45:16 -0700


siknezevic commented on a change in pull request #27246:
URL: https://github.com/apache/spark/pull/27246#discussion_r441907490




##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -1238,6 +1239,18 @@ package object config {
         s"The value must be in allowed range [1,048,576, 
${MAX_BUFFER_SIZE_BYTES}].")
       .createWithDefault(1024 * 1024)
 
+  private[spark] val UNSAFE_SORTER_SPILL_READER_BUFFER_SIZE_RATIO =
+    ConfigBuilder("spark.unsafe.sorter.spill.reader.buffer.size.ratio")

Review comment:
       Perhaps it is my misunderstanding of your comment from 14 days ago.
   You mention following:
   
   private byte[] arr = new byte[1024 *** 1024**];
   If this number is performance-senstive, could we parameterize it?
   
   So, I created parameter. Is my understanding incorrect?
   
   Why do you use ratio instead of size?
   There is existing parameter “spark.unsafe.sorter.spill.reader.buffer.size” 
which is used for different purpose, so I am not able to use it here. Also, I 
followed your recommendation to parameterize ratio. I am OK with creating 
parameter for entire buffer size instead ratio. 
   Please let me know what makes more sense to you.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] siknezevic commented on a change in pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

Reply via email to