[GitHub] [kylin] zzcclp commented on a change in pull request #1495: KYLIN-4829 Support to use thread-level SparkSession to execute query

GitBox Tue, 08 Dec 2020 06:14:36 -0800


zzcclp commented on a change in pull request #1495:
URL: https://github.com/apache/kylin/pull/1495#discussion_r538418158




##########
File path: 
kylin-spark-project/kylin-spark-common/src/main/scala/org/apache/spark/sql/execution/datasource/ResetShufflePartition.scala
##########
@@ -17,25 +17,26 @@
  */
 package org.apache.spark.sql.execution.datasource
 
-import org.apache.kylin.common.{KylinConfig, QueryContext, QueryContextFacade}
+import org.apache.kylin.common.{KylinConfig, QueryContextFacade}
 import org.apache.spark.internal.Logging
 import org.apache.spark.sql.SparkSession
+import org.apache.spark.utils.SparderUtils
 
 trait ResetShufflePartition extends Logging {
+  val PARTITION_SPLIT_BYTES: Long = 
KylinConfig.getInstanceFromEnv.getQueryPartitionSplitSizeMB * 1024 * 1024 // 
64MB
 
   def setShufflePartitions(bytes: Long, sparkSession: SparkSession): Unit = {
     QueryContextFacade.current().addAndGetSourceScanBytes(bytes)
-    val defaultParallelism = sparkSession.sparkContext.defaultParallelism
+    val defaultParallelism = 
SparderUtils.getTotalCore(sparkSession.sparkContext.getConf)
     val kylinConfig = KylinConfig.getInstanceFromEnv
     val partitionsNum = if (kylinConfig.getSparkSqlShufflePartitions != -1) {
       kylinConfig.getSparkSqlShufflePartitions
     } else {
-      Math.min(QueryContextFacade.current().getSourceScanBytes / (

Review comment:
       Using 'Math.min' is right, it makes sure that the max partition number 
is not larger than the total cores.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [kylin] zzcclp commented on a change in pull request #1495: KYLIN-4829 Support to use thread-level SparkSession to execute query

Reply via email to