[
https://issues.apache.org/jira/browse/KYLIN-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245854#comment-17245854
]
ASF GitHub Bot commented on KYLIN-4829:
---------------------------------------
hit-lacus commented on a change in pull request #1495:
URL: https://github.com/apache/kylin/pull/1495#discussion_r538299640
##########
File path:
kylin-spark-project/kylin-spark-common/src/main/scala/org/apache/spark/sql/execution/datasource/ResetShufflePartition.scala
##########
@@ -17,25 +17,26 @@
*/
package org.apache.spark.sql.execution.datasource
-import org.apache.kylin.common.{KylinConfig, QueryContext, QueryContextFacade}
+import org.apache.kylin.common.{KylinConfig, QueryContextFacade}
import org.apache.spark.internal.Logging
import org.apache.spark.sql.SparkSession
+import org.apache.spark.utils.SparderUtils
trait ResetShufflePartition extends Logging {
+ val PARTITION_SPLIT_BYTES: Long =
KylinConfig.getInstanceFromEnv.getQueryPartitionSplitSizeMB * 1024 * 1024 //
64MB
def setShufflePartitions(bytes: Long, sparkSession: SparkSession): Unit = {
QueryContextFacade.current().addAndGetSourceScanBytes(bytes)
- val defaultParallelism = sparkSession.sparkContext.defaultParallelism
+ val defaultParallelism =
SparderUtils.getTotalCore(sparkSession.sparkContext.getConf)
val kylinConfig = KylinConfig.getInstanceFromEnv
val partitionsNum = if (kylinConfig.getSparkSqlShufflePartitions != -1) {
kylinConfig.getSparkSqlShufflePartitions
} else {
- Math.min(QueryContextFacade.current().getSourceScanBytes / (
Review comment:
Looks like the original code is wrong, if
`sparkContext.defaultParallelism` is 1, `partitionsNum` will always be 1.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Support to use thread-level SparkSession to execute query
> ----------------------------------------------------------
>
> Key: KYLIN-4829
> URL: https://issues.apache.org/jira/browse/KYLIN-4829
> Project: Kylin
> Issue Type: Improvement
> Components: Query Engine, Spark Engine
> Reporter: Zhichao Zhang
> Assignee: Zhichao Zhang
> Priority: Minor
> Fix For: v4.0.0-beta
>
>
> Currently, when executing a query, it is impossible to configure proper
> parameters for each query according to the data will be scanned, such as
> spark.sql.shuffle.partitions, this will impact the performance of querying.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)