cloud-fan commented on a change in pull request #33655:
URL: https://github.com/apache/spark/pull/33655#discussion_r683612281
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -526,28 +527,26 @@ object SQLConf {
.booleanConf
.createWithDefault(true)
- private val MIN_PARTITION_SIZE_KEY =
"spark.sql.adaptive.coalescePartitions.minPartitionSize"
-
val COALESCE_PARTITIONS_PARALLELISM_FIRST =
buildConf("spark.sql.adaptive.coalescePartitions.parallelismFirst")
- .doc("When true, Spark ignores the target size specified by " +
+ .doc("When true, Spark does not respect the target size specified by " +
s"'${ADVISORY_PARTITION_SIZE_IN_BYTES.key}' (default 64MB) when
coalescing contiguous " +
- "shuffle partitions, and only respect the minimum partition size
specified by " +
- s"'$MIN_PARTITION_SIZE_KEY' (default 1MB), to maximize the
parallelism. " +
- "This is to avoid performance regression when enabling adaptive query
execution. " +
- "It's recommended to set this config to false and respect the target
size specified by " +
- s"'${ADVISORY_PARTITION_SIZE_IN_BYTES.key}'.")
+ "shuffle partitions, but adaptively calculate the target size
according to the default " +
+ "parallelism of the Spark cluster. The calculated size is usually
smaller than the " +
+ "configured target size. This is to maximize the parallelism and avoid
performance " +
+ "regression when enabling adaptive query execution. It's recommended
to set this config " +
+ "to false and respect the configured target size.")
.version("3.2.0")
.booleanConf
.createWithDefault(true)
val COALESCE_PARTITIONS_MIN_PARTITION_SIZE =
buildConf("spark.sql.adaptive.coalescePartitions.minPartitionSize")
- .doc("The minimum size of shuffle partitions after coalescing. Its value
can be at most " +
- s"20% of '${ADVISORY_PARTITION_SIZE_IN_BYTES.key}'. This is useful
when the target size " +
Review comment:
The limitation is not really necessary. Most likely no one would set it,
but if anyone needs to set it, we should make it flexible.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]