beliefer commented on a change in pull request #28096:
[SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc
URL: https://github.com/apache/spark/pull/28096#discussion_r402073245
##########
File path: docs/sql-performance-tuning.md
##########
@@ -193,34 +200,38 @@ Adaptive Query Execution (AQE) is an optimization
technique in Spark SQL that ma
### Coalescing Post Shuffle Partitions
This feature coalesces the post shuffle partitions based on the map output
statistics when both `spark.sql.adaptive.enabled` and
`spark.sql.adaptive.coalescePartitions.enabled` configurations are true. This
feature simplifies the tuning of shuffle partition number when running queries.
You do not need to set a proper shuffle partition number to fit your dataset.
Spark can pick the proper shuffle partition number at runtime once you set a
large enough initial number of shuffle partitions via
`spark.sql.adaptive.coalescePartitions.initialPartitionNum` configuration.
<table class="table">
- <tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
+ <tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since
Version</th></tr>
<tr>
<td><code>spark.sql.adaptive.coalescePartitions.enabled</code></td>
<td>true</td>
<td>
When true and <code>spark.sql.adaptive.enabled</code> is true, Spark
will coalesce contiguous shuffle partitions according to the target size
(specified by <code>spark.sql.adaptive.advisoryPartitionSizeInBytes</code>), to
avoid too many small tasks.
</td>
+ <td>3.0.0</td>
</tr>
<tr>
<td><code>spark.sql.adaptive.coalescePartitions.minPartitionNum</code></td>
<td>Default Parallelism</td>
<td>
The minimum number of shuffle partitions after coalescing. If not set,
the default value is the default parallelism of the Spark cluster. This
configuration only has an effect when <code>spark.sql.adaptive.enabled</code>
and <code>spark.sql.adaptive.coalescePartitions.enabled</code> are both enabled.
</td>
+ <td>3.0.0</td>
</tr>
<tr>
<td><code>spark.sql.adaptive.coalescePartitions.initialPartitionNum</code></td>
<td>200</td>
<td>
The initial number of shuffle partitions before coalescing. By default
it equals to <code>spark.sql.shuffle.partitions</code>. This configuration only
has an effect when <code>spark.sql.adaptive.enabled</code> and
<code>spark.sql.adaptive.coalescePartitions.enabled</code> are both enabled.
</td>
+ <td>3.0.0</td>
Review comment:
SPARK-31037, commit ID:
46b7f1796bd0b96977ce9b473601033f397a3b18#diff-9a6b543db706f1a90f790783d6930a13
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]