This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 4d409e6bd7 Publish built docs triggered by
161c6d32824fc87307341f942ffad7b4d452c82f
4d409e6bd7 is described below
commit 4d409e6bd772b5025298ade0755cbc436ee53f9e
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Wed Aug 9 16:14:13 2023 +0000
Publish built docs triggered by 161c6d32824fc87307341f942ffad7b4d452c82f
---
_sources/user-guide/configs.md.txt | 2 ++
searchindex.js | 2 +-
user-guide/configs.html | 8 ++++++++
3 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/_sources/user-guide/configs.md.txt
b/_sources/user-guide/configs.md.txt
index bff7cb4da0..63c9c064bc 100644
--- a/_sources/user-guide/configs.md.txt
+++ b/_sources/user-guide/configs.md.txt
@@ -57,6 +57,8 @@ Environment variables are read during `SessionConfig`
initialisation so they mus
| datafusion.execution.parquet.reorder_filters | false | If
true, filter expressions evaluated during the parquet decoding operation will
be reordered heuristically to minimize the cost of evaluation. If false, the
filters are applied in the same order as written in the query
[...]
| datafusion.execution.aggregate.scalar_update_factor | 10 |
Specifies the threshold for using `ScalarValue`s to update accumulators during
high-cardinality aggregations for each input batch. The aggregation is
considered high-cardinality if the number of affected groups is greater than or
equal to `batch_size / scalar_update_factor`. In such cases, `ScalarValue`s are
utilized for updating accumulators, rather than the default batch-slice
approach. This can lead to perform [...]
| datafusion.execution.planning_concurrency | 0 |
Fan-out during initial physical planning. This is mostly use to plan `UNION`
children in parallel. Defaults to the number of CPU cores on the system
[...]
+| datafusion.execution.sort_spill_reservation_bytes | 10485760 |
Specifies the reserved memory for each spillable sort operation to facilitate
an in-memory merge. When a sort operation spills to disk, the in-memory data
must be sorted and merged before being written to a file. This setting reserves
a specific amount of memory for that in-memory sort/merge process. Note: This
setting is irrelevant if the sort operation cannot spill (i.e., if there's no
`DiskManager` configured) [...]
+| datafusion.execution.sort_in_place_threshold_bytes | 1048576 |
When sorting, below what size should data be concatenated and sorted in a
single RecordBatch rather than sorted in batches and merged.
[...]
| datafusion.optimizer.enable_round_robin_repartition | true |
When set to true, the physical plan optimizer will try to add round robin
repartitioning to increase parallelism to leverage more CPU cores
[...]
| datafusion.optimizer.filter_null_join_keys | false |
When set to true, the optimizer will insert filters before a join between a
nullable and non-nullable column to filter out nulls on the nullable side. This
filter can add additional overhead when the file format does not fully support
predicate push down.
[...]
| datafusion.optimizer.repartition_aggregations | true |
Should DataFusion repartition data using the aggregate keys to execute
aggregates in parallel using the provided `target_partitions` level
[...]
diff --git a/searchindex.js b/searchindex.js
index f3a4227d72..e74b1d83ee 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"docnames": ["contributor-guide/architecture",
"contributor-guide/communication", "contributor-guide/index",
"contributor-guide/quarterly_roadmap", "contributor-guide/roadmap",
"contributor-guide/specification/index",
"contributor-guide/specification/invariants",
"contributor-guide/specification/output-field-name-semantic", "index",
"user-guide/cli", "user-guide/configs", "user-guide/dataframe",
"user-guide/example-usage", "user-guide/expressions", "user-guide/faq", "use
[...]
\ No newline at end of file
+Search.setIndex({"docnames": ["contributor-guide/architecture",
"contributor-guide/communication", "contributor-guide/index",
"contributor-guide/quarterly_roadmap", "contributor-guide/roadmap",
"contributor-guide/specification/index",
"contributor-guide/specification/invariants",
"contributor-guide/specification/output-field-name-semantic", "index",
"user-guide/cli", "user-guide/configs", "user-guide/dataframe",
"user-guide/example-usage", "user-guide/expressions", "user-guide/faq", "use
[...]
\ No newline at end of file
diff --git a/user-guide/configs.html b/user-guide/configs.html
index 0bd49c9b1d..0962850732 100644
--- a/user-guide/configs.html
+++ b/user-guide/configs.html
@@ -422,6 +422,14 @@ Environment variables are read during <code
class="docutils literal notranslate"
<td><p>0</p></td>
<td><p>Fan-out during initial physical planning. This is mostly use to plan
<code class="docutils literal notranslate"><span
class="pre">UNION</span></code> children in parallel. Defaults to the number of
CPU cores on the system</p></td>
</tr>
+<tr
class="row-even"><td><p>datafusion.execution.sort_spill_reservation_bytes</p></td>
+<td><p>10485760</p></td>
+<td><p>Specifies the reserved memory for each spillable sort operation to
facilitate an in-memory merge. When a sort operation spills to disk, the
in-memory data must be sorted and merged before being written to a file. This
setting reserves a specific amount of memory for that in-memory sort/merge
process. Note: This setting is irrelevant if the sort operation cannot spill
(i.e., if there’s no <code class="docutils literal notranslate"><span
class="pre">DiskManager</span></code> configu [...]
+</tr>
+<tr
class="row-odd"><td><p>datafusion.execution.sort_in_place_threshold_bytes</p></td>
+<td><p>1048576</p></td>
+<td><p>When sorting, below what size should data be concatenated and sorted in
a single RecordBatch rather than sorted in batches and merged.</p></td>
+</tr>
<tr
class="row-even"><td><p>datafusion.optimizer.enable_round_robin_repartition</p></td>
<td><p>true</p></td>
<td><p>When set to true, the physical plan optimizer will try to add round
robin repartitioning to increase parallelism to leverage more CPU cores</p></td>