This is an automated email from the ASF dual-hosted git repository.

github-actions[bot] pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 4c447b54ed Publish built docs triggered by 
070d0135330a1d084c3b4d510c782079d2cf60f5
4c447b54ed is described below

commit 4c447b54edd70181e324a846ae4f24596f1835e8
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Wed May 27 22:09:02 2026 +0000

    Publish built docs triggered by 070d0135330a1d084c3b4d510c782079d2cf60f5
---
 _sources/user-guide/configs.md.txt | 2 +-
 searchindex.js                     | 2 +-
 user-guide/configs.html            | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/_sources/user-guide/configs.md.txt 
b/_sources/user-guide/configs.md.txt
index 576137bda2..9856a13f00 100644
--- a/_sources/user-guide/configs.md.txt
+++ b/_sources/user-guide/configs.md.txt
@@ -149,7 +149,7 @@ The following configuration settings are available:
 | datafusion.optimizer.enable_dynamic_filter_pushdown                     | 
true                      | When set to true attempts to push down dynamic 
filters generated by operators (TopK, Join & Aggregate) into the file scan 
phase. For example, for a query such as `SELECT * FROM t ORDER BY timestamp 
DESC LIMIT 10`, the optimizer will attempt to push down the current top 10 
timestamps that the TopK operator references into the file scans. This means 
that if we already have 10 timestamps  [...]
 | datafusion.optimizer.filter_null_join_keys                              | 
false                     | When set to true, the optimizer will insert filters 
before a join between a nullable and non-nullable column to filter out nulls on 
the nullable side. This filter can add additional overhead when the file format 
does not fully support predicate push down.                                     
                                                                                
                 [...]
 | datafusion.optimizer.repartition_aggregations                           | 
true                      | Should DataFusion repartition data using the 
aggregate keys to execute aggregates in parallel using the provided 
`target_partitions` level                                                       
                                                                                
                                                                                
                                    [...]
-| datafusion.optimizer.repartition_file_min_size                          | 
10485760                  | Minimum total files size in bytes to perform file 
scan repartitioning.                                                            
                                                                                
                                                                                
                                                                                
                   [...]
+| datafusion.optimizer.repartition_file_min_size                          | 
1048576                   | Minimum total file size in bytes for file-group 
byte-range splitting to fire. Files (or merged file groups) smaller than this 
stay as one partition. Lower values produce more, smaller partitions — better 
at filling `target_partitions` worth of cores when files are modestly sized, at 
the cost of slightly more per-partition open / metadata-load overhead.          
                         [...]
 | datafusion.optimizer.repartition_joins                                  | 
true                      | Should DataFusion repartition data using the join 
keys to execute joins in parallel using the provided `target_partitions` level  
                                                                                
                                                                                
                                                                                
                   [...]
 | datafusion.optimizer.allow_symmetric_joins_without_pruning              | 
true                      | Should DataFusion allow symmetric hash joins for 
unbounded data sources even when its inputs do not have any ordering or 
filtering If the flag is not enabled, the SymmetricHashJoin operator will be 
unable to prune its internal buffers, resulting in certain join types - such as 
Full, Left, LeftAnti, LeftSemi, Right, RightAnti, and RightSemi - being 
produced only at the end of the execut [...]
 | datafusion.optimizer.repartition_file_scans                             | 
true                      | When set to `true`, datasource partitions will be 
repartitioned to achieve maximum parallelism. This applies to both in-memory 
partitions and FileSource's file groups (1 group is 1 partition). For 
FileSources, only Parquet and CSV formats are currently supported. If set to 
`true` for a FileSource, all files will be repartitioned evenly (i.e., a single 
large file might be partitioned in [...]
diff --git a/searchindex.js b/searchindex.js
index 07fd70d560..42767edc73 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles":{"!=":[[74,"op-neq"]],"!~":[[74,"op-re-not-match"]],"!~*":[[74,"op-re-not-match-i"]],"!~~":[[74,"id19"]],"!~~*":[[74,"id20"]],"#":[[74,"op-bit-xor"]],"%":[[74,"op-modulo"]],"&":[[74,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[15,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[74,"op-multiply"]],"+":[[74,"op-plus"]],"-":[[74,"op-minus"]],"/":[[74,"op-divide"]],"1.
 Array Literal Con [...]
\ No newline at end of file
+Search.setIndex({"alltitles":{"!=":[[74,"op-neq"]],"!~":[[74,"op-re-not-match"]],"!~*":[[74,"op-re-not-match-i"]],"!~~":[[74,"id19"]],"!~~*":[[74,"id20"]],"#":[[74,"op-bit-xor"]],"%":[[74,"op-modulo"]],"&":[[74,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[15,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[74,"op-multiply"]],"+":[[74,"op-plus"]],"-":[[74,"op-minus"]],"/":[[74,"op-divide"]],"1.
 Array Literal Con [...]
\ No newline at end of file
diff --git a/user-guide/configs.html b/user-guide/configs.html
index af879aa398..b59ef107b7 100644
--- a/user-guide/configs.html
+++ b/user-guide/configs.html
@@ -808,8 +808,8 @@ example, to configure <code class="docutils literal 
notranslate"><span class="pr
 <td><p>Should DataFusion repartition data using the aggregate keys to execute 
aggregates in parallel using the provided <code class="docutils literal 
notranslate"><span class="pre">target_partitions</span></code> level</p></td>
 </tr>
 <tr 
class="row-even"><td><p>datafusion.optimizer.repartition_file_min_size</p></td>
-<td><p>10485760</p></td>
-<td><p>Minimum total files size in bytes to perform file scan 
repartitioning.</p></td>
+<td><p>1048576</p></td>
+<td><p>Minimum total file size in bytes for file-group byte-range splitting to 
fire. Files (or merged file groups) smaller than this stay as one partition. 
Lower values produce more, smaller partitions — better at filling <code 
class="docutils literal notranslate"><span 
class="pre">target_partitions</span></code> worth of cores when files are 
modestly sized, at the cost of slightly more per-partition open / metadata-load 
overhead.</p></td>
 </tr>
 <tr class="row-odd"><td><p>datafusion.optimizer.repartition_joins</p></td>
 <td><p>true</p></td>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to