(datafusion) branch asf-site updated: Publish built docs triggered by 84bc8761ac3a126e41658b6cd0ec6bd8cc34cda8

github-bot Fri, 05 Jun 2026 03:04:04 -0700

This is an automated email from the ASF dual-hosted git repository.

github-actions[bot] pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 4b11a7e6fa Publish built docs triggered by 
84bc8761ac3a126e41658b6cd0ec6bd8cc34cda8
4b11a7e6fa is described below

commit 4b11a7e6fac441b86a492d30c269534bf0d62337
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Fri Jun 5 10:03:52 2026 +0000

    Publish built docs triggered by 84bc8761ac3a126e41658b6cd0ec6bd8cc34cda8
---
 _sources/user-guide/configs.md.txt            |   3 +-
 _sources/user-guide/sql/format_options.md.txt |   1 +
 searchindex.js                                |   2 +-
 user-guide/configs.html                       | 218 +++++++++++++-------------
 user-guide/sql/format_options.html            |  50 +++---
 5 files changed, 143 insertions(+), 131 deletions(-)

diff --git a/_sources/user-guide/configs.md.txt 
b/_sources/user-guide/configs.md.txt
index 88fbeb3de0..442b72ea9b 100644
--- a/_sources/user-guide/configs.md.txt
+++ b/_sources/user-guide/configs.md.txt
@@ -101,7 +101,8 @@ The following configuration settings are available:
 | datafusion.execution.parquet.dictionary_enabled                         | 
true                      | (writing) Sets if dictionary encoding is enabled. 
If NULL, uses default parquet writer setting                                    
                                                                                
                                                                                
                                                                                
                   [...]
 | datafusion.execution.parquet.dictionary_page_size_limit                 | 
1048576                   | (writing) Sets best effort maximum dictionary page 
size, in bytes                                                                  
                                                                                
                                                                                
                                                                                
                  [...]
 | datafusion.execution.parquet.statistics_enabled                         | 
page                      | (writing) Sets if statistics are enabled for any 
column Valid values are: "none", "chunk", and "page" These values are not case 
sensitive. If NULL, uses default parquet writer setting                         
                                                                                
                                                                                
                     [...]
-| datafusion.execution.parquet.max_row_group_size                         | 
1048576                   | (writing) Target maximum number of rows in each row 
group (defaults to 1M rows). Writing larger row groups requires more memory to 
write, but can get better compression and be faster to read.                    
                                                                                
                                                                                
                  [...]
+| datafusion.execution.parquet.max_row_group_size                         | 
1048576                   | (writing) Target maximum number of rows in each row 
group (defaults to 1M rows). Writing larger row groups requires more memory to 
write, but can get better compression and be faster to read. When 
`max_row_group_bytes` is also set, the writer flushes a row group when either 
limit is reached, whichever comes first.                                        
                                  [...]
+| datafusion.execution.parquet.max_row_group_bytes                        | 
NULL                      | (writing) Target maximum size of each row group in 
bytes. When set, the writer flushes whenever either this limit or 
`max_row_group_size` is reached, whichever comes first. Useful for bounding 
writer memory on wide schemas where a row-count limit can map to very different 
byte sizes. Matches the behavior of `parquet.block.size` in parquet-mr. If 
`None` (the default), only the row-count [...]
 | datafusion.execution.parquet.created_by                                 | 
datafusion version 53.1.0 | (writing) Sets "created by" property                
                                                                                
                                                                                
                                                                                
                                                                                
                 [...]
 | datafusion.execution.parquet.column_index_truncate_length               | 64 
                       | (writing) Sets column index truncate length            
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
 | datafusion.execution.parquet.statistics_truncate_length                 | 64 
                       | (writing) Sets statistics truncate length. If NULL, 
uses default parquet writer setting                                             
                                                                                
                                                                                
                                                                                
                 [...]
diff --git a/_sources/user-guide/sql/format_options.md.txt 
b/_sources/user-guide/sql/format_options.md.txt
index 46d251c18e..ca79858dae 100644
--- a/_sources/user-guide/sql/format_options.md.txt
+++ b/_sources/user-guide/sql/format_options.md.txt
@@ -142,6 +142,7 @@ The following options are available when reading or writing 
Parquet files. If an
 | BLOOM_FILTER_FPP                           | Yes                     | Sets 
bloom filter false positive probability (global or per column).                 
                                                                                
                                                                                
                                                                                
| `'bloom_filter_fpp'` or `'bloom_filter_fpp::col'`     | None                  
   |
 | BLOOM_FILTER_NDV                           | Yes                     | Sets 
bloom filter number of distinct values (global or per column).                  
                                                                                
                                                                                
                                                                                
| `'bloom_filter_ndv'` or `'bloom_filter_ndv::col'`     | None                  
   |
 | MAX_ROW_GROUP_SIZE                         | No                      | Sets 
the maximum number of rows per row group. Larger groups require more memory but 
can improve compression and scan efficiency.                                    
                                                                                
                                                                                
| `'max_row_group_size'`                                | 1048576               
   |
+| MAX_ROW_GROUP_BYTES                        | No                      | Sets 
the maximum size of each row group in bytes. When both this and 
`MAX_ROW_GROUP_SIZE` are set, the row group flushes whenever either limit is 
reached. Mirrors `parquet.block.size` from parquet-mr. Currently only honored 
when `allow_single_file_parallelism` is `false`; by default the parallel file 
writer ignores it.     | `'max_row_group_bytes'`                               
| None                     |
 | ENABLE_PAGE_INDEX                          | No                      | If 
true, reads the Parquet data page level metadata (the Page Index), if present, 
to reduce I/O and decoding.                                                     
                                                                                
                                                                                
   | `'enable_page_index'`                                 | true               
      |
 | PRUNING                                    | No                      | If 
true, enables row group pruning based on min/max statistics.                    
                                                                                
                                                                                
                                                                                
  | `'pruning'`                                           | true                
     |
 | SKIP_METADATA                              | No                      | If 
true, skips optional embedded metadata in the file schema.                      
                                                                                
                                                                                
                                                                                
  | `'skip_metadata'`                                     | true                
     |
diff --git a/searchindex.js b/searchindex.js
index 0ecd29960a..40581e1cbb 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles":{"!=":[[74,"op-neq"]],"!~":[[74,"op-re-not-match"]],"!~*":[[74,"op-re-not-match-i"]],"!~~":[[74,"id19"]],"!~~*":[[74,"id20"]],"#":[[74,"op-bit-xor"]],"%":[[74,"op-modulo"]],"&":[[74,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[15,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[74,"op-multiply"]],"+":[[74,"op-plus"]],"-":[[74,"op-minus"]],"/":[[74,"op-divide"]],"1.
 Array Literal Con [...]
\ No newline at end of file
+Search.setIndex({"alltitles":{"!=":[[74,"op-neq"]],"!~":[[74,"op-re-not-match"]],"!~*":[[74,"op-re-not-match-i"]],"!~~":[[74,"id19"]],"!~~*":[[74,"id20"]],"#":[[74,"op-bit-xor"]],"%":[[74,"op-modulo"]],"&":[[74,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[15,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[74,"op-multiply"]],"+":[[74,"op-plus"]],"-":[[74,"op-minus"]],"/":[[74,"op-divide"]],"1.
 Array Literal Con [...]
\ No newline at end of file
diff --git a/user-guide/configs.html b/user-guide/configs.html
index 12549436d3..9220139b48 100644
--- a/user-guide/configs.html
+++ b/user-guide/configs.html
@@ -631,429 +631,433 @@ example, to configure <code class="docutils literal 
notranslate"><span class="pr
 </tr>
 <tr 
class="row-even"><td><p>datafusion.execution.parquet.max_row_group_size</p></td>
 <td><p>1048576</p></td>
-<td><p>(writing) Target maximum number of rows in each row group (defaults to 
1M rows). Writing larger row groups requires more memory to write, but can get 
better compression and be faster to read.</p></td>
+<td><p>(writing) Target maximum number of rows in each row group (defaults to 
1M rows). Writing larger row groups requires more memory to write, but can get 
better compression and be faster to read. When <code class="docutils literal 
notranslate"><span class="pre">max_row_group_bytes</span></code> is also set, 
the writer flushes a row group when either limit is reached, whichever comes 
first.</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.execution.parquet.created_by</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.parquet.max_row_group_bytes</p></td>
+<td><p>NULL</p></td>
+<td><p>(writing) Target maximum size of each row group in bytes. When set, the 
writer flushes whenever either this limit or <code class="docutils literal 
notranslate"><span class="pre">max_row_group_size</span></code> is reached, 
whichever comes first. Useful for bounding writer memory on wide schemas where 
a row-count limit can map to very different byte sizes. Matches the behavior of 
<code class="docutils literal notranslate"><span 
class="pre">parquet.block.size</span></code> in parque [...]
+</tr>
+<tr class="row-even"><td><p>datafusion.execution.parquet.created_by</p></td>
 <td><p>datafusion version 53.1.0</p></td>
 <td><p>(writing) Sets “created by” property</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.parquet.column_index_truncate_length</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.parquet.column_index_truncate_length</p></td>
 <td><p>64</p></td>
 <td><p>(writing) Sets column index truncate length</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.parquet.statistics_truncate_length</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.parquet.statistics_truncate_length</p></td>
 <td><p>64</p></td>
 <td><p>(writing) Sets statistics truncate length. If NULL, uses default 
parquet writer setting</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.parquet.data_page_row_count_limit</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.parquet.data_page_row_count_limit</p></td>
 <td><p>20000</p></td>
 <td><p>(writing) Sets best effort maximum number of rows in data page</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.execution.parquet.encoding</p></td>
+<tr class="row-even"><td><p>datafusion.execution.parquet.encoding</p></td>
 <td><p>NULL</p></td>
 <td><p>(writing) Sets default encoding for any column. Valid values are: 
plain, plain_dictionary, rle, bit_packed, delta_binary_packed, 
delta_length_byte_array, delta_byte_array, rle_dictionary, and 
byte_stream_split. These values are not case sensitive. If NULL, uses default 
parquet writer setting</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.parquet.bloom_filter_on_write</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.parquet.bloom_filter_on_write</p></td>
 <td><p>false</p></td>
 <td><p>(writing) Write bloom filters for all columns when creating parquet 
files</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.parquet.bloom_filter_fpp</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.parquet.bloom_filter_fpp</p></td>
 <td><p>NULL</p></td>
 <td><p>(writing) Sets bloom filter false positive probability. If NULL, uses 
default parquet writer setting</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.parquet.bloom_filter_ndv</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.parquet.bloom_filter_ndv</p></td>
 <td><p>NULL</p></td>
 <td><p>(writing) Sets bloom filter number of distinct values. If NULL, uses 
default parquet writer setting</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.parquet.allow_single_file_parallelism</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.parquet.allow_single_file_parallelism</p></td>
 <td><p>true</p></td>
 <td><p>(writing) Controls whether DataFusion will attempt to speed up writing 
parquet files by serializing them in parallel. Each column in each row group in 
each output file are serialized in parallel leveraging a maximum possible core 
count of n_files<em>n_row_groups</em>n_columns.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.parquet.maximum_parallel_row_group_writers</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.parquet.maximum_parallel_row_group_writers</p></td>
 <td><p>1</p></td>
 <td><p>(writing) By default parallel parquet writer is tuned for minimum 
memory usage in a streaming execution plan. You may see a performance benefit 
when writing large parquet files by increasing 
maximum_parallel_row_group_writers and 
maximum_buffered_record_batches_per_stream if your system has idle cores and 
can tolerate additional memory usage. Boosting these values is likely 
worthwhile when writing out already in-memory data, such as from a cached data 
frame.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.parquet.maximum_buffered_record_batches_per_stream</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.parquet.maximum_buffered_record_batches_per_stream</p></td>
 <td><p>2</p></td>
 <td><p>(writing) By default parallel parquet writer is tuned for minimum 
memory usage in a streaming execution plan. You may see a performance benefit 
when writing large parquet files by increasing 
maximum_parallel_row_group_writers and 
maximum_buffered_record_batches_per_stream if your system has idle cores and 
can tolerate additional memory usage. Boosting these values is likely 
worthwhile when writing out already in-memory data, such as from a cached data 
frame.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.parquet.content_defined_chunking.enabled</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.parquet.content_defined_chunking.enabled</p></td>
 <td><p>false</p></td>
 <td><p>(writing) EXPERIMENTAL: Enable content-defined chunking (CDC) when 
writing parquet files. When enabled, parallel writing is automatically disabled 
since the chunker state must persist across row groups.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.parquet.content_defined_chunking.min_chunk_size</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.parquet.content_defined_chunking.min_chunk_size</p></td>
 <td><p>262144</p></td>
 <td><p>Minimum chunk size in bytes. The rolling hash will not trigger a split 
until this many bytes have been accumulated. Default is 256 KiB.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.parquet.content_defined_chunking.max_chunk_size</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.parquet.content_defined_chunking.max_chunk_size</p></td>
 <td><p>1048576</p></td>
 <td><p>Maximum chunk size in bytes. A split is forced when the accumulated 
size exceeds this value. Default is 1 MiB.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.parquet.content_defined_chunking.norm_level</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.parquet.content_defined_chunking.norm_level</p></td>
 <td><p>0</p></td>
 <td><p>Normalization level. Increasing this improves deduplication ratio but 
increases fragmentation. Recommended range is [-3, 3], default is 0.</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.execution.planning_concurrency</p></td>
+<tr class="row-odd"><td><p>datafusion.execution.planning_concurrency</p></td>
 <td><p>0</p></td>
 <td><p>Fan-out during initial physical planning. This is mostly use to plan 
<code class="docutils literal notranslate"><span 
class="pre">UNION</span></code> children in parallel. Defaults to the number of 
CPU cores on the system</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.skip_physical_aggregate_schema_check</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.skip_physical_aggregate_schema_check</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, skips verifying that the schema produced by planning 
the input of <code class="docutils literal notranslate"><span 
class="pre">LogicalPlan::Aggregate</span></code> exactly matches the schema of 
the input plan. When set to false, if the schema does not match exactly 
(including nullability and metadata), a planning error will be raised. This is 
used to workaround bugs in the planner that are now caught by the new schema 
verification step.</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.execution.spill_compression</p></td>
+<tr class="row-odd"><td><p>datafusion.execution.spill_compression</p></td>
 <td><p>uncompressed</p></td>
 <td><p>Sets the compression codec used when spilling data to disk. Since 
datafusion writes spill files using the Arrow IPC Stream format, only codecs 
supported by the Arrow IPC Stream Writer are allowed. Valid values are: 
uncompressed, lz4_frame, zstd. Note: lz4_frame offers faster (de)compression, 
but typically results in larger spill files. In contrast, zstd achieves higher 
compression ratios at the cost of slower (de)compression speed.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.sort_spill_reservation_bytes</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.sort_spill_reservation_bytes</p></td>
 <td><p>10485760</p></td>
 <td><p>Specifies the reserved memory for each spillable sort operation to 
facilitate an in-memory merge. When a sort operation spills to disk, the 
in-memory data must be sorted and merged before being written to a file. This 
setting reserves a specific amount of memory for that in-memory sort/merge 
process. Note: This setting is irrelevant if the sort operation cannot spill 
(i.e., if there’s no <code class="docutils literal notranslate"><span 
class="pre">DiskManager</span></code> configu [...]
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.sort_in_place_threshold_bytes</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.sort_in_place_threshold_bytes</p></td>
 <td><p>1048576</p></td>
 <td><p>When sorting, below what size should data be concatenated and sorted in 
a single RecordBatch rather than sorted in batches and merged.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.sort_pushdown_buffer_capacity</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.sort_pushdown_buffer_capacity</p></td>
 <td><p>1073741824</p></td>
 <td><p>Maximum buffer capacity (in bytes) per partition for BufferExec 
inserted during sort pushdown optimization. When PushdownSort eliminates a 
SortExec under SortPreservingMergeExec, a BufferExec is inserted to replace 
SortExec’s buffering role. This prevents I/O stalls by allowing the scan to run 
ahead of the merge. This uses strictly less memory than the SortExec it 
replaces (which buffers the entire partition). The buffer respects the global 
memory pool limit. Setting this to a lar [...]
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.max_spill_file_size_bytes</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.max_spill_file_size_bytes</p></td>
 <td><p>134217728</p></td>
 <td><p>Maximum size in bytes for individual spill files before rotating to a 
new file. When operators spill data to disk (e.g., RepartitionExec), they write 
multiple batches to the same file until this size limit is reached, then rotate 
to a new file. This reduces syscall overhead compared to one-file-per-batch 
while preventing files from growing too large. A larger value reduces file 
creation overhead but may hold more disk space. A smaller value creates more 
files but allows finer-grai [...]
 </tr>
-<tr class="row-odd"><td><p>datafusion.execution.meta_fetch_concurrency</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.meta_fetch_concurrency</p></td>
 <td><p>32</p></td>
 <td><p>Number of files to read in parallel when inferring schema and 
statistics</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.minimum_parallel_output_files</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.minimum_parallel_output_files</p></td>
 <td><p>4</p></td>
 <td><p>Guarantees a minimum level of output files running in parallel. 
RecordBatches will be distributed in round robin fashion to each parallel 
writer. Each writer is closed and a new file opened once 
soft_max_rows_per_output_file is reached.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.soft_max_rows_per_output_file</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.soft_max_rows_per_output_file</p></td>
 <td><p>50000000</p></td>
 <td><p>Target number of rows in output files when writing multiple. This is a 
soft max, so it can be exceeded slightly. There also will be one file smaller 
than the limit if the total number of rows written is not roughly divisible by 
the soft max</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.max_buffered_batches_per_output_file</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.max_buffered_batches_per_output_file</p></td>
 <td><p>2</p></td>
 <td><p>This is the maximum number of RecordBatches buffered for each output 
file being worked. Higher values can potentially give faster write performance 
at the cost of higher peak memory consumption</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.listing_table_ignore_subdirectory</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.listing_table_ignore_subdirectory</p></td>
 <td><p>true</p></td>
 <td><p>Should sub directories be ignored when scanning directories for data 
files. Defaults to true (ignores subdirectories), consistent with Hive. Note 
that this setting does not affect reading partitioned tables (e.g. <code 
class="docutils literal notranslate"><span 
class="pre">/table/year=2021/month=01/data.parquet</span></code>).</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.listing_table_factory_infer_partitions</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.listing_table_factory_infer_partitions</p></td>
 <td><p>true</p></td>
 <td><p>Should a <code class="docutils literal notranslate"><span 
class="pre">ListingTable</span></code> created through the <code 
class="docutils literal notranslate"><span 
class="pre">ListingTableFactory</span></code> infer table partitions from Hive 
compliant directories. Defaults to true (partition columns are inferred and 
will be represented in the table schema).</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.execution.enable_recursive_ctes</p></td>
+<tr class="row-even"><td><p>datafusion.execution.enable_recursive_ctes</p></td>
 <td><p>true</p></td>
 <td><p>Should DataFusion support recursive CTEs</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.split_file_groups_by_statistics</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.split_file_groups_by_statistics</p></td>
 <td><p>false</p></td>
 <td><p>Attempt to eliminate sorts by packing &amp; sorting files with 
non-overlapping statistics into the same file groups. Currently 
experimental</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.keep_partition_by_columns</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.keep_partition_by_columns</p></td>
 <td><p>false</p></td>
 <td><p>Should DataFusion keep the columns used for partition_by in the output 
RecordBatches</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.skip_partial_aggregation_probe_ratio_threshold</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.skip_partial_aggregation_probe_ratio_threshold</p></td>
 <td><p>0.8</p></td>
 <td><p>Aggregation ratio (number of distinct groups / number of input rows) 
threshold for skipping partial aggregation. If the value is greater then 
partial aggregation will skip aggregation for further input</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.skip_partial_aggregation_probe_rows_threshold</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.skip_partial_aggregation_probe_rows_threshold</p></td>
 <td><p>100000</p></td>
 <td><p>Number of input rows partial aggregation partition should process, 
before aggregation ratio check and trying to switch to skipping aggregation 
mode</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.use_row_number_estimates_to_optimize_partitioning</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.use_row_number_estimates_to_optimize_partitioning</p></td>
 <td><p>false</p></td>
 <td><p>Should DataFusion use row number estimates at the input to decide 
whether increasing parallelism is beneficial or not. By default, only exact row 
numbers (not estimates) are used for this decision. Setting this flag to <code 
class="docutils literal notranslate"><span class="pre">true</span></code> will 
likely produce better plans. if the source of statistics is accurate. We plan 
to make this the default in the future.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.execution.enforce_batch_size_in_joins</p></td>
+<tr 
class="row-even"><td><p>datafusion.execution.enforce_batch_size_in_joins</p></td>
 <td><p>false</p></td>
 <td><p>Should DataFusion enforce batch size in joins or not. By default, 
DataFusion will not enforce batch size in joins. Enforcing batch size in joins 
can reduce memory usage when joining large tables with a highly-selective join 
filter, but is also slightly slower.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.objectstore_writer_buffer_size</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.objectstore_writer_buffer_size</p></td>
 <td><p>10485760</p></td>
 <td><p>Size (bytes) of data buffer DataFusion uses when writing output files. 
This affects the size of the data chunks that are uploaded to remote object 
stores (e.g. AWS S3). If very large (&gt;= 100 GiB) output files are being 
written, it may be necessary to increase this size to avoid errors from the 
remote end point.</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.execution.enable_ansi_mode</p></td>
+<tr class="row-even"><td><p>datafusion.execution.enable_ansi_mode</p></td>
 <td><p>false</p></td>
 <td><p>Whether to enable ANSI SQL mode. The flag is experimental and relevant 
only for DataFusion Spark built-in functions When <code class="docutils literal 
notranslate"><span class="pre">enable_ansi_mode</span></code> is set to <code 
class="docutils literal notranslate"><span class="pre">true</span></code>, the 
query engine follows ANSI SQL semantics for expressions, casting, and error 
handling. This means: - <strong>Strict type coercion rules:</strong> implicit 
casts between incompati [...]
 </tr>
-<tr 
class="row-even"><td><p>datafusion.execution.hash_join_buffering_capacity</p></td>
+<tr 
class="row-odd"><td><p>datafusion.execution.hash_join_buffering_capacity</p></td>
 <td><p>0</p></td>
 <td><p>How many bytes to buffer in the probe side of hash joins while the 
build side is concurrently being built. Without this, hash joins will wait 
until the full materialization of the build side before polling the probe side. 
This is useful in scenarios where the query is not completely CPU bounded, 
allowing to do some early work concurrently and reducing the latency of the 
query. Note that when hash join buffering is enabled, the probe side will start 
eagerly polling data, not giving [...]
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_distinct_aggregation_soft_limit</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.enable_distinct_aggregation_soft_limit</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the optimizer will push a limit operation into 
grouped aggregations which have no aggregate expressions, as a soft limit, 
emitting groups once the limit is reached, before all rows in the group are 
read.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.enable_round_robin_repartition</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_round_robin_repartition</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the physical plan optimizer will try to add round 
robin repartitioning to increase parallelism to leverage more CPU cores</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_topk_aggregation</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.enable_topk_aggregation</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the optimizer will attempt to perform limit 
operations during aggregations, if possible</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.optimizer.enable_window_limits</p></td>
+<tr class="row-odd"><td><p>datafusion.optimizer.enable_window_limits</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the optimizer will attempt to push limit operations 
past window functions, if possible</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.optimizer.enable_window_topn</p></td>
+<tr class="row-even"><td><p>datafusion.optimizer.enable_window_topn</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the optimizer will replace Filter(rn&lt;=K) → 
Window(ROW_NUMBER) → Sort patterns with a PartitionedTopKExec that maintains 
per-partition heaps, avoiding a full sort of the input. When the window 
partition key has low cardinality, enabling this optimization can improve 
performance. However, for high cardinality keys, it may cause regressions in 
both memory usage and runtime.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.enable_topk_repartition</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_topk_repartition</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the optimizer will push TopK (Sort with fetch) below 
hash repartition when the partition key is a prefix of the sort key, reducing 
data volume before the shuffle.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_topk_dynamic_filter_pushdown</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.enable_topk_dynamic_filter_pushdown</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the optimizer will attempt to push down TopK dynamic 
filters into the file scan phase.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.enable_physical_uncorrelated_scalar_subquery</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_physical_uncorrelated_scalar_subquery</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, uncorrelated scalar subqueries are left in the 
logical plan and executed by <code class="docutils literal notranslate"><span 
class="pre">ScalarSubqueryExec</span></code> during physical execution. When 
set to false, all scalar subqueries (including uncorrelated ones) are rewritten 
to left joins by the <code class="docutils literal notranslate"><span 
class="pre">ScalarSubqueryToJoin</span></code> optimizer rule. Note disabling 
this option is not recommended. It re [...]
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_join_dynamic_filter_pushdown</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.enable_join_dynamic_filter_pushdown</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the optimizer will attempt to push down Join dynamic 
filters into the file scan phase.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.enable_aggregate_dynamic_filter_pushdown</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_aggregate_dynamic_filter_pushdown</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the optimizer will attempt to push down Aggregate 
dynamic filters into the file scan phase.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_dynamic_filter_pushdown</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.enable_dynamic_filter_pushdown</p></td>
 <td><p>true</p></td>
 <td><p>When set to true attempts to push down dynamic filters generated by 
operators (TopK, Join &amp; Aggregate) into the file scan phase. For example, 
for a query such as <code class="docutils literal notranslate"><span 
class="pre">SELECT</span> <span class="pre">*</span> <span 
class="pre">FROM</span> <span class="pre">t</span> <span 
class="pre">ORDER</span> <span class="pre">BY</span> <span 
class="pre">timestamp</span> <span class="pre">DESC</span> <span 
class="pre">LIMIT</span> <span [...]
 </tr>
-<tr class="row-even"><td><p>datafusion.optimizer.filter_null_join_keys</p></td>
+<tr class="row-odd"><td><p>datafusion.optimizer.filter_null_join_keys</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the optimizer will insert filters before a join 
between a nullable and non-nullable column to filter out nulls on the nullable 
side. This filter can add additional overhead when the file format does not 
fully support predicate push down.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.repartition_aggregations</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.repartition_aggregations</p></td>
 <td><p>true</p></td>
 <td><p>Should DataFusion repartition data using the aggregate keys to execute 
aggregates in parallel using the provided <code class="docutils literal 
notranslate"><span class="pre">target_partitions</span></code> level</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.repartition_file_min_size</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.repartition_file_min_size</p></td>
 <td><p>1048576</p></td>
 <td><p>Minimum total file size in bytes for file-group byte-range splitting to 
fire. Files (or merged file groups) smaller than this stay as one partition. 
Lower values produce more, smaller partitions — better at filling <code 
class="docutils literal notranslate"><span 
class="pre">target_partitions</span></code> worth of cores when files are 
modestly sized, at the cost of slightly more per-partition open / metadata-load 
overhead.</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.optimizer.repartition_joins</p></td>
+<tr class="row-even"><td><p>datafusion.optimizer.repartition_joins</p></td>
 <td><p>true</p></td>
 <td><p>Should DataFusion repartition data using the join keys to execute joins 
in parallel using the provided <code class="docutils literal notranslate"><span 
class="pre">target_partitions</span></code> level</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.allow_symmetric_joins_without_pruning</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.allow_symmetric_joins_without_pruning</p></td>
 <td><p>true</p></td>
 <td><p>Should DataFusion allow symmetric hash joins for unbounded data sources 
even when its inputs do not have any ordering or filtering If the flag is not 
enabled, the SymmetricHashJoin operator will be unable to prune its internal 
buffers, resulting in certain join types - such as Full, Left, LeftAnti, 
LeftSemi, Right, RightAnti, and RightSemi - being produced only at the end of 
the execution. This is not typical in stream processing. Additionally, without 
proper design for long runne [...]
 </tr>
-<tr class="row-odd"><td><p>datafusion.optimizer.repartition_file_scans</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.repartition_file_scans</p></td>
 <td><p>true</p></td>
 <td><p>When set to <code class="docutils literal notranslate"><span 
class="pre">true</span></code>, datasource partitions will be repartitioned to 
achieve maximum parallelism. This applies to both in-memory partitions and 
FileSource’s file groups (1 group is 1 partition). For FileSources, only 
Parquet and CSV formats are currently supported. If set to <code 
class="docutils literal notranslate"><span class="pre">true</span></code> for a 
FileSource, all files will be repartitioned evenly ( [...]
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.preserve_file_partitions</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.preserve_file_partitions</p></td>
 <td><p>0</p></td>
 <td><p>Minimum number of distinct partition values required to group files by 
their Hive partition column values (enabling Hash partitioning declaration). 
How the option is used: - preserve_file_partitions=0: Disable it. - 
preserve_file_partitions=1: Always enable it. - preserve_file_partitions=N, 
actual file partitions=M: Only enable when M &gt;= N. This threshold preserves 
I/O parallelism when file partitioning is below it. Note: This may reduce 
parallelism, rooting from the I/O level, [...]
 </tr>
-<tr class="row-odd"><td><p>datafusion.optimizer.repartition_windows</p></td>
+<tr class="row-even"><td><p>datafusion.optimizer.repartition_windows</p></td>
 <td><p>true</p></td>
 <td><p>Should DataFusion repartition data using the partitions keys to execute 
window functions in parallel using the provided <code class="docutils literal 
notranslate"><span class="pre">target_partitions</span></code> level</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.optimizer.repartition_sorts</p></td>
+<tr class="row-odd"><td><p>datafusion.optimizer.repartition_sorts</p></td>
 <td><p>true</p></td>
 <td><p>Should DataFusion execute sorts in a per-partition fashion and merge 
afterwards instead of coalescing first and sorting globally. With this flag is 
enabled, plans in the form below <code class="docutils literal 
notranslate"><span class="pre">text</span> <span 
class="pre">&quot;SortExec:</span> <span class="pre">[a&#64;0</span> <span 
class="pre">ASC]&quot;,</span> <span class="pre">&quot;</span> <span 
class="pre">CoalescePartitionsExec&quot;,</span> <span 
class="pre">&quot;</span>  [...]
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.subset_repartition_threshold</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.subset_repartition_threshold</p></td>
 <td><p>4</p></td>
 <td><p>Partition count threshold for subset satisfaction optimization. When 
the current partition count is &gt;= this threshold, DataFusion will skip 
repartitioning if the required partitioning expression is a subset of the 
current partition expression such as Hash(a) satisfies Hash(a, b). When the 
current partition count is &lt; this threshold, DataFusion will repartition to 
increase parallelism even when subset satisfaction applies. Set to 0 to always 
repartition (disable subset satisf [...]
 </tr>
-<tr class="row-even"><td><p>datafusion.optimizer.prefer_existing_sort</p></td>
+<tr class="row-odd"><td><p>datafusion.optimizer.prefer_existing_sort</p></td>
 <td><p>false</p></td>
 <td><p>When true, DataFusion will opportunistically remove sorts when the data 
is already sorted, (i.e. setting <code class="docutils literal 
notranslate"><span class="pre">preserve_order</span></code> to true on <code 
class="docutils literal notranslate"><span 
class="pre">RepartitionExec</span></code> and using <code class="docutils 
literal notranslate"><span class="pre">SortPreservingMergeExec</span></code>) 
When false, DataFusion will maximize plan parallelism using <code class="docut 
[...]
 </tr>
-<tr class="row-odd"><td><p>datafusion.optimizer.skip_failed_rules</p></td>
+<tr class="row-even"><td><p>datafusion.optimizer.skip_failed_rules</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the logical plan optimizer will produce warning 
messages if any optimization rules produce errors and then proceed to the next 
rule. When set to false, any rules that produce errors will cause the query to 
fail</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.optimizer.max_passes</p></td>
+<tr class="row-odd"><td><p>datafusion.optimizer.max_passes</p></td>
 <td><p>3</p></td>
 <td><p>Number of times that the optimizer will attempt to optimize the 
plan</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.top_down_join_key_reordering</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.top_down_join_key_reordering</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the physical plan optimizer will run a top down 
process to reorder the join keys</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.optimizer.join_reordering</p></td>
+<tr class="row-odd"><td><p>datafusion.optimizer.join_reordering</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the physical plan optimizer may swap join inputs 
based on statistics. When set to false, statistics-driven join input reordering 
is disabled and the original join order in the query is used.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.use_statistics_registry</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.use_statistics_registry</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the physical plan optimizer uses the pluggable <code 
class="docutils literal notranslate"><span 
class="pre">StatisticsRegistry</span></code> for statistics propagation across 
operators. This enables more accurate cardinality estimates compared to each 
operator’s built-in <code class="docutils literal notranslate"><span 
class="pre">partition_statistics</span></code>.</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.optimizer.prefer_hash_join</p></td>
+<tr class="row-odd"><td><p>datafusion.optimizer.prefer_hash_join</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the physical plan optimizer will prefer HashJoin over 
SortMergeJoin. HashJoin can work more efficiently than SortMergeJoin but 
consumes more memory</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_piecewise_merge_join</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.enable_piecewise_merge_join</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, piecewise merge join is enabled. PiecewiseMergeJoin 
is currently experimental. Physical planner will opt for PiecewiseMergeJoin 
when there is only one range filter.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.hash_join_single_partition_threshold</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.hash_join_single_partition_threshold</p></td>
 <td><p>1048576</p></td>
 <td><p>The maximum estimated size in bytes for one input side of a HashJoin 
will be collected into a single partition</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.hash_join_single_partition_threshold_rows</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.hash_join_single_partition_threshold_rows</p></td>
 <td><p>131072</p></td>
 <td><p>The maximum estimated size in rows for one input side of a HashJoin 
will be collected into a single partition</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.hash_join_inlist_pushdown_max_size</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.hash_join_inlist_pushdown_max_size</p></td>
 <td><p>131072</p></td>
 <td><p>Maximum size in bytes for the build side of a hash join to be pushed 
down as an InList expression for dynamic filtering. Build sides larger than 
this will use hash table lookups instead. Set to 0 to always use hash table 
lookups. InList pushdown can be more efficient for small build sides because it 
can result in better statistics pruning as well as use any bloom filters 
present on the scan side. InList expressions are also more transparent and 
easier to serialize over the network [...]
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.hash_join_inlist_pushdown_max_distinct_values</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.hash_join_inlist_pushdown_max_distinct_values</p></td>
 <td><p>150</p></td>
 <td><p>Maximum number of distinct values (rows) in the build side of a hash 
join to be pushed down as an InList expression for dynamic filtering. Build 
sides with more rows than this will use hash table lookups instead. Set to 0 to 
always use hash table lookups. This provides an additional limit beyond <code 
class="docutils literal notranslate"><span 
class="pre">hash_join_inlist_pushdown_max_size</span></code> to prevent very 
large IN lists that might not provide much benefit over hash t [...]
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.default_filter_selectivity</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.default_filter_selectivity</p></td>
 <td><p>20</p></td>
 <td><p>The default filter selectivity used by Filter Statistics when an exact 
selectivity cannot be determined. Valid values are between 0 (no selectivity) 
and 100 (all rows are selected).</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.optimizer.prefer_existing_union</p></td>
+<tr class="row-even"><td><p>datafusion.optimizer.prefer_existing_union</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the optimizer will not attempt to convert Union to 
Interleave</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.expand_views_at_output</p></td>
+<tr class="row-odd"><td><p>datafusion.optimizer.expand_views_at_output</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, if the returned type is a view type then the output 
will be coerced to a non-view. Coerces <code class="docutils literal 
notranslate"><span class="pre">Utf8View</span></code> to <code class="docutils 
literal notranslate"><span class="pre">LargeUtf8</span></code>, and <code 
class="docutils literal notranslate"><span class="pre">BinaryView</span></code> 
to <code class="docutils literal notranslate"><span 
class="pre">LargeBinary</span></code>.</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.optimizer.enable_sort_pushdown</p></td>
+<tr class="row-even"><td><p>datafusion.optimizer.enable_sort_pushdown</p></td>
 <td><p>true</p></td>
 <td><p>Enable sort pushdown optimization. When enabled, attempts to push sort 
requirements down to data sources that can natively handle them (e.g., by 
reversing file/row group read order). Returns <strong>inexact 
ordering</strong>: Sort operator is kept for correctness, but optimized input 
enables early termination for TopK queries (ORDER BY … LIMIT N), providing 
significant speedup. Memory: No additional overhead (only changes read order). 
Future: Will add option to detect perfectly so [...]
 </tr>
-<tr 
class="row-even"><td><p>datafusion.optimizer.enable_leaf_expression_pushdown</p></td>
+<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_leaf_expression_pushdown</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the optimizer will extract leaf expressions (such as 
<code class="docutils literal notranslate"><span 
class="pre">get_field</span></code>) from filter/sort/join nodes into 
projections closer to the leaf table scans, and push those projections down 
towards the leaf nodes.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.optimizer.enable_unions_to_filter</p></td>
+<tr 
class="row-even"><td><p>datafusion.optimizer.enable_unions_to_filter</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the logical optimizer will rewrite <code 
class="docutils literal notranslate"><span class="pre">UNION</span> <span 
class="pre">DISTINCT</span></code> branches that read from the same source and 
differ only by filter predicates into a single branch with a combined filter. 
This optimization is conservative and only applies when the branches share the 
same source and compatible wrapper nodes such as identical projections or 
aliases.</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.explain.logical_plan_only</p></td>
+<tr class="row-odd"><td><p>datafusion.explain.logical_plan_only</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the explain statement will only print logical 
plans</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.explain.physical_plan_only</p></td>
+<tr class="row-even"><td><p>datafusion.explain.physical_plan_only</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the explain statement will only print physical 
plans</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.explain.show_statistics</p></td>
+<tr class="row-odd"><td><p>datafusion.explain.show_statistics</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the explain statement will print operator statistics 
for physical plans</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.explain.show_sizes</p></td>
+<tr class="row-even"><td><p>datafusion.explain.show_sizes</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, the explain statement will print the partition 
sizes</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.explain.show_schema</p></td>
+<tr class="row-odd"><td><p>datafusion.explain.show_schema</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the explain statement will print schema 
information</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.explain.format</p></td>
+<tr class="row-even"><td><p>datafusion.explain.format</p></td>
 <td><p>indent</p></td>
 <td><p>Display format of explain. Default is “indent”. When set to “tree”, it 
will print the plan in a tree-rendered format.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.explain.tree_maximum_render_width</p></td>
+<tr 
class="row-odd"><td><p>datafusion.explain.tree_maximum_render_width</p></td>
 <td><p>240</p></td>
 <td><p>(format=tree only) Maximum total width of the rendered tree. When set 
to 0, the tree will have no width limit.</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.explain.analyze_level</p></td>
+<tr class="row-even"><td><p>datafusion.explain.analyze_level</p></td>
 <td><p>dev</p></td>
 <td><p>Verbosity level for “EXPLAIN ANALYZE”. Default is “dev” “summary” shows 
common metrics for high-level insights. “dev” provides deep operator-level 
introspection for developers.</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.explain.analyze_categories</p></td>
+<tr class="row-odd"><td><p>datafusion.explain.analyze_categories</p></td>
 <td><p>all</p></td>
 <td><p>Which metric categories to include in “EXPLAIN ANALYZE” output. 
Comma-separated list of: “rows”, “bytes”, “timing”, “uncategorized”. Use “none” 
to show plan structure only, or “all” (default) to show everything. Metrics 
without a declared category are treated as “uncategorized”.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.sql_parser.parse_float_as_decimal</p></td>
+<tr 
class="row-even"><td><p>datafusion.sql_parser.parse_float_as_decimal</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, SQL parser will parse float as decimal type</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.sql_parser.enable_ident_normalization</p></td>
+<tr 
class="row-odd"><td><p>datafusion.sql_parser.enable_ident_normalization</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, SQL parser will normalize ident (convert ident to 
lowercase when not quoted)</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.sql_parser.enable_options_value_normalization</p></td>
+<tr 
class="row-even"><td><p>datafusion.sql_parser.enable_options_value_normalization</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, SQL parser will normalize options value (convert 
value to lowercase). Note that this option is ignored and will be removed in 
the future. All case-insensitive values are normalized automatically.</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.sql_parser.dialect</p></td>
+<tr class="row-odd"><td><p>datafusion.sql_parser.dialect</p></td>
 <td><p>generic</p></td>
 <td><p>Configure the SQL dialect used by DataFusion’s parser; supported values 
include: Generic, MySQL, PostgreSQL, Hive, SQLite, Snowflake, Redshift, MsSQL, 
ClickHouse, BigQuery, Ansi, DuckDB and Databricks.</p></td>
 </tr>
-<tr 
class="row-odd"><td><p>datafusion.sql_parser.support_varchar_with_length</p></td>
+<tr 
class="row-even"><td><p>datafusion.sql_parser.support_varchar_with_length</p></td>
 <td><p>true</p></td>
 <td><p>If true, permit lengths for <code class="docutils literal 
notranslate"><span class="pre">VARCHAR</span></code> such as <code 
class="docutils literal notranslate"><span 
class="pre">VARCHAR(20)</span></code>, but ignore the length. If false, error 
if a <code class="docutils literal notranslate"><span 
class="pre">VARCHAR</span></code> with a length is specified. The Arrow type 
system does not have a notion of maximum string length and thus DataFusion can 
not enforce such limits.</p></td>
 </tr>
-<tr 
class="row-even"><td><p>datafusion.sql_parser.map_string_types_to_utf8view</p></td>
+<tr 
class="row-odd"><td><p>datafusion.sql_parser.map_string_types_to_utf8view</p></td>
 <td><p>true</p></td>
 <td><p>If true, string types (VARCHAR, CHAR, Text, and String) are mapped to 
<code class="docutils literal notranslate"><span 
class="pre">Utf8View</span></code> during SQL planning. If false, they are 
mapped to <code class="docutils literal notranslate"><span 
class="pre">Utf8</span></code>. Default is true.</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.sql_parser.collect_spans</p></td>
+<tr class="row-even"><td><p>datafusion.sql_parser.collect_spans</p></td>
 <td><p>false</p></td>
 <td><p>When set to true, the source locations relative to the original SQL 
query (i.e. <a class="reference external" 
href="https://docs.rs/sqlparser/latest/sqlparser/tokenizer/struct.Span.html";><code
 class="docutils literal notranslate"><span class="pre">Span</span></code></a>) 
will be collected and recorded in the logical plan nodes.</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.sql_parser.recursion_limit</p></td>
+<tr class="row-odd"><td><p>datafusion.sql_parser.recursion_limit</p></td>
 <td><p>50</p></td>
 <td><p>Specifies the recursion depth limit when parsing complex SQL 
Queries</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.sql_parser.default_null_ordering</p></td>
+<tr 
class="row-even"><td><p>datafusion.sql_parser.default_null_ordering</p></td>
 <td><p>nulls_max</p></td>
 <td><p>Specifies the default null ordering for query results. There are 4 
options: - <code class="docutils literal notranslate"><span 
class="pre">nulls_max</span></code>: Nulls appear last in ascending order. - 
<code class="docutils literal notranslate"><span 
class="pre">nulls_min</span></code>: Nulls appear first in ascending order. - 
<code class="docutils literal notranslate"><span 
class="pre">nulls_first</span></code>: Nulls always be first in any order. - 
<code class="docutils litera [...]
 </tr>
-<tr 
class="row-even"><td><p>datafusion.sql_parser.enable_subquery_sort_elimination</p></td>
+<tr 
class="row-odd"><td><p>datafusion.sql_parser.enable_subquery_sort_elimination</p></td>
 <td><p>true</p></td>
 <td><p>When set to true, DataFusion may remove <code class="docutils literal 
notranslate"><span class="pre">ORDER</span> <span class="pre">BY</span></code> 
clauses from subqueries or CTEs during SQL planning when their ordering cannot 
affect the result, such as when no <code class="docutils literal 
notranslate"><span class="pre">LIMIT</span></code> or other order-sensitive 
operator depends on them. Disable this option to preserve explicit subquery 
ordering in the planned query.</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.format.safe</p></td>
+<tr class="row-even"><td><p>datafusion.format.safe</p></td>
 <td><p>true</p></td>
 <td><p>If set to <code class="docutils literal notranslate"><span 
class="pre">true</span></code> any formatting errors will be written to the 
output instead of being converted into a [<code class="docutils literal 
notranslate"><span class="pre">std::fmt::Error</span></code>]</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.format.null</p></td>
+<tr class="row-odd"><td><p>datafusion.format.null</p></td>
 <td><p></p></td>
 <td><p>Format string for nulls</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.format.date_format</p></td>
+<tr class="row-even"><td><p>datafusion.format.date_format</p></td>
 <td><p>%Y-%m-%d</p></td>
 <td><p>Date format for date arrays</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.format.datetime_format</p></td>
+<tr class="row-odd"><td><p>datafusion.format.datetime_format</p></td>
 <td><p>%Y-%m-%dT%H:%M:%S%.f</p></td>
 <td><p>Format for DateTime arrays</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.format.timestamp_format</p></td>
+<tr class="row-even"><td><p>datafusion.format.timestamp_format</p></td>
 <td><p>%Y-%m-%dT%H:%M:%S%.f</p></td>
 <td><p>Timestamp format for timestamp arrays</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.format.timestamp_tz_format</p></td>
+<tr class="row-odd"><td><p>datafusion.format.timestamp_tz_format</p></td>
 <td><p>NULL</p></td>
 <td><p>Timestamp format for timestamp with timezone arrays. When <code 
class="docutils literal notranslate"><span class="pre">None</span></code>, ISO 
8601 format is used.</p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.format.time_format</p></td>
+<tr class="row-even"><td><p>datafusion.format.time_format</p></td>
 <td><p>%H:%M:%S%.f</p></td>
 <td><p>Time format for time arrays</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.format.duration_format</p></td>
+<tr class="row-odd"><td><p>datafusion.format.duration_format</p></td>
 <td><p>pretty</p></td>
 <td><p>Duration format. Can be either <code class="docutils literal 
notranslate"><span class="pre">&quot;pretty&quot;</span></code> or <code 
class="docutils literal notranslate"><span 
class="pre">&quot;ISO8601&quot;</span></code></p></td>
 </tr>
-<tr class="row-odd"><td><p>datafusion.format.types_info</p></td>
+<tr class="row-even"><td><p>datafusion.format.types_info</p></td>
 <td><p>false</p></td>
 <td><p>Show types in visual representation batches</p></td>
 </tr>
-<tr class="row-even"><td><p>datafusion.spark.map_key_dedup_policy</p></td>
+<tr class="row-odd"><td><p>datafusion.spark.map_key_dedup_policy</p></td>
 <td><p>EXCEPTION</p></td>
 <td><p>Policy for handling duplicate keys in Spark-compatible map-construction 
functions (<code class="docutils literal notranslate"><span 
class="pre">map_from_arrays</span></code>, <code class="docutils literal 
notranslate"><span class="pre">map_from_entries</span></code>, <code 
class="docutils literal notranslate"><span 
class="pre">str_to_map</span></code>). Mirrors Spark’s <a class="reference 
external" 
href="https://github.com/apache/spark/blob/cf3a34e19dfcf70e2d679217ff1ba21302212472
 [...]
 </tr>
diff --git a/user-guide/sql/format_options.html 
b/user-guide/sql/format_options.html
index 8f07a8ed0c..7c82a684ad 100644
--- a/user-guide/sql/format_options.html
+++ b/user-guide/sql/format_options.html
@@ -672,133 +672,139 @@ a<span class="p">;</span>b
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'max_row_group_size'</span></code></p></td>
 <td><p>1048576</p></td>
 </tr>
-<tr class="row-even"><td><p>ENABLE_PAGE_INDEX</p></td>
+<tr class="row-even"><td><p>MAX_ROW_GROUP_BYTES</p></td>
+<td><p>No</p></td>
+<td><p>Sets the maximum size of each row group in bytes. When both this and 
<code class="docutils literal notranslate"><span 
class="pre">MAX_ROW_GROUP_SIZE</span></code> are set, the row group flushes 
whenever either limit is reached. Mirrors <code class="docutils literal 
notranslate"><span class="pre">parquet.block.size</span></code> from 
parquet-mr. Currently only honored when <code class="docutils literal 
notranslate"><span class="pre">allow_single_file_parallelism</span></code> is 
<c [...]
+<td><p><code class="docutils literal notranslate"><span 
class="pre">'max_row_group_bytes'</span></code></p></td>
+<td><p>None</p></td>
+</tr>
+<tr class="row-odd"><td><p>ENABLE_PAGE_INDEX</p></td>
 <td><p>No</p></td>
 <td><p>If true, reads the Parquet data page level metadata (the Page Index), 
if present, to reduce I/O and decoding.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'enable_page_index'</span></code></p></td>
 <td><p>true</p></td>
 </tr>
-<tr class="row-odd"><td><p>PRUNING</p></td>
+<tr class="row-even"><td><p>PRUNING</p></td>
 <td><p>No</p></td>
 <td><p>If true, enables row group pruning based on min/max statistics.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'pruning'</span></code></p></td>
 <td><p>true</p></td>
 </tr>
-<tr class="row-even"><td><p>SKIP_METADATA</p></td>
+<tr class="row-odd"><td><p>SKIP_METADATA</p></td>
 <td><p>No</p></td>
 <td><p>If true, skips optional embedded metadata in the file schema.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'skip_metadata'</span></code></p></td>
 <td><p>true</p></td>
 </tr>
-<tr class="row-odd"><td><p>METADATA_SIZE_HINT</p></td>
+<tr class="row-even"><td><p>METADATA_SIZE_HINT</p></td>
 <td><p>No</p></td>
 <td><p>Sets the size hint (in bytes) for fetching Parquet file 
metadata.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'metadata_size_hint'</span></code></p></td>
 <td><p>None</p></td>
 </tr>
-<tr class="row-even"><td><p>PUSHDOWN_FILTERS</p></td>
+<tr class="row-odd"><td><p>PUSHDOWN_FILTERS</p></td>
 <td><p>No</p></td>
 <td><p>If true, enables filter pushdown during Parquet decoding.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'pushdown_filters'</span></code></p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-odd"><td><p>REORDER_FILTERS</p></td>
+<tr class="row-even"><td><p>REORDER_FILTERS</p></td>
 <td><p>No</p></td>
 <td><p>If true, enables heuristic reordering of filters during Parquet 
decoding.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'reorder_filters'</span></code></p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-even"><td><p>SCHEMA_FORCE_VIEW_TYPES</p></td>
+<tr class="row-odd"><td><p>SCHEMA_FORCE_VIEW_TYPES</p></td>
 <td><p>No</p></td>
 <td><p>If true, reads Utf8/Binary columns as view types.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'schema_force_view_types'</span></code></p></td>
 <td><p>true</p></td>
 </tr>
-<tr class="row-odd"><td><p>BINARY_AS_STRING</p></td>
+<tr class="row-even"><td><p>BINARY_AS_STRING</p></td>
 <td><p>No</p></td>
 <td><p>If true, reads Binary columns as strings.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'binary_as_string'</span></code></p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-even"><td><p>DATA_PAGESIZE_LIMIT</p></td>
+<tr class="row-odd"><td><p>DATA_PAGESIZE_LIMIT</p></td>
 <td><p>No</p></td>
 <td><p>Sets best effort maximum size of data page in bytes.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'data_pagesize_limit'</span></code></p></td>
 <td><p>1048576</p></td>
 </tr>
-<tr class="row-odd"><td><p>DATA_PAGE_ROW_COUNT_LIMIT</p></td>
+<tr class="row-even"><td><p>DATA_PAGE_ROW_COUNT_LIMIT</p></td>
 <td><p>No</p></td>
 <td><p>Sets best effort maximum number of rows in data page.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'data_page_row_count_limit'</span></code></p></td>
 <td><p>20000</p></td>
 </tr>
-<tr class="row-even"><td><p>DICTIONARY_PAGE_SIZE_LIMIT</p></td>
+<tr class="row-odd"><td><p>DICTIONARY_PAGE_SIZE_LIMIT</p></td>
 <td><p>No</p></td>
 <td><p>Sets best effort maximum dictionary page size, in bytes.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'dictionary_page_size_limit'</span></code></p></td>
 <td><p>1048576</p></td>
 </tr>
-<tr class="row-odd"><td><p>WRITE_BATCH_SIZE</p></td>
+<tr class="row-even"><td><p>WRITE_BATCH_SIZE</p></td>
 <td><p>No</p></td>
 <td><p>Sets write_batch_size in rows.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'write_batch_size'</span></code></p></td>
 <td><p>1024</p></td>
 </tr>
-<tr class="row-even"><td><p>WRITER_VERSION</p></td>
+<tr class="row-odd"><td><p>WRITER_VERSION</p></td>
 <td><p>No</p></td>
 <td><p>Sets the Parquet writer version (<code class="docutils literal 
notranslate"><span class="pre">1.0</span></code> or <code class="docutils 
literal notranslate"><span class="pre">2.0</span></code>).</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'writer_version'</span></code></p></td>
 <td><p>1.0</p></td>
 </tr>
-<tr class="row-odd"><td><p>SKIP_ARROW_METADATA</p></td>
+<tr class="row-even"><td><p>SKIP_ARROW_METADATA</p></td>
 <td><p>No</p></td>
 <td><p>If true, skips writing Arrow schema information into the Parquet file 
metadata.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'skip_arrow_metadata'</span></code></p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-even"><td><p>CREATED_BY</p></td>
+<tr class="row-odd"><td><p>CREATED_BY</p></td>
 <td><p>No</p></td>
 <td><p>Sets the “created by” string in the Parquet file metadata.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'created_by'</span></code></p></td>
 <td><p>datafusion version X.Y.Z</p></td>
 </tr>
-<tr class="row-odd"><td><p>COLUMN_INDEX_TRUNCATE_LENGTH</p></td>
+<tr class="row-even"><td><p>COLUMN_INDEX_TRUNCATE_LENGTH</p></td>
 <td><p>No</p></td>
 <td><p>Sets the length (in bytes) to truncate min/max values in column 
indexes.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'column_index_truncate_length'</span></code></p></td>
 <td><p>64</p></td>
 </tr>
-<tr class="row-even"><td><p>STATISTICS_TRUNCATE_LENGTH</p></td>
+<tr class="row-odd"><td><p>STATISTICS_TRUNCATE_LENGTH</p></td>
 <td><p>No</p></td>
 <td><p>Sets statistics truncate length.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'statistics_truncate_length'</span></code></p></td>
 <td><p>None</p></td>
 </tr>
-<tr class="row-odd"><td><p>BLOOM_FILTER_ON_WRITE</p></td>
+<tr class="row-even"><td><p>BLOOM_FILTER_ON_WRITE</p></td>
 <td><p>No</p></td>
 <td><p>Sets whether bloom filters should be written for all columns by default 
(can be overridden per column).</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'bloom_filter_on_write'</span></code></p></td>
 <td><p>false</p></td>
 </tr>
-<tr class="row-even"><td><p>ALLOW_SINGLE_FILE_PARALLELISM</p></td>
+<tr class="row-odd"><td><p>ALLOW_SINGLE_FILE_PARALLELISM</p></td>
 <td><p>No</p></td>
 <td><p>Enables parallel serialization of columns in a single file.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'allow_single_file_parallelism'</span></code></p></td>
 <td><p>true</p></td>
 </tr>
-<tr class="row-odd"><td><p>MAXIMUM_PARALLEL_ROW_GROUP_WRITERS</p></td>
+<tr class="row-even"><td><p>MAXIMUM_PARALLEL_ROW_GROUP_WRITERS</p></td>
 <td><p>No</p></td>
 <td><p>Maximum number of parallel row group writers.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'maximum_parallel_row_group_writers'</span></code></p></td>
 <td><p>1</p></td>
 </tr>
-<tr class="row-even"><td><p>MAXIMUM_BUFFERED_RECORD_BATCHES_PER_STREAM</p></td>
+<tr class="row-odd"><td><p>MAXIMUM_BUFFERED_RECORD_BATCHES_PER_STREAM</p></td>
 <td><p>No</p></td>
 <td><p>Maximum number of buffered record batches per stream.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'maximum_buffered_record_batches_per_stream'</span></code></p></td>
 <td><p>2</p></td>
 </tr>
-<tr class="row-odd"><td><p>KEY_VALUE_METADATA</p></td>
+<tr class="row-even"><td><p>KEY_VALUE_METADATA</p></td>
 <td><p>No (Key is specific)</p></td>
 <td><p>Adds custom key-value pairs to the file metadata. Use the format <code 
class="docutils literal notranslate"><span 
class="pre">'metadata::your_key_name'</span> <span 
class="pre">'your_value'</span></code>. Multiple entries allowed.</p></td>
 <td><p><code class="docutils literal notranslate"><span 
class="pre">'metadata::key_name'</span></code></p></td>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(datafusion) branch asf-site updated: Publish built docs triggered by 84bc8761ac3a126e41658b6cd0ec6bd8cc34cda8

Reply via email to