This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 7c3ba2380c Publish built docs triggered by
d103d8886fcef989b0465a4a7dba28114869431c
7c3ba2380c is described below
commit 7c3ba2380c9c5688d96b56d4fb75f31017a3b436
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Mon Jan 12 02:32:15 2026 +0000
Publish built docs triggered by d103d8886fcef989b0465a4a7dba28114869431c
---
_sources/user-guide/configs.md.txt | 2 +-
_sources/user-guide/sql/format_options.md.txt | 64 +++++++++++++--------------
searchindex.js | 2 +-
user-guide/configs.html | 2 +-
user-guide/sql/format_options.html | 2 +-
5 files changed, 36 insertions(+), 36 deletions(-)
diff --git a/_sources/user-guide/configs.md.txt
b/_sources/user-guide/configs.md.txt
index b59af0c13d..99c94b2c78 100644
--- a/_sources/user-guide/configs.md.txt
+++ b/_sources/user-guide/configs.md.txt
@@ -96,7 +96,7 @@ The following configuration settings are available:
| datafusion.execution.parquet.write_batch_size |
1024 | (writing) Sets write_batch_size in bytes
[...]
| datafusion.execution.parquet.writer_version |
1.0 | (writing) Sets parquet writer version valid values
are "1.0" and "2.0"
[...]
| datafusion.execution.parquet.skip_arrow_metadata |
false | (writing) Skip encoding the embedded arrow metadata
in the KV_meta This is analogous to the
`ArrowWriterOptions::with_skip_arrow_metadata`. Refer to
<https://docs.rs/parquet/53.3.0/parquet/arrow/arrow_writer/struct.ArrowWriterOptions.html#method.with_skip_arrow_metadata>
[...]
-| datafusion.execution.parquet.compression |
zstd(3) | (writing) Sets default parquet compression codec.
Valid values are: uncompressed, snappy, gzip(level), lzo, brotli(level), lz4,
zstd(level), and lz4_raw. These values are not case sensitive. If NULL, uses
default parquet writer setting Note that this default setting is not the same
as the default parquet writer setting.
[...]
+| datafusion.execution.parquet.compression |
zstd(3) | (writing) Sets default parquet compression codec.
Valid values are: uncompressed, snappy, gzip(level), brotli(level), lz4,
zstd(level), and lz4_raw. These values are not case sensitive. If NULL, uses
default parquet writer setting Note that this default setting is not the same
as the default parquet writer setting.
[...]
| datafusion.execution.parquet.dictionary_enabled |
true | (writing) Sets if dictionary encoding is enabled.
If NULL, uses default parquet writer setting
[...]
| datafusion.execution.parquet.dictionary_page_size_limit |
1048576 | (writing) Sets best effort maximum dictionary page
size, in bytes
[...]
| datafusion.execution.parquet.statistics_enabled |
page | (writing) Sets if statistics are enabled for any
column Valid values are: "none", "chunk", and "page" These values are not case
sensitive. If NULL, uses default parquet writer setting
[...]
diff --git a/_sources/user-guide/sql/format_options.md.txt
b/_sources/user-guide/sql/format_options.md.txt
index d349bc1c98..c04a6b5d52 100644
--- a/_sources/user-guide/sql/format_options.md.txt
+++ b/_sources/user-guide/sql/format_options.md.txt
@@ -132,38 +132,38 @@ OPTIONS('DELIMITER' '|', 'HAS_HEADER' 'true',
'NEWLINES_IN_VALUES' 'true');
The following options are available when reading or writing Parquet files. If
any unsupported option is specified, an error will be raised and the query will
fail. If a column-specific option is specified for a column that does not
exist, the option will be ignored without error.
-| Option | Can be Column Specific? |
Description
| OPTIONS Key | Default
Value |
-| ------------------------------------------ | ----------------------- |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| ----------------------------------------------------- |
------------------------ |
-| COMPRESSION | Yes | Sets
the internal Parquet **compression codec** for data pages, optionally including
the compression level. Applies globally if set without `::col`, or specifically
to a column if set using `'compression::column_name'`. Valid values:
`uncompressed`, `snappy`, `gzip(level)`, `lzo`, `brotli(level)`, `lz4`,
`zstd(level)`, `lz4_raw`. | `'compression'` or `'compression::col'`
| zstd(3) |
-| ENCODING | Yes | Sets
the **encoding** scheme for data pages. Valid values: `plain`,
`plain_dictionary`, `rle`, `bit_packed`, `delta_binary_packed`,
`delta_length_byte_array`, `delta_byte_array`, `rle_dictionary`,
`byte_stream_split`. Use key `'encoding'` or `'encoding::col'` in OPTIONS.
| `'encoding'` or
`'encoding::col'` | None |
-| DICTIONARY_ENABLED | Yes | Sets
whether dictionary encoding should be enabled globally or for a specific
column.
| `'dictionary_enabled'` or `'dictionary_enabled::col'` | true
|
-| STATISTICS_ENABLED | Yes | Sets
the level of statistics to write (`none`, `chunk`, `page`).
| `'statistics_enabled'` or `'statistics_enabled::col'` | page
|
-| BLOOM_FILTER_ENABLED | Yes | Sets
whether a bloom filter should be written for a specific column.
| `'bloom_filter_enabled::column_name'` | None
|
-| BLOOM_FILTER_FPP | Yes | Sets
bloom filter false positive probability (global or per column).
| `'bloom_filter_fpp'` or `'bloom_filter_fpp::col'` | None
|
-| BLOOM_FILTER_NDV | Yes | Sets
bloom filter number of distinct values (global or per column).
| `'bloom_filter_ndv'` or `'bloom_filter_ndv::col'` | None
|
-| MAX_ROW_GROUP_SIZE | No | Sets
the maximum number of rows per row group. Larger groups require more memory but
can improve compression and scan efficiency.
| `'max_row_group_size'` | 1048576
|
-| ENABLE_PAGE_INDEX | No | If
true, reads the Parquet data page level metadata (the Page Index), if present,
to reduce I/O and decoding.
| `'enable_page_index'` | true
|
-| PRUNING | No | If
true, enables row group pruning based on min/max statistics.
| `'pruning'` | true
|
-| SKIP_METADATA | No | If
true, skips optional embedded metadata in the file schema.
| `'skip_metadata'` | true
|
-| METADATA_SIZE_HINT | No | Sets
the size hint (in bytes) for fetching Parquet file metadata.
| `'metadata_size_hint'` | None
|
-| PUSHDOWN_FILTERS | No | If
true, enables filter pushdown during Parquet decoding.
| `'pushdown_filters'` | false
|
-| REORDER_FILTERS | No | If
true, enables heuristic reordering of filters during Parquet decoding.
| `'reorder_filters'` | false
|
-| SCHEMA_FORCE_VIEW_TYPES | No | If
true, reads Utf8/Binary columns as view types.
| `'schema_force_view_types'` | true
|
-| BINARY_AS_STRING | No | If
true, reads Binary columns as strings.
| `'binary_as_string'` | false
|
-| DATA_PAGESIZE_LIMIT | No | Sets
best effort maximum size of data page in bytes.
| `'data_pagesize_limit'` | 1048576
|
-| DATA_PAGE_ROW_COUNT_LIMIT | No | Sets
best effort maximum number of rows in data page.
| `'data_page_row_count_limit'` | 20000
|
-| DICTIONARY_PAGE_SIZE_LIMIT | No | Sets
best effort maximum dictionary page size, in bytes.
| `'dictionary_page_size_limit'` | 1048576
|
-| WRITE_BATCH_SIZE | No | Sets
write_batch_size in bytes.
| `'write_batch_size'` | 1024
|
-| WRITER_VERSION | No | Sets
the Parquet writer version (`1.0` or `2.0`).
| `'writer_version'` | 1.0
|
-| SKIP_ARROW_METADATA | No | If
true, skips writing Arrow schema information into the Parquet file metadata.
| `'skip_arrow_metadata'` | false
|
-| CREATED_BY | No | Sets
the "created by" string in the Parquet file metadata.
| `'created_by'` | datafusion
version X.Y.Z |
-| COLUMN_INDEX_TRUNCATE_LENGTH | No | Sets
the length (in bytes) to truncate min/max values in column indexes.
| `'column_index_truncate_length'` | 64
|
-| STATISTICS_TRUNCATE_LENGTH | No | Sets
statistics truncate length.
| `'statistics_truncate_length'` | None
|
-| BLOOM_FILTER_ON_WRITE | No | Sets
whether bloom filters should be written for all columns by default (can be
overridden per column).
| `'bloom_filter_on_write'` | false
|
-| ALLOW_SINGLE_FILE_PARALLELISM | No |
Enables parallel serialization of columns in a single file.
| `'allow_single_file_parallelism'` | true
|
-| MAXIMUM_PARALLEL_ROW_GROUP_WRITERS | No |
Maximum number of parallel row group writers.
| `'maximum_parallel_row_group_writers'` | 1
|
-| MAXIMUM_BUFFERED_RECORD_BATCHES_PER_STREAM | No |
Maximum number of buffered record batches per stream.
| `'maximum_buffered_record_batches_per_stream'` | 2
|
-| KEY_VALUE_METADATA | No (Key is specific) | Adds
custom key-value pairs to the file metadata. Use the format
`'metadata::your_key_name' 'your_value'`. Multiple entries allowed.
| `'metadata::key_name'`
| None |
+| Option | Can be Column Specific? |
Description
| OPTIONS Key | Default Value
|
+| ------------------------------------------ | ----------------------- |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| ----------------------------------------------------- |
------------------------ |
+| COMPRESSION | Yes | Sets
the internal Parquet **compression codec** for data pages, optionally including
the compression level. Applies globally if set without `::col`, or specifically
to a column if set using `'compression::column_name'`. Valid values:
`uncompressed`, `snappy`, `gzip(level)`, `brotli(level)`, `lz4`, `zstd(level)`,
`lz4_raw`. | `'compression'` or `'compression::col'` | zstd(3)
|
+| ENCODING | Yes | Sets
the **encoding** scheme for data pages. Valid values: `plain`,
`plain_dictionary`, `rle`, `bit_packed`, `delta_binary_packed`,
`delta_length_byte_array`, `delta_byte_array`, `rle_dictionary`,
`byte_stream_split`. Use key `'encoding'` or `'encoding::col'` in OPTIONS.
| `'encoding'` or
`'encoding::col'` | None |
+| DICTIONARY_ENABLED | Yes | Sets
whether dictionary encoding should be enabled globally or for a specific
column.
| `'dictionary_enabled'` or `'dictionary_enabled::col'` | true
|
+| STATISTICS_ENABLED | Yes | Sets
the level of statistics to write (`none`, `chunk`, `page`).
| `'statistics_enabled'` or `'statistics_enabled::col'` | page
|
+| BLOOM_FILTER_ENABLED | Yes | Sets
whether a bloom filter should be written for a specific column.
| `'bloom_filter_enabled::column_name'` | None
|
+| BLOOM_FILTER_FPP | Yes | Sets
bloom filter false positive probability (global or per column).
| `'bloom_filter_fpp'` or `'bloom_filter_fpp::col'` | None
|
+| BLOOM_FILTER_NDV | Yes | Sets
bloom filter number of distinct values (global or per column).
| `'bloom_filter_ndv'` or `'bloom_filter_ndv::col'` | None
|
+| MAX_ROW_GROUP_SIZE | No | Sets
the maximum number of rows per row group. Larger groups require more memory but
can improve compression and scan efficiency.
| `'max_row_group_size'` | 1048576
|
+| ENABLE_PAGE_INDEX | No | If
true, reads the Parquet data page level metadata (the Page Index), if present,
to reduce I/O and decoding.
| `'enable_page_index'` | true
|
+| PRUNING | No | If
true, enables row group pruning based on min/max statistics.
| `'pruning'` | true
|
+| SKIP_METADATA | No | If
true, skips optional embedded metadata in the file schema.
| `'skip_metadata'` | true
|
+| METADATA_SIZE_HINT | No | Sets
the size hint (in bytes) for fetching Parquet file metadata.
| `'metadata_size_hint'` | None
|
+| PUSHDOWN_FILTERS | No | If
true, enables filter pushdown during Parquet decoding.
| `'pushdown_filters'` | false
|
+| REORDER_FILTERS | No | If
true, enables heuristic reordering of filters during Parquet decoding.
| `'reorder_filters'` | false
|
+| SCHEMA_FORCE_VIEW_TYPES | No | If
true, reads Utf8/Binary columns as view types.
| `'schema_force_view_types'` | true
|
+| BINARY_AS_STRING | No | If
true, reads Binary columns as strings.
| `'binary_as_string'` | false
|
+| DATA_PAGESIZE_LIMIT | No | Sets
best effort maximum size of data page in bytes.
| `'data_pagesize_limit'` | 1048576
|
+| DATA_PAGE_ROW_COUNT_LIMIT | No | Sets
best effort maximum number of rows in data page.
| `'data_page_row_count_limit'` | 20000
|
+| DICTIONARY_PAGE_SIZE_LIMIT | No | Sets
best effort maximum dictionary page size, in bytes.
| `'dictionary_page_size_limit'` | 1048576
|
+| WRITE_BATCH_SIZE | No | Sets
write_batch_size in bytes.
| `'write_batch_size'` | 1024
|
+| WRITER_VERSION | No | Sets
the Parquet writer version (`1.0` or `2.0`).
| `'writer_version'` | 1.0
|
+| SKIP_ARROW_METADATA | No | If
true, skips writing Arrow schema information into the Parquet file metadata.
| `'skip_arrow_metadata'` | false
|
+| CREATED_BY | No | Sets
the "created by" string in the Parquet file metadata.
| `'created_by'` | datafusion version
X.Y.Z |
+| COLUMN_INDEX_TRUNCATE_LENGTH | No | Sets
the length (in bytes) to truncate min/max values in column indexes.
| `'column_index_truncate_length'` | 64
|
+| STATISTICS_TRUNCATE_LENGTH | No | Sets
statistics truncate length.
| `'statistics_truncate_length'` | None
|
+| BLOOM_FILTER_ON_WRITE | No | Sets
whether bloom filters should be written for all columns by default (can be
overridden per column).
| `'bloom_filter_on_write'` | false
|
+| ALLOW_SINGLE_FILE_PARALLELISM | No |
Enables parallel serialization of columns in a single file.
| `'allow_single_file_parallelism'` | true
|
+| MAXIMUM_PARALLEL_ROW_GROUP_WRITERS | No |
Maximum number of parallel row group writers.
| `'maximum_parallel_row_group_writers'` | 1
|
+| MAXIMUM_BUFFERED_RECORD_BATCHES_PER_STREAM | No |
Maximum number of buffered record batches per stream.
| `'maximum_buffered_record_batches_per_stream'` | 2
|
+| KEY_VALUE_METADATA | No (Key is specific) | Adds
custom key-value pairs to the file metadata. Use the format
`'metadata::your_key_name' 'your_value'`. Multiple entries allowed.
| `'metadata::key_name'` |
None |
**Example:**
diff --git a/searchindex.js b/searchindex.js
index 3fda0dc305..75f097d9ba 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles":{"!=":[[61,"op-neq"]],"!~":[[61,"op-re-not-match"]],"!~*":[[61,"op-re-not-match-i"]],"!~~":[[61,"id19"]],"!~~*":[[61,"id20"]],"#":[[61,"op-bit-xor"]],"%":[[61,"op-modulo"]],"&":[[61,"op-bit-and"]],"(relation,
name) tuples in logical fields and logical columns are
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[61,"op-multiply"]],"+":[[61,"op-plus"]],"-":[[61,"op-minus"]],"/":[[61,"op-divide"]],"<":[[61,"op-lt"]],"<
[...]
\ No newline at end of file
+Search.setIndex({"alltitles":{"!=":[[61,"op-neq"]],"!~":[[61,"op-re-not-match"]],"!~*":[[61,"op-re-not-match-i"]],"!~~":[[61,"id19"]],"!~~*":[[61,"id20"]],"#":[[61,"op-bit-xor"]],"%":[[61,"op-modulo"]],"&":[[61,"op-bit-and"]],"(relation,
name) tuples in logical fields and logical columns are
unique":[[13,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[61,"op-multiply"]],"+":[[61,"op-plus"]],"-":[[61,"op-minus"]],"/":[[61,"op-divide"]],"<":[[61,"op-lt"]],"<
[...]
\ No newline at end of file
diff --git a/user-guide/configs.html b/user-guide/configs.html
index 14b68779fa..d7b70ac962 100644
--- a/user-guide/configs.html
+++ b/user-guide/configs.html
@@ -572,7 +572,7 @@ example, to configure <code class="docutils literal
notranslate"><span class="pr
</tr>
<tr class="row-odd"><td><p>datafusion.execution.parquet.compression</p></td>
<td><p>zstd(3)</p></td>
-<td><p>(writing) Sets default parquet compression codec. Valid values are:
uncompressed, snappy, gzip(level), lzo, brotli(level), lz4, zstd(level), and
lz4_raw. These values are not case sensitive. If NULL, uses default parquet
writer setting Note that this default setting is not the same as the default
parquet writer setting.</p></td>
+<td><p>(writing) Sets default parquet compression codec. Valid values are:
uncompressed, snappy, gzip(level), brotli(level), lz4, zstd(level), and
lz4_raw. These values are not case sensitive. If NULL, uses default parquet
writer setting Note that this default setting is not the same as the default
parquet writer setting.</p></td>
</tr>
<tr
class="row-even"><td><p>datafusion.execution.parquet.dictionary_enabled</p></td>
<td><p>true</p></td>
diff --git a/user-guide/sql/format_options.html
b/user-guide/sql/format_options.html
index c1aad6ee75..0823f1b56b 100644
--- a/user-guide/sql/format_options.html
+++ b/user-guide/sql/format_options.html
@@ -587,7 +587,7 @@ a<span class="p">;</span>b
<tbody>
<tr class="row-even"><td><p>COMPRESSION</p></td>
<td><p>Yes</p></td>
-<td><p>Sets the internal Parquet <strong>compression codec</strong> for data
pages, optionally including the compression level. Applies globally if set
without <code class="docutils literal notranslate"><span
class="pre">::col</span></code>, or specifically to a column if set using <code
class="docutils literal notranslate"><span
class="pre">'compression::column_name'</span></code>. Valid values: <code
class="docutils literal notranslate"><span
class="pre">uncompressed</span></code>, <co [...]
+<td><p>Sets the internal Parquet <strong>compression codec</strong> for data
pages, optionally including the compression level. Applies globally if set
without <code class="docutils literal notranslate"><span
class="pre">::col</span></code>, or specifically to a column if set using <code
class="docutils literal notranslate"><span
class="pre">'compression::column_name'</span></code>. Valid values: <code
class="docutils literal notranslate"><span
class="pre">uncompressed</span></code>, <co [...]
<td><p><code class="docutils literal notranslate"><span
class="pre">'compression'</span></code> or <code class="docutils literal
notranslate"><span class="pre">'compression::col'</span></code></p></td>
<td><p>zstd(3)</p></td>
</tr>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
