(datafusion) branch main updated: chore: remove LZO Parquet compression (#19726)

github-bot Sun, 11 Jan 2026 18:32:07 -0800

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git



The following commit(s) were added to refs/heads/main by this push:
     new d103d8886f chore: remove LZO Parquet compression (#19726)
d103d8886f is described below

commit d103d8886fcef989b0465a4a7dba28114869431c
Author: Kumar Ujjawal <[email protected]>
AuthorDate: Mon Jan 12 08:01:20 2026 +0530

    chore: remove LZO Parquet compression (#19726)
    
    ## Which issue does this PR close?
    
    <!--
    We generally require a GitHub issue to be filed for all bug fixes and
    enhancements and this helps us generate change logs for our releases.
    You can link an issue to this PR using the GitHub syntax. For example
    `Closes #123` indicates that this PR will close issue #123.
    -->
    
    - Closes #19720.
    
    ## Rationale for this change
    
    - Choosing LZO compression errors, I think it might never get supported
    so the best option moving forward is to remove it algother and update
    the docs.
    
    <!--
    Why are you proposing this change? If this is already explained clearly
    in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand
    your changes and offer better suggestions for fixes.
    -->
    
    ## What changes are included in this PR?
    
    - Removed LZO from parse_compression_string() function
    - Removed docs
    - Updated exptected test output
    
    <!--
    There is no need to duplicate the description in the issue here but it
    is sometimes worth providing a summary of the individual changes in this
    PR.
    -->
    
    ## Are these changes tested?
    
    Yes
    
    <!--
    We typically require tests for all PRs in order to:
    1. Prevent the code from being accidentally broken by subsequent changes
    2. Serve as another way to document the expected behavior of the code
    
    If tests are not included in your PR, please explain why (for example,
    are they covered by existing tests)?
    -->
    
    ## Are there any user-facing changes?
    
    User choosing LZO as compression will get a clear error message:
    
    ```
    Unknown or unsupported parquet compression: lzo. Valid values are: 
uncompressed, snappy, gzip(level), brotli(level), lz4, zstd(level), and lz4_raw.
    ```
    
    <!--
    If there are user-facing changes then we may require documentation to be
    updated before approving the PR.
    -->
    
    <!--
    If there are any breaking changes to public APIs, please add the `api
    change` label.
    -->
---
 datafusion/common/src/config.rs                    |  4 +-
 .../common/src/file_options/parquet_writer.rs      |  6 +-
 .../sqllogictest/test_files/information_schema.slt |  2 +-
 docs/source/user-guide/configs.md                  |  2 +-
 docs/source/user-guide/sql/format_options.md       | 64 +++++++++++-----------
 5 files changed, 37 insertions(+), 41 deletions(-)

diff --git a/datafusion/common/src/config.rs b/datafusion/common/src/config.rs
index b7a7841593..87344914d2 100644
--- a/datafusion/common/src/config.rs
+++ b/datafusion/common/src/config.rs
@@ -772,7 +772,7 @@ config_namespace! {
 
         /// (writing) Sets default parquet compression codec.
         /// Valid values are: uncompressed, snappy, gzip(level),
-        /// lzo, brotli(level), lz4, zstd(level), and lz4_raw.
+        /// brotli(level), lz4, zstd(level), and lz4_raw.
         /// These values are not case sensitive. If NULL, uses
         /// default parquet writer setting
         ///
@@ -2499,7 +2499,7 @@ config_namespace_with_hashmap! {
 
         /// Sets default parquet compression codec for the column path.
         /// Valid values are: uncompressed, snappy, gzip(level),
-        /// lzo, brotli(level), lz4, zstd(level), and lz4_raw.
+        /// brotli(level), lz4, zstd(level), and lz4_raw.
         /// These values are not case-sensitive. If NULL, uses
         /// default parquet options
         pub compression: Option<String>, transform = str::to_lowercase, 
default = None
diff --git a/datafusion/common/src/file_options/parquet_writer.rs 
b/datafusion/common/src/file_options/parquet_writer.rs
index 196cb96f38..f6608d16c1 100644
--- a/datafusion/common/src/file_options/parquet_writer.rs
+++ b/datafusion/common/src/file_options/parquet_writer.rs
@@ -341,10 +341,6 @@ pub fn parse_compression_string(
                 level,
             )?))
         }
-        "lzo" => {
-            check_level_is_none(codec, &level)?;
-            Ok(parquet::basic::Compression::LZO)
-        }
         "brotli" => {
             let level = require_level(codec, level)?;
             Ok(parquet::basic::Compression::BROTLI(BrotliLevel::try_new(
@@ -368,7 +364,7 @@ pub fn parse_compression_string(
         _ => Err(DataFusionError::Configuration(format!(
             "Unknown or unsupported parquet compression: \
         {str_setting}. Valid values are: uncompressed, snappy, gzip(level), \
-        lzo, brotli(level), lz4, zstd(level), and lz4_raw."
+        brotli(level), lz4, zstd(level), and lz4_raw."
         ))),
     }
 }
diff --git a/datafusion/sqllogictest/test_files/information_schema.slt 
b/datafusion/sqllogictest/test_files/information_schema.slt
index 860d81b098..2039ee93df 100644
--- a/datafusion/sqllogictest/test_files/information_schema.slt
+++ b/datafusion/sqllogictest/test_files/information_schema.slt
@@ -373,7 +373,7 @@ datafusion.execution.parquet.bloom_filter_on_read true 
(reading) Use any availab
 datafusion.execution.parquet.bloom_filter_on_write false (writing) Write bloom 
filters for all columns when creating parquet files
 datafusion.execution.parquet.coerce_int96 NULL (reading) If true, parquet 
reader will read columns of physical type int96 as originating from a different 
resolution than nanosecond. This is useful for reading data from systems like 
Spark which stores microsecond resolution timestamps in an int96 allowing it to 
write values with a larger date range than 64-bit timestamps with nanosecond 
resolution.
 datafusion.execution.parquet.column_index_truncate_length 64 (writing) Sets 
column index truncate length
-datafusion.execution.parquet.compression zstd(3) (writing) Sets default 
parquet compression codec. Valid values are: uncompressed, snappy, gzip(level), 
lzo, brotli(level), lz4, zstd(level), and lz4_raw. These values are not case 
sensitive. If NULL, uses default parquet writer setting Note that this default 
setting is not the same as the default parquet writer setting.
+datafusion.execution.parquet.compression zstd(3) (writing) Sets default 
parquet compression codec. Valid values are: uncompressed, snappy, gzip(level), 
brotli(level), lz4, zstd(level), and lz4_raw. These values are not case 
sensitive. If NULL, uses default parquet writer setting Note that this default 
setting is not the same as the default parquet writer setting.
 datafusion.execution.parquet.created_by datafusion (writing) Sets "created by" 
property
 datafusion.execution.parquet.data_page_row_count_limit 20000 (writing) Sets 
best effort maximum number of rows in data page
 datafusion.execution.parquet.data_pagesize_limit 1048576 (writing) Sets best 
effort maximum size of data page in bytes
diff --git a/docs/source/user-guide/configs.md 
b/docs/source/user-guide/configs.md
index b59af0c13d..99c94b2c78 100644
--- a/docs/source/user-guide/configs.md
+++ b/docs/source/user-guide/configs.md
@@ -96,7 +96,7 @@ The following configuration settings are available:
 | datafusion.execution.parquet.write_batch_size                           | 
1024                      | (writing) Sets write_batch_size in bytes            
                                                                                
                                                                                
                                                                                
                                                                                
                 [...]
 | datafusion.execution.parquet.writer_version                             | 
1.0                       | (writing) Sets parquet writer version valid values 
are "1.0" and "2.0"                                                             
                                                                                
                                                                                
                                                                                
                  [...]
 | datafusion.execution.parquet.skip_arrow_metadata                        | 
false                     | (writing) Skip encoding the embedded arrow metadata 
in the KV_meta This is analogous to the 
`ArrowWriterOptions::with_skip_arrow_metadata`. Refer to 
<https://docs.rs/parquet/53.3.0/parquet/arrow/arrow_writer/struct.ArrowWriterOptions.html#method.with_skip_arrow_metadata>
                                                                                
                                      [...]
-| datafusion.execution.parquet.compression                                | 
zstd(3)                   | (writing) Sets default parquet compression codec. 
Valid values are: uncompressed, snappy, gzip(level), lzo, brotli(level), lz4, 
zstd(level), and lz4_raw. These values are not case sensitive. If NULL, uses 
default parquet writer setting Note that this default setting is not the same 
as the default parquet writer setting.                                          
                          [...]
+| datafusion.execution.parquet.compression                                | 
zstd(3)                   | (writing) Sets default parquet compression codec. 
Valid values are: uncompressed, snappy, gzip(level), brotli(level), lz4, 
zstd(level), and lz4_raw. These values are not case sensitive. If NULL, uses 
default parquet writer setting Note that this default setting is not the same 
as the default parquet writer setting.                                          
                               [...]
 | datafusion.execution.parquet.dictionary_enabled                         | 
true                      | (writing) Sets if dictionary encoding is enabled. 
If NULL, uses default parquet writer setting                                    
                                                                                
                                                                                
                                                                                
                   [...]
 | datafusion.execution.parquet.dictionary_page_size_limit                 | 
1048576                   | (writing) Sets best effort maximum dictionary page 
size, in bytes                                                                  
                                                                                
                                                                                
                                                                                
                  [...]
 | datafusion.execution.parquet.statistics_enabled                         | 
page                      | (writing) Sets if statistics are enabled for any 
column Valid values are: "none", "chunk", and "page" These values are not case 
sensitive. If NULL, uses default parquet writer setting                         
                                                                                
                                                                                
                     [...]
diff --git a/docs/source/user-guide/sql/format_options.md 
b/docs/source/user-guide/sql/format_options.md
index d349bc1c98..c04a6b5d52 100644
--- a/docs/source/user-guide/sql/format_options.md
+++ b/docs/source/user-guide/sql/format_options.md
@@ -132,38 +132,38 @@ OPTIONS('DELIMITER' '|', 'HAS_HEADER' 'true', 
'NEWLINES_IN_VALUES' 'true');
 
 The following options are available when reading or writing Parquet files. If 
any unsupported option is specified, an error will be raised and the query will 
fail. If a column-specific option is specified for a column that does not 
exist, the option will be ignored without error.
 
-| Option                                     | Can be Column Specific? | 
Description                                                                     
                                                                                
                                                                                
                                                                                
            | OPTIONS Key                                           | Default 
Value            |
-| ------------------------------------------ | ----------------------- | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 | ----------------------------------------------------- | 
------------------------ |
-| COMPRESSION                                | Yes                     | Sets 
the internal Parquet **compression codec** for data pages, optionally including 
the compression level. Applies globally if set without `::col`, or specifically 
to a column if set using `'compression::column_name'`. Valid values: 
`uncompressed`, `snappy`, `gzip(level)`, `lzo`, `brotli(level)`, `lz4`, 
`zstd(level)`, `lz4_raw`. | `'compression'` or `'compression::col'`             
  | zstd(3)                  |
-| ENCODING                                   | Yes                     | Sets 
the **encoding** scheme for data pages. Valid values: `plain`, 
`plain_dictionary`, `rle`, `bit_packed`, `delta_binary_packed`, 
`delta_length_byte_array`, `delta_byte_array`, `rle_dictionary`, 
`byte_stream_split`. Use key `'encoding'` or `'encoding::col'` in OPTIONS.      
                                                       | `'encoding'` or 
`'encoding::col'`                     | None                     |
-| DICTIONARY_ENABLED                         | Yes                     | Sets 
whether dictionary encoding should be enabled globally or for a specific 
column.                                                                         
                                                                                
                                                                                
              | `'dictionary_enabled'` or `'dictionary_enabled::col'` | true    
                 |
-| STATISTICS_ENABLED                         | Yes                     | Sets 
the level of statistics to write (`none`, `chunk`, `page`).                     
                                                                                
                                                                                
                                                                                
       | `'statistics_enabled'` or `'statistics_enabled::col'` | page           
          |
-| BLOOM_FILTER_ENABLED                       | Yes                     | Sets 
whether a bloom filter should be written for a specific column.                 
                                                                                
                                                                                
                                                                                
       | `'bloom_filter_enabled::column_name'`                 | None           
          |
-| BLOOM_FILTER_FPP                           | Yes                     | Sets 
bloom filter false positive probability (global or per column).                 
                                                                                
                                                                                
                                                                                
       | `'bloom_filter_fpp'` or `'bloom_filter_fpp::col'`     | None           
          |
-| BLOOM_FILTER_NDV                           | Yes                     | Sets 
bloom filter number of distinct values (global or per column).                  
                                                                                
                                                                                
                                                                                
       | `'bloom_filter_ndv'` or `'bloom_filter_ndv::col'`     | None           
          |
-| MAX_ROW_GROUP_SIZE                         | No                      | Sets 
the maximum number of rows per row group. Larger groups require more memory but 
can improve compression and scan efficiency.                                    
                                                                                
                                                                                
       | `'max_row_group_size'`                                | 1048576        
          |
-| ENABLE_PAGE_INDEX                          | No                      | If 
true, reads the Parquet data page level metadata (the Page Index), if present, 
to reduce I/O and decoding.                                                     
                                                                                
                                                                                
          | `'enable_page_index'`                                 | true        
             |
-| PRUNING                                    | No                      | If 
true, enables row group pruning based on min/max statistics.                    
                                                                                
                                                                                
                                                                                
         | `'pruning'`                                           | true         
            |
-| SKIP_METADATA                              | No                      | If 
true, skips optional embedded metadata in the file schema.                      
                                                                                
                                                                                
                                                                                
         | `'skip_metadata'`                                     | true         
            |
-| METADATA_SIZE_HINT                         | No                      | Sets 
the size hint (in bytes) for fetching Parquet file metadata.                    
                                                                                
                                                                                
                                                                                
       | `'metadata_size_hint'`                                | None           
          |
-| PUSHDOWN_FILTERS                           | No                      | If 
true, enables filter pushdown during Parquet decoding.                          
                                                                                
                                                                                
                                                                                
         | `'pushdown_filters'`                                  | false        
            |
-| REORDER_FILTERS                            | No                      | If 
true, enables heuristic reordering of filters during Parquet decoding.          
                                                                                
                                                                                
                                                                                
         | `'reorder_filters'`                                   | false        
            |
-| SCHEMA_FORCE_VIEW_TYPES                    | No                      | If 
true, reads Utf8/Binary columns as view types.                                  
                                                                                
                                                                                
                                                                                
         | `'schema_force_view_types'`                           | true         
            |
-| BINARY_AS_STRING                           | No                      | If 
true, reads Binary columns as strings.                                          
                                                                                
                                                                                
                                                                                
         | `'binary_as_string'`                                  | false        
            |
-| DATA_PAGESIZE_LIMIT                        | No                      | Sets 
best effort maximum size of data page in bytes.                                 
                                                                                
                                                                                
                                                                                
       | `'data_pagesize_limit'`                               | 1048576        
          |
-| DATA_PAGE_ROW_COUNT_LIMIT                  | No                      | Sets 
best effort maximum number of rows in data page.                                
                                                                                
                                                                                
                                                                                
       | `'data_page_row_count_limit'`                         | 20000          
          |
-| DICTIONARY_PAGE_SIZE_LIMIT                 | No                      | Sets 
best effort maximum dictionary page size, in bytes.                             
                                                                                
                                                                                
                                                                                
       | `'dictionary_page_size_limit'`                        | 1048576        
          |
-| WRITE_BATCH_SIZE                           | No                      | Sets 
write_batch_size in bytes.                                                      
                                                                                
                                                                                
                                                                                
       | `'write_batch_size'`                                  | 1024           
          |
-| WRITER_VERSION                             | No                      | Sets 
the Parquet writer version (`1.0` or `2.0`).                                    
                                                                                
                                                                                
                                                                                
       | `'writer_version'`                                    | 1.0            
          |
-| SKIP_ARROW_METADATA                        | No                      | If 
true, skips writing Arrow schema information into the Parquet file metadata.    
                                                                                
                                                                                
                                                                                
         | `'skip_arrow_metadata'`                               | false        
            |
-| CREATED_BY                                 | No                      | Sets 
the "created by" string in the Parquet file metadata.                           
                                                                                
                                                                                
                                                                                
       | `'created_by'`                                        | datafusion 
version X.Y.Z |
-| COLUMN_INDEX_TRUNCATE_LENGTH               | No                      | Sets 
the length (in bytes) to truncate min/max values in column indexes.             
                                                                                
                                                                                
                                                                                
       | `'column_index_truncate_length'`                      | 64             
          |
-| STATISTICS_TRUNCATE_LENGTH                 | No                      | Sets 
statistics truncate length.                                                     
                                                                                
                                                                                
                                                                                
       | `'statistics_truncate_length'`                        | None           
          |
-| BLOOM_FILTER_ON_WRITE                      | No                      | Sets 
whether bloom filters should be written for all columns by default (can be 
overridden per column).                                                         
                                                                                
                                                                                
            | `'bloom_filter_on_write'`                             | false     
               |
-| ALLOW_SINGLE_FILE_PARALLELISM              | No                      | 
Enables parallel serialization of columns in a single file.                     
                                                                                
                                                                                
                                                                                
            | `'allow_single_file_parallelism'`                     | true      
               |
-| MAXIMUM_PARALLEL_ROW_GROUP_WRITERS         | No                      | 
Maximum number of parallel row group writers.                                   
                                                                                
                                                                                
                                                                                
            | `'maximum_parallel_row_group_writers'`                | 1         
               |
-| MAXIMUM_BUFFERED_RECORD_BATCHES_PER_STREAM | No                      | 
Maximum number of buffered record batches per stream.                           
                                                                                
                                                                                
                                                                                
            | `'maximum_buffered_record_batches_per_stream'`        | 2         
               |
-| KEY_VALUE_METADATA                         | No (Key is specific)    | Adds 
custom key-value pairs to the file metadata. Use the format 
`'metadata::your_key_name' 'your_value'`. Multiple entries allowed.             
                                                                                
                                                                                
                           | `'metadata::key_name'`                             
   | None                     |
+| Option                                     | Can be Column Specific? | 
Description                                                                     
                                                                                
                                                                                
                                                                                
     | OPTIONS Key                                           | Default Value    
        |
+| ------------------------------------------ | ----------------------- | 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 | ----------------------------------------------------- | 
------------------------ |
+| COMPRESSION                                | Yes                     | Sets 
the internal Parquet **compression codec** for data pages, optionally including 
the compression level. Applies globally if set without `::col`, or specifically 
to a column if set using `'compression::column_name'`. Valid values: 
`uncompressed`, `snappy`, `gzip(level)`, `brotli(level)`, `lz4`, `zstd(level)`, 
`lz4_raw`. | `'compression'` or `'compression::col'`               | zstd(3)    
              |
+| ENCODING                                   | Yes                     | Sets 
the **encoding** scheme for data pages. Valid values: `plain`, 
`plain_dictionary`, `rle`, `bit_packed`, `delta_binary_packed`, 
`delta_length_byte_array`, `delta_byte_array`, `rle_dictionary`, 
`byte_stream_split`. Use key `'encoding'` or `'encoding::col'` in OPTIONS.      
                                                | `'encoding'` or 
`'encoding::col'`                     | None                     |
+| DICTIONARY_ENABLED                         | Yes                     | Sets 
whether dictionary encoding should be enabled globally or for a specific 
column.                                                                         
                                                                                
                                                                                
       | `'dictionary_enabled'` or `'dictionary_enabled::col'` | true           
          |
+| STATISTICS_ENABLED                         | Yes                     | Sets 
the level of statistics to write (`none`, `chunk`, `page`).                     
                                                                                
                                                                                
                                                                                
| `'statistics_enabled'` or `'statistics_enabled::col'` | page                  
   |
+| BLOOM_FILTER_ENABLED                       | Yes                     | Sets 
whether a bloom filter should be written for a specific column.                 
                                                                                
                                                                                
                                                                                
| `'bloom_filter_enabled::column_name'`                 | None                  
   |
+| BLOOM_FILTER_FPP                           | Yes                     | Sets 
bloom filter false positive probability (global or per column).                 
                                                                                
                                                                                
                                                                                
| `'bloom_filter_fpp'` or `'bloom_filter_fpp::col'`     | None                  
   |
+| BLOOM_FILTER_NDV                           | Yes                     | Sets 
bloom filter number of distinct values (global or per column).                  
                                                                                
                                                                                
                                                                                
| `'bloom_filter_ndv'` or `'bloom_filter_ndv::col'`     | None                  
   |
+| MAX_ROW_GROUP_SIZE                         | No                      | Sets 
the maximum number of rows per row group. Larger groups require more memory but 
can improve compression and scan efficiency.                                    
                                                                                
                                                                                
| `'max_row_group_size'`                                | 1048576               
   |
+| ENABLE_PAGE_INDEX                          | No                      | If 
true, reads the Parquet data page level metadata (the Page Index), if present, 
to reduce I/O and decoding.                                                     
                                                                                
                                                                                
   | `'enable_page_index'`                                 | true               
      |
+| PRUNING                                    | No                      | If 
true, enables row group pruning based on min/max statistics.                    
                                                                                
                                                                                
                                                                                
  | `'pruning'`                                           | true                
     |
+| SKIP_METADATA                              | No                      | If 
true, skips optional embedded metadata in the file schema.                      
                                                                                
                                                                                
                                                                                
  | `'skip_metadata'`                                     | true                
     |
+| METADATA_SIZE_HINT                         | No                      | Sets 
the size hint (in bytes) for fetching Parquet file metadata.                    
                                                                                
                                                                                
                                                                                
| `'metadata_size_hint'`                                | None                  
   |
+| PUSHDOWN_FILTERS                           | No                      | If 
true, enables filter pushdown during Parquet decoding.                          
                                                                                
                                                                                
                                                                                
  | `'pushdown_filters'`                                  | false               
     |
+| REORDER_FILTERS                            | No                      | If 
true, enables heuristic reordering of filters during Parquet decoding.          
                                                                                
                                                                                
                                                                                
  | `'reorder_filters'`                                   | false               
     |
+| SCHEMA_FORCE_VIEW_TYPES                    | No                      | If 
true, reads Utf8/Binary columns as view types.                                  
                                                                                
                                                                                
                                                                                
  | `'schema_force_view_types'`                           | true                
     |
+| BINARY_AS_STRING                           | No                      | If 
true, reads Binary columns as strings.                                          
                                                                                
                                                                                
                                                                                
  | `'binary_as_string'`                                  | false               
     |
+| DATA_PAGESIZE_LIMIT                        | No                      | Sets 
best effort maximum size of data page in bytes.                                 
                                                                                
                                                                                
                                                                                
| `'data_pagesize_limit'`                               | 1048576               
   |
+| DATA_PAGE_ROW_COUNT_LIMIT                  | No                      | Sets 
best effort maximum number of rows in data page.                                
                                                                                
                                                                                
                                                                                
| `'data_page_row_count_limit'`                         | 20000                 
   |
+| DICTIONARY_PAGE_SIZE_LIMIT                 | No                      | Sets 
best effort maximum dictionary page size, in bytes.                             
                                                                                
                                                                                
                                                                                
| `'dictionary_page_size_limit'`                        | 1048576               
   |
+| WRITE_BATCH_SIZE                           | No                      | Sets 
write_batch_size in bytes.                                                      
                                                                                
                                                                                
                                                                                
| `'write_batch_size'`                                  | 1024                  
   |
+| WRITER_VERSION                             | No                      | Sets 
the Parquet writer version (`1.0` or `2.0`).                                    
                                                                                
                                                                                
                                                                                
| `'writer_version'`                                    | 1.0                   
   |
+| SKIP_ARROW_METADATA                        | No                      | If 
true, skips writing Arrow schema information into the Parquet file metadata.    
                                                                                
                                                                                
                                                                                
  | `'skip_arrow_metadata'`                               | false               
     |
+| CREATED_BY                                 | No                      | Sets 
the "created by" string in the Parquet file metadata.                           
                                                                                
                                                                                
                                                                                
| `'created_by'`                                        | datafusion version 
X.Y.Z |
+| COLUMN_INDEX_TRUNCATE_LENGTH               | No                      | Sets 
the length (in bytes) to truncate min/max values in column indexes.             
                                                                                
                                                                                
                                                                                
| `'column_index_truncate_length'`                      | 64                    
   |
+| STATISTICS_TRUNCATE_LENGTH                 | No                      | Sets 
statistics truncate length.                                                     
                                                                                
                                                                                
                                                                                
| `'statistics_truncate_length'`                        | None                  
   |
+| BLOOM_FILTER_ON_WRITE                      | No                      | Sets 
whether bloom filters should be written for all columns by default (can be 
overridden per column).                                                         
                                                                                
                                                                                
     | `'bloom_filter_on_write'`                             | false            
        |
+| ALLOW_SINGLE_FILE_PARALLELISM              | No                      | 
Enables parallel serialization of columns in a single file.                     
                                                                                
                                                                                
                                                                                
     | `'allow_single_file_parallelism'`                     | true             
        |
+| MAXIMUM_PARALLEL_ROW_GROUP_WRITERS         | No                      | 
Maximum number of parallel row group writers.                                   
                                                                                
                                                                                
                                                                                
     | `'maximum_parallel_row_group_writers'`                | 1                
        |
+| MAXIMUM_BUFFERED_RECORD_BATCHES_PER_STREAM | No                      | 
Maximum number of buffered record batches per stream.                           
                                                                                
                                                                                
                                                                                
     | `'maximum_buffered_record_batches_per_stream'`        | 2                
        |
+| KEY_VALUE_METADATA                         | No (Key is specific)    | Adds 
custom key-value pairs to the file metadata. Use the format 
`'metadata::your_key_name' 'your_value'`. Multiple entries allowed.             
                                                                                
                                                                                
                    | `'metadata::key_name'`                                | 
None                     |
 
 **Example:**
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(datafusion) branch main updated: chore: remove LZO Parquet compression (#19726)

Reply via email to