This is an automated email from the ASF dual-hosted git repository.

jorisvandenbossche pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new 065a6da852 GH-41748: [Python][Parquet] Update BYTE_STREAM_SPLIT 
description in write_table() docstring (#41759)
065a6da852 is described below

commit 065a6da8520bd65fb4f59b2e3e496fe1124ac685
Author: Antoine Pitrou <[email protected]>
AuthorDate: Wed May 22 10:37:52 2024 +0200

    GH-41748: [Python][Parquet] Update BYTE_STREAM_SPLIT description in 
write_table() docstring (#41759)
    
    ### Rationale for this change
    
    In PR #40094 (issue GH-39978), we forgot to update the `write_table` 
docstring with an accurate description of the supported data types for 
BYTE_STREAM_SPLIT.
    
    ### Are these changes tested?
    
    No (only a doc change).
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41748
    
    Authored-by: Antoine Pitrou <[email protected]>
    Signed-off-by: Joris Van den Bossche <[email protected]>
---
 python/pyarrow/parquet/core.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/python/pyarrow/parquet/core.py b/python/pyarrow/parquet/core.py
index f54a203c87..81798b1544 100644
--- a/python/pyarrow/parquet/core.py
+++ b/python/pyarrow/parquet/core.py
@@ -797,8 +797,9 @@ use_byte_stream_split : bool or list, default False
     Specify if the byte_stream_split encoding should be used in general or
     only for some columns. If both dictionary and byte_stream_stream are
     enabled, then dictionary is preferred.
-    The byte_stream_split encoding is valid only for floating-point data types
-    and should be combined with a compression codec.
+    The byte_stream_split encoding is valid for integer, floating-point
+    and fixed-size binary data types (including decimals); it should be
+    combined with a compression codec so as to achieve size reduction.
 column_encoding : string or dict, default None
     Specify the encoding scheme on a per column basis.
     Can only be used when ``use_dictionary`` is set to False, and

Reply via email to