jorisvandenbossche commented on code in PR #38070:
URL: https://github.com/apache/arrow/pull/38070#discussion_r1350392364
##########
python/pyarrow/parquet/core.py:
##########
@@ -821,7 +823,10 @@ def _sanitize_table(table, new_schema, flavor):
and should be combined with a compression codec.
column_encoding : string or dict, default None
Specify the encoding scheme on a per column basis.
- Currently supported values: {'PLAIN', 'BYTE_STREAM_SPLIT'}.
+ Only if "use_dictionary" and "use_byte_stream_split" is False,
+ the following encodings are supported.
+ Currently supported values: {'PLAIN', 'BYTE_STREAM_SPLIT',
+ 'DELTA_BINARY_PACKED', 'DELTA_LENGTH_BYTE_ARRAY', 'DELTA_BYTE_ARRAY'}.
Review Comment:
I would propose to move this discussion elsewhere (new issue? or the
original issue of https://github.com/apache/arrow/issues/36882)
> Rle is default enabled on Page V2, however, write page v2 is not
recommended.
No, we enabled it by default also for DataPage V1 (we have a separate
"parquet v2" which is enabled by default, and we decided to write this encoding
when that is enabled)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]