mapleFU commented on code in PR #38070:
URL: https://github.com/apache/arrow/pull/38070#discussion_r1350380917
##########
python/pyarrow/parquet/core.py:
##########
@@ -821,7 +823,10 @@ def _sanitize_table(table, new_schema, flavor):
and should be combined with a compression codec.
column_encoding : string or dict, default None
Specify the encoding scheme on a per column basis.
- Currently supported values: {'PLAIN', 'BYTE_STREAM_SPLIT'}.
+ Only if "use_dictionary" and "use_byte_stream_split" is False,
+ the following encodings are supported.
+ Currently supported values: {'PLAIN', 'BYTE_STREAM_SPLIT',
+ 'DELTA_BINARY_PACKED', 'DELTA_LENGTH_BYTE_ARRAY', 'DELTA_BYTE_ARRAY'}.
Review Comment:
Just as this https://blog.getdaft.io/p/working-with-the-apache-parquet-file
blog saying. Parquet V2 is a ambigious naming. Although we (arrow and arrow-rs
) is using format 2.x and some properties on it. Most of implementions can
still decode page v1.
So I think if user know what he/she is doing, RLE is ok to export to user,
however I think here we can just hide it until PARQUET-2222 has a conclusion
about this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]