etseidl commented on code in PR #250:
URL: https://github.com/apache/parquet-format/pull/250#discussion_r1617731016
##########
src/main/thrift/parquet.thrift:
##########
@@ -537,6 +537,39 @@ enum Encoding {
Support for INT32, INT64 and FIXED_LEN_BYTE_ARRAY added in 2.11.
*/
BYTE_STREAM_SPLIT = 9;
+
+ /** Encoding for variable length binary data that allows random access of
values.
Review Comment:
Off topic, but I really like this idea as an alternative to PLAIN encoding
for BYTE_ARRAY data. It has the advantages of DELTA_LENGTH_BYTE_ARRAY without
having to do delta encoding for the offsets. Random access (in the absence of
nulls) is just an added bonus.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]