Re: [PR] [DO NOT MERGE] PARQUET-2414: Add test file for additional BYTE_STREAM_SPLIT types [parquet-testing]

via GitHub Tue, 20 Feb 2024 08:17:13 -0800


pitrou commented on code in PR #46:
URL: https://github.com/apache/parquet-testing/pull/46#discussion_r1496105130



##########
data/README.md:
##########
@@ -351,3 +353,37 @@ pq.write_table(
 
 This is a practical case where `BYTE_STREAM_SPLIT` encoding obtains a smaller 
file size than `PLAIN` or dictionary.
 Since the distributions are random normals centered at 0, each byte has 
nontrivial behavior.
+
+# Additional types
+
+`byte_stream_split_extended.gzip.parquet` is generated by pyarrow 16.0.0.
+It contains 7 pairs of columns, each in two variants containing the same
+values: one `PLAIN`-encoded and one `BYTE_STREAM_SPLIT`-encoded:
+```
+Version: 2.6

Review Comment:
   That will have to be part of the final C++ PR. We do not need the C++ 
changes to fix this README if we want to, though.



##########
data/README.md:
##########
@@ -351,3 +353,37 @@ pq.write_table(
 
 This is a practical case where `BYTE_STREAM_SPLIT` encoding obtains a smaller 
file size than `PLAIN` or dictionary.
 Since the distributions are random normals centered at 0, each byte has 
nontrivial behavior.
+
+# Additional types
+
+`byte_stream_split_extended.gzip.parquet` is generated by pyarrow 16.0.0.
+It contains 7 pairs of columns, each in two variants containing the same
+values: one `PLAIN`-encoded and one `BYTE_STREAM_SPLIT`-encoded:
+```
+Version: 2.6

Review Comment:
   (also see https://github.com/apache/arrow/issues/40096)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [DO NOT MERGE] PARQUET-2414: Add test file for additional BYTE_STREAM_SPLIT types [parquet-testing]

Reply via email to