nkaki commented on code in PR #552:
URL: https://github.com/apache/parquet-format/pull/552#discussion_r2744430182
##########
Encodings.md:
##########
@@ -25,6 +25,20 @@ This file contains the specification of all supported
encodings.
Unless otherwise stated in page or encoding documentation, any encoding can be
used with any page type.
+### Supported Encodings
+
+| Encoding type | Encoding enum
| Encoding Targets <br> (Parquet 2.0.0+)
| Encoding Targets <br> (Parquet 1.0.0+) |
Review Comment:
@alamb
Thank you for the review!
> I think we have been trying to avoid the nomenclature of "parquet 2.0" as
its definition is not universally agreed upon.
> I recommend we remove the separate columns and instead focus on helping
people navigate the current version of the spec
I agree on focusing on current versions spec. At some point it would be
great to make the parquet site able to see the previous versions easily. For
the table I will remove the last column and rename the thrid one.
And just a question, would Data Page V2 (header?) would be a better term in
this case?
> I am also not sure about the differences in different encoding targets
(e.g. PLAIN_DICTIONARY) --- maybe we can simply not include that in the table
as it has been deprecated?
For PLAIN_DICTIONARY and RLE_DICTIONARY, I will merge the rows and mark
PLAIN_DICTIONARY enum as deprecated.
For BIT_PACKED, since the deprecated encodings are still explained in the
document and it is linked by other encodings , I thought it should be in the
table and linked to the details. I think there are few options.
1. Remove BIT_PACKED encoding from the table (your suggestion)
2. Remove BIT_PACKED encoding description from the page and from the table
(this may break links).
3. Seperate currently supported and deprecated encodings as seperate tables,
and change the layout of the page.
- Layout A:
supported encodings table
deprecated encodings table (only BIT_PACKED)
supported + deprecated encodings descriptions (current order)
- Layout B:
supported encodings table
supported encodings descriptions (current order with out BIT_PACKED)
deprecated encodings table (only BIT_PACKED)
deprecated encodings descriptions (only BIT_PACKED)
- Layout C:
supported encodings table
deprecated encodings table (only BIT_PACKED)
supported encodings descriptions (current order with out BIT_PACKED)
deprecated encodings descriptions (only BIT_PACKED)
Also about Encoding Targets column should I just list the physical types?
removing other encoding targets (e.g. Repetition and definition levels)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]