asfimport commented on issue #405: URL: https://github.com/apache/parquet-format/issues/405#issuecomment-2184154121
[Gabor Szadovszky](https://issues.apache.org/jira/browse/PARQUET-2222?#comment-17730904) / @gszadovszky: @pitrou, @wgtmac, It seems my review was not deep enough. Sorry for that. So, parquet-mr does not use RLE encoding for boolean values in case of V1 but only bit packing: - [V1](https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/column/values/factory/DefaultV1ValuesWriterFactory.java#L53) -> ... -> [Bit packing](https://github.com/apache/parquet-mr/blob/9d80330ae4948787ac0bf4e4b0d990917f106440/parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ByteBitPackingValuesWriter.java) (encoding written to page header: PLAIN) - [V2](https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/column/values/factory/DefaultV2ValuesWriterFactory.java#L57) -> ... -> [RLE](https://github.com/apache/parquet-mr/blob/9d80330ae4948787ac0bf4e4b0d990917f106440/parquet-column/src/main/java/org/apache/parquet/column/values/rle/RunLengthBitPackingHybridValuesWriter.java) (encoding written to page header: RLE) @pitrou, could you please confirm that is the same for parquet cpp? So the table we added in this PR about prepending the length is misleading. Also, the link in the PLAIN encoding for boolean is dead and misleading. It should point to BIT_PACKED. In the definition of BIT_PACKED it is also wrongly stated that it is valid only for RL/DL. I think, the deprecation is valid since the "BIT_PACKED" encoding should not be written to anywhere but the actual encoding is used under PLAIN for boolean. Would you guys like to work on this? We probably want to add this to the current format release. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
