asfimport commented on issue #405:
URL: https://github.com/apache/parquet-format/issues/405#issuecomment-2184154121

   [Gabor 
Szadovszky](https://issues.apache.org/jira/browse/PARQUET-2222?#comment-17730904)
 / @gszadovszky:
   @pitrou, @wgtmac,
   
   It seems my review was not deep enough. Sorry for that. So, parquet-mr does 
not use RLE encoding for boolean values in case of V1 but only bit packing: 
   - 
[V1](https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/column/values/factory/DefaultV1ValuesWriterFactory.java#L53)
 -> ... -> [Bit 
packing](https://github.com/apache/parquet-mr/blob/9d80330ae4948787ac0bf4e4b0d990917f106440/parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ByteBitPackingValuesWriter.java)
 (encoding written to page header: PLAIN)
   - 
[V2](https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/column/values/factory/DefaultV2ValuesWriterFactory.java#L57)
 -> ... -> 
[RLE](https://github.com/apache/parquet-mr/blob/9d80330ae4948787ac0bf4e4b0d990917f106440/parquet-column/src/main/java/org/apache/parquet/column/values/rle/RunLengthBitPackingHybridValuesWriter.java)
 (encoding written to page header: RLE)
     
     @pitrou, could you please confirm that is the same for parquet cpp?
     
     So the table we added in this PR about prepending the length is 
misleading. Also, the link in the PLAIN encoding for boolean is dead and 
misleading. It should point to BIT_PACKED. In the definition of BIT_PACKED it 
is also wrongly stated that it is valid only for RL/DL. I think, the 
deprecation is valid since the "BIT_PACKED" encoding should not be written to 
anywhere but the actual encoding is used under PLAIN for boolean.
     Would you guys like to work on this? We probably want to add this to the 
current format release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to