[
https://issues.apache.org/jira/browse/PARQUET-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218330#comment-15218330
]
Ryan Blue commented on PARQUET-574:
-----------------------------------
Java uses [RLE for
boolean|https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/column/ParquetProperties.java#L210].
Bit packing is supported for reads, but deprecated and Parquet MR no longer
uses it when writing 2.0 encodings. I believe we should be able to read
bit-packed booleans just fine, but using RLE will be much smaller if there are
runs in the data, which is fairly common.
> Boolean format in Plain Decoder
> --------------------------------
>
> Key: PARQUET-574
> URL: https://issues.apache.org/jira/browse/PARQUET-574
> Project: Parquet
> Issue Type: Improvement
> Reporter: Fabrizio Milo
> Priority: Trivial
>
> In the encoding.md document is written that the plain encoder for boolean
> uses
> [RLE/BitPacking](https://github.com/apache/parquet-format/blob/master/Encodings.md#plain-plain--0)
>
> While in the cpp implementation seems is just using [simple bit decoding back
> to
> back.](https://github.com/apache/parquet-cpp/blob/master/src/parquet/encodings/plain-encoding.h#L151)
> Which one is the right format ?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)