[
https://issues.apache.org/jira/browse/PARQUET-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114022#comment-15114022
]
Wes McKinney commented on PARQUET-459:
--------------------------------------
The value decoders are already internally buffering arrays of values. The idea
with PARQUET-435 is to expose an array-oriented reader interface. So, rather
than decoding values into an internal buffer, data from the data page would be
able to be buffered directly into memory allocated elsewhere. This way,
application developers can use threads to pipeline decoding and
application-level processing of the decoded data.
This of course also extends to the RleDecoder which currently decodes one value
at a time (see RleDecoder::Get); instead you can retrieve a batch of levels at
once.
This vectorized reader interface is all restricted to the context of a
particular data page that's been decompressed in memory.
> Improve handling of null values
> -------------------------------
>
> Key: PARQUET-459
> URL: https://issues.apache.org/jira/browse/PARQUET-459
> Project: Parquet
> Issue Type: Bug
> Components: parquet-cpp
> Reporter: Deepak Majeti
>
> Currently, the default value of the type is returned for NULL values and is
> incorrect.
> This JIRA will correctly identify a NULL value with the help of an additional
> variable that will be set for NULL values.
> This feature depends on reading the repetition level (PARQUET-169).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)