[ 
https://issues.apache.org/jira/browse/PARQUET-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114022#comment-15114022
 ] 

Wes McKinney commented on PARQUET-459:
--------------------------------------

The value decoders are already internally buffering arrays of values. The idea 
with PARQUET-435 is to expose an array-oriented reader interface. So, rather 
than decoding values into an internal buffer, data from the data page would be 
able to be buffered directly into memory allocated elsewhere. This way, 
application developers can use threads to pipeline decoding and 
application-level processing of the decoded data. 

This of course also extends to the RleDecoder which currently decodes one value 
at a time (see RleDecoder::Get); instead you can retrieve a batch of levels at 
once. 

This vectorized reader interface is all restricted to the context of a 
particular data page that's been decompressed in memory. 

> Improve handling of null values
> -------------------------------
>
>                 Key: PARQUET-459
>                 URL: https://issues.apache.org/jira/browse/PARQUET-459
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>            Reporter: Deepak Majeti
>
> Currently, the default value of the type is returned for NULL values and is 
> incorrect.
> This JIRA will correctly identify a NULL value with the help of an additional 
> variable that will be set for NULL values. 
> This feature depends on reading the repetition level (PARQUET-169).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to