[ 
https://issues.apache.org/jira/browse/PARQUET-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113509#comment-15113509
 ] 

Wes McKinney commented on PARQUET-459:
--------------------------------------

Related to PARQUET-435. A main issue here is that there are typically more 
repetition/definition levels than values. For both nested and flat data 
iteration, it would probably be better to retrieve a batch of repetition / 
definition levels and a batch of values, and leave the iteration / null 
inference to the downstream application developer. 

I think the idea with returning a default undefined value with the current API 
is to look at the definition level to determine whether the returned value is 
actually null or not. At minimum we need to have some unit tests for the 
current API.

I'm going to implement PARQUET-435 and PARQUET-451 together and will let you 
know

> Improve handling of null values
> -------------------------------
>
>                 Key: PARQUET-459
>                 URL: https://issues.apache.org/jira/browse/PARQUET-459
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>            Reporter: Deepak Majeti
>
> Currently, the default value of the type is returned for NULL values and is 
> incorrect.
> This JIRA will correctly identify a NULL value with the help of an additional 
> variable that will be set for NULL values. 
> This feature depends on reading the repetition level (PARQUET-169).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to