[ 
https://issues.apache.org/jira/browse/PARQUET-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou reassigned PARQUET-2067:
---------------------------------------

    Assignee: William Butler

> [C++]  null_count and num_nulls incorrect for repeated columns
> --------------------------------------------------------------
>
>                 Key: PARQUET-2067
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2067
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>            Reporter: Micah Kornfield
>            Assignee: William Butler
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: cpp-6.0.0
>
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently only nulls at the leaf are accounted for in the null count 
> statstics.  For nested lists this is incorrect because null lists have zero 
> elements and don't show up in the leaf.
>  
> Example from mailing list discussion
>  
> [[0, 1], None, [2, None, 3]]
>  
> should have a null count of 2 (it currently reports as 1).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to