Micah Kornfield created ARROW-9603:
--------------------------------------

             Summary: [C++][Parquet] Write Arrow relies on unspecified behavior 
for nested types
                 Key: ARROW-9603
                 URL: https://issues.apache.org/jira/browse/ARROW-9603
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
            Reporter: Micah Kornfield


parquet/column_writer.cc WriteArrow implementations at certain points checks 
null counts/required data and passes through the null bitmap for encoding.  
This only works for nested data types if the if the null slot on a parent 
implies a null slot on the leaf.  This relationship is not required by the 
specifications.

 

Most paths for creating arrays follow this pattern so it would be esoteric to 
hit this bug, but we should still fix it.

 

All branches that rely on reading nullness should generate a new null bitmap 
based on definition levels if the column is nested, and decisions should be 
based off of that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to