Micah Kornfield created ARROW-9603:
--------------------------------------
Summary: [C++][Parquet] Write Arrow relies on unspecified behavior
for nested types
Key: ARROW-9603
URL: https://issues.apache.org/jira/browse/ARROW-9603
Project: Apache Arrow
Issue Type: Bug
Components: C++
Reporter: Micah Kornfield
parquet/column_writer.cc WriteArrow implementations at certain points checks
null counts/required data and passes through the null bitmap for encoding.
This only works for nested data types if the if the null slot on a parent
implies a null slot on the leaf. This relationship is not required by the
specifications.
Most paths for creating arrays follow this pattern so it would be esoteric to
hit this bug, but we should still fix it.
All branches that rely on reading nullness should generate a new null bitmap
based on definition levels if the column is nested, and decisions should be
based off of that.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)