Alex Levenson created PARQUET-215:
-------------------------------------
Summary: Parquet Thrift should discard records with unrecognized
union members
Key: PARQUET-215
URL: https://issues.apache.org/jira/browse/PARQUET-215
Project: Parquet
Issue Type: Bug
Components: parquet-mr
Reporter: Alex Levenson
When writing parquet-thrift files, when a thrift record with an unknown union
member is encountered, it should be considered a bad record and discarded.
Currently, because unions are treated as structs with one optional field per
union member, parquet-thrift happily writes the empty struct, but then crashes
in the read path when trying to read this record.
We should discard these records in the write path, just as we discard other
unparseable records.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)