[
https://issues.apache.org/jira/browse/PARQUET-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antoine Pitrou resolved PARQUET-1374.
-------------------------------------
Resolution: Fixed
Fix Version/s: cpp-1.6.0
This was fixed as part of PARQUET-1508.
> [C++] Segfault on writing zero columns
> --------------------------------------
>
> Key: PARQUET-1374
> URL: https://issues.apache.org/jira/browse/PARQUET-1374
> Project: Parquet
> Issue Type: Bug
> Reporter: Philip Felton
> Priority: Minor
> Labels: pull-request-available
> Fix For: cpp-1.6.0
>
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> Here's a gist which reproduces it:
> [https://gist.github.com/philjdf/594ab431f135a040586aff08c7fb7666]
> # The problem starts with the call to ParquetFileWriter::Close().
> # As a result of that call,
> FileMetaDataBuilder::FileMetaDataBuilderImpl::Finish() gets called, which
> relies on metadata_ being non-null. At the end of that call Finish, it
> std::moves metadata_ somewhere else, setting it to null. So obviously it
> assumes it only gets called once.
> # Later on still inside Close(), FlatSchemaConverter::Convert() gets called,
> which throws an exception because we have no columns.
> # In handling this exception, we leave the try block, which destructs our
> ParquetFileWriter. This calls Close() again. This calls Finish() again, which
> now has a null metadata_ and segfaults.
> So file_writer.cc FileSerializer::Close is presumably wrong, it should set
> is_open_ to false at the start rather than the end of the if block.
> It's better to get an exception rather than a segfault, but ideally I'd like
> to write/read Parquet files with zero rows and/or zero columns. It means one
> less edge case for client code.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)