[ 
https://issues.apache.org/jira/browse/PARQUET-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou reassigned PARQUET-1374:
---------------------------------------

    Assignee: Wes McKinney

> [C++] Segfault on writing zero columns
> --------------------------------------
>
>                 Key: PARQUET-1374
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1374
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Philip Felton
>            Assignee: Wes McKinney
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: cpp-1.6.0
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Here's a gist which reproduces it: 
> [https://gist.github.com/philjdf/594ab431f135a040586aff08c7fb7666]
>  # The problem starts with the call to ParquetFileWriter::Close().
>  # As a result of that call, 
> FileMetaDataBuilder::FileMetaDataBuilderImpl::Finish() gets called, which 
> relies on metadata_ being non-null. At the end of that call Finish, it 
> std::moves metadata_ somewhere else, setting it to null. So obviously it 
> assumes it only gets called once.
>  # Later on still inside Close(), FlatSchemaConverter::Convert() gets called, 
> which throws an exception because we have no columns.
>  # In handling this exception, we leave the try block, which destructs our 
> ParquetFileWriter. This calls Close() again. This calls Finish() again, which 
> now has a null metadata_ and segfaults.
> So file_writer.cc FileSerializer::Close is presumably wrong, it should set 
> is_open_ to false at the start rather than the end of the if block.
> It's better to get an exception rather than a segfault, but ideally I'd like 
> to write/read Parquet files with zero rows and/or zero columns. It means one 
> less edge case for client code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to