[
https://issues.apache.org/jira/browse/PARQUET-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642453#comment-17642453
]
Steve Loughran commented on PARQUET-2216:
-----------------------------------------
* OutputFile may not implement Closeable, but the {{PositionOutputStream}}
returned by the {{create())) method does.
* the writer close chain seems to go all the way through too.
looking at the code, one place for improvement would be for
{{ParquetFileWriter.end(Map<String, String> extraMetaData)}} to close its
output stream in a finally() clause, so even if the write of the footer
failed, the local fs client would do all it can to cleanup, release connections
etc.
> Parquet writer classes don't close underlying output stream in case of errors.
> ------------------------------------------------------------------------------
>
> Key: PARQUET-2216
> URL: https://issues.apache.org/jira/browse/PARQUET-2216
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Affects Versions: 1.12.3
> Reporter: Andrei Lopukhov
> Priority: Major
> Attachments: TestExample.java
>
>
> org.apache.parquet.io.OutputFile interface does not implement Closeable.
> In my opinion it implies that created streams are fully managed by parquet-mr
> classes.
> Unfortunately opened stream will not be closed in case of IO or other failure.
> There are two places I can find for this problem:
> * During writer creation
> (org.apache.parquet.hadoop.ParquetWriter.Builder#build()) - created stream
> should be closed if writer creation fails.
> * During writer close(org.apache.parquet.hadoop.ParquetWriter#close) -
> underlying stream should be closed regardless of any faced failures.
> Although I didn't examine ParquetReaded that much.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)