Repository: parquet-mr Updated Branches: refs/heads/master 4453aa3bf -> 09d28fe79
PARQUET-783: Close the underlying stream when an H2SeekableInputStream is closed This PR addresses https://issues.apache.org/jira/browse/PARQUET-783. `ParquetFileReader` opens a `SeekableInputStream` to read a footer. In the process, it opens a new `FSDataInputStream` and wraps it. However, `H2SeekableInputStream` does not override the `close` method. Therefore, when `ParquetFileReader` closes it, the underlying `FSDataInputStream` is not closed. As a result, these stale connections can exhaust a clusters' data nodes' connection resources and lead to mysterious HDFS read failures in HDFS clients, e.g. ``` org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517 ``` Author: Michael Allman <[email protected]> Closes #388 from mallman/parquet-783-close_underlying_inputstream and squashes the following commits: f4b27c1 [Michael Allman] PARQUET-783 Close the underlying stream when an H2SeekableInputStream is closed Project: http://git-wip-us.apache.org/repos/asf/parquet-mr/repo Commit: http://git-wip-us.apache.org/repos/asf/parquet-mr/commit/09d28fe7 Tree: http://git-wip-us.apache.org/repos/asf/parquet-mr/tree/09d28fe7 Diff: http://git-wip-us.apache.org/repos/asf/parquet-mr/diff/09d28fe7 Branch: refs/heads/master Commit: 09d28fe7995db1a4da2c651d362007d2082c663c Parents: 4453aa3 Author: Michael Allman <[email protected]> Authored: Mon Dec 5 15:27:14 2016 -0800 Committer: Julien Le Dem <[email protected]> Committed: Mon Dec 5 15:27:14 2016 -0800 ---------------------------------------------------------------------- .../org/apache/parquet/hadoop/util/H2SeekableInputStream.java | 5 +++++ 1 file changed, 5 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/parquet-mr/blob/09d28fe7/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java ---------------------------------------------------------------------- diff --git a/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java b/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java index a706546..ec4567e 100644 --- a/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java +++ b/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java @@ -45,6 +45,11 @@ class H2SeekableInputStream extends SeekableInputStream { } @Override + public void close() throws IOException { + stream.close(); + } + + @Override public long getPos() throws IOException { return stream.getPos(); }
