Todd Gao created PARQUET-2134:
---------------------------------

             Summary: Incorrect type checking in HadoopStreams.wrap
                 Key: PARQUET-2134
                 URL: https://issues.apache.org/jira/browse/PARQUET-2134
             Project: Parquet
          Issue Type: Bug
          Components: parquet-mr
            Reporter: Todd Gao


The method 
[HadoopStreams.wrap|https://github.com/apache/parquet-mr/blob/4d062dc37577e719dcecc666f8e837843e44a9be/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java#L51]
 wraps an FSDataInputStream to a SeekableInputStream. 

It checks whether the underlying stream of the passed  FSDataInputStream 
implements ByteBufferReadable: if true, wraps the FSDataInputStream to 
H2SeekableInputStream; otherwise, wraps to H1SeekableInputStream.

In some cases, we may add another wrapper over FSDataInputStream. For example, 

{code:java}
class CustomDataInputStream extends FSDataInputStream {
    public CustomDataInputStream(FSDataInputStream original) {
        super(original);
    }
}
{code}

When we create an FSDataInputStream, whose underlying stream does not 
implements ByteBufferReadable, and then creates a CustomDataInputStream with 
it. If we use HadoopStreams.wrap to create a SeekableInputStream, we may get an 
error like 
{quote}java.lang.UnsupportedOperationException: Byte-buffer read unsupported by 
input stream{quote}.

We can fix this by taking recursive checks over the underlying stream of 
FSDataInputStream.




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to