[
https://issues.apache.org/jira/browse/PARQUET-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516600#comment-17516600
]
ASF GitHub Bot commented on PARQUET-2134:
-
shangxinli commented on PR #951:
URL: https://github.com/apache/parquet-mr/pull/951#issuecomment-1086966074
Thanks for adding the check and debug log. LGTM! One more thing(sorry for
not asking at first-round review), do you think it makes sense to add tests?
> Incorrect type checking in HadoopStreams.wrap
> -
>
> Key: PARQUET-2134
> URL: https://issues.apache.org/jira/browse/PARQUET-2134
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
>Affects Versions: 1.8.3, 1.10.1, 1.11.2, 1.12.2
>Reporter: Todd Gao
>Priority: Minor
>
> The method
> [HadoopStreams.wrap|https://github.com/apache/parquet-mr/blob/4d062dc37577e719dcecc666f8e837843e44a9be/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java#L51]
> wraps an FSDataInputStream to a SeekableInputStream.
> It checks whether the underlying stream of the passed FSDataInputStream
> implements ByteBufferReadable: if true, wraps the FSDataInputStream to
> H2SeekableInputStream; otherwise, wraps to H1SeekableInputStream.
> In some cases, we may add another wrapper over FSDataInputStream. For
> example,
> {code:java}
> class CustomDataInputStream extends FSDataInputStream {
> public CustomDataInputStream(FSDataInputStream original) {
> super(original);
> }
> }
> {code}
> When we create an FSDataInputStream, whose underlying stream does not
> implements ByteBufferReadable, and then creates a CustomDataInputStream with
> it. If we use HadoopStreams.wrap to create a SeekableInputStream, we may get
> an error like
> {quote}java.lang.UnsupportedOperationException: Byte-buffer read unsupported
> by input stream{quote}
> We can fix this by taking recursive checks over the underlying stream of
> FSDataInputStream.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)