[jira] [Commented] (PARQUET-2134) Incorrect type checking in HadoopStreams.wrap

ASF GitHub Bot (Jira) Wed, 16 Mar 2022 00:22:05 -0700


    [ 
https://issues.apache.org/jira/browse/PARQUET-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507386#comment-17507386
 ]


ASF GitHub Bot commented on PARQUET-2134:
-----------------------------------------

7c00 commented on a change in pull request #951:
URL: https://github.com/apache/parquet-mr/pull/951#discussion_r827693142



##########
File path: 
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java
##########
@@ -66,6 +67,15 @@ public static SeekableInputStream wrap(FSDataInputStream 
stream) {
     }
   }
 
+  private static boolean isWrappedStreamByteBufferReadable(FSDataInputStream 
stream) {
+    InputStream wrapped = stream.getWrappedStream();
+    if (wrapped instanceof FSDataInputStream) {
+      return isWrappedStreamByteBufferReadable(((FSDataInputStream) wrapped));

Review comment:
       Yes, it could be. But it may be hard to create such a case. As its code 
shows, FSDataInputStream is a wrapper class of an inputstream. When we check 
the wrapped inputstream recursively, it would finally reach an inputstream 
whose type is not FSDataInputStream. A developer could override 
`getWrappedStream` as `return this` to cause an infinite loop, while this makes 
no sense.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incorrect type checking in HadoopStreams.wrap
> ---------------------------------------------
>
>                 Key: PARQUET-2134
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2134
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.8.3, 1.10.1, 1.11.2, 1.12.2
>            Reporter: Todd Gao
>            Priority: Minor
>
> The method 
> [HadoopStreams.wrap|https://github.com/apache/parquet-mr/blob/4d062dc37577e719dcecc666f8e837843e44a9be/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java#L51]
>  wraps an FSDataInputStream to a SeekableInputStream. 
> It checks whether the underlying stream of the passed  FSDataInputStream 
> implements ByteBufferReadable: if true, wraps the FSDataInputStream to 
> H2SeekableInputStream; otherwise, wraps to H1SeekableInputStream.
> In some cases, we may add another wrapper over FSDataInputStream. For 
> example, 
> {code:java}
> class CustomDataInputStream extends FSDataInputStream {
>     public CustomDataInputStream(FSDataInputStream original) {
>         super(original);
>     }
> }
> {code}
> When we create an FSDataInputStream, whose underlying stream does not 
> implements ByteBufferReadable, and then creates a CustomDataInputStream with 
> it. If we use HadoopStreams.wrap to create a SeekableInputStream, we may get 
> an error like 
> {quote}java.lang.UnsupportedOperationException: Byte-buffer read unsupported 
> by input stream{quote}
> We can fix this by taking recursive checks over the underlying stream of 
> FSDataInputStream.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PARQUET-2134) Incorrect type checking in HadoopStreams.wrap

Reply via email to