[ 
https://issues.apache.org/jira/browse/PARQUET-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509496#comment-17509496
 ] 

ASF GitHub Bot commented on PARQUET-2134:
-----------------------------------------

shangxinli commented on a change in pull request #951:
URL: https://github.com/apache/parquet-mr/pull/951#discussion_r830648757



##########
File path: 
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java
##########
@@ -66,6 +67,15 @@ public static SeekableInputStream wrap(FSDataInputStream 
stream) {
     }
   }
 
+  private static boolean isWrappedStreamByteBufferReadable(FSDataInputStream 
stream) {
+    InputStream wrapped = stream.getWrappedStream();
+    if (wrapped instanceof FSDataInputStream) {
+      return isWrappedStreamByteBufferReadable(((FSDataInputStream) wrapped));

Review comment:
       I understand it would be very rare case but once that happen it would be 
hard to debug this 'hang' issue. Let's do two things: 1) Add check if it is 
'this'; Throw exception if that happens; 2) Add debug log; When it hangs, 
developer can enable debug log and see what parquet-mr is doing. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incorrect type checking in HadoopStreams.wrap
> ---------------------------------------------
>
>                 Key: PARQUET-2134
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2134
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.8.3, 1.10.1, 1.11.2, 1.12.2
>            Reporter: Todd Gao
>            Priority: Minor
>
> The method 
> [HadoopStreams.wrap|https://github.com/apache/parquet-mr/blob/4d062dc37577e719dcecc666f8e837843e44a9be/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java#L51]
>  wraps an FSDataInputStream to a SeekableInputStream. 
> It checks whether the underlying stream of the passed  FSDataInputStream 
> implements ByteBufferReadable: if true, wraps the FSDataInputStream to 
> H2SeekableInputStream; otherwise, wraps to H1SeekableInputStream.
> In some cases, we may add another wrapper over FSDataInputStream. For 
> example, 
> {code:java}
> class CustomDataInputStream extends FSDataInputStream {
>     public CustomDataInputStream(FSDataInputStream original) {
>         super(original);
>     }
> }
> {code}
> When we create an FSDataInputStream, whose underlying stream does not 
> implements ByteBufferReadable, and then creates a CustomDataInputStream with 
> it. If we use HadoopStreams.wrap to create a SeekableInputStream, we may get 
> an error like 
> {quote}java.lang.UnsupportedOperationException: Byte-buffer read unsupported 
> by input stream{quote}
> We can fix this by taking recursive checks over the underlying stream of 
> FSDataInputStream.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to