vburenin commented on a change in pull request #2500:
URL: https://github.com/apache/hudi/pull/2500#discussion_r588912086



##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java
##########
@@ -74,20 +75,21 @@
 
   public HoodieLogFileReader(FileSystem fs, HoodieLogFile logFile, Schema 
readerSchema, int bufferSize,
                              boolean readBlockLazily, boolean reverseReader) 
throws IOException {
+    FSDataInputStream inputStreamLocal;
     FSDataInputStream fsDataInputStream = fs.open(logFile.getPath(), 
bufferSize);
-    if (FSUtils.isGCSInputStream(fsDataInputStream)) {
-      this.inputStream = new TimedFSDataInputStream(logFile.getPath(), new 
FSDataInputStream(
-          new BufferedFSInputStream((FSInputStream) ((
-              (FSDataInputStream) 
fsDataInputStream.getWrappedStream()).getWrappedStream()), bufferSize)));
-    } else if (fsDataInputStream.getWrappedStream() instanceof FSInputStream) {
-      this.inputStream = new TimedFSDataInputStream(logFile.getPath(), new 
FSDataInputStream(
-          new BufferedFSInputStream((FSInputStream) 
fsDataInputStream.getWrappedStream(), bufferSize)));
+
+    if (fsDataInputStream.getWrappedStream() instanceof FSInputStream) {
+      inputStreamLocal = new TimedFSDataInputStream(logFile.getPath(), new 
FSDataInputStream(

Review comment:
       The problem with this PR right now is that I had a case when 
`fsDataInputStream.getWrappedStream() instanceof FSInputStream` was not true 
because fsDataInputStream.getWrappedStream() was returning FSDataInputStream 
type when it was running as a spark job, however it was true when I was running 
deltastreamer as it is locally. (maybe I had different GCS connector versions?, 
this is likely) 
   So, that is why I was asking for multilayered IF.
   I would also suggest to have this IF outside the first IF scope:
   ```
    if (FSUtils.isGCSFileSystem(fs)) {
           inputStreamLocal = new 
SchemeAwareFSDataInputStream(inputStreamLocal, true);
         }
   ```
   since in case of the fallback 'else' scenario we can potentially miss GCS 
filesystem and crash on incorrect SEEK scenario.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to