vburenin edited a comment on pull request #2500:
URL: https://github.com/apache/hudi/pull/2500#issuecomment-792948352
I can clearly reproduce the issue with this code:
```java
LOG.info("fsDataInputStream.getWrappedStream: " +
fsDataInputStream.getWrappedStream().getClass().getCanonicalName());
LOG.info("fsDataInputStream.getWrappedStream: instanceof FSInputStream"
+ (fsDataInputStream.getWrappedStream() instanceof FSInputStream));
if (fsDataInputStream.getWrappedStream() instanceof FSInputStream) {
LOG.info("fsDataInputStream.getWrappedStream: instanceof FSInputStream
" + (fsDataInputStream.getWrappedStream() instanceof FSInputStream));
inputStreamLocal = new TimedFSDataInputStream(logFile.getPath(), new
FSDataInputStream(
new BufferedFSInputStream((FSInputStream)
fsDataInputStream.getWrappedStream(), bufferSize)));
if (FSUtils.isGCSFileSystem(fs)) {
inputStreamLocal = new
SchemeAwareFSDataInputStream(inputStreamLocal, true);
}
} else {
// fsDataInputStream.getWrappedStream() maybe a BufferedFSInputStream
// need to wrap in another BufferedFSInputStream the make bufferSize
work?
inputStreamLocal = fsDataInputStream;
}
LOG.info("inputStreamLocal: " +
inputStreamLocal.getClass().getCanonicalName());
```
225852 [Executor task launch worker for task 174] INFO
org.apache.hudi.common.table.log.HoodieLogFileReader -
fsDataInputStream.getWrappedStream: org.apache.hadoop.fs.FSDataInputStream
225852 [Executor task launch worker for task 174] INFO
org.apache.hudi.common.table.log.HoodieLogFileReader -
fsDataInputStream.getWrappedStream: instanceof FSInputStream false
225852 [Executor task launch worker for task 174] INFO
org.apache.hudi.common.table.log.HoodieLogFileReader - inputStreamLocal:
org.apache.hudi.common.fs.TimedFSDataInputStream
So, clearly, there is a bug.
It actually got even slightly worse as it is no longer handles GCS last byte
seek issue anymore, not counting the absence of use of buffered reader.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]