yihua commented on code in PR #13797:
URL: https://github.com/apache/hudi/pull/13797#discussion_r2353448628


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroReaderContext.java:
##########
@@ -127,6 +128,24 @@ private HoodieAvroReaderContext(
     this.isMultiFormat = tableConfig.isMultipleBaseFileFormatsEnabled();
   }
 
+  @Override
+  public ClosableIterator<IndexedRecord> getFileRecordIterator(
+      StoragePathInfo storagePathInfo, long start, long length, Schema 
dataSchema, Schema requiredSchema,
+      HoodieStorage storage) throws IOException {
+    HoodieAvroFileReader reader;
+    boolean isLogFile = FSUtils.isLogFile(storagePathInfo.getPath());
+    if (reusableFileReaders.containsKey(storagePathInfo.getPath())) {
+      reader = reusableFileReaders.get(storagePathInfo.getPath());
+    } else {
+      HoodieFileFormat fileFormat = isMultiFormat && !isLogFile ? 
HoodieFileFormat.fromFileExtension(storagePathInfo.getPath().getFileExtension())
 : baseFileFormat;
+      reader = (HoodieAvroFileReader) HoodieIOFactory.getIOFactory(storage)
+          
.getReaderFactory(HoodieRecord.HoodieRecordType.AVRO).getFileReader(new 
HoodieConfig(),
+              storagePathInfo, fileFormat, Option.empty());
+    }

Review Comment:
   nit: add a method like this for shared logic between two 
`getFileRecordIterator` methods?
   ```
   private HoodieAvroFileReader getHFileReader(StoragePath path, 
Function<HoodieFileFormat, HoodieAvroFileReader> func) {
     if (reusableFileReaders.containsKey(path)) {
         reader = reusableFileReaders.get(path);
       } else {
         HoodieFileFormat fileFormat = isMultiFormat && !isLogFile ? 
HoodieFileFormat.fromFileExtension(path.getFileExtension()) : baseFileFormat;
         reader = func.apply(fileFormat);
       }
   }
   ```



##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileReaderFactory.java:
##########
@@ -77,6 +78,20 @@ public HoodieFileReader getFileReader(HoodieConfig 
hoodieConfig, StoragePath pat
     }
   }
 
+  public HoodieFileReader getFileReader(HoodieConfig hoodieConfig, 
StoragePathInfo pathInfo, HoodieFileFormat format,

Review Comment:
   Should we revisit all usage of `getFileReader(HoodieConfig hoodieConfig, 
StoragePath path, HoodieFileFormat format, Option<Schema> schemaOption)` if 
HFile is used to replace the usage with this new API with `StoragePathInfo` 
object if it is available for use?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to