[GitHub] [hudi] prashantwason commented on a change in pull request #1687: [WIP] [HUDI-684] Introduced abstraction for writing and reading different types of base file formats.

GitBox Thu, 04 Jun 2020 17:29:22 -0700


prashantwason commented on a change in pull request #1687:
URL: https://github.com/apache/hudi/pull/1687#discussion_r435628047




##########
File path: 
hudi-client/src/main/java/org/apache/hudi/table/action/commit/CommitActionExecutor.java
##########
@@ -89,11 +87,12 @@ public CommitActionExecutor(JavaSparkContext jsc,
       throw new HoodieUpsertException(
           "Error in finding the old file path at commit " + instantTime + " 
for fileId: " + fileId);
     } else {
-      AvroReadSupport.setAvroReadSchema(table.getHadoopConf(), 
upsertHandle.getWriterSchema());
       BoundedInMemoryExecutor<GenericRecord, GenericRecord, Void> wrapper = 
null;
-      try (ParquetReader<IndexedRecord> reader =
-          
AvroParquetReader.<IndexedRecord>builder(upsertHandle.getOldFilePath()).withConf(table.getHadoopConf()).build())
 {
-        wrapper = new SparkBoundedInMemoryExecutor(config, new 
ParquetReaderIterator(reader),
+      try {
+        HoodieStorageReader<IndexedRecord> storageReader =
+            HoodieStorageReaderFactory.getStorageReader(table.getHadoopConf(), 
upsertHandle.getOldFilePath());
+        wrapper =
+            new SparkBoundedInMemoryExecutor(config, 
storageReader.getRecordIterator(upsertHandle.getWriterSchema()),

Review comment:
       This is just creating a ParquetReader and getting an iterator to read 
all the records. The entire record will need to be read here as we are merging.
   
   I didn't understand how predicates are applicable here. Are they not in the 
InputFormat? 
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] prashantwason commented on a change in pull request #1687: [WIP] [HUDI-684] Introduced abstraction for writing and reading different types of base file formats.

Reply via email to