nsivabalan commented on code in PR #13498:
URL: https://github.com/apache/hudi/pull/13498#discussion_r2231762529


##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/BaseSparkInternalRowReaderContext.java:
##########
@@ -110,6 +112,23 @@ public HoodieRecord<InternalRow> 
constructHoodieRecord(BufferedRecord<InternalRo
     return new HoodieSparkRecord(hoodieKey, row, 
HoodieInternalRowUtils.getCachedSchema(schema), false);
   }
 
+  @Override
+  public InternalRow constructEngineRecord(Schema schema,

Review Comment:
   MIT also does pretty similar per column processing. Just that in reality, 
while we merge multiple log records, all of them are expected to have just few 
columns. and while merging w/ base record, we might incur the perf hit. This is 
what existing MIT partial encoding is also doing. 
   
   
https://github.com/apache/hudi/blob/0fe119a0cf1e0b8ef44a2049fd15e56fcb62cfb9/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/DefaultSparkRecordMerger.java#L59
 
   
https://github.com/apache/hudi/blob/0fe119a0cf1e0b8ef44a2049fd15e56fcb62cfb9/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/merge/SparkRecordMergingUtils.java#L122
   
   The reader schema will dictate how many columns we process here. W/ queries, 
if someone where to do "select a,b,c from tbl", we should only be processing 3 
columns here. But during compaction, while merging w/ base record, we will 
process all columns just when merging w/ base file record. 
   
   In this patch, Lin did not add support for MIT to work w/ the new 
PartialUpdateModes added. but we will be putting out a separate patch, users 
should be able to leverage MIT partial encoding even for a table created for 
OverwriteNonDefaultsWithLatest semantics. 
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to