the-other-tim-brown commented on code in PR #13559:
URL: https://github.com/apache/hudi/pull/13559#discussion_r2209044808


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##########
@@ -116,19 +127,29 @@ public class HoodieMergeHandle<T, I, K, O> extends 
HoodieWriteHandle<T, I, K, O>
 
   protected Option<String[]> partitionFields = Option.empty();
   protected Object[] partitionValues = new Object[0];
+  protected KeyBasedFileGroupRecordBuffer<T> recordBuffer;
+  protected HoodieReaderContext<T> readerContext;
+  protected Option<String> orderingFieldName;
 
   public HoodieMergeHandle(HoodieWriteConfig config, String instantTime, 
HoodieTable<T, I, K, O> hoodieTable,
                            Iterator<HoodieRecord<T>> recordItr, String 
partitionPath, String fileId,
-                           TaskContextSupplier taskContextSupplier, 
Option<BaseKeyGenerator> keyGeneratorOpt) {
+                           TaskContextSupplier taskContextSupplier, 
Option<BaseKeyGenerator> keyGeneratorOpt,
+                           HoodieReaderContext<T> readerContext) {
     this(config, instantTime, hoodieTable, recordItr, partitionPath, fileId, 
taskContextSupplier,
-        getLatestBaseFile(hoodieTable, partitionPath, fileId), 
keyGeneratorOpt);
+        getLatestBaseFile(hoodieTable, partitionPath, fileId), 
keyGeneratorOpt, readerContext, false);
   }
 
   public HoodieMergeHandle(HoodieWriteConfig config, String instantTime, 
HoodieTable<T, I, K, O> hoodieTable,
                            Iterator<HoodieRecord<T>> recordItr, String 
partitionPath, String fileId,
-                           TaskContextSupplier taskContextSupplier, 
HoodieBaseFile baseFile, Option<BaseKeyGenerator> keyGeneratorOpt) {
+                           TaskContextSupplier taskContextSupplier, 
HoodieBaseFile baseFile,
+                           Option<BaseKeyGenerator> keyGeneratorOpt,
+                           HoodieReaderContext<T> readerContex, boolean 
useFileGroupRecordBuffer) {

Review Comment:
   typo: `readerContex` -> `readerContext`



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##########
@@ -406,7 +454,9 @@ public void write(HoodieRecord<T> oldRecord) {
     boolean copyOldRecord = true;
     String key = oldRecord.getRecordKey(oldSchema, keyGeneratorOpt);
     TypedProperties props = config.getPayloadConfig().getProps();
-    if (keyToNewRecords.containsKey(key)) {
+    if (recordBuffer != null && recordBuffer.containsLogRecord(key)) {

Review Comment:
   Is it possible to refactor the `write` to avoid going through `HoodieRecord` 
by using the FileGroupReader? We are essentially reading the base file and 
returning an iterator that merges with the updates which is what the FGReader 
does but it avoids extra conversion where possible. It will allow us to focus 
on a single code path and apply performance improvements we make for compaction 
to our CoW tables and vice versa.



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##########
@@ -116,19 +127,29 @@ public class HoodieMergeHandle<T, I, K, O> extends 
HoodieWriteHandle<T, I, K, O>
 
   protected Option<String[]> partitionFields = Option.empty();
   protected Object[] partitionValues = new Object[0];
+  protected KeyBasedFileGroupRecordBuffer<T> recordBuffer;

Review Comment:
   Let's make this the FileGroupRecordBuffer interface so it is more flexible 
if we want to provide new implementations in the future



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to