[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #3181: [HUDI-2059] When log exists in mor table, clustering is triggered. The query result shows that the update record in log is lost

GitBox Tue, 29 Jun 2021 18:48:47 -0700


xiarixiaoyao commented on a change in pull request #3181:
URL: https://github.com/apache/hudi/pull/3181#discussion_r661072382




##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java
##########
@@ -41,11 +43,26 @@
     while (baseIterator.hasNext()) {
       GenericRecord record = (GenericRecord)  baseIterator.next();
       HoodieRecord<T> hoodieRecord = 
SpillableMapUtils.convertToHoodieRecordPayload(record, payloadClass);
-      scanner.processNextRecord(hoodieRecord);
+      processNextRecord(scanner, hoodieRecord);
     }
     return new HoodieFileSliceReader(scanner.iterator());
   }
 
+  private static void processNextRecord(HoodieMergedLogRecordScanner scanner, 
HoodieRecord<? extends HoodieRecordPayload> hoodieRecord) {

Review comment:
       thanks for your review.    commit3 is only to use trigger clustering.   
   in the commit1 , we create a table and the age column is assigned to 1
   in the commit2，update age value to 1001 where keyid < 5 to produce log files
   in the commit3，only use to trigger clustering and find the bug。
   
   In the previous code， we combine base file record and log file record by 
call HoodieMergedLogRecordeScanner.processNextRecord, however this function is 
not suitable for this scene，if we call this function，the record from base file 
will be kept， and the record from log file will be discard，this is wrong。
   
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #3181: [HUDI-2059] When log exists in mor table, clustering is triggered. The query result shows that the update record in log is lost

Reply via email to