yihua commented on code in PR #13384:
URL: https://github.com/apache/hudi/pull/13384#discussion_r2127753677


##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/BufferedRecord.java:
##########
@@ -77,6 +77,10 @@ public static <T> BufferedRecord<T> 
forDeleteRecord(DeleteRecord deleteRecord, C
     return new BufferedRecord<>(deleteRecord.getRecordKey(), orderingValue, 
null, null, true);
   }
 
+  public static <T> BufferedRecord<T> forDeleteRecord(String recordKey, 
Comparable orderingValue) {
+    return new BufferedRecord<>(recordKey, orderingValue, null, null, true);

Review Comment:
   nit: modify the existing method to call this one
   ```
   public static <T> BufferedRecord<T> forDeleteRecord(DeleteRecord 
deleteRecord, Comparable orderingValue) {
       return forDeleteRecord(deleteRecord.getRecordKey(), orderingValue);
     }
   ```



##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/FileGroupRecordBuffer.java:
##########
@@ -289,9 +289,11 @@ protected Option<BufferedRecord<T>> 
doProcessNextDataRecord(BufferedRecord<T> ne
                 if (combinedRecordData != existingRecord.getRecord()) {
                   Pair<HoodieRecord, Schema> combinedRecordAndSchema = 
combinedRecordAndSchemaOpt.get();
                   return 
Option.of(BufferedRecord.forRecordWithContext(combinedRecordData, 
combinedRecordAndSchema.getRight(), readerContext, orderingFieldName, false));
+                } else {
+                  return Option.empty();
                 }
               }
-              return Option.empty();
+              return 
Option.of(BufferedRecord.forDeleteRecord(newRecord.getRecordKey(), 
newRecord.getOrderingValue()));

Review Comment:
   It's better to use `DEFAULT_ORDERING_VALUE` as the ordering value as the 
user-defined payload or merger signifies the delete at this particular commit.



##########
hudi-client/hudi-java-client/src/test/java/org/apache/hudi/testutils/HoodieJavaClientTestHarness.java:
##########
@@ -341,7 +341,9 @@ public void validateMetadata(HoodieTestTable testTable, 
List<String> inflightCom
             .collect(Collectors.toList());
     Map<String, List<StoragePathInfo>> partitionToFilesMap =
         tableMetadata.getAllFilesInPartitions(fullPartitionPaths);
-    assertEquals(fsPartitions.size(), partitionToFilesMap.size());

Review Comment:
   So if all files in a partition and the partition itself are deleted, is the 
partition removed from the `__all_partitions__` record in MDT files partition?



##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/FileGroupRecordBuffer.java:
##########
@@ -301,7 +303,7 @@ protected Option<BufferedRecord<T>> 
doProcessNextDataRecord(BufferedRecord<T> ne
                   props);
 
               if (!combinedRecordAndSchemaOpt.isPresent()) {
-                return Option.empty();
+                return 
Option.of(BufferedRecord.forDeleteRecord(newRecord.getRecordKey(), 
newRecord.getOrderingValue()));

Review Comment:
   Similar here on using `DEFAULT_ORDERING_VALUE`



##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/FileGroupRecordBuffer.java:
##########
@@ -289,9 +289,11 @@ protected Option<BufferedRecord<T>> 
doProcessNextDataRecord(BufferedRecord<T> ne
                 if (combinedRecordData != existingRecord.getRecord()) {
                   Pair<HoodieRecord, Schema> combinedRecordAndSchema = 
combinedRecordAndSchemaOpt.get();
                   return 
Option.of(BufferedRecord.forRecordWithContext(combinedRecordData, 
combinedRecordAndSchema.getRight(), readerContext, orderingFieldName, false));
+                } else {
+                  return Option.empty();
                 }
               }
-              return Option.empty();
+              return 
Option.of(BufferedRecord.forDeleteRecord(newRecord.getRecordKey(), 
newRecord.getOrderingValue()));

Review Comment:
   Could you add unit tests on both cases (e.g., in `TestCustomMerger`) where 
the user-defined payload and merger return `Option.empty()` after merging to 
signal deletes, and the file group reader should handle it properly?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to