nsivabalan commented on code in PR #18029:
URL: https://github.com/apache/hudi/pull/18029#discussion_r2765810230


##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java:
##########
@@ -616,6 +616,15 @@ public final class HoodieMetadataConfig extends 
HoodieConfig {
           + "honor the set value for number of tasks. If not, number of write 
status's from data "
           + "table writes will be used for metadata table record preparation");
 
+  public static final ConfigProperty<Boolean> 
RECORD_INDEX_BOOTSTRAP_VALIDATION_ENABLE = ConfigProperty
+      .key(METADATA_PREFIX + ".record.index.bootstrap.validation.enable")

Review Comment:
   we already have a feature for bootstrap. and hence, we avoid using 
"bootstrap" for mdt partitions. 
   we generally use, "initializing mdt partitions". 
   
   So, can we name the config 
   `hoodie.metadata.record.index.enable.validation.on.initailization`



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -833,6 +845,79 @@ private int 
estimateFileGroupCount(HoodieData<HoodieRecord> records) {
     );
   }
 
+  /**
+   * Validates the record index after bootstrap by comparing the expected 
record count with the actual
+   * record count stored in the metadata table. The validation is performed in 
a distributed manner
+   * using the engine context to count records from HFiles in parallel.
+   *
+   * @param recordIndexRecords the HoodieData containing the expected records
+   * @param fileGroupCount the expected number of file groups
+   */
+  private void validateRecordIndex(HoodieData<HoodieRecord> 
recordIndexRecords, int fileGroupCount) {
+    String partitionName = 
MetadataPartitionType.RECORD_INDEX.getPartitionPath();
+    HoodieTableFileSystemView fsView = 
HoodieTableMetadataUtil.getFileSystemViewForMetadataTable(metadataMetaClient);

Review Comment:
   we need to close the FSV at the end of this method



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to