prashantwason commented on code in PR #18029:
URL: https://github.com/apache/hudi/pull/18029#discussion_r2770832919
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -833,6 +845,79 @@ private int
estimateFileGroupCount(HoodieData<HoodieRecord> records) {
);
}
+ /**
+ * Validates the record index after bootstrap by comparing the expected
record count with the actual
+ * record count stored in the metadata table. The validation is performed in
a distributed manner
+ * using the engine context to count records from HFiles in parallel.
+ *
+ * @param recordIndexRecords the HoodieData containing the expected records
+ * @param fileGroupCount the expected number of file groups
+ */
+ private void validateRecordIndex(HoodieData<HoodieRecord>
recordIndexRecords, int fileGroupCount) {
+ String partitionName =
MetadataPartitionType.RECORD_INDEX.getPartitionPath();
+ HoodieTableFileSystemView fsView =
HoodieTableMetadataUtil.getFileSystemViewForMetadataTable(metadataMetaClient);
Review Comment:
Done. Added try-finally block to ensure the FSView is closed at the end of
the method.
##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java:
##########
@@ -616,6 +616,15 @@ public final class HoodieMetadataConfig extends
HoodieConfig {
+ "honor the set value for number of tasks. If not, number of write
status's from data "
+ "table writes will be used for metadata table record preparation");
+ public static final ConfigProperty<Boolean>
RECORD_INDEX_BOOTSTRAP_VALIDATION_ENABLE = ConfigProperty
+ .key(METADATA_PREFIX + ".record.index.bootstrap.validation.enable")
Review Comment:
Done. Renamed to
`hoodie.metadata.record.index.enable.validation.on.initialization`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]