prashantwason opened a new issue, #18028:
URL: https://github.com/apache/hudi/issues/18028

   ## Description
   
   When bootstrapping the record index, there is currently no validation to 
ensure that the expected number of records matches the actual number of records 
written to the metadata table. This can lead to silent data integrity issues.
   
   ## Proposed Change
   
   Add validation logic that:
   1. After completing the record index bootstrap, reads the total records from 
the base files
   2. Compares the expected record count (from the source data) with the actual 
record count in the metadata table
   3. Fails the bootstrap if there is a mismatch (when duplicates are not 
allowed)
   
   This helps ensure data integrity during the bootstrap process.
   
   ## Related
   
   - Original internal JIRA: HUDI-2750
   - Related to record index bootstrap functionality in 
`HoodieBackedTableMetadataWriter`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to