tarun11Mavani opened a new pull request, #16743: URL: https://github.com/apache/pinot/pull/16743
In #16344, we added commit-time compaction to remove invalid and soft-deleted records from consuming segments before committing. While removing invalid records is safe, compacting soft-deleted records can cause inconsistencies where deleted rows reappear as valid. Example: Suppose segment S2 contains a delete record R2 (deleteColumn=true) for a primary key that already exists as R1 in segment S0. During commit, R2 is removed from S2. After a server restart, Pinot reads R1 from S0 and treats it as valid, since R2 no longer exists to mark it as deleted. To prevent this, this change removes the deleteRecordColumn parameter from the CompactedPinotSegmentRecordReader constructor, ensuring soft-deleted records are not compacted. ### Test Added integration test testCommitTimeCompactionPreservesDeletedRecords(). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
