tarun11Mavani opened a new issue, #17337: URL: https://github.com/apache/pinot/issues/17337
### Description: We observed data loss in a FULL upsert table caused by an incorrect segment-deletion decision during the UpsertMergeCompactionTask. Two segments (S1, S2) existed across two servers and were merged into a new segment (MS1). Due to async OFFLINE→ONLINE state transitions during a server restart, tie-breaking (shouldReplaceOnComparisonTie) caused validDocIds to diverge between replicas: S1 (creation time T1) and S2 (creation time T2) was merged and we created a new merged segment MS1 with creation time T2 (max of T1 and T2). - Node1 retained valid records in S2 and MS1. Due to MS1 and S2 having same creationTime, records in S2 were retained while all records from S1 were invalided and marked valid in MS1. - Node2 retained valid records in S2 and MS1. Due to MS1 and S2 having same creationTime, records in S2 were retained while all records from S1 were invalided and marked valid in MS1. - After a server replacement, MS1 was processed before S1 and S2. So Node2 retained valid records only in MS1; S1 & S2 appeared fully invalid - A subsequent doSnapshot() persisted 0 validDocIds for S2 on Node2. Because the merge-compaction task deletes a segment if any replica reports totalInvalidDoc == totalDocs, S2 was incorrectly deleted from both replicas, even though Node1 still contained valid PKs. **Result**: Permanent data loss of valid primary keys on one node. ### Fixes - Instead of checking if one of the replicas has zero validDocIds, we should check that all replicas have 0 validDocIds [here](https://github.com/apache/pinot/blob/538407935001e2ffa17fd61a766a41b2bd53f7bc/pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/upsertcompactmerge/UpsertCompactMergeTaskGenerator.java#L313). - We upload a new segment with creationTime = max (creationTime of all merging segments). This causes the merged segment and largest segment to have same creation time and all records in largest merging segment is replaced with merging segment due to tiebrealking logic in shouldReplaceOnComparisonTie. We should set the creationTime = max (creationTime of all merging segments) + 1 to avoid this and fully replace all records from merging segments with new merged segment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
