tarun11Mavani opened a new issue, #17337:
URL: https://github.com/apache/pinot/issues/17337

   ### Description:
   We observed data loss in a FULL upsert table caused by an incorrect 
segment-deletion decision during the UpsertMergeCompactionTask. Two segments 
(S1, S2) existed across two servers and were merged into a new segment (MS1). 
Due to async OFFLINE→ONLINE state transitions during a server restart, 
tie-breaking (shouldReplaceOnComparisonTie) caused validDocIds to diverge 
between replicas:
   
   S1 (creation time T1) and S2 (creation time T2) was merged and we created a 
new merged segment MS1 with creation time T2 (max of T1 and T2). 
   
   - Node1 retained valid records in S2 and MS1. Due to MS1 and S2 having same 
creationTime, records in S2 were retained while all records from S1 were 
invalided and marked valid in MS1.
   - Node2 retained valid records in S2 and MS1. Due to MS1 and S2 having same 
creationTime, records in S2 were retained while all records from S1 were 
invalided and marked valid in MS1.
   - After a server replacement, MS1 was processed before S1 and S2. So Node2 
retained valid records only in MS1; S1 & S2 appeared fully invalid
   - A subsequent doSnapshot() persisted 0 validDocIds for S2 on Node2. Because 
the merge-compaction task deletes a segment if any replica reports 
totalInvalidDoc == totalDocs, S2 was incorrectly deleted from both replicas, 
even though Node1 still contained valid PKs.
   
   **Result**: Permanent data loss of valid primary keys on one node.
   
   ### Fixes
   
   - Instead of checking if one of the replicas has zero validDocIds, we should 
check that all replicas have 0 validDocIds 
[here](https://github.com/apache/pinot/blob/538407935001e2ffa17fd61a766a41b2bd53f7bc/pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/upsertcompactmerge/UpsertCompactMergeTaskGenerator.java#L313).
   - We upload a new segment with creationTime = max (creationTime of all 
merging segments). This causes the merged segment and largest segment to have 
same creation time and all records in largest merging segment is replaced with 
merging segment due to tiebrealking logic in shouldReplaceOnComparisonTie. We 
should set the creationTime = max (creationTime of all merging segments) + 1 to 
avoid this and fully replace all records from merging segments with new merged 
segment.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to