tarun11Mavani opened a new pull request, #17356:
URL: https://github.com/apache/pinot/pull/17356

   ### Problem
   During upsert compact merge operations, merged segment are created with the 
same creation time as the maximum creation time of the merging segments. If the 
merging segment are UPLOADED segment (merged earlier), they share the same 
create time as merged segment. This means that records in segment with highest 
creation time is not replaced due to the tie-breaking logic in 
[shouldReplaceOnComparisonTie](https://github.com/apache/pinot/blob/52db36c816f91ef8887fddd0beade5d169824296/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java#L518).
   Not replacing records from this segment could lead to dataloss as discussed 
in in #17337.
   
   ### Solution
   We set the creation time of merged segment = max(creation time of all 
segment) + 1. 
   This ensures that the merging segment takes priority and all records in 
existing segment are replaced with records in new merged segment. 
   
   ### Test
   Tested in a test cluster. Verified that the new merging segment has the 
creation time as expected. Validated that all records from merging segment were 
replaced with merged segment. 
   All compacted segments were deleted successfully in next task iteration. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to