Jackie-Jiang opened a new pull request #6213:
URL: https://github.com/apache/incubator-pinot/pull/6213


   ## Description
   For upsert table, the record with newer timestamp will replace the old 
record with older timestamp, but when multiple records have the same timestamp, 
which record to preserve is undefined in the current implementation.
   This PR enhances the PartitionUpsertMetadataManager to preserve the latest 
ingested record if multiple records have the same timestamp:
   - If 2 records are not in the same segment, preserve the one in the segment 
with larger sequence number
   - If 2 records are in the same segment, preserve the one with larger docId
   
   Note that for tables with sorted column, the records will be re-ordered when 
committing the segment, and we will use the re-ordered docIds instead of the 
ingestion order to decide which record to preserve.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to