Jackie-Jiang opened a new pull request #6213: URL: https://github.com/apache/incubator-pinot/pull/6213
## Description For upsert table, the record with newer timestamp will replace the old record with older timestamp, but when multiple records have the same timestamp, which record to preserve is undefined in the current implementation. This PR enhances the PartitionUpsertMetadataManager to preserve the latest ingested record if multiple records have the same timestamp: - If 2 records are not in the same segment, preserve the one in the segment with larger sequence number - If 2 records are in the same segment, preserve the one with larger docId Note that for tables with sorted column, the records will be re-ordered when committing the segment, and we will use the re-ordered docIds instead of the ingestion order to decide which record to preserve. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
