Jackie-Jiang commented on code in PR #12037: URL: https://github.com/apache/pinot/pull/12037#discussion_r1410118166
########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java: ########## @@ -323,11 +347,13 @@ static class RecordLocation { private final IndexSegment _segment; private final int _docId; private final Comparable _comparisonValue; + private final boolean _isDeletedRecord; Review Comment: Ideally we don't want to add field here since this will add 1 byte per entry. Currently each `RecordLocation` is 4 bytes, and this will make it 5 bytes. ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java: ########## @@ -240,6 +243,26 @@ public void doRemoveExpiredPrimaryKeys() { persistWatermark(_largestSeenComparisonValue); } + @Override + public void doRemoveExpiredDeletedKeys() { + double threshold = _largestSeenComparisonValue - _deletedKeysTTL; + AtomicInteger numDeletedKeys = new AtomicInteger(); + _primaryKeyToRecordLocationMap.forEach((primaryKey, recordLocation) -> { Review Comment: Looping over all entries is a very expensive operation, so we should either only filter all deleted records (using bitmap to get them), or combine it with `doRemoveExpiredPrimaryKeys()` and only loop once -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org