Jackie-Jiang commented on code in PR #12037:
URL: https://github.com/apache/pinot/pull/12037#discussion_r1410118166
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java:
##########
@@ -323,11 +347,13 @@ static class RecordLocation {
private final IndexSegment _segment;
private final int _docId;
private final Comparable _comparisonValue;
+ private final boolean _isDeletedRecord;
Review Comment:
Ideally we don't want to add field here since this will add 1 byte per
entry. Currently each `RecordLocation` is 4 bytes, and this will make it 5
bytes.
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java:
##########
@@ -240,6 +243,26 @@ public void doRemoveExpiredPrimaryKeys() {
persistWatermark(_largestSeenComparisonValue);
}
+ @Override
+ public void doRemoveExpiredDeletedKeys() {
+ double threshold = _largestSeenComparisonValue - _deletedKeysTTL;
+ AtomicInteger numDeletedKeys = new AtomicInteger();
+ _primaryKeyToRecordLocationMap.forEach((primaryKey, recordLocation) -> {
Review Comment:
Looping over all entries is a very expensive operation, so we should either
only filter all deleted records (using bitmap to get them), or combine it with
`doRemoveExpiredPrimaryKeys()` and only loop once
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]