Jackie-Jiang commented on code in PR #12037:
URL: https://github.com/apache/pinot/pull/12037#discussion_r1410118166


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java:
##########
@@ -323,11 +347,13 @@ static class RecordLocation {
     private final IndexSegment _segment;
     private final int _docId;
     private final Comparable _comparisonValue;
+    private final boolean _isDeletedRecord;

Review Comment:
   Ideally we don't want to add field here since this will add 1 byte per 
entry. Currently each `RecordLocation` is 4 bytes, and this will make it 5 
bytes.



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java:
##########
@@ -240,6 +243,26 @@ public void doRemoveExpiredPrimaryKeys() {
     persistWatermark(_largestSeenComparisonValue);
   }
 
+  @Override
+  public void doRemoveExpiredDeletedKeys() {
+    double threshold = _largestSeenComparisonValue - _deletedKeysTTL;
+    AtomicInteger numDeletedKeys = new AtomicInteger();
+    _primaryKeyToRecordLocationMap.forEach((primaryKey, recordLocation) -> {

Review Comment:
   Looping over all entries is a very expensive operation, so we should either 
only filter all deleted records (using bitmap to get them), or combine it with 
`doRemoveExpiredPrimaryKeys()` and only loop once



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to