Re: [PR] KAFKA-16229: Fix slow expired producer id deletion [kafka]

via GitHub Wed, 07 Feb 2024 21:32:30 -0800


jeqo commented on code in PR #15324:
URL: https://github.com/apache/kafka/pull/15324#discussion_r1482430278



##########
storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerStateManager.java:
##########
@@ -177,7 +177,7 @@ private void addProducerId(long producerId, 
ProducerStateEntry entry) {
     }
 
     private void removeProducerIds(List<Long> keys) {
-        producers.keySet().removeAll(keys);
+        keys.forEach(producers.keySet()::remove);

Review Comment:
   interesting, seems to be related. Attaching the flamegraph on high cpu 
utilization to spot the root method.
   
   Looking at `AbstractSet#remoteAll` implementation:
   
   ```
       public boolean removeAll(Collection<?> c) {
           Objects.requireNonNull(c);
           boolean modified = false;
   
           if (size() > c.size()) {
               for (Object e : c)
                   modified |= remove(e);
           } else {
               for (Iterator<?> i = iterator(); i.hasNext(); ) {
                   if (c.contains(i.next())) {
                       i.remove();
                       modified = true;
                   }
               }
           }
           return modified;
       }
   
   ```
   
   Seems that in my case it's hitting the second branch, as it's burning on 
`AbstractList#contains`.
   
   For the expiration removal to hit the second branch means the size of 
expired keys is the same as the size of producers (cannot be higher). This 
seems to be possible, as we have got this issue even when no producers were 
running (so no new producer ids created), but when rebalancing the cluster (ie. 
old producer id snapshots loaded). 
   
   In hindsight, the JDK implementation may have considered extending the first 
condition to include `c.size <= size()` scenario, as it will not depend on the 
collections `remove` implementation.
   On the other hand, if it would used a `HashSet keys` instead of current 
`ArrayList` type, it would not pretty much the same as the proposed fix.  
   
   btw, will be simplyfing the expression even further to:
   
   ```
           keys.forEach(producers::remove);
   ```
   
   both lead to same `HashMap#remove` implementation at the end.
   
   We could even consider: if the size of expired producer ids it's the same as 
all producer ids, then we could consider to clean it all up instead of 
removing, as the source of expired ids is the same as producer. Something like:
   
   ```
           if (keys.size() == producers.size()) {
               clearProducerIds();
           } else {
               keys.forEach(producers::remove);
               producerIdCount = producers.size();
           }
   ```
   
   but performance-wise, execution time is pretty much the same (linear, 
de-referencing each key) as to the fix version; and readability doesn't improve 
much.
   
   PS, using jdk17.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] KAFKA-16229: Fix slow expired producer id deletion [kafka]

Reply via email to