divijvaidya commented on PR #13078:
URL: https://github.com/apache/kafka/pull/13078#issuecomment-1385362966

   **Proposal: Using concurrent map for ProducerStateManager.producers 
(currently a mutable.Map)**
   
   **Pros**
   - Simplifies code (prevents future bugs by accidental update to map without 
updating metrics) to implement KIP-847
   - Does not have any locking concerns since calculation of size() in a 
concurrent map does't require locks. [From Java docs for 
`concurrentMap`](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/ConcurrentHashMap.html):
   > Bear in mind that the results of aggregate status methods including size, 
isEmpty, and containsValue are typically useful only when a map is not 
undergoing concurrent updates in other threads. Otherwise the results of these 
methods reflect transient states that may be adequate for monitoring or 
estimation purposes, but not for program control.
   
   **Cons**
   - The metrics for number of producers may be inaccurate when multiple 
threads are modifying the map concurrently. `producers` map is modified during 
`truncation` (when the replica is catching up) or when a producer is expired 
due to timeout or on every completed transaction. That means it is modified 
fairly frequently. Hence, possibility of the `producerId` metric displaying 
inaccurate (usually off-by-one since there are at max 2 threads modifying this 
map) information is fairly high.
   
   **My opinion**
   In cases when we have just one (or two) producers, this metric would be 
highly unreliable (not just stale) as it provides an "approximation" of size(). 
It is not un-common to produce data from limited set of producers (with a large 
number of consumers) and hence, I would incline towards sticking to current 
approach of keeping this metric accurate.
   
   Having said that, I don't have a strong opinion here. We can always come 
back and make this metric accurate if we find that inaccuracy is causing 
problems while troubleshooting. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to