divijvaidya commented on PR #13078: URL: https://github.com/apache/kafka/pull/13078#issuecomment-1385362966
**Proposal: Using concurrent map for ProducerStateManager.producers (currently a mutable.Map)** **Pros** - Simplifies code (prevents future bugs by accidental update to map without updating metrics) to implement KIP-847 - Does not have any locking concerns since calculation of size() in a concurrent map does't require locks. [From Java docs for `concurrentMap`](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/ConcurrentHashMap.html): > Bear in mind that the results of aggregate status methods including size, isEmpty, and containsValue are typically useful only when a map is not undergoing concurrent updates in other threads. Otherwise the results of these methods reflect transient states that may be adequate for monitoring or estimation purposes, but not for program control. **Cons** - The metrics for number of producers may be inaccurate when multiple threads are modifying the map concurrently. `producers` map is modified during `truncation` (when the replica is catching up) or when a producer is expired due to timeout or on every completed transaction. That means it is modified fairly frequently. Hence, possibility of the `producerId` metric displaying inaccurate (usually off-by-one since there are at max 2 threads modifying this map) information is fairly high. **My opinion** In cases when we have just one (or two) producers, this metric would be highly unreliable (not just stale) as it provides an "approximation" of size(). It is not un-common to produce data from limited set of producers (with a large number of consumers) and hence, I would incline towards sticking to current approach of keeping this metric accurate. Having said that, I don't have a strong opinion here. We can always come back and make this metric accurate if we find that inaccuracy is causing problems while troubleshooting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org