lhotari opened a new pull request #9764: URL: https://github.com/apache/pulsar/pull/9764
### Motivation Currently `ConcurrentOpenHashMap.forEach` keeps a read lock in each section for the duration of the processing. This can lead to various issues when I/O operations are done in the processing function. This is a common patter in Pulsar code base. The I/O usually happens to Zookeeper and it leads to threads piling up in waiting for locks in ConcurrentOpenHashMap when there are both reads and writes for the same section in concurrently executing threads. It is also possible that long living locks lead to dead locks. This is the current implementation of `ConcurrentOpenHashMap.forEach` https://github.com/apache/pulsar/blob/34ca8938ca13ea60b05d164c27d9755855caf87c/pulsar-common/src/main/java/org/apache/pulsar/common/util/collections/ConcurrentOpenHashMap.java#L157-L161 and the referenced `Section.forEach` https://github.com/apache/pulsar/blob/34ca8938ca13ea60b05d164c27d9755855caf87c/pulsar-common/src/main/java/org/apache/pulsar/common/util/collections/ConcurrentOpenHashMap.java#L356-L395 ### Modifications - Add a new method `ConcurrentOpenHashMap.forEachInSnapshot` which makes a copy of the entries in each section before processing the entries. - Replace all current usages of `forEach` with `forEachInSnapshot` There are tradeoffs involved. Some memory allocations will have to be made to make a copy. However, allocations aren't a real issue. JVMs are really efficient for such short time allocations and the allocations aren't causing new bottlenecks or performance issues. The other tradeoff is a related to data consistency. The entry might have already been removed from the collection at the time it is processed. This is also a non-issue in most cases since it would be a bad solution in the first place to determine state based on the membership in the collection. Since the previous `forEach` implementation remains, it's possible to choose between the usage of `forEach` or `forEachInSnapshot` in each specific use case. If this feature isn't needed, it is possible to replace the `forEach` implemention with the snapshotting implemention completely. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
