BewareMyPower opened a new issue, #23215: URL: https://github.com/apache/pulsar/issues/23215
### Search before asking - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) and found nothing similar. ### Motivation There was a discussion at Sep. 2023 before to replace Customized Map with ConcurrentOpenHashMap. In this issue, I'd focus on the `ConcurrentOpenHashMap`. Here is the only advantage of this map. > Provides similar methods as a {@code ConcurrentMap<K,V>} but since it's an open hash map with linear probing, no node allocations are required to store the values. Let me count the disadvantages. ### 1. Bad performance This map was aded in the initial commit (on 2016). However, this implementation was just based on the Java 7 implementation of the `ConcurrentHashMap`, which uses a segment based lock. Actually, this solution was discarded by the Java team since Java 8. https://github.com/apache/pulsar/pull/20647/#issuecomment-1607257960 did a benchmark and found the performance was much worse than the current `ConcurrentHashMap` provided by Java library. We can also search the PROs of the Java 8 design in network, or just ask for ChatGPT. Besides, the frequently used `keys()` and `values()` methods just copy the keys and values to a new list. While the `ConcurrentHashMap` just returns a thread-safe internal view that users can choose whether to make a copy. Anyway, to prove the performance is worse than `ConcurrentHashMap`, we need to have more tests and research. So it's the least important reason. ### 2. Lack of the updates This class was rarely updated. What I can remember is the shrink support two years ago. https://github.com/apache/pulsar/pull/14663 From https://github.com/apache/bookkeeper/pull/3061, we can see the motivation is the frequently appeared Full GC caused by this implementation. However, adding a `shrink` method makes it harder to use. There are already many parameters to tune, see it's builder: ```java public static class Builder<K, V> { int expectedItems = DefaultExpectedItems; int concurrencyLevel = DefaultConcurrencyLevel; float mapFillFactor = DefaultMapFillFactor; float mapIdleFactor = DefaultMapIdleFactor; float expandFactor = DefaultExpandFactor; float shrinkFactor = DefaultShrinkFactor; boolean autoShrink = DefaultAutoShrink; ``` Many `xxxFactor`s and the concurrency level. It's hard to determine a proper value by default. However, it makes new developers hard to modify it. ### 3. Bad debug experience When I debugged the topics maintained in a `BrokerService`. <img width="1031" alt="image" src="https://github.com/user-attachments/assets/5dfb13d9-7897-429e-ab06-4888595a06c1"> As you can see. There are 16 sections. And I have to iterate over all these sections and expand the `table` array to find the target topic. Let's compare it with the official `ConcurrentHashMap` (I replaced it locally) <img width="1211" alt="image" src="https://github.com/user-attachments/assets/49fbb5ba-7cb0-4225-922f-737af90ef5a7"> ### 4. Not friendly to new developers Many places just use it as a concurrent hash map. **What's the reason for new developers to not use the official `ConcurrentHashMap`, which is developed and consistently improved by a professional team?** Just to reduce the node allocation? With the improving JVM GC? As I've mentioned, this class might be introduced at the Java 7 era. Now the minimum required Java version of broker side is 17. We have ZGC. We have Shenandoah GC. We have many more JVM developers developing better GC. I'm suspecting if the advantage makes sense. I cannot think of a reason to choose this hard-to-maintain class rather than well-maintained official `ConcurrentHashMap`. For example, when I maintained a KoP, I encountered the deadlock of `ConcurrentLongHashMap` (maybe the similar implementation). https://github.com/streamnative/kop/pull/620 And it's hard to know if this case is fixed. So I have to switch to the official `ConcurrentHashMap`. ### Solution Replace `ConcurrentOpenHashMap` with the official Java `ConcurrentHashMap`. ### Alternatives N/A ### Anything else? N/A ### Are you willing to submit a PR? - [X] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
