wuzhanpeng edited a comment on pull request #12202: URL: https://github.com/apache/pulsar/pull/12202#issuecomment-929973940
> I doubt the write lock maybe not the root cause of this issue. > > When checking `isProducersExceeded`, it will load policy data from zk if topic policy not configured. Once the first thread call `isProducersExceeded`, the policy will be cached, and the following check won't be blocked. > > So in my opinion, you'd better check the zk read latency or there are something wrong in `ZooKeeperDataCache`. You can also the stack about ZooKeeperDataCache. Thank you for your reminder~ We also checked why the caching strategy of ns policy did not take effect. The actual situation is that when the broker gets into a loop waiting problem, every time the `ZooKeeperDataCache#get` times out, it will invalidate the z-path by the way. In this way, the next time you get the ns strategy, you still have to get data from zk. Therefore, once a problem occurs, it is difficult to cache successfully and then get out of the predicament. There are many ways to break the deadlock condition in this scenario. However, IMHO reducing the use of locks may be a more thorough solution. After all, if the producers of shared mode accounts for most of the topics, the existence of this lock itself is also reducing the overall performance. In addition, the logic involved in the cache layer is extensive, and avoiding modifying the current cache design may be a more secure solution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
