wuzhanpeng edited a comment on pull request #12202:
URL: https://github.com/apache/pulsar/pull/12202#issuecomment-929973940


   > I doubt the write lock maybe not the root cause of this issue.
   > 
   > When checking `isProducersExceeded`, it will load policy data from zk if 
topic policy not configured. Once the first thread call `isProducersExceeded`, 
the policy will be cached, and the following check won't be blocked.
   > 
   > So in my opinion, you'd better check the zk read latency or there are 
something wrong in `ZooKeeperDataCache`. You can also the stack about 
ZooKeeperDataCache.
   
   Thank you for your reminder~ 
   
   We also checked why the caching strategy of ns policy did not take effect. 
The actual situation is that when the broker gets into a loop waiting problem, 
every time the `ZooKeeperDataCache#get` times out, it will invalidate the 
z-path by the way. In this way, the next time you get the ns strategy, you 
still have to get data from zk. Therefore, once a problem occurs, it is 
difficult to cache successfully and then get out of the predicament.
   
   There are many ways to break the deadlock condition in this scenario. 
However, IMHO reducing the use of locks may be a more thorough solution. After 
all, if the producers of shared mode accounts for most of the topics, the 
existence of this lock itself is also reducing the overall performance. In 
addition, the logic involved in the cache layer is extensive, and avoiding 
modifying the current cache design may be a more secure solution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to