Re: [PR] BP-66: support throttling for zookeeper read of rereplication [bookkeeper]

via GitHub Sun, 28 Apr 2024 19:10:52 -0700


thetumbled commented on PR #4258:
URL: https://github.com/apache/bookkeeper/pull/4258#issuecomment-2081781109


   > You've mentioned 400 bookies in the cluster. In such configuration (and in 
general) I'd recommend to not run autorecovery on every bookie, and not run it 
as a part of a bookie process. I'd go as far as to call it the best practice.
   > 
   > E.g. when autorecovery needs to run it will compete for resources with the 
bookie, potentially OOMing it (though that has been improved over the years 
IIRC) etc. Normally one does not need 400 AR services anyway. i'd run 3, maybe 
5 as a separate processes even if they are running on a subset of bookie nodes 
(better - just separately).
   > 
   > With 400 AR you also getting into the case when they frequently collide 
trying to grab ledger for rereplication from ZK and backoff/wait, thus many of 
the AR services won't be productive anyway.
   > 
   > With all that in mind, you can tune dedicated AR service to have stricter 
settings for some of the existing throttles, such as:
   > 
   > ```
   > zkRequestRateLimit
   > auditorMaxNumberOfConcurrentOpenLedgerOperations
   > rereplicationEntryBatchSize
   > ```
   > 
   > and others
   > 
   > See detailed descriptions in 
https://github.com/apache/bookkeeper/blob/master/conf/bk_server.conf and in the 
corresponding code for the configs.
   > 
   > I think this should cover your usecase without any changes unless I have 
missed some nuanced point.
   
   Deploy a dedicated cluster for AR is a solution to relieve the pressure of 
zk. But we prefer to solve this problem without adding complexity of the 
cluster and the difficulty of maintenance. And this pip fix our problem pretty 
well without any negative effect.
   As for the concern about the collision of replicator, i have studied this 
issue before. When the replicator try to acquire a task, replicators will 
shuffling the znode list before iterate it, so we do not meet such problem yet.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] BP-66: support throttling for zookeeper read of rereplication [bookkeeper]

Reply via email to