thetumbled commented on PR #4258: URL: https://github.com/apache/bookkeeper/pull/4258#issuecomment-2081781109
> You've mentioned 400 bookies in the cluster. In such configuration (and in general) I'd recommend to not run autorecovery on every bookie, and not run it as a part of a bookie process. I'd go as far as to call it the best practice. > > E.g. when autorecovery needs to run it will compete for resources with the bookie, potentially OOMing it (though that has been improved over the years IIRC) etc. Normally one does not need 400 AR services anyway. i'd run 3, maybe 5 as a separate processes even if they are running on a subset of bookie nodes (better - just separately). > > With 400 AR you also getting into the case when they frequently collide trying to grab ledger for rereplication from ZK and backoff/wait, thus many of the AR services won't be productive anyway. > > With all that in mind, you can tune dedicated AR service to have stricter settings for some of the existing throttles, such as: > > ``` > zkRequestRateLimit > auditorMaxNumberOfConcurrentOpenLedgerOperations > rereplicationEntryBatchSize > ``` > > and others > > See detailed descriptions in https://github.com/apache/bookkeeper/blob/master/conf/bk_server.conf and in the corresponding code for the configs. > > I think this should cover your usecase without any changes unless I have missed some nuanced point. Deploy a dedicated cluster for AR is a solution to relieve the pressure of zk. But we prefer to solve this problem without adding complexity of the cluster and the difficulty of maintenance. And this pip fix our problem pretty well without any negative effect. As for the concern about the collision of replicator, i have studied this issue before. When the replicator try to acquire a task, replicators will shuffling the znode list before iterate it, so we do not meet such problem yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@bookkeeper.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org