wolfstudy opened a new pull request, #3799: URL: https://github.com/apache/bookkeeper/pull/3799
Descriptions of the changes in this PR: ### Motivation In the current implementation, when the Session between Bookie and ZK expires, Auto Recovery does not support automatic reconnection. However, the bookie server itself supports reconnection logic, so when Auto Recovery and Bookie Server are deployed together, once the Session expires with ZK, Auto Recovery will automatically stop and cause the entire Bookie service to go down. However, in a production environment, session expiration between Bookie and ZK itself is a high-frequency operation, such as the following scenario: From the ZK side, we can see that the current ZK zxid has overflowed, triggering XidRolloverException:   When the above situation occurs, we can see that the Session between Bookie and ZK expires:   Since Auto Recovery has no reconnection logic, the entire Bookie service is down. ### Changes - Add retry logic for auto recovery -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
