wolfstudy opened a new pull request, #3799:
URL: https://github.com/apache/bookkeeper/pull/3799

   Descriptions of the changes in this PR:
   
   
   
   ### Motivation
   
   In the current implementation, when the Session between Bookie and ZK 
expires, Auto Recovery does not support automatic reconnection. However, the 
bookie server itself supports reconnection logic, so when Auto Recovery and 
Bookie Server are deployed together, once the Session expires with ZK, Auto 
Recovery will automatically stop and cause the entire Bookie service to go down.
   
   However, in a production environment, session expiration between Bookie and 
ZK itself is a high-frequency operation, such as the following scenario:
   
   
   From the ZK side, we can see that the current ZK zxid has overflowed, 
triggering XidRolloverException:
   
   
![image](https://user-images.githubusercontent.com/20965307/220550723-4a40cdb4-6fed-417b-9799-32fa624119e0.png)
   
![image](https://user-images.githubusercontent.com/20965307/220550795-b69fc1b9-0b39-4af0-b04f-00a3cdc3adc2.png)
   
   When the above situation occurs, we can see that the Session between Bookie 
and ZK expires:
   
   
![image](https://user-images.githubusercontent.com/20965307/220548677-0e754d4d-5d05-4204-ba43-5501795e7675.png)
   
![image](https://user-images.githubusercontent.com/20965307/220551410-bf1db659-966f-4d55-b709-b2e8f5da9ece.png)
   
   Since Auto Recovery has no reconnection logic, the entire Bookie service is 
down.
   
   ### Changes
   
   - Add retry logic for auto recovery
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to