rdhabalia opened a new pull request #2535: Add ledger op timeout to avoid 
topics stuck on ledger-creation
URL: https://github.com/apache/incubator-pulsar/pull/2535
 
 
   ### Motivation
   
   Frequently, we are having issue where many topics across the all brokers get 
stuck on creating-ledger state and create-ledger-callback never gets completed. 
Right now, it mainly happens when 
   1.  Anytime ZK leader restarts
   2. ZK quorum restarts due to [32 bit 
rollover](https://jira.apache.org/jira/browse/ZOOKEEPER-1278)
   In such cases, zk-client doesn't complete the callback and broker keeps 
waiting with creating-ledger state until we unload the topic. It happens 
frequently to us and it's hard to catch such stuck topics and it requires 
roll-over broker restart.
   So, Broker should have a way to timeout LedgerOp (create/delete) and 
complete callback with timeout-exception.
   
   ### Modifications
   
   Add Ledger-Op timeout configuration, and managed-ledger fails callback if 
ledger-creation doesn't complete in that time duration.
   
   ### Result
   
   Broker can recover stuck managed-ledger which are waiting on ledger creation 
callback.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to