xiang092689 commented on issue #17812:
URL: https://github.com/apache/pulsar/issues/17812#issuecomment-1647335052

   We also encountered this problem in version 2.11.0. (w:2,r:2,a:1)
   
   the scene is we reboot a machine which lives 1 bookie and 1 broker
   after rebalance, new topic owner open metadata ledger
   
   1. cursor recover failed by LedgerRecoveryException 
   bk client open metadata ledger failed by (-8,0) then return 
LedgerRecoveryException (-10)
   pulsar catch the exception and initialize cursor with earliest position 
recorded in zookeeper and set cursor stat as NoLedger
   the cursor will close because recover failed
   when cursor close, broker will persist md position to zk if cursor stat is 
not closed or closing.
   however, there is an additional action, because open metadata cursor failed, 
cursor initialize cursorledger as null, cursor ledger in zookeeper will set as 
-1 at the same time.
   
   2. reset cursor to the earliest
   let's go to next recover round.
   broker will create a new cursor ledger with the earliest position which is 
persisted when cursor close in step 1.
   
   here is the whole story
   
   i think broker process is fine, but i think there should be some tolerance 
when we meet LedgerRecoveryException.
   bookeeper client return LedgerRecoveryException when the rc is not "timeout" 
and "authenticate failed" which covers too much exception and make the problem 
reproduce easier.
   
   Actually, i don't know how to fix it properly
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to