poorbarcode opened a new pull request, #16841:
URL: https://github.com/apache/pulsar/pull/16841

   ### Motivation
   
   If the meta-ledger fails to be initialized when mark delete is executed, a 
timeout response of the 20s will occur in 1/1000 probability. You can reproduce 
it by doing this: "Run unit test `ManagedCursorTest.markDeleteWithErrors` 1000 
times".
   
   When the problem occurs, the actual execution process is as follows:
   
   | Time | `cursor mark deleted` | `meta thread` |
   | -----------  | ----------- | ----------- |
   | 1 | check meta-ledger state | |
   | 2 | do create ledger |  |
   | 3 |  | create ledger fail |
   | 4 |  | loop pending requests, and fail callback | 
   | 5 | append to pending requests queue |  | 
   | 6 |  waiting callback...  |  |
   | 7 |  after the 20s...  |  |
   | 8 |  timeout ex |  |
   
   - Each column means the individual threads.
   - Column Time is used only to indicate the order of each step, not the 
actual time.
   - The important steps are explained below:
   
   step-4: If the ledger fails to be created, will trigger a "fail back" for 
the pending requests, and the requests that have not been queued will be 
ignored.
   
   
https://github.com/apache/pulsar/blob/c217b8f559292fd34c6a4fb4b30aab213720d962/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java#L2570-L2577
   
   step-5 ( <strong>High light</strong> ): If the meta ledger needs to be 
created, create ledger will be triggered first and the current request will be 
put into the `pending requests queue`. It is possible that step 4 has been 
completed before the request is put into the queue, so this request will not 
get the callback anymore.
   
   
https://github.com/apache/pulsar/blob/c217b8f559292fd34c6a4fb4b30aab213720d962/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java#L1870-L1876
   
   ### Modifications
   
   When should create a ledger, make `append to pending requests queue` and 
`create ledger` execute serially.
   
   ### Documentation
   
   Check the box below or label this PR directly.
   
   Need to update docs? 
   
   - [ ] `doc-required` 
   (Your PR needs to update docs and you will update later)
     
   - [x] `doc-not-needed` 
   (Please explain why)
     
   - [ ] `doc` 
   (Your PR contains doc changes)
   
   - [ ] `doc-complete`
   (Docs have been already added)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to