kezhuw opened a new issue #7490:
URL: https://github.com/apache/pulsar/issues/7490


   **Describe the bug**
   When adding new entry, bookie failure and zookeeper disconnection may cause 
inconsistency between `LedgerInfo.getEntries` and 
`LedgerMetadata.getLastEntryId`. This will introduce duplicated entry.
   
   **To Reproduce**
   1. Adding `entry-a` to `ledger1`.
   2. Succeed to persist on disk, but failed to response due to, say, machine 
crash.
   3. Zookeeper disconnected, thus fail to write metadata and closing ledger.
   4. Report `ledgerClosed` to `ManagedLedger`.
   5. Zookeeper reconnected.
   6. Roll to `ledger2`, `entry-a` added success.
   7. `ManagedLedger` does not count `entry-a` in `LedgerInfo.getEntries`, but 
`entry-a` does count as `LedgerMetadata.getLastEntryId` and 
`LedgerHandle.getLastAddConfirmed` after recovery.
   8. `ManagedCursor` does not use `LedgerInfo.getEntries` to restrict its 
reading.
   9. `LedgerOffloader` does not use `LedgerInfo.getEntries` either.
   
   I have add test case to reproduce this: 
https://github.com/kezhuw/pulsar/commit/bc0de5e9e110d931340ce6fd2a85911892630f33.
   
   Strictly speaking, I think `OpAddEntry.handleAddTimeoutFailure` has this 
issue too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to