poorbarcode opened a new pull request, #20128:
URL: https://github.com/apache/pulsar/pull/20128

   ### Motivation
   
   **Background of Deduplication:** Every message will have an attribute 
`sequence-id`, and Broker will mark the last `sequence-id` for each producer. 
If a producer sends a message with a smaller `sequence-id`, it will be rejected.
   
   ---
   
   **Background of Geo-Replication:** The replicator works like this:  read 
messages from the source cluster; send messages to the target cluster; if there 
has something error, Replicator will rewind these messages; loop to the next 
reading.
   
   **Background of ManagedCursor:** ManagedCursor marks the next position which 
will read from BK, and we call it `readPosition`, and update after a read is 
complete. For example: 
   1. initialize `readPosition` with `0.`
   2. read 10 messages
   3. update `readPosition` to `10.`
   
   --- 
   
   <strong>(Highlight)</strong>Therefore, if messages are sent out of order, 
many messages will be discarded. for example: if send `1,2,3,5,4`, then the 
message `4` will be discarded by duplication check.
   
   ---
   
   **There have two scenarios that make messages out of order**
   - scenario-1
     - read messages from the original cluster
     - read messages from the original cluster
     - receive messages `[1,2,3]`(the callback of the first read)
     - send messages `[1,2,3]` to the remote cluster, but something is wrong. 
Replicator trigger `rewind` and try to send again.
     - receive messages `[4,5]`(the callback of the second read)
     - send messages `[4,5]` to remote cluster.
     - receive messages `[1,2,3]` after `rewind`.
   
   - scenario-2
   
   | time | `thread rewind` | `thread reading messages`|
   | -- | --- | --- |
   | 1 | receive messages `[3:0 ~ 4:0]` |  |
   | 1 |  | async read start(`readPosition=4:0, markDeleted=3:0`) |
   | 2 | rewind `readPosition` (` --> 3:0`)` | |
   | 3 |  | read complete, set `readPosition`(` --> 5:0`) |
   | 4 | receive messages `[4:0 ~ 5.0]` |  |
   | 5 | the messages `3:0 ~ 4:0` will nolanger be received until next `rewind` 
|
   
   ### Modifications
   
   Make the replicator processes messages only sequentially
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update 
later -->
   - [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->
   
   ### Matching PR in forked repository
   
   PR in forked repository: 
   - https://github.com/poorbarcode/pulsar/pull/87


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to