vineeth1995 opened a new pull request, #19605: URL: https://github.com/apache/pulsar/pull/19605
### Motivation This completes https://github.com/apache/pulsar/issues/16551 and extension to this part-1 pr https://github.com/apache/pulsar/pull/17962 This handles Replicator and message ordering guarantee part for blue-green deployment. **Replicator and message ordering handling** A. Incoming replication messages from other region's replicator producers to Blue cluster This will not impact ordering messages coming from the other regions to blue/green cluster. After marking blue cluster, blue cluster will reject replication writes from remote regions and redirects remote producers to the Green cluster where new messages will be written. Consumers of Blue clusters will only be redirected to green once they received all messages from blue. So, migration gives an ordering guarantee for messages replicating from remote regions. B. Outgoing replication messages from Blue cluster's replicator producers to other regions The broker can give an ordering guarantee in this case with the trade-off of topic unavailability until the blue cluster replicates all existing published messages in the blue cluster before the topic gets terminated. Blue cluster marks topic terminated and migrated Topic will not redirect producers/consumers until all the replicators reaches end of topic and replicates all messages to remote regions. Topic will send TOPIC_UNAVAILABLE message to producers/consumers so, they can keep retrying until replicators reach to end of topics. Broker disconnects all the replicators and delete them once they reach end of topic. Broker start sending migrated-command to producer/consumers to redirect clients to green cluster. ### Modifications - Handle producers so that message ordering is guaranteed when topic has been migrated but replication backlog still exists. Example use case: producer1 sends messages msg1, msg2 -> region1 region1 replicator -> msg1 ->region2 but region2 has a connectivity issue with region1 as a result region1 has a replication backlog msg2 with region2 Marked blue-green region1 -> region1A If you redirect producer1 to region1A producer1 sends msg3 to region1A region1A is connected to region2 region1A sends msg3 to region2 Meanwhile if region1 gets it's connection back to region2 region1 sends msg2(replication backlog) to region2 region2 consumer consumes in the order msg1, msg3, msg2 which is a wrong order of messages as it should be msg1, msg2, msg3 ---------- So we don't want to redirect producer1 until Replicator has no backlog. This pr handles this use case by making sure replication backlog is drained before redirecting the producers to green cluster. ### Verifying this change Added end t end test to verify this change. ### Does this pull request potentially affect one of the following parts: *If the box was checked, please highlight the changes* - [ ] Dependencies (add or upgrade a dependency) - [ ] The public API - [ ] The schema - [ ] The default values of configurations - [ ] The threading model - [ ] The binary protocol - [ ] The REST endpoints - [ ] The admin CLI options - [ ] Anything that affects deployment ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [ ] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [x] `doc-complete` <!-- Docs have been already added --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
