GitHub user lhotari added a comment to the discussion: Questions regarding pulsar active-active geo-replication
I'm sorry I missed the reference to the [active-active replication docs](https://pulsar.apache.org/docs/3.2.x/concepts-replication/#active-active-replication) in your question. Thanks for the follow up. > If only once, then how is the subscription being tracked across clusters > without subscription replication? It seems that the example in the documentation is missing that detail. If there wouldn't be subscription replication, the subscriptions would be completely independent. > Eg. A client application in its service url added the URLs of both cluster A > and cluster B as comma separated values. This detail makes the scenario active-passive from the application (consumers/producers) point of view. The Pulsar client and its consumer would connect to only one cluster at a time. This is needed for consistent usage of replicated subscriptions. As I mentioned in my previous message, the behaviour isn't consistent when the replicated subscription is actively used in more than one cluster at a time. Even with replicated subscriptions, the diagram doesn't make full sense to me since there are two separate consumers C1 and C2 in the diagram. When there are 2 service URLs for the client, it would connect to the first cluster that is available and this would be the correct way to use replicated subscriptions. There are important limitations for replicated subscriptions. For at-least-once messaging with a consumer for a replicated subscription consuming only on one cluster at a time, this is usually fine when delayed messages aren't used. The main limitation of replicated subscription is that only the "mark delete" position is replicated. Any individually "deleted" (acknowledged) messages will be ignored. This is explained in [Penghui's presentation at 1:12:26](https://www.youtube.com/watch?v=17jQIOVeu4s&t=1h12m26s). Naturally, batch index acknowledgements aren't supported either. Delayed messages prevent the mark delete position from moving forward until the delayed message has been delivered and acknowledged. This is why delayed messages together with replicated subscriptions isn't a good solution if the large amount of duplicates are a problem when the consumer switches to consume from the other cluster. The current documentation for geo replication needs improvements so that it wouldn't cause surprises and unrealistic expectations. Contributions to improve the docs are more than welcome to clarify the points that you have brought up in your questions. GitHub link: https://github.com/apache/pulsar/discussions/22315#discussioncomment-8882348 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
