GitHub user StormLord07 edited a discussion: Guaranteeing reply delivery to the 
same instance in KeyShared subscription without per-instance topics

Hello,

I’m trying to implement a request/response pattern between two programs using 
Pulsar, but I want to avoid:

* Creating a large number of per-instance topics or subscriptions
* Duplicating messages to all instances

# Idea

* **Program A** may have multiple instances.

  * All share the same KeyShared subscription on a topic.
  * Each instance has its own logical “key” (e.g., `key_1`, `key_2`, …) and 
consumes only messages routed to it by message key hash range.
  * When sending a request, it uses a producer and sets a message property 
`reply_id = key_1` to indicate where to send the response.

* **Program B** also has multiple instances.

  * Consumes from the same request topic.
  * When replying, it publishes a message with `key = key_1`.

---

# Problem

If Program B publishes with `key = key_1`, can we guarantee that the message 
will be routed back to the *same instance* of Program A that originally sent 
the request?
Assumptions:

* Using KeyShared subscription mode
* Key stays consistent (`key_1` maps to the same consumer)
* Auto-split or sticky hash range policy is used

---

# Reasoning

* ConsumerExclusive: Apps starting up are **not guaranteed** to have freed the 
previous subscription (consumer name still considered active by the broker). 
This forces us to recreate new subscriptions each time, which would lead to 
topic/subscription bloat over time. replication tends to slow things down a 
lot. since sometimes the long disconnected exclusive consumers tend to stick 
and replicate.
* If we try to use ConsumerShared, we risk the response being routed to an 
instance of the program that is in the process of shutting down, resulting in 
lost or undeliverable replies.
* ConsumerFailover seems like a good idea, but I am a bit fuzzy about its 
working. And it seems its even worse, since the message will most certanly be 
lost to the old instance since it's a master at the moment. 

And the other problem is we can't exactly know which instance is being 
restarted, but that i may be wrong, I do not exactly know how  they are being 
restarted.

---

Edit:

I’ve implemented STICKY consumers with a very narrow range. For example, I 
calculate:

```c
int32_t range = (murmur3_32(<key>) % 65536) + 1;
```

Then I assign the range as `{range, range}`, ensuring that messages with the 
same key always land on the same consumer.

However, as the number of program instances increases, we run into the birthday 
paradox: the probability of collisions grows rapidly. When that happens, the 
resulting bugs will be extremely difficult to diagnose.

# TLDR
The question is whether the KeyShared hash-range mapping is stable enough 
between producer and consumer keys to make this safe, or whether we still need 
per-instance topics/subscriptions for strict targeting. Or is there a 
better/suggested way?

GitHub link: https://github.com/apache/pulsar/discussions/24616

----
This is an automatically sent email for commits@pulsar.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@pulsar.apache.org

Reply via email to