Hi!
*I posted this to google groups and then the message somehow disappeared, I
will send it again here. Sorry for the duplication.*
I am checking out pulsar for using it as our events bus, and it's awesome!
Our services (written in nodejs) requirements that they need to listen to
multiple tenants (or all tenants - we have 10k tenants, and it's growing)
and the list of tenants can change dynamically at runtime (changes are not
that frequent, we can have 200/300 changes max at a day).
Pulsar sounds like an excellent fit for this because I can create topic per
tenant, like "tenant:XX:events" (XX = tenant id) and use shared
subscription for consumer groups.
As I said, the list of tenants needed to be subscribed all consumers in a
group gets a message (it's broadcasted via Redis pub/sub).
I am not sure what is the best solution to implement this, I see I have two
options:
- Client-side: consumer receives a tenant he needs listening to, and he
adds the topic to the shard subscription - sounds a like a right solution,
but:
- Since all consumers will add the same topic at the same time - is
there any issues with this? Or I need to make sure it happens
once, so only
one consumer mutates the shared subscription?
- There are consumers (small fraction, but important ones) that needs
to listen to all events - this makes the subscription consume
all topics -
is it makes sense in terms of performance? Attaching subscription to 10k+
topics?
- Functions: I thought about creating a function that will have a list
of application subscriptions (not pulsar subscription) and will listen to
the main topic called "events" (or to all tenant topics? not sure how to
implement this with function) and will route the events based on
subscriptions to service topic. For example, service named "users" will
have "users-service" topic and the function will route all events to
"users-service" topic. This sounds like a good solution as well, but:
- I am not sure where functions are running, if they are running as a
separate container we will have massive traffic waste - I see there is
threaded option to run the function - is the function runs inside pulsar?
So I don't have traffic waste?
- Is this overkill for functions?
- Storing of application subscriptions - I can save them inside our
database, and I see I can store them inside pulsar state tables - what is
most preferred here?
- Once I want to listen to more topic - Should I notify the function
somehow to reload the list of subscriptions (since I will cache it) OR I
need to implement some refresh timer?
Hopefully, this makes sense! If you have any questions and want me to
elaborate, please let me know!
If you want me to ask in other places (like Slack) or somewhere else, let
me know and I will ask their instead.