Hi,

We’re evaluating Pulsar for something of an unusual use case in that we want to 
create a number of topics with a very large number of partitions (tens , or 
ideally even hundreds of thousands). The reasons here is that we want consumers 
to be able to seek efficiently to a given message key.  By hashing a given key 
to a given topic partition we can let consumers subscribe only to that 
partition and thus ignore the vast majority of other messages.

I’ve had a go at proof of concepting this with Pulsar, without much success. 
What happens is something like the following:

Environment:
Pulsar 2.7.3 configured with 10 brokers, 10 bookies and 5 zookeepers. 
PrevioUsly tested as handling 100k messages/sec on a topic with 100 partitions.

* Create a partitioned topic with 50k partitions
* Create a publisher using the Go library that publishes to the topic.
* Publisher tries to create 50k producers (this is done by the go library, in 
my code I am creating a single producer). I can see the log lines that 
producers are being created but after a minute or so they seem to disconnect. 
The publisher then seems to get itself into a state whereby it is trying to 
create 50k producers, but before it can do so they all disconnect and the cycle 
repeats.
* During the above I can see that both the brokers and the zookeepers are using 
high cpu.

Does anyone have any hints as to how I can achieve what I want here[1] or, 
alternatively confirm that Pulsar is the wrong tool for the job? I do realise 
that I could remodel the situation as having 50k topics each with a single 
partition, but I’m assuming that as far as pulsar is concerned these two 
situations are largely equivalent as an n-partition topic is modelled as n 
individual topics under the hood.

Thanks,

Chris

[1] where “doing what I want” could either be setting up pulsar to have topics 
with a large number of partitions or, more generally, some pattern that would 
allow consumers to be able to efficiently consume a given message key when the 
number of message keys is measured in the hundreds of thousand or even millions.

Reply via email to