Hi guys, I'm wondering about experiences with a large number of feeds created and managed on a single Kafka cluster. Specifically, if anyone can share information about how many different feeds they have on their kafka cluster and overall throughput, that'd be cool.
Some background: I'm planning on setting up a system around Kafka that will (hopefully, eventually) have >10,000 feeds in parallel. I expect event volume on these feeds to follow a zipfian distribution. So, there will be a long-tail of smaller feeds and some large ones, but there will be consumers for each of these feeds. I'm trying to decide between relying on Kafka's feeds to maintain the separation between the data streams, or if I should actually create one large aggregate feed and utilize Kafka's partitioning mechanisms along with some custom logic to keep the feeds separated. I prefer to use Kafka's built-in feed mechanisms, cause there are significant benefits to that, but I can also imagine a world where that many feeds was not in the base assumptions of how the system would be used and thus questionable around performance. Any input is appreciated. --Eric