Todd Palino <mailto:tpal...@gmail.com>
April 9, 2015 at 11:58 AM
1000s of partitions should not be a problem at all. Our largest clusters
have over 30k partitions in them without a problem (running on 40
brokers).
We've run into some issues when you have more than 4000 partitions (either
leader or replica) on a single broker, but that was on older code so there
may be less of an issue now. You'll want to keep an eye on your retention
settings, combined with the number of open file handles allowed for your
broker process. We run with the limit set to 200k right now so we have
plenty of headroom.
The 100k consumers I'm not as sure about. So we have active clusters that
have over 250k open network connections across all the brokers combined
(about 12-15k per broker), but most of those connections are
producers, not
consumers. While the brokers themselves may be able to handle the
number of
consumers, especially if you horizontally scale a bit and make sure
you use
a high enough partition count so you don't get hot brokers, that's not
where I think you'll hit a problem. It's actually Zookeeper that will give
you the headache, and it will be hard to see it.
Zookeeper has a default limit of 1 MB as the size of the data in a znode.
This is usually fine, although some of the cluster commands like partition
moves and preferred replica election can hit it if you have a high number
of topics. What is less understood is that the list of child nodes of the
znode must ALSO fit inside that limit. So if you have 100k consumers, and
each group name is at least 10 letters long (don't forget overhead for a
list!), you'll blow the limit for the /consumers node. We actually ran
into
this in one of our ZK clusters for a different application. It only only
caused ZK to fail, it caused corruption of the snapshots in the ensemble.
Now, you could conceivably up the limit in Zookeeper (you need to set it
the same on the servers and the clients of Zookeeper), but I think you're
going to run into other problems. Possibly with Zookeeper, with the amount
of traffic you'll get from those consumers, and also from Kafka itself not
handling the number of consumers well or hitting previously unknown race
conditions.
Now, as far as your model goes, I think you should rethink it a little. We
have a similar model in place that we're in the process of getting rid of
for reading metrics out of Kafka. All the servers that store metrics
in RRD
files consume ALL the metrics data, and then they throw out everything
that
they don't have an RRD for. It's not only inefficient, it magnifies any
increase in incoming traffic many-fold on the consume side. We nearly took
down a cluster at one point because we had a 1.5 MB/sec increase in
traffic
on the produce side that turned into a 100-fold increase on the consume
side. Kafka can be part of your system, but I think you should use a layer
between Kafka and the consumers to route the messages properly if that's
the way you're going to go. A queue solution that would consume the data
out of Kafka once, and separate it out into buckets with no retention to
then be pulled by your customers.
Another solution is to use keyed partitioning, if it is possible with your
architecture, to bucket the userids into separate partitions. That way you
could have the customers just consume the bucket they are interested
in. It
would require more up front work to come up with the custom partitioner,
but it would be very efficient as you move forwards.
-Todd
Ralph Caraveo <mailto:decka...@gmail.com>
April 8, 2015 at 10:35 PM
Hello Kafka Friends,
We are considering a use-case where we'd like to have a Kafka Cluster with
potentially 1000's of partitions using a hashed key on customer userids.
We have heard that Kafka can support 1000's of partitions in a single
cluster and I wanted to find out if it's reasonable to have that many
partitions?
Additionally, we'd like to have potentially 100,000's of consumers
that are
consuming a somewhat low volume of log data from these partitions. And I'd
also like to know if having that many consumers is reasonable with
Kafka or
recommended.
The scenario would be something like we have 100,000 to 200,000 customers
where we'd like to have their data sharded by userid into a cluster of say
4000 partitions. And then we'd like to have a consumer running for each
userid that is consuming the log data.
In this scenario we'd have (assuming 100,000 userids)
100,000/4000 = 25 consumers per partition where each consumer would be
reading each offset and ignoring whatever key is not related to the
assigned userid that it is consuming from.
My gut feeling with all of this tells me that this may not be a sound
solution because we'd need to have a ton of file descriptors open and
there
could be a lot of overhead on Kafka managing this volume of consumers.
Any guidance is appreciated...mainly I'm just looking to see if this a
reasonable use of Kafka or if we need to go back to the drawing board.
I appreciate any help!
-Ralph