Re: Upper-bound on number of consumers

Wes Chow Thu, 09 Apr 2015 10:29:01 -0700

We run with 1000's of partitions too and that's fine. However youshouldn't expect to be able to run a consumer per user under your model.Each of your consumers would be discarding most of the data they'reingesting. In fact, the would be throwing away 24 times more data thanwhat they process.

Here's the way we approach capacity planning. We look at a topic'soverall message rate, so suppose this is 1,000 messages / sec. Then wefigure we don't want to run more than 1 consumer instance per partition,otherwise we'll be wasting bandwidth. Then you estimate a reasonableprocessing rate. Suppose you expect the slowest application of thistopic will be able to work on 10 messages / sec in a single process.Then this means you need 1,000 / 10 = 100 partitions. Maybe double thatin case your 10 messages / sec is off base, etc.

Note this assumes that your consumer is CPU bound. If it is not (forinstance, it may make a database call per message), then you need tomake sure IO can keep up with 1k messages / sec, otherwise it doesn'tmatter what your partitioning scheme is at all.

Wes

Todd Palino <mailto:tpal...@gmail.com>
April 9, 2015 at 11:58 AM
1000s of partitions should not be a problem at all. Our largest clusters

have over 30k partitions in them without a problem (running on 40brokers).

We've run into some issues when you have more than 4000 partitions (either
leader or replica) on a single broker, but that was on older code so there
may be less of an issue now. You'll want to keep an eye on your retention
settings, combined with the number of open file handles allowed for your
broker process. We run with the limit set to 200k right now so we have
plenty of headroom.

The 100k consumers I'm not as sure about. So we have active clusters that
have over 250k open network connections across all the brokers combined

(about 12-15k per broker), but most of those connections areproducers, notconsumers. While the brokers themselves may be able to handle thenumber ofconsumers, especially if you horizontally scale a bit and make sureyou use

a high enough partition count so you don't get hot brokers, that's not
where I think you'll hit a problem. It's actually Zookeeper that will give
you the headache, and it will be hard to see it.

Zookeeper has a default limit of 1 MB as the size of the data in a znode.
This is usually fine, although some of the cluster commands like partition
moves and preferred replica election can hit it if you have a high number
of topics. What is less understood is that the list of child nodes of the
znode must ALSO fit inside that limit. So if you have 100k consumers, and
each group name is at least 10 letters long (don't forget overhead for a

list!), you'll blow the limit for the /consumers node. We actually raninto

this in one of our ZK clusters for a different application. It only only
caused ZK to fail, it caused corruption of the snapshots in the ensemble.

Now, you could conceivably up the limit in Zookeeper (you need to set it
the same on the servers and the clients of Zookeeper), but I think you're
going to run into other problems. Possibly with Zookeeper, with the amount
of traffic you'll get from those consumers, and also from Kafka itself not
handling the number of consumers well or hitting previously unknown race
conditions.

Now, as far as your model goes, I think you should rethink it a little. We
have a similar model in place that we're in the process of getting rid of

for reading metrics out of Kafka. All the servers that store metricsin RRDfiles consume ALL the metrics data, and then they throw out everythingthat

they don't have an RRD for. It's not only inefficient, it magnifies any
increase in incoming traffic many-fold on the consume side. We nearly took

down a cluster at one point because we had a 1.5 MB/sec increase intraffic

on the produce side that turned into a 100-fold increase on the consume
side. Kafka can be part of your system, but I think you should use a layer
between Kafka and the consumers to route the messages properly if that's
the way you're going to go. A queue solution that would consume the data
out of Kafka once, and separate it out into buckets with no retention to
then be pulled by your customers.

Another solution is to use keyed partitioning, if it is possible with your
architecture, to bucket the userids into separate partitions. That way you

could have the customers just consume the bucket they are interestedin. It

would require more up front work to come up with the custom partitioner,
but it would be very efficient as you move forwards.

-Todd



Ralph Caraveo <mailto:decka...@gmail.com>
April 8, 2015 at 10:35 PM
Hello Kafka Friends,

We are considering a use-case where we'd like to have a Kafka Cluster with
potentially 1000's of partitions using a hashed key on customer userids.
We have heard that Kafka can support 1000's of partitions in a single
cluster and I wanted to find out if it's reasonable to have that many
partitions?

Additionally, we'd like to have potentially 100,000's of consumersthat are

consuming a somewhat low volume of log data from these partitions. And I'd

also like to know if having that many consumers is reasonable withKafka or

recommended.

The scenario would be something like we have 100,000 to 200,000 customers
where we'd like to have their data sharded by userid into a cluster of say
4000 partitions. And then we'd like to have a consumer running for each
userid that is consuming the log data.

In this scenario we'd have (assuming 100,000 userids)

100,000/4000 = 25 consumers per partition where each consumer would be
reading each offset and ignoring whatever key is not related to the
assigned userid that it is consuming from.

My gut feeling with all of this tells me that this may not be a sound

solution because we'd need to have a ton of file descriptors open andthere

could be a lot of overhead on Kafka managing this volume of consumers.

Any guidance is appreciated...mainly I'm just looking to see if this a
reasonable use of Kafka or if we need to go back to the drawing board.

I appreciate any help!

-Ralph

Re: Upper-bound on number of consumers

Reply via email to