So as I noted, it really does depend on what you need. In the case of a
small number of topics, I would say to make the number of partitions be a
multiple of the number of brokers. That will balance them in the cluster,
while still giving you some freedom to have larger partition counts for
larger topics.

-Todd

On Wed, Apr 8, 2015 at 9:29 AM, Akshat Aranya <aara...@gmail.com> wrote:

> Thanks for the info, Todd.  This is very useful.  Please see my question
> inline:
>
> On Mon, Apr 6, 2015 at 10:24 AM, Todd Palino <tpal...@gmail.com> wrote:
>
> >
> >     - Partition count (leader and follower combined) on each broker
> should
> > stay under 4000
> >
> > As far as topic volume goes, it varies widely. We have topics that only
> see
> > a single message per minute (or less). Our largest topic by bytes has a
> > peak rate of about 290 Mbits/sec. Our largest topic by messages has a
> peak
> > rate of about 225k messages/sec. Note that those are in the same cluster.
> > When we are sizing topics (number of partitions), we use the following
> > guidelines:
> >     - Have at least as many partitions as there are consumers in the
> > largest group
> >     - Keep partition size on disk under 50GB per partition (better
> balance)
> >     - Take into account any other application requirements (keyed
> messages,
> > specific topic counts required, etc.)
> >
> >  What would you say is a recommended configuration when you don't have
> too
> many topics?  It seems like having too many partitions is not recommended,
> but at the same time, you need more partitions to be able to utilize all
> the disks and handle the data rate, especially for high volume topics.
>
> I hope this helps. I'll be covering some of this at my ApacheCon talk
> > (Kafka at Scale: Multi-Tier Architectures) and at the meet up that Jun
> has
> > set up at ApacheCon. If you have any questions, just ask!
> >
> > -Todd
> >
> >
> > On Mon, Apr 6, 2015 at 9:35 AM, Rama Ramani <rama.ram...@live.com>
> wrote:
> >
> > > Hello,
> > >           I am trying to understand some of the common Kafka deployment
> > > sizes ("small", "medium", "large") and configuration to come up with a
> > set
> > > of common templates for deployment on Linux. Some of the Qs to answer
> > are:
> > >
> > > - Number of nodes in the cluster
> > > - Machine Specs (cpu, memory, number of disks, network etc.)
> > > - Speeds & Feeds of messages
> > > - What are some of the best practices to consider when laying out the
> > > clusters?
> > > -  Is there a sizing calculator for coming up with this?
> > >
> > > If you can please share pointers to existing materials or specific
> > details
> > > of your deployment, that will be great.
> > >
> > > Regards
> > > Rama
> > >
> >
>

Reply via email to