Re: Topic Partitioning Strategy For Large Data

2014-05-25 Thread Drew Goya
A few things I've learned: 1) Don't break things up into separate topics unless the data in them is truly independent. Consumer behavior can be extremely variable, don't assume you will always be consuming as fast as you are producing. 2) Keep time related messages in the same partition. Again

Re: Topic Partitioning Strategy For Large Data

2014-05-23 Thread Joel Koshy
Take a look at: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIchoosethenumberofpartitionsforatopic? On Fri, May 23, 2014 at 12:49:39PM -0700, Bhavesh Mistry wrote: > Hi Kafka Users, > > > > We are trying to transport 4TB data per day on single topic. It is > operation applica

Topic Partitioning Strategy For Large Data

2014-05-23 Thread Bhavesh Mistry
Hi Kafka Users, We are trying to transport 4TB data per day on single topic. It is operation application logs.How do we estimate number of partitions and partitioning strategy? Our goal is to drain (from consumer side) from the Kafka Brokers as soon as messages arrive (keep the lag as min