Re: Rebalance not happening even after increasing max retries causing conflict in ZK

2014-11-07 Thread Mohit Kathuria
Hi all, Can someone help here. We are getting constant rebalance failure each time a consumer is added beyond a certain number. Did quite a lot of debugging on this and still not able to figure out the pattern. -Thanks, Mohit On Mon, Nov 3, 2014 at 10:53 PM, Mohit Kathuria

Re: Disactivating Yammer Metrics Monitoring

2014-11-07 Thread François Langelier
Good to know! Thanks Jason, I'll look at it ASAP! :) François Langelier Étudiant en génie Logiciel - École de Technologie Supérieure http://www.etsmtl.ca/ Membre Club Capra http://capra.etsmtl.ca/ VP-Communication - CS Games http://csgames.org 2014 Jeux de Génie http://www.jdgets.com/ 2011 à

Interrupting controlled shutdown breaks Kafka cluster

2014-11-07 Thread Solon Gordon
Hi all, My team has observed that if a broker process is killed in the middle of the controlled shutdown procedure, the remaining brokers start spewing errors and do not properly rebalance leadership. The cluster cannot recover without major manual intervention. Here is how to reproduce the

Re: corrupt recovery checkpoint file issue....

2014-11-07 Thread Guozhang Wang
Jun, Checking the OffsetCheckpoint.write function, if fileOutputStream.getFD.sync throws exception it will just be caught and forgotten, and the swap will still happen, may be we need to catch the SyncFailedException and re-throw it as a FATAIL error to skip the swap. Guozhang On Thu, Nov 6,

Re: OffsetOutOfRange errors

2014-11-07 Thread Guozhang Wang
Hi Jim, When messages gets cleaned based on data retention policy (by time or by size), the brokers will not inform ZK for the deletion event. The underlying assumption is that when consumers are fetching data at around the tail of the log (i.e. they are not much lagging, which is normal cases)

Re: OffsetOutOfRange errors

2014-11-07 Thread Jimmy John
The current setting is to commit to ZK every 100 messages read. The read buffer size is 262144 bytes. So we will read in a bunch of messages in a batch. And while iterating through those messages, we commit the offset to ZK every 100. jim On Fri, Nov 7, 2014 at 10:13 AM, Guozhang Wang

Re: OffsetOutOfRange errors

2014-11-07 Thread Guozhang Wang
When would you read offsets from ZK, only when starting up? Also what is your data retention config values on the broker? Guozhang On Fri, Nov 7, 2014 at 10:30 AM, Jimmy John jj...@livefyre.com wrote: The current setting is to commit to ZK every 100 messages read. The read buffer size is

Re: OffsetOutOfRange errors

2014-11-07 Thread Jason Rosenberg
The bottom line, is you are likely not consuming messages fast enough, so you are falling behind. So, you are steadily consuming older and older messages, and eventually you are consuming messages older than the retention time window set for your kafka broker. That's the typical scenario for

Re: Interrupting controlled shutdown breaks Kafka cluster

2014-11-07 Thread Guozhang Wang
Solon, Which version of Kafka are you running and are you enabling auto leader rebalance at the same time? Guozhang On Fri, Nov 7, 2014 at 8:41 AM, Solon Gordon so...@knewton.com wrote: Hi all, My team has observed that if a broker process is killed in the middle of the controlled shutdown

Re: Interrupting controlled shutdown breaks Kafka cluster

2014-11-07 Thread Solon Gordon
We're using 0.8.1.1 with auto.leader.rebalance.enable=true. On Fri, Nov 7, 2014 at 2:35 PM, Guozhang Wang wangg...@gmail.com wrote: Solon, Which version of Kafka are you running and are you enabling auto leader rebalance at the same time? Guozhang On Fri, Nov 7, 2014 at 8:41 AM, Solon

Re: Announcing Confluent

2014-11-07 Thread vipul jhawar
Best of luck. Will stay tuned to news. On Thu, Nov 6, 2014 at 11:58 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey all, I’m happy to announce that Jun Rao, Neha Narkhede and I are creating a company around Kafka called Confluent. We are planning on productizing the kind of Kafka-based

Add partitions with replica assignment in same command

2014-11-07 Thread Allen Wang
I am trying to figure out how to add partitions and assign replicas using one admin command. I tried kafka.admin.TopicCommand to increase the partition number from 9 to 12 with the following options: /apps/kafka/bin/kafka-run-class.sh kafka.admin.TopicCommand --zookeeper ${ZOOKEEPER} --alter