Guy, I am adding a FAQ to the website. Here is the content.
My consumer seems to have stopped, why?First, try to figure out if the consumer has really stopped or is just slow, using our tool ConsumerOffsetChecker. bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group consumer-group1 --zkconnect zkhost:zkport --topic topic1 consumer-group1,topic1,0-0 (Group,Topic,BrokerId-PartitionId) Owner = consumer-group1-consumer1 Consumer offset = 70121994703 = 70,121,994,703 (65.31G) Log size = 70122018287 = 70,122,018,287 (65.31G) Consumer lag = 23584 = 23,584 (0.00G) If consumer offset is not moving after some time, then consumer is likely to have stopped. If consumer offset is moving, but consumer lag (difference between the end of the log and the consumer offset) is increasing, the consumer is slower than the producer. If the consumer is slow, the typical solution is to increase the degree of parallelism in the consumer. This may require increasing the number of partitions of a topic. If a consumer has stopped, one of the typical causes is that the application code that consumes messages somehow died and therefore killed the consumer thread. We recommend using a try/catch clause to log all Throwable in the consumer logic. Thanks, Jun On Thu, Jul 5, 2012 at 6:48 AM, Guy Doulberg <guy.doulb...@conduit.com>wrote: > Hi guys, > > I am running a kafka cluster with 3 brokers (0.7.0). > > I have a 2 consumer-groups on the same topic, > > One consumer -group is working fine (meaning it never stops consuming), > > Unfortunately the other consumer-group - which contains one consumer, is > consuming until is suddenly stops... > > In the logs of that consumer or the brokers, I can't find anything that > can indicates why it stopped consuming. > > As far as I know, there is no re-balancing in the consumer (also there is > one consumer), > I read about bug > https://issues.apache.org/**jira/browse/KAFKA-256<https://issues.apache.org/jira/browse/KAFKA-256>that > was fixed at 0.7.1, but I am not sure it is relevant to my case, since > there is no re-balancing > > > Any ideas what I can do here? > > Thanks > Guy Doulberg >