Re: Shutdown/Ctrl-C and ConsumerRebalanceFailedException

Neha Narkhede Fri, 20 Jul 2012 15:57:29 -0700

Sam,

That seems like a bug. If you can reproduce it with Kafka 0.7.1, would you
mind filing a bug and attaching a test case ?


Thanks,
Neha

On Wed, Jul 18, 2012 at 12:04 PM, Sam William <sa...@stumbleupon.com> wrote:

> Neha,
>  Here is the full stack trace
>
> org.I0Itec.zkclient.exception.ZkNoNodeException:
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> NoNode for
> /consumers/sampd-rate2/ids/sampd-rate2_sv4r25s49-1342637842023-a4b442c4
>    at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
>         at
> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
>         at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
>         at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
>         at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:750)
>         at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:744)
>         at kafka.utils.ZkUtils$.readData(ZkUtils.scala:163)
>         at
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$getTopicCount(ZookeeperConsumerConnector.scala:421)
>         at
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:460)
>         at
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:437)
>         at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78)
>         at
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:433)
>         at
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.handleChildChange(ZookeeperConsumerConnector.scala:375)
>         at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
>         at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode = NoNode for
> /consumers/sampd-rate2/ids/sampd-rate2_sv4r25s49-1342637842023-a4b442c4
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
>         at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
>         at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
>         at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
>         at
> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
>
>
>
> On Jul 17, 2012, at 3:03 PM, Neha Narkhede wrote:
>
> > Sam,
> >
> > Please could you send around the entire stack trace for that exception ?
> It
> > means that the consumer couldn't complete a rebalancing operation and it
> is
> > possible that the consumer is not pulling all the data for the requested
> > topics.
> >
> > Thanks,
> > Neha
> >
> > On Tue, Jul 17, 2012 at 1:29 PM, Sam William <sa...@stumbleupon.com>
> wrote:
> >
> >>
> >> On Mar 20, 2012, at 8:49 AM, Neha Narkhede wrote:
> >>
> >>> Peter,
> >>>
> >>>>> If this exception is thrown, will the consumer the intelligently wait
> >> for the rebalancing to complete? and then resume consumption?
> >>>
> >>> If this exception is thrown, it means that the consumer has failed the
> >>> current rebalancing attempt and will try only when one of the
> >>> following happens -
> >>>
> >>> 1. New partitions are added to the topic it is consuming
> >>> 2. Existing partitions become unavailable
> >>> 3. New consumer instances are brought up for the consumer group it
> >> belongs to
> >>> 4. Existing consumer instances die for the consumer group it belongs to
> >>>
> >>> Until that, the consumer is not fully functional. So, this particular
> >>> exception should be monitored and the consumer instance should be
> >>> restarted.
> >>>
> >>> Having said that, it is pretty rare for the consumer to run out of
> >>> rebalancing attempts. One of the common causes is using zookeeper
> >>> 3.3.3 which causes older ephemeral nodes to be retained.
> >>> Which version of Kafka are you using ?
> >>> Would you mind attaching the entire log for the consumer. It will help
> >>> us debug the cause of this exception and see if it is an actual bug.
> >>>
> >>> Thanks,
> >>> Neha
> >>>
> >>>
> >>
> >>
> >> Neha,
> >>   I see this exception
> >>
> >> 2012-07-17 12:58:12,238 ERROR
> >>
> [ZkClient-EventThread-17-11.zookeeper.,12.zookeeper.,13.zookeeper.,14.zookeeper.,16.zookeeper./kafka]
> >> zkclient.ZkEventThread Error handling event ZkEvent[Children of
> >> /consumers/live-event-sense-new8/ids changed sent to
> >> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener@6d9dd520
> ]
> >> java.lang.RuntimeException:
> >> live-event-sense-new8_sv4r25s49-1342554132312-c04abfef can't rebalance
> >> after 4 retires
> >>
> >>
> >> occurring very often.  I use ZK 3.4.3.    Im not handling/monitoring
> this
> >> exception . The consumer seems to continue just fine after this
> happens.  I
> >> do not see any on the 4 conditions you mentioned happening. Am I missing
> >> something ?
> >>
> >> Thanks,
> >> Sam
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>> On Tue, Mar 20, 2012 at 2:42 AM, Peter Thygesen <pt.activ...@gmail.com
> >
> >> wrote:
> >>>> When I shutdown my consumer with crtl-c and tries to restart it
> quickly
> >>>> afterwards, I usually get ConsumerRebalanceFailedException (see
> below).
> >> The
> >>>> application then seems to hang.. or at least I'm sure if it is running
> >> any
> >>>> more.. If this exception is thrown, will the consumer the
> intelligently
> >>>> wait for the rebalancing to complete? and then resume consumption?
> >>>>
> >>>> I found a page
> >> https://cwiki.apache.org/KAFKA/consumer-co-ordinator.htmlthat
> >>>> describes something about Consumer Co-ordinator.. according to this
> >>>> the consumer
> >>>> group remains in this state until the next rebalancing attempt is
> >>>> triggered. But when is it triggered?
> >>>>
> >>>> Could a shutdown hock with a consumer.commitOffsets help?
> >>>> Does the consumer.shutdown implicit commitOffsets?
> >>>>
> >>>>
> >>>> Exception in thread "main"
> >> kafka.common.ConsumerRebalanceFailedException:
> >>>> contentItem-consumer-group-1_cphhdfs01node09-1332175323213-e6a3010f
> >> can't
> >>>> rebalance after 4 retries
> >>>>       at
> >>>>
> >>
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:467)
> >>>>       at
> >>>>
> >>
> kafka.consumer.ZookeeperConsumerConnector.consume(ZookeeperConsumerConnector.scala:204)
> >>>>       at
> >>>>
> >>
> kafka.javaapi.consumer.ZookeeperConsumerConnector.createMessageStreams(ZookeeperConsumerConnector.scala:75)
> >>>>       at
> >>>>
> >>
> kafka.javaapi.consumer.ZookeeperConsumerConnector.createMessageStreams(ZookeeperConsumerConnector.scala:89)
> >>>>       at
> >>>>
> >>
> com.infopaq.research.repository.uima.ContentItemClient.consume(ContentItemClient.java:75)
> >>>>       at
> >>>>
> >>
> com.infopaq.research.repository.uima.ContentItemClient.main(ContentItemClient.java:111)
> >>>>
> >>>>
> >>>> Brgds,
> >>>> Peter Thygesen
> >>>>
> >>>> BTW: Great work, very interesting project.
> >>
> >> Sam William
> >> sa...@stumbleupon.com
> >>
> >>
> >>
> >>
>
> Sam William
> sa...@stumbleupon.com
>
>
>
>

Re: Shutdown/Ctrl-C and ConsumerRebalanceFailedException

Reply via email to