First you want to find out the reason for the frequent zookeeper session expirations. Few common causes -
1. Client side GC (turn on JVM GC logs to validate this) 2. Long I/O pauses on the zookeeper servers (grep for fsync in the zookeeper log4j log) 3. GC on the zookeeper server (again, turn on JVM GC logs to validate this) You would want to fix the root cause of session expirations. I'm not sure yet if you are hitting KAFKA-992 or not. Thanks, Neha On Sun, Jan 12, 2014 at 7:21 PM, Guozhang Wang <wangg...@gmail.com> wrote: > This might be a issue of KAFKA-992. The title only includes Server but this > bug actually applies to Consumers. > > Which version are you using? > > Guozhang > > > On Sun, Jan 12, 2014 at 8:00 AM, Seshadri, Balaji > <balaji.sesha...@dish.com>wrote: > > > Please see log. > > > > > seshbal:tm1mwdpl03-/apps/tc/tm1-deshdpconsumer101/apache-tomcat-7.0.42/logs > > >ls -1 catalina*|xargs -exec grep "Expired" > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,265} DEBUG > > [catalina-exec-3-EventThread] (ZkClient.java:351) - Received event: > > WatchedEvent state:Expired type:None path:null > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,265} DEBUG > > [catalina-exec-6-EventThread] (ZkClient.java:351) - Received event: > > WatchedEvent state:Expired type:None path:null > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,265} INFO > > [catalina-exec-6-EventThread] (ZkClient.java:449) - zookeeper state > > changed (Expired) > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,265} INFO > > [catalina-exec-3-EventThread] (ZkClient.java:449) - zookeeper state > > changed (Expired) > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,265} DEBUG > > [catalina-exec-3-EventThread] (ZkEventThread.java:88) - New event: > > ZkEvent[State changed to Expired sent to > > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@152e95fa > > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,265} DEBUG > > [catalina-exec-6-EventThread] (ZkEventThread.java:88) - New event: > > ZkEvent[State changed to Expired sent to > > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@4090755c > > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,265} DEBUG [ > > ZkClient-EventThread-55-tvip-m1-mw-zookeeper.dish.com:2181] > > (ZkEventThread.java:69) - Delivering event #22 ZkEvent[State changed to > > Expired sent to > > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@152e95fa > > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,265} DEBUG [ > > ZkClient-EventThread-90-tvip-m1-mw-zookeeper.dish.com:2181] > > (ZkEventThread.java:69) - Delivering event #23 ZkEvent[State changed to > > Expired sent to > > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@4090755c > > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,267} DEBUG > > [catalina-exec-10-EventThread] (ZkClient.java:351) - Received event: > > WatchedEvent state:Expired type:None path:null > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,267} INFO > > [catalina-exec-10-EventThread] (ZkClient.java:449) - zookeeper state > > changed (Expired) > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,267} DEBUG > > [catalina-exec-5-EventThread] (ZkClient.java:351) - Received event: > > WatchedEvent state:Expired type:None path:null > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,267} DEBUG > > [catalina-exec-10-EventThread] (ZkEventThread.java:88) - New event: > > ZkEvent[State changed to Expired sent to > > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@1901a132 > > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,267} INFO > > [catalina-exec-5-EventThread] (ZkClient.java:449) - zookeeper state > > changed (Expired) > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,267} DEBUG [ > > ZkClient-EventThread-162-tvip-m1-mw-zookeeper.dish.com:2181] > > (ZkEventThread.java:69) - Delivering event #24 ZkEvent[State changed to > > Expired sent to > > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@1901a132 > > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,267} DEBUG > > [catalina-exec-5-EventThread] (ZkEventThread.java:88) - New event: > > ZkEvent[State changed to Expired sent to > > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@10d1a0d2 > > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,267} DEBUG [ > > ZkClient-EventThread-79-tvip-m1-mw-zookeeper.dish.com:2181] > > (ZkEventThread.java:69) - Delivering event #25 ZkEvent[State changed to > > Expired sent to > > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@10d1a0d2 > > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,268} DEBUG > > [catalina-exec-8-EventThread] (ZkClient.java:351) - Received event: > > WatchedEvent state:Expired type:None path:null > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,268} INFO > > [catalina-exec-8-EventThread] (ZkClient.java:449) - zookeeper state > > changed (Expired) > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,269} DEBUG > > [catalina-exec-8-EventThread] (ZkEventThread.java:88) - New event: > > ZkEvent[State changed to Expired sent to > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@b977ea5 > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,269} DEBUG [ > > ZkClient-EventThread-128-tvip-m1-mw-zookeeper.dish.com:2181] > > (ZkEventThread.java:69) - Delivering event #28 ZkEvent[State changed to > > Expired sent to > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@b977ea5 > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,272} DEBUG > > [catalina-exec-2-EventThread] (ZkClient.java:351) - Received event: > > WatchedEvent state:Expired type:None path:null > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,273} INFO > > [catalina-exec-2-EventThread] (ZkClient.java:449) - zookeeper state > > changed (Expired) > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,273} DEBUG > > [catalina-exec-2-EventThread] (ZkEventThread.java:88) - New event: > > ZkEvent[State changed to Expired sent to > > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@376807ed > > ] > > catalina.out.2014-01-08_16-07-05:{2014-01-10 00:58:03,273} DEBUG [ > > ZkClient-EventThread-41-tvip-m1-mw-zookeeper.dish.com:2181] > > (ZkEventThread.java:69) - Delivering event #32 ZkEvent[State changed to > > Expired sent to > > > kafka.consumer.ZookeeperConsumerConnector$ZKSessionExpireListener@376807ed > > ] > > > > -----Original Message----- > > From: Seshadri, Balaji [mailto:balaji.sesha...@dish.com] > > Sent: Sunday, January 12, 2014 8:51 AM > > To: 'users@kafka.apache.org' > > Subject: RE: Looks like consumer fetchers get stopped we are not getting > > any data > > > > I do see ZkClient expired when this really happened. > > > > > > -----Original Message----- > > From: Seshadri, Balaji [mailto:balaji.sesha...@dish.com] > > Sent: Saturday, January 11, 2014 10:31 PM > > To: users@kafka.apache.org > > Subject: RE: Looks like consumer fetchers get stopped we are not getting > > any data > > > > Currently its set to default,whats the timeout should we increase to. > > ________________________________________ > > From: Guozhang Wang [wangg...@gmail.com] > > Sent: Saturday, January 11, 2014 3:53 PM > > To: users@kafka.apache.org > > Subject: Re: Looks like consumer fetchers get stopped we are not getting > > any data > > > > From the logs it seems the consumer's ZK registry log has lost, while > > KAFKA-693 is mainly due to server side issue. Could you check if there is > > a session timeout from the consumer on ZK log. > > > > Guozhang > > > > > > On Sat, Jan 11, 2014 at 2:33 PM, Seshadri, Balaji > > <balaji.sesha...@dish.com>wrote: > > > > > We found the below bug in Kafka JIRA. > > > > > > https://issues.apache.org/jira/browse/KAFKA-693 > > > > > > Could you guys please let us know if this bug is causing the fetchers > > > getting stooped because of rebalance failure,we are using 0.8-beta and > > > this fix seems to be in 0.8 version. > > > > > > The issue happens only when we run it in tomcat,when we use JUnits > > > with same group.id to test we receive the messages.It would be great > > > if you could help us confirm if it is a bug. > > > > > > Also we would like to know if we need to upgrade the server/client to > > > 0.8 or just server side code. > > > > > > > > > ________________________________________ > > > From: Rob Withers [robert.w.with...@gmail.com] > > > Sent: Friday, January 10, 2014 10:17 PM > > > To: users@kafka.apache.org > > > Subject: Re: Looks like consumer fetchers get stopped we are not > > > getting any data > > > > > > That was an interesting section too. Which GC settings would you > > suggest? > > > > > > Thank you, > > > - charlie > > > > > > > On Jan 10, 2014, at 10:11 PM, Jun Rao <jun...@gmail.com> wrote: > > > > > > > > Have you looked at our FAQ, especially > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretherem > > > anyrebalancesinmyconsumerlog > > > > ? > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > > > > On Fri, Jan 10, 2014 at 2:25 PM, Seshadri, Balaji > > > > <balaji.sesha...@dish.com>wrote: > > > > > > > >> Any clue would be helpful. > > > >> > > > >> -----Original Message----- > > > >> From: Seshadri, Balaji [mailto:balaji.sesha...@dish.com] > > > >> Sent: Friday, January 10, 2014 12:46 PM > > > >> To: users@kafka.apache.org > > > >> Subject: RE: Looks like consumer fetchers get stopped we are not > > > >> getting any data > > > >> > > > >> Yes rebalance begins and exceptions occurs. > > > >> > > > >> > > > >> {2014-01-10 00:58:11,293} INFO > > > >> > > > [account-info-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad59660 > > > b_watcher_executor] > > > >> (?:?) - [account-i > > > >> nfo-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad59660b], > > > >> Cleared > > > the > > > >> data chunks in all the consumer message iterators > > > >> {2014-01-10 00:58:11,293} INFO > > > >> > > > [account-info-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad59660 > > > b_watcher_executor] > > > >> (?:?) - [account-i > > > >> nfo-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad59660b], > > > Committing > > > >> all offsets after clearing the fetcher queues > > > >> {2014-01-10 00:58:11,298} DEBUG [catalina-exec-12-SendThread( > > > >> tvip-m1-mw-zookeeper.dish.com:2181)] (ClientCnxn.java:759) - Got > > > >> ping response for sessionid: 0x1437b2879870005 af ter 0ms > > > >> {2014-01-10 00:58:11,313} INFO > > > >> > > > [account-activated-hadoop-consumer_tm1mwdpl04-1389222557906-562b6738_w > > > atcher_executor] > > > >> (?:?) - > > > >> [account-activated-hadoop-consumer_tm1mwdpl04-1389222557906-562b673 > > > >> 8], > > > >> begin rebalancing consumer > > > >> account-activated-hadoop-consumer_tm1mwdpl04-1389222557906-562b6738 > > > >> try > > > #1 > > > >> {2014-01-10 00:58:11,314} DEBUG [catalina-exec-12-SendThread( > > > >> tvip-m1-mw-zookeeper.dish.com:2181)] (ClientCnxn.java:839) - > > > >> Reading reply sessionid:0x1437b2879870005, packet:: clientPath:null > > > serverPath:null > > > >> finished:false header:: 627,8 replyHeader:: 627,51539619966,0 > > > request:: > > > >> '/brokers/ids,F response:: v{'1} > > > >> {2014-01-10 00:58:11,315} DEBUG [catalina-exec-12-SendThread( > > > >> tvip-m1-mw-zookeeper.dish.com:2181)] (ClientCnxn.java:839) - > > > >> Reading reply sessionid:0x1437b2879870005, packet:: clientPath:null > > > serverPath:null > > > >> finished:false header:: 628,4 replyHeader:: 628,51539619966,0 > > > request:: > > > >> '/brokers/ids/1,F response:: > > > >> > > > #7b2022686f7374223a22746d312d6b61666b6162726f6b6572313031222c20226a6d7 > > > 85f706f7274223a393939392c2022706f7274223a393039322c202276657273696f6e2 > > > 23a31207d,s{47244644685,47244644685,1388537628753,1388537628753,0,0,0, > > > 163056791896588316,74,0,47244644685} > > > >> {2014-01-10 00:58:11,316} DEBUG [catalina-exec-12-SendThread( > > > >> tvip-m1-mw-zookeeper.dish.com:2181)] (ClientCnxn.java:839) - > > > >> Reading reply sessionid:0x1437b2879870005, packet:: clientPath:null > > > serverPath:null > > > >> finished:false header:: 629,4 replyHeader:: 629,51539619966,-101 > > > >> request:: > > > >> > > > '/consumers/account-activated-hadoop-consumer/ids/account-activated-ha > > > doop-consumer_tm1mwdpl04-1389222557906-562b6738,F > > > >> response:: > > > >> {2014-01-10 00:58:11,316} INFO > > > >> > > > [account-activated-hadoop-consumer_tm1mwdpl04-1389222557906-562b6738_w > > > atcher_executor] > > > >> (?:?) - > > > >> [account-activated-hadoop-consumer_tm1mwdpl04-1389222557906-562b673 > > > >> 8], > > > >> exception during rebalance > > > >> org.I0Itec.zkclient.exception.ZkNoNodeException: > > > >> org.apache.zookeeper.KeeperException$NoNodeException: > > > >> KeeperErrorCode = NoNode for > > > >> > > > /consumers/account-activated-hadoop-consumer/ids/account-activated-had > > > oop-consumer_tm1mwdpl04-1389222557906-562b6738 > > > >> at > > > >> > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47) > > > >> at > > > >> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685) > > > >> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766) > > > >> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761) > > > >> at kafka.utils.ZkUtils$.readData(Unknown Source) > > > >> at kafka.consumer.TopicCount$.constructTopicCount(Unknown > > Source) > > > >> at > > > >> > > > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$c > > > onsumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(Unk > > > nown > > > >> Source) > > > >> at > > > >> > > > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfu > > > n$syncedRebalance$1.apply$mcVI$sp(Unknown > > > >> Source) > > > >> at > > > scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:142) > > > >> at > > > >> > > > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedR > > > ebalance(Unknown > > > >> Source) > > > >> at > > > >> > > > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1 > > > .run(Unknown > > > >> Source) Caused by: > > org.apache.zookeeper.KeeperException$NoNodeException: > > > >> KeeperErrorCode = NoNode for > > > >> > > > /consumers/account-activated-hadoop-consumer/ids/account-activated-had > > > oop-consumer_tm1mwdpl04-1389222557906-562b6738 > > > >> at > > > >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:102) > > > >> at > > > >> org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > > > >> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) > > > >> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956) > > > >> at > > > org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103) > > > >> : > > > >> -----Original Message----- > > > >> From: Guozhang Wang [mailto:wangg...@gmail.com] > > > >> Sent: Friday, January 10, 2014 12:30 PM > > > >> To: users@kafka.apache.org > > > >> Subject: Re: Looks like consumer fetchers get stopped we are not > > > >> getting any data > > > >> > > > >> Actually, the broken channel is broken by shutting down the > > > >> leader-finder-thread, which is shutdown either by a rebalance retry > > > >> or shutting down the consumer. > > > >> > > > >> Do you see "begin rebalance ..." before this log entry? And if yes, > > > search > > > >> to see if the rebalance keep failing. > > > >> > > > >> Guozhang > > > >> > > > >> > > > >> On Fri, Jan 10, 2014 at 11:23 AM, Guozhang Wang > > > >> <wangg...@gmail.com> > > > >> wrote: > > > >> > > > >>> From your logs the channel with the brokers are broken, are the > > > >>> brokers alive at that time? > > > >>> > > > >>> Guozhang > > > >>> > > > >>> > > > >>> On Fri, Jan 10, 2014 at 10:52 AM, Withers, Robert > > > >>> <robert.with...@dish.com > > > >>>> wrote: > > > >>> > > > >>>> The core problem is our consumers stop consuming and lag > increases. > > > >>>> We found this blog: > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerse > > > emstohavestopped,why > > > >> ?. > > > >>>> This lists 3 possibilities. > > > >>>> > > > >>>> The blog also talks earlier about spurious rebalances, due to > > > >>>> improper GC settings, but we couldn't find what GC settings to > use. > > > >>>> We are considering changing the zookeeper timeouts. We are a > > > >>>> little confused about the various issues, the sequence of issues > > > >>>> and what could cause the consumers to stop reading. If the > > > >>>> fetchers get shutdown, due to a ClosedByInterruptException in the > > "leader_finder" > > > >>>> thread, which tells the "executor_watcher" thread to shutdown the > > > >>>> fetchers, that would be another reason the consumers stop > > > >>>> processing > > > >> data. Is this possible? > > > >>>> > > > >>>> Thank you, > > > >>>> rob > > > >>>> > > > >>>> -----Original Message----- > > > >>>> From: Seshadri, Balaji [mailto:balaji.sesha...@dish.com] > > > >>>> Sent: Friday, January 10, 2014 11:40 AM > > > >>>> To: users@kafka.apache.org > > > >>>> Subject: RE: Looks like consumer fetchers get stopped we are not > > > >>>> getting any data > > > >>>> > > > >>>> It would be helpful if you guys can shed some light why all > > > >>>> fetchers are getting stopped. > > > >>>> > > > >>>> -----Original Message----- > > > >>>> From: Seshadri, Balaji [mailto:balaji.sesha...@dish.com] > > > >>>> Sent: Friday, January 10, 2014 11:28 AM > > > >>>> To: users@kafka.apache.org > > > >>>> Subject: RE: Looks like consumer fetchers get stopped we are not > > > >>>> getting any data > > > >>>> > > > >>>> We also got the below error when this happens. > > > >>>> > > > >>>> {2014-01-10 00:58:11,292} INFO > > > >>>> > > > >>>> [account-info-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad > > > >>>> 5966 > > > >>>> 0b_watcher_executor] > > > >>>> (?:?) - > > > >>>> [account-info-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad > > > >>>> 5966 > > > >>>> 0b], > > > >>>> exception during rebalance > > > >>>> org.I0Itec.zkclient.exception.ZkNoNodeException: > > > >>>> org.apache.zookeeper.KeeperException$NoNodeException: > > > >>>> KeeperErrorCode = NoNode for > > > >> > > > /consumers/account-info-updated-hadoop-consumer/ids/account-info-updat > > > ed-hadoop-consumer_tm1mwdpl04-1389222553159-ad59660b > > > >>>> at > > > >>>> > > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47) > > > >>>> at > > > >>>> > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685) > > > >>>> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766) > > > >>>> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761) > > > >>>> at kafka.utils.ZkUtils$.readData(Unknown Source) > > > >>>> at kafka.consumer.TopicCount$.constructTopicCount(Unknown > > > >> Source) > > > >>>> at > > > >>>> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.ka > > > >>>> fka$ > > > >>>> consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalan > > > >>>> ce(U > > > >>>> nknown > > > >>>> Source) > > > >>>> at > > > >>>> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$a > > > >>>> nonf un$syncedRebalance$1.apply$mcVI$sp(Unknown > > > >>>> Source) > > > >>>> at > > > >>>> scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:142) > > > >>>> at > > > >>>> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.sy > > > >>>> nced > > > >>>> Rebalance(Unknown > > > >>>> Source) > > > >>>> at > > > >>>> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$a > > > >>>> non$ > > > >>>> 1.run(Unknown > > > >>>> Source) Caused by: > > > org.apache.zookeeper.KeeperException$NoNodeException: > > > >>>> KeeperErrorCode = NoNode for > > > >> > > > /consumers/account-info-updated-hadoop-consumer/ids/account-info-updat > > > ed-hadoop-consumer_tm1mwdpl04-1389222553159-ad59660b > > > >>>> at > > > >>>> > > org.apache.zookeeper.KeeperException.create(KeeperException.java:102) > > > >>>> at > > > >>>> > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > > > >>>> at > org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) > > > >>>> at > org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956) > > > >>>> at > > > >>>> org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103) > > > >>>> at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770) > > > >>>> at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766) > > > >>>> at > > > >>>> > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) > > > >>>> ... 9 more > > > >>>> > > > >>>> -----Original Message----- > > > >>>> From: Seshadri, Balaji [mailto:balaji.sesha...@dish.com] > > > >>>> Sent: Friday, January 10, 2014 10:52 AM > > > >>>> To: users@kafka.apache.org > > > >>>> Subject: Looks like consumer fetchers get stopped we are not > > > >>>> getting any data > > > >>>> > > > >>>> Please let us know why we are not getting any data from Kafaka > > > >>>> after this log from Kafka,can you guys lets us know. > > > >>>> > > > >>>> What could be causing all fetchers associated to be stooped why > > > >>>> it is not doing retry. > > > >>>> > > > >>>> {2014-01-10 00:58:09,284} WARN > > > >>>> > > > >>>> [account-info-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad > > > >>>> 5966 > > > >>>> 0b-leader-finder-thread] > > > >>>> (?:?) - Fetching topic metadata with correlation id 3 for topics > > > >>>> [Set(account-info-updated)] from broker > > > >>>> [id:1,host:tm1-kafkabroker101,port:9092] failed > > > >>>> java.nio.channels.ClosedByInterruptException > > > >>>> at > > > >> > > > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterru > > > ptibleChannel.java:202) > > > >>>> at > > > >> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:506) > > > >>>> at > > > java.nio.channels.SocketChannel.write(SocketChannel.java:493) > > > >>>> at kafka.network.BoundedByteBufferSend.writeTo(Unknown > > Source) > > > >>>> at kafka.network.Send$class.writeCompletely(Unknown Source) > > > >>>> at > > > >>>> kafka.network.BoundedByteBufferSend.writeCompletely(Unknown > > > >>>> Source) > > > >>>> at kafka.network.BlockingChannel.send(Unknown Source) > > > >>>> at kafka.producer.SyncProducer.liftedTree1$1(Unknown > Source) > > > >>>> at > > > >>>> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(U > > > >>>> nkno > > > >>>> wn > > > >>>> Source) > > > >>>> at kafka.producer.SyncProducer.send(Unknown Source) > > > >>>> at kafka.client.ClientUtils$.fetchTopicMetadata(Unknown > > Source) > > > >>>> at kafka.client.ClientUtils$.fetchTopicMetadata(Unknown > > Source) > > > >>>> at > > > >>>> kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(U > > > >>>> nkno > > > >>>> wn > > > >>>> Source) > > > >>>> at kafka.utils.ShutdownableThread.run(Unknown Source) > > > >>>> {2014-01-10 00:58:09,284} DEBUG > > > >>>> [account-access-hadoop-consumer_tm1mwdpl04-1389222551916-a0c87abc > > > >>>> _wat > > > >>>> cher_executor] > > > >>>> (?:?) - initial fetch offset of account-access:27: fetched offset > > > >>>> = > > > >> 9655: > > > >>>> consumed offset = 9655 is 9655 > > > >>>> {2014-01-10 00:58:09,284} DEBUG > > > >>>> [bill-generated-hadoop-consumer_tm1mwdpl04-1389222547995-29a6dce9 > > > >>>> _wat > > > >>>> cher_executor] > > > >>>> (?:?) - initial consumer offset of bill-generated:11: fetched > > > >>>> offset = > > > >> 152: > > > >>>> consumed offset = 152 is 152 > > > >>>> {2014-01-10 00:58:09,284} DEBUG > > > >>>> [outbound-communications-hadoop-consumer_tm1mwdpl04-1389222550693 > > > >>>> -8bc > > > >>>> 34b77_watcher_executor] > > > >>>> (?:?) - > > > >>>> [outbound-communications-hadoop-consumer_tm1mwdpl04-1389222550693 > > > >>>> -8bc > > > >>>> 34b77], > > > >>>> outbound-communications:108: fetched offset = 1689: consumed > > > >>>> offset = > > > >>>> 1689 selected new offset 1689 > > > >>>> {2014-01-10 00:58:09,284} DEBUG [catalina-exec-3-SendThread( > > > >>>> tvip-m1-mw-zookeeper.dish.com:2181)] (ClientCnxn.java:839) - > > > >>>> Reading reply sessionid:0x1434b49cf56383b, packet:: > > > >>>> clientPath:null serverPath:null finished:false header:: 279,4 > > replyHeader:: > > > >> 279,51539617506,0 request:: > > > >>>> '/consumers/outbound-call-attempted-hadoop-consumer/offsets/outbo > > > >>>> und- > > > >>>> call-attempted/14,F > > > >>>> response:: > > > >>>> #30,s{39860186414,39860186414,1387517714994,1387517714994,0,0,0,0 > > > >>>> ,1,0 > > > >>>> ,39860186414} > > > >>>> {2014-01-10 00:58:09,285} INFO > > > >>>> > > > >>>> [outbound-communications-hadoop-consumer_tm1mwdpl04-1389222550693 > > > >>>> -8bc > > > >>>> 34b77_watcher_executor] > > > >>>> (?:?) - > > > >>>> [outbound-communications-hadoop-consumer_tm1mwdpl04-1389222550693 > > > >>>> -8bc > > > >>>> 34b77], > > > >>>> outbound-communications-hadoop-consumer_tm1mwdpl04-1389222550693- > > > >>>> 8bc3 > > > >>>> 4b77-3 > > > >>>> attempting to claim partition 109 > > > >>>> {2014-01-10 00:58:09,284} DEBUG > > > >>>> [bill-generated-hadoop-consumer_tm1mwdpl04-1389222547995-29a6dce9 > > > >>>> _wat > > > >>>> cher_executor] > > > >>>> (?:?) - initial fetch offset of bill-generated:11: fetched offset > > > >>>> = > > > 152: > > > >>>> consumed offset = 152 is 152 > > > >>>> {2014-01-10 00:58:09,284} DEBUG [catalina-exec-12-SendThread( > > > >>>> tvip-m1-mw-zookeeper.dish.com:2181)] (ClientCnxn.java:839) - > > > >>>> Reading reply sessionid:0x1437b2879870005, packet:: > > > >>>> clientPath:null serverPath:null finished:false header:: 619,1 > > replyHeader:: > > > >> 619,51539617508,0 request:: > > > >>>> '/consumers/account-activated-hadoop-consumer/owners/account-acti > > > >>>> vate > > > >>>> d/68,#6163636f756e742d6163746976617465642d6861646f6f702d636f6e737 > > > >>>> 56d6 > > > >>>> 5725f746d316d7764706c30342d313338393232323535373930362d3536326236 > > > >>>> 3733 > > > >>>> 382d30,v{s{31,s{'world,'anyone}}},1 > > > >>>> response:: > > > >>>> '/consumers/account-activated-hadoop-consumer/owners/account-acti > > > >>>> vate > > > >>>> d/68 > > > >>>> {2014-01-10 00:58:09,284} DEBUG > > > >>>> [account-access-hadoop-consumer_tm1mwdpl04-1389222551916-a0c87abc > > > >>>> _wat > > > >>>> cher_executor] > > > >>>> (?:?) - > > > >>>> [account-access-hadoop-consumer_tm1mwdpl04-1389222551916-a0c87abc > > > >>>> ], > > > >>>> account-access:27: fetched offset = 9655: consumed offset = 9655 > > > >>>> selected new offset 9655 > > > >>>> {2014-01-10 00:58:09,284} INFO > > > >>>> > > > >>>> [account-info-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad > > > >>>> 5966 > > > >>>> 0b_watcher_executor] > > > >>>> (?:?) - > > > >>>> [account-info-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad > > > >>>> 5966 > > > >>>> 0b-leader-finder-thread], > > > >>>> Shutdown completed > > > >>>> {2014-01-10 00:58:09,284} INFO > > > >>>> > > > >>>> [account-info-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad > > > >>>> 5966 > > > >>>> 0b-leader-finder-thread] > > > >>>> (?:?) - > > > >>>> [account-info-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad > > > >>>> 5966 > > > >>>> 0b-leader-finder-thread], > > > >>>> Stopped > > > >>>> {2014-01-10 00:58:09,285} INFO > > > >>>> > > > >>>> [account-info-updated-hadoop-consumer_tm1mwdpl04-1389222553159-ad > > > >>>> 5966 > > > >>>> 0b_watcher_executor] > > > >>>> (?:?) - [ConsumerFetcherManager-1389222553163] Stopping all > > > >>>> fetchers > > > >>> > > > >>> > > > >>> -- > > > >>> -- Guozhang > > > >> > > > >> > > > >> > > > >> -- > > > >> -- Guozhang > > > >> > > > >> > > > > > > > > > > > -- > > -- Guozhang > > > > > > -- > -- Guozhang >