Thanks for the input. Yes that directory is open for all users (rwx).

I don't think that the lack of logging is related to my consumer dying, but
it doesn't help when trying to debug when I have no logs.

I am struggling to find a reason behind this. I deployed the same code, and
same version of Kafka/Zookeeper locally and I am unable to reproduce it.
Granted, my local setup does have a few different components, but it's a
start.

Any other ideas on what to look for?

Thanks again for your help


On Fri, Nov 8, 2013 at 4:00 PM, Joel Koshy <jjkosh...@gmail.com> wrote:

> Do you have write permissions in /kafka-log4j? Your logs should be
> going there (at least per your log4j config) - and you may want to use
> a different log4j config for your consumer so it doesn't collide with
> the broker's.
>
> I doubt the consumer thread dying issue is related to yours - again,
> logs would help.
>
> Also, you may want to try with the latest HEAD as opposed to the beta.
>
> Thanks,
>
> Joel
>
> On Fri, Nov 08, 2013 at 01:18:07PM -0500, Ahmed H. wrote:
> > Hello,
> >
> > I am using the beta right now.
> >
> > I'm not sure if it's GC or something else at this point. To be honest
> I've
> > never really fiddled with any GC settings before. The system can run for
> as
> > long as a day without failing, or as little as a few hours. The lack of
> > pattern makes it a little harder to debug. As I mentioned before, the
> > activity on this system is fairly consistent throughout the day.
> >
> > On the link that you sent, I see this, which could very well be the
> reason:
> >
> >    - One of the typical causes is that the application code that consumes
> >    messages somehow died and therefore killed the consumer thread. We
> >    recommend using a try/catch clause to log all Throwable in the
> consumer
> >    logic.
> >
> > That is entirely possible. I wanted to check the kafka logs for any clues
> > but for some reason, kafka is not writing any logs :/. Here is my log4j
> > settings for kafka:
> >
> > log4j.rootLogger=INFO, stdout
> > > log4j.appender.stdout=org.apache.log4j.ConsoleAppender
> > > log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
> > > log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender
> > > log4j.appender.kafkaAppender.DatePattern='.'yyyy-MM-dd-HH
> > > log4j.appender.kafkaAppender.File=/kafka-log4j/server.log
> > > log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
> > >
> > >
> log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender
> > > log4j.appender.stateChangeAppender.DatePattern='.'yyyy-MM-dd-HH
> > > log4j.appender.stateChangeAppender.File=/kafka-log4j/state-change.log
> > >
> log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m
> > > (%c)%n
> > >
> log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender
> > > log4j.appender.requestAppender.DatePattern='.'yyyy-MM-dd-HH
> > > log4j.appender.requestAppender.File=/kafka-log4j/kafka-request.log
> > > log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m
> (%c)%n
> > >
> log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender
> > > log4j.appender.controllerAppender.DatePattern='.'yyyy-MM-dd-HH
> > > log4j.appender.controllerAppender.File=/kafka-log4j/controller.log
> > > log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout
> > > log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m
> > > (%c)%n
> > > log4j.logger.kafka=INFO, kafkaAppender
> > > log4j.logger.kafka.network.RequestChannel$=TRACE, requestAppender
> > > log4j.additivity.kafka.network.RequestChannel$=false
> > > log4j.logger.kafka.request.logger=TRACE, requestAppender
> > > log4j.additivity.kafka.request.logger=false
> > > log4j.logger.kafka.controller=TRACE, controllerAppender
> > > log4j.additivity.kafka.controller=false
> > > log4j.logger.state.change.logger=TRACE, stateChangeAppender
> > > log4j.additivity.state.change.logger=false
> >
> >
> >
> > Thanks
> >
> >
> > On Thu, Nov 7, 2013 at 5:06 PM, Joel Koshy <jjkosh...@gmail.com> wrote:
> >
> > > Can you see if this applies in your case:
> > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog%3F
> > >
> > > Also, what version of kafka 0.8 are you using? If not the beta, then
> > > what's the git hash?
> > >
> > > Joel
> > >
> > > On Thu, Nov 07, 2013 at 02:51:41PM -0500, Ahmed H. wrote:
> > > > Hello all,
> > > >
> > > > I am not sure if this is a Kafka issue, or an issue with the client
> that
> > > I
> > > > am using.
> > > >
> > > > We have a fairly small setup, where everything sits on one server
> (Kafka
> > > > 0.8, and Zookeeper). The message frequency is not too high (1-2 per
> > > second).
> > > >
> > > > The setup works fine for a certain period of time but at some point,
> it
> > > > just dies, and exceptions are thrown. This is pretty much a daily
> > > > occurrence, but there is no pattern. Based on the logs, it appears
> that
> > > the
> > > > Kafka client tries to rebalance with Zookeeper and fails, it tries
> and
> > > > tries multiple times but after a few tries it gives up. Here is the
> stack
> > > > trace:
> > > >
> > > > 04:56:07,234 INFO  [kafka.consumer.SimpleConsumer]
> > > > >
> > >
> (ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0)
> > > > > Reconnect due to socket error: :
> > > > > java.nio.channels.ClosedByInterruptException
> > > > >  at
> > > > >
> > >
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> > > > > [rt.jar:1.7.0_25]
> > > > > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:402)
> > > > > [rt.jar:1.7.0_25]
> > > > >  at
> > > > >
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:220)
> > > > > [rt.jar:1.7.0_25]
> > > > > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> > > > > [rt.jar:1.7.0_25]
> > > > >  at
> > > > >
> > >
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> > > > > [rt.jar:1.7.0_25]
> > > > > at kafka.utils.Utils$.read(Utils.scala:394)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:71)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> > > > >
> > >
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:108)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:108)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:108)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:107)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:107)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:107)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:106)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> > > > >
> > >
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:96)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:88)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > 04:56:07,238 WARN  [kafka.consumer.ConsumerFetcherThread]
> > > > >
> > >
> (ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0)
> > > > >
> > >
> [ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0],
> > > > > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 0;
> > > ClientId:
> > > > >
> > >
> kafkaqueue.notifications-ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0;
> > > > > ReplicaId: -1; MaxWait: 100 ms; MinBytes: 1 bytes; RequestInfo:
> > > > > [kafkaqueue.notifications,0] -> PartitionFetchInfo(216003,1048576):
> > > > > java.nio.channels.ClosedByInterruptException
> > > > >  at
> > > > >
> > >
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> > > > > [rt.jar:1.7.0_25]
> > > > > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:650)
> > > > > [rt.jar:1.7.0_25]
> > > > >  at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:43)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> kafka.consumer.SimpleConsumer.reconnect(SimpleConsumer.scala:56)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:77)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:108)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:108)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:108)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:107)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:107)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> > > > >
> > >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:107)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:106)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:96)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> > > > >
> > >
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:88)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > 04:56:07,240 INFO  [kafka.consumer.ConsumerFetcherThread]
> > > > >
> > >
> (ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0)
> > > > >
> > >
> [ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0],
> > > > > Stopped
> > > > > 04:56:07,240 INFO  [kafka.consumer.ConsumerFetcherThread]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> > >
> [ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0],
> > > > > Shutdown completed
> > > > > 04:56:07,241 INFO  [kafka.consumer.ConsumerFetcherManager]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > > [ConsumerFetcherManager-1383643783834] All connections stopped
> > > > > 04:56:07,241 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5],
> > > > > Cleared all relevant queues for this fetcher
> > > > > 04:56:07,242 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5],
> > > > > Cleared the data chunks in all the consumer message iterators
> > > > > 04:56:07,242 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5],
> > > > > Committing all offsets after clearing the fetcher queues
> > > > > 04:56:07,245 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5],
> > > > > Releasing partition ownership
> > > > > 04:56:07,248 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5],
> > > > > Consumer
> > > > >
> kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5
> > > > > rebalancing the following partitions: ArrayBuffer(0) for topic
> > > > > kafkaqueue.notifications with consumers:
> > > > >
> > >
> List(kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0)
> > > > > 04:56:07,249 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5],
> > > > >
> kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0
> > > > > attempting to claim partition 0
> > > > > 04:56:07,252 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5],
> > > > >
> kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0
> > > > > successfully owned partition 0 for topic kafkaqueue.notifications
> > > > > 04:56:07,253 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5],
> > > > > Updating the cache
> > > > > 04:56:07,254 INFO  [proj.hd.core] (clojure-agent-send-off-pool-5)
> > > Invalid
> > > > > node name. Not performing walk. Node name:  POC6O003.2:BER:1/19/1
> > > > > 04:56:07,254 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5],
> > > > > Consumer
> > > > >
> kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5
> > > > > selected partitions : kafkaqueue.notifications:0: fetched offset =
> > > 216003:
> > > > > consumed offset = 216003
> > > > > 04:56:07,255 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5_watcher_executor)
> > > > >
> > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5], end
> > > > > rebalancing consumer
> > > > >
> kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5
> > > try #0
> > > > > 04:56:07,257 INFO
> > > > >  [kafka.consumer.ConsumerFetcherManager$LeaderFinderThread]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-leader-finder-thread)
> > > > >
> > >
> [kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-leader-finder-thread],
> > > > > Starting
> > > > > 04:56:07,265 INFO  [kafka.utils.VerifiableProperties]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-leader-finder-thread)
> > > > > Verifying properties
> > > > > 04:56:07,265 INFO  [kafka.utils.VerifiableProperties]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-leader-finder-thread)
> > > > > Property metadata.broker.list is overridden to
> > > test-server.localnet:9092
> > > > > 04:56:07,266 INFO  [kafka.utils.VerifiableProperties]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-leader-finder-thread)
> > > > > Property request.timeout.ms is overridden to 30000
> > > > > 04:56:07,266 INFO  [kafka.utils.VerifiableProperties]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-leader-finder-thread)
> > > > > Property client.id is overridden to kafkaqueue.notifications
> > > > > 04:56:07,267 INFO  [kafka.client.ClientUtils$]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-leader-finder-thread)
> > > > > Fetching metadata from broker
> id:0,host:test-server.localnet,port:9092
> > > with
> > > > > correlation id 15 for 1 topic(s) Set(kafkaqueue.notifications)
> > > > > 04:56:07,268 INFO  [kafka.producer.SyncProducer]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-leader-finder-thread)
> > > > > Connected to test-server.localnet:9092 for producing
> > > > > 04:56:07,272 INFO  [kafka.producer.SyncProducer]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-leader-finder-thread)
> > > > > Disconnecting from test-server.localnet:9092
> > > > > 04:56:07,274 INFO  [kafka.consumer.ConsumerFetcherManager]
> > > > >
> > >
> (kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-leader-finder-thread)
> > > > > [ConsumerFetcherManager-1383643783834] Adding fetcher for partition
> > > > > [kafkaqueue.notifications,0], initOffset 216003 to broker 0 with
> > > fetcherId 0
> > > > > 04:56:07,275 INFO  [kafka.consumer.ConsumerFetcherThread]
> > > > >
> > >
> (ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0)
> > > > >
> > >
> [ConsumerFetcherThread-kafkaqueue.notifications_test-server.localnet-1383643783745-3757e7a5-0-0],
> > > > > Starting
> > > > > 04:56:07,281 INFO  [proj.hd.core] (clojure-agent-send-off-pool-5)
> > > Invalid
> > > > > node name. Not performing walk. Node name:  B2Z_0053.2:Rx
> > > Frequency:1/2/1
> > > > > 04:56:10,010 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.topology.updates_test-server.localnet-1383643783747-c7775701_watcher_executor)
> > > > >
> > >
> [kafkaqueue.topology.updates_test-server.localnet-1383643783747-c7775701],
> > > > > begin rebalancing consumer
> > > > >
> > >
> kafkaqueue.topology.updates_test-server.localnet-1383643783747-c7775701 try
> > > > > #0
> > > > > 04:56:10,020 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > > > >
> > >
> (kafkaqueue.topology.updates_test-server.localnet-1383643783747-c7775701_watcher_executor)
> > > > >
> > >
> [kafkaqueue.topology.updates_test-server.localnet-1383643783747-c7775701],
> > > > > exception during rebalance :
> > > > > org.I0Itec.zkclient.exception.ZkNoNodeException:
> > > > > org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode =
> > > > > NoNode for
> > > > >
> > >
> /consumers/kafkaqueue.topology.updates/ids/kafkaqueue.topology.updates_test-server.localnet-1383643783747-c7775701
> > > > >  at
> > > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> > > > > [zkclient-0.3.jar:0.3]
> > > > > at
> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >  at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
> > > > > [zkclient-0.3.jar:0.3]
> > > > > at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >  at kafka.utils.ZkUtils$.readData(ZkUtils.scala:407)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > at
> kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:52)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:401)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78)
> > > > > [scala-library-2.9.2.jar:]
> > > > > at
> > > > >
> > >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:369)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > >  at
> > > > >
> > >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326)
> > > > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > > > > Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> > > > > KeeperErrorCode = NoNode for
> > > > >
> > >
> /consumers/kafkaqueue.topology.updates/ids/kafkaqueue.topology.updates_test-server.localnet-1383643783747-c7775701
> > > > >  at
> > > org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> > > > > [zookeeper-3.4.3.jar:3.4.3-1240972]
> > > > > at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> > > > > [zookeeper-3.4.3.jar:3.4.3-1240972]
> > > > >  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
> > > > > [zookeeper-3.4.3.jar:3.4.3-1240972]
> > > > > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
> > > > > [zookeeper-3.4.3.jar:3.4.3-1240972]
> > > > >  at
> org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
> > > > > [zkclient-0.3.jar:0.3]
> > > > > at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >  at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
> > > > > [zkclient-0.3.jar:0.3]
> > > > > at
> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
> > > > > [zkclient-0.3.jar:0.3]
> > > > >  ... 9 more
> > > >
> > > >
> > > > The attempts to rebalance occur a few times but eventually,this
> message
> > > > shows up: "can't rebalance after 4 retries".
> > > >
> > > > Our app is deployed in JBoss and the only way to recover from this
> is to
> > > > restart JBoss.
> > > >
> > > > This started happening after we went from Kafka 0.7 to Kafka 0.8.
> Nothing
> > > > else on our system changed except for that. We are connecting to
> Kafka
> > > > using a Clojure library called clj-kafka (
> > > > https://github.com/pingles/clj-kafka). clj-kafka was updated to work
> > > with
> > > > Kafka 0.8...
> > > >
> > > > My apologies if this post doesn't belong here. I'm hoping that this
> may
> > > be
> > > > a generic issue rather than an issue specific to how we're
> connecting to
> > > > Kafka. Any ideas are appreciated.
> > > >
> > > > Thanks!
> > >
> > >
>
>

Reply via email to