[
https://issues.apache.org/jira/browse/KAFKA-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092050#comment-14092050
]
Joe Stein commented on KAFKA-1585:
----------------------------------
FWIW there were a lot of bug fixes released in the Zookeeper 3.4.6
http://zookeeper.apache.org/doc/r3.4.6/releasenotes.html from 3.4.5 version.
You could be hitting ZOOKEEPER-1382 which was fixed in the 3.4.6 release
Current Kafka 0.8.1.1 zookeeper recommend
https://kafka.apache.org/documentation.html#zk though folks are using 3.4.6 in
production and that should be Zookeeper version for 0.8.2
In regards to your logs, before this happened it looks like you had errors and
then a reconnect and consumer shutdown
Line 132356: 18:31:38,948 [7-cloudera:2181] INFO kafka.utils.Logging$class -
[Q_dev-1407608193903-1cb30b18], Q_dev-1407608193903-1cb30b18-0 attempting to
claim partition 0
Line 132357: 18:31:38,975 [26-d7f0e66a-0-0] ERROR kafka.utils.Logging$class -
[ConsumerFetcherThread-Q_dev-1407608195226-d7f0e66a-0-0], Current offset 15 for
partition [gk.q.event,0] out of range; reset offset to 0
Line 132358: 18:31:38,980 [62-1d81f64b-0-0] ERROR kafka.utils.Logging$class -
[ConsumerFetcherThread-Q_dev-1407608193962-1d81f64b-0-0], Current offset 4 for
partition [gk.q.mail.api,0] out of range; reset offset to 0
Line 132359: 18:31:38,994 [84-ceea5788-0-0] WARN kafka.utils.Logging$class -
Reconnect due to socket error: null
Line 132360: 18:31:38,995 [84-ceea5788-0-0] INFO kafka.utils.Logging$class -
[ConsumerFetcherThread-dev_dev-1407608194884-ceea5788-0-0], Stopped
Line 132361: 18:31:38,995 [atcher_executor] INFO kafka.utils.Logging$class -
[ConsumerFetcherThread-dev_dev-1407608194884-ceea5788-0-0], Shutdown completed
Line 132362: 18:31:38,995 [atcher_executor] INFO kafka.utils.Logging$class -
[ConsumerFetcherManager-1407608194890] All connections stopped
Line 132363: 18:31:38,996 [atcher_executor] INFO kafka.utils.Logging$class -
[dev_dev-1407608194884-ceea5788], Cleared all relevant queues for this fetcher
Line 132364: 18:31:38,996 [atcher_executor] INFO kafka.utils.Logging$class -
[dev_dev-1407608194884-ceea5788], Cleared the data chunks in all the consumer
message iterators
Line 132365: 18:31:38,996 [atcher_executor] INFO kafka.utils.Logging$class -
[dev_dev-1407608194884-ceea5788], Committing all offsets after clearing the
fetcher queues
Line 132366: 18:31:38,996 [atcher_executor] INFO kafka.utils.Logging$class -
[dev_dev-1407608194884-ceea5788], Releasing partition ownership
Line 132367: 18:31:39,005 [7-cloudera:2181] INFO kafka.utils.Logging$class -
conflict in /consumers/Q/owners/gk.q.log/0 data: Q_dev-1407608193903-1cb30b18-0
stored data: Q_dev-1407608205503-9cfb99aa-0
likely what happened is when it reconnected the timeout with zk never occurred
and it got stuck there. Could be the Zk bug, could also be related somewhat to
KAFKA-1387 or KAFKA-1451 I will link the JIRAs so when we test 0.8.2 see about
reproducing this on a good zk version
To resolve that you can stop the consumer, wait for the zk nodes to expire and
start up the consumers again.
> Client: Infinite "conflict in /consumers/"
> ------------------------------------------
>
> Key: KAFKA-1585
> URL: https://issues.apache.org/jira/browse/KAFKA-1585
> Project: Kafka
> Issue Type: Bug
> Components: consumer
> Affects Versions: 0.8.1.1
> Reporter: Artur Denysenko
> Priority: Critical
> Fix For: 0.8.2
>
> Attachments: kafka_consumer_ephemeral_node_extract.zip
>
>
> Periodically we have kafka consumers cycling in "conflict in /consumers/" and
> "I wrote this conflicted ephemeral node".
> Please see attached log extract.
> After restarting the process kafka consumers are working perfectly.
> We are using Zookeeper 3.4.5
--
This message was sent by Atlassian JIRA
(v6.2#6252)