[jira] [Commented] (KAFKA-6848) Kafka consumer failed to get correct offset after commit

zhenyu jiang (JIRA) Fri, 18 Jan 2019 01:44:07 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746084#comment-16746084
 ]


zhenyu jiang commented on KAFKA-6848:
-------------------------------------

This description is very similar to the problem I encountered.My situation is 
as follows:

*application log like this:*
{code:java}
2019-01-18 03:30:29.873 INFO [dmp-notifier,,]
 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] 
o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-1, 
groupId=notifier] Discovered group coordinator 10.211.6.56:9092 (id: 2147483645 
rack: null)
2019-01-18 03:30:32.699 ERROR [dmp-notifier,,]
 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] 
o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-1, 
groupId=notifier] Offset commit failed on partition dmp.notifier.notice-0 at 
offset 294: This is not the correct coordinator.
2019-01-18 03:30:32.699 INFO [dmp-notifier,,]
 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] 
o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-1, 
groupId=notifier] Marking the coordinator 10.211.6.56:9092 (id: 2147483645 
rack: null) dead
2019-01-18 03:30:32.699 WARN [dmp-notifier,,]
 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] 
o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-1, 
groupId=notifier] Asynchronous auto-commit of offsets 
{dmp.notifier.notice-0=OffsetAndMetadata{offset=294, metadata=''}, 
dmp.notifier.notice-2=OffsetAndMetadata{offset=438, metadata=''}, 
dmp.notifier.notice-1=OffsetAndMetadata{offset=45, metadata=''}, 
dmp.notifier.notice-4=OffsetAndMetadata{offset=35, metadata=''}, 
dmp.notifier.notice-3=OffsetAndMetadata{offset=1242, metadata=''}} failed: 
Offset commit failed with a retriable exception. You should retry committing 
the latest consumed offsets.{code}

*10.211.6.56 state-change.log.2019-01-18-03 like this:*
{code:java}
[2019-01-18 03:30:32,697] TRACE Controller 2 epoch 17 started leader election 
for partition [dmp.notifier.notice,2] (state.change.logger)
[2019-01-18 03:30:32,710] TRACE Controller 2 epoch 17 elected leader 3 for 
Offline partition [dmp.notifier.notice,2] (state.change.logger)
[2019-01-18 03:30:32,748] TRACE Controller 2 epoch 17 changed partition 
[dmp.notifier.notice,2] from OfflinePartition to OnlinePartition with leader 3 
(state.change.logger){code}
*Another kafka node in the same cluster server.log.2019-01-18-03 like this:*
 
{code:java}
[2019-01-18 03:30:26,609] INFO Updated PartitionLeaderEpoch. New: {epoch:32, 
offset:24117087}, Current: {epoch:31, offset24116582} for Partition: 
__consumer_offsets-16. Cache now contains 29 entries. 
(kafka.server.epoch.LeaderEpochFileCache)
[2019-01-18 03:30:42,140] WARN Client session timed out, have not heard from 
server in 4090ms for sessionid 0x2684b1cd93e0003 
(org.apache.zookeeper.ClientCnxn)
[2019-01-18 03:30:42,140] INFO Client session timed out, have not heard from 
server in 4090ms for sessionid 0x2684b1cd93e0003, closing socket connection and 
attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2019-01-18 03:30:42,240] INFO zookeeper state changed (Disconnected) 
(org.I0Itec.zkclient.ZkClient)
[2019-01-18 03:30:42,450] INFO Opening socket connection to server 
prod-dmp3.fengdai.org/10.211.6.57:2181. Will not attempt to authenticate using 
SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2019-01-18 03:30:42,451] INFO Socket connection established to 
prod-dmp3.fengdai.org/10.211.6.57:2181, initiating session 
(org.apache.zookeeper.ClientCnxn)
[2019-01-18 03:30:42,452] INFO Session establishment complete on server 
prod-dmp3.fengdai.org/10.211.6.57:2181, sessionid = 0x2684b1cd93e0003, 
negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2019-01-18 03:30:42,452] INFO zookeeper state changed (SyncConnected) 
(org.I0Itec.zkclient.ZkClient){code}

10.211.6.56 service logs have no exceptions at this time，but there are many 
exceptions before the time (by other topic)，please see the attachment. 
[^kafka_service-log.log]

 

> Kafka consumer failed to get correct offset after commit
> --------------------------------------------------------
>
>                 Key: KAFKA-6848
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6848
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.11.0.0
>            Reporter: YY.Roy
>            Priority: Major
>         Attachments: kafka_service-log.log
>
>
> I use kafka consumer java api to poll messages from broker, and here is the 
> code:
> Consumer consumer = new Consumer(props);
> consumer.assgin(topicPartitions);
> long nextOffset = consumer.position(topicPartition);
> consumer.poll();
> consumer.commitSync();
>  
> The above code is called by a quartz scheduler every minute and the group.id 
> is always the same. It ran properly during past several days until today  
> around 8:20:35 am, the position api always returned the older offset 
> committed two days ago, not the latest one which was committed around 8:20:33 
> am. It seems the kafka offset of this group.id just went backward
>  
> I polled the offsets message from the kafka internal topic __consumer_offsets 
> and saw the lastes message was correct, which is like this:
> [eb89887c591b4d2a98c7,my-topic-eb89887c591b4d2a98c7,0]::[OffsetMetadata[447648316,NO_METADATA],CommitTime
>  1525220421173,ExpirationTime 1526430021173]
> The commitTime showed it was indeed the last successful commit.
> But then the position api returned a wrong offset, which is the first message 
> of the corresponding partition of __consumer_offsets. It is like kafka broker 
> regards this older committed offset is the correct offset of this group.id, 
> but the correct one should have been last message in the  __consumer_offsets.
> Then I checked the broker server log and found at that time there are some 
> connection errors, which just the same time the position is called.
> 08:20:33,261 WARN Attempting to send response via channel for which there is 
> no open connection, connection id 2 (kafka.network.Processor)
> There are some other consumer trying to call position at this time and the 
> leader of those topics are this broker too. After that they call get a wrong 
> offset which were older commits in __consumer_offsets. 
>  
>  
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KAFKA-6848) Kafka consumer failed to get correct offset after commit

Reply via email to