[ 
https://issues.apache.org/jira/browse/KAFKA-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15026907#comment-15026907
 ] 

Ben Stopford edited comment on KAFKA-2891 at 11/25/15 4:01 PM:
---------------------------------------------------------------

Yes. I get something similar. Worked fine for about six runs then got a run 
with:

At least one acked message did not appear in the consumed messages. 
acked_minus_consumed: set([29073, 29067, 29076, 29070, 29079])

Which i have not seen before (i.e. just a few messages missing). 

The five messages were produced at 11:29:19. Interestingly this time 
corresponds to the consumer's first notifications that the coordinator is dead 
(below). These come a few secs after the second node (of 3) is shutdown.

{quote}
[2015-11-25 11:28:55,958] INFO Kafka version : 0.9.1.0-SNAPSHOT 
(org.apache.kafka.common.utils.AppInfoParser)
[2015-11-25 11:28:55,958] INFO Kafka commitId : 6f3c8e2c5079f00e 
(org.apache.kafka.common.utils.AppInfoParser)
[2015-11-25 11:29:02,628] INFO Attempt to heart beat failed since member id is 
not valid, reset it and try to re-join group. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2015-11-25 11:29:02,649] ERROR Error ILLEGAL_GENERATION occurred while 
committing offsets for group unique-test-group-0.206159604113 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2015-11-25 11:29:02,649] WARN Auto offset commit failed:  
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2015-11-25 11:29:19,376] INFO Marking the coordinator 2147483644 dead. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2015-11-25 11:29:19,383] INFO Marking the coordinator 2147483644 dead. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2015-11-25 11:29:19,386] INFO Marking the coordinator 2147483644 dead. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
{quote}


was (Author: benstopford):
Yes. I get exactly the same. Worked fine for about six runs then got a run with:

At least one acked message did not appear in the consumed messages. 
acked_minus_consumed: set([29073, 29067, 29076, 29070, 29079])

Which i have not seen before (i.e. just a few messages missing). 

The five messages were produced at 11:29:19. Interestingly this time 
corresponds to the consumer's first notifications that the coordinator is dead 
(below). These come a few secs after the second node (of 3) is shutdown.

{quote}
[2015-11-25 11:28:55,958] INFO Kafka version : 0.9.1.0-SNAPSHOT 
(org.apache.kafka.common.utils.AppInfoParser)
[2015-11-25 11:28:55,958] INFO Kafka commitId : 6f3c8e2c5079f00e 
(org.apache.kafka.common.utils.AppInfoParser)
[2015-11-25 11:29:02,628] INFO Attempt to heart beat failed since member id is 
not valid, reset it and try to re-join group. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2015-11-25 11:29:02,649] ERROR Error ILLEGAL_GENERATION occurred while 
committing offsets for group unique-test-group-0.206159604113 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2015-11-25 11:29:02,649] WARN Auto offset commit failed:  
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2015-11-25 11:29:19,376] INFO Marking the coordinator 2147483644 dead. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2015-11-25 11:29:19,383] INFO Marking the coordinator 2147483644 dead. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2015-11-25 11:29:19,386] INFO Marking the coordinator 2147483644 dead. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
{quote}

> Gaps in messages delivered by new consumer after Kafka restart
> --------------------------------------------------------------
>
>                 Key: KAFKA-2891
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2891
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.9.0.0
>            Reporter: Rajini Sivaram
>            Priority: Critical
>
> Replication tests when run with the new consumer with SSL/SASL were failing 
> very often because messages were not being consumed from some topics after a 
> Kafka restart. The fix in KAFKA-2877 has made this a lot better. But I am 
> still seeing some failures (less often now) because a small set of messages 
> are not received after Kafka restart. This failure looks slightly different 
> from the one before the fix for KAFKA-2877 was applied, hence the new defect. 
> The test fails because not all acked messages are received by the consumer, 
> and the number of messages missing are quite small.
> [~benstopford] Are the upgrade tests working reliably with KAFKA-2877 now?
> Not sure if any of these log entries are important:
> {quote}
> [2015-11-25 14:41:12,342] INFO SyncGroup for group test-consumer-group failed 
> due to NOT_COORDINATOR_FOR_GROUP, will find new coordinator and rejoin 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2015-11-25 14:41:12,342] INFO Marking the coordinator 2147483644 dead. 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2015-11-25 14:41:12,958] INFO Attempt to join group test-consumer-group 
> failed due to unknown member id, resetting and retrying. 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2015-11-25 14:41:42,437] INFO Fetch offset null is out of range, resetting 
> offset (org.apache.kafka.clients.consumer.internals.Fetcher)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to