[
https://issues.apache.org/jira/browse/KAFKA-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Gustafson resolved KAFKA-5430.
------------------------------------
Resolution: Fixed
I think this issue was fixed in KAFKA-5154. Please reopen if you think it is
still a problem.
> new consumers getting data for revoked partitions
> -------------------------------------------------
>
> Key: KAFKA-5430
> URL: https://issues.apache.org/jira/browse/KAFKA-5430
> Project: Kafka
> Issue Type: Bug
> Components: clients, consumer
> Affects Versions: 0.10.2.0
> Reporter: Lior Chaga
> Priority: Major
> Attachments: consumer-thread.log, consumer-thread.log,
> kafka_trace.log.gz
>
>
> Due to bad configuration applied to network components, we experienced issues
> with communication between kafka brokers (causing under replication) as well
> as producers/consumers not being able to work against kafka.
> The symptoms on the consumer were many errors of the following form:
> {code}
> 2017-06-04 04:27:35,200 ERROR [Kafka Topics Cosumer
> requestlegacy.consumer-11_session_parser_02] TaboolaKafkaConsumer [] - Failed
> committing to kafka topicPartitions
> [requestlegacy-2,requestlegacy-0,requestlegacy-1]
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset
> commit failed with a retriable exception. You should retry committing offsets.
> Caused by: org.apache.kafka.common.errors.DisconnectException
> {code}
> So far so good. However, upon network recovery, there were several rebalance
> operations, which eventually resulted in only one consumer (#14) being
> assigned with all topic partitions (at this case we're talking about a
> consumer groups for which all consumers are running in same process):
> {code}
> 2017-06-04 04:27:02,168 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-14_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-8, requestlegacy-9]
> for group session_parser_02
> 2017-06-04 04:27:04,208 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-15_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-10, requestlegacy-11]
> for group session_parser_02
> 2017-06-04 04:27:18,167 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-12_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-3, requestlegacy-4,
> requestlegacy-5] for group session_parser_02
> 2017-06-04 04:27:20,232 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-11_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-2, requestlegacy-0,
> requestlegacy-1] for group session_parser_02
> 2017-06-04 04:27:20,236 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-15_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-9, requestlegacy-10,
> requestlegacy-11] for group session_parser_02
> 2017-06-04 04:27:20,237 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-12_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-3, requestlegacy-4, requestlegacy-5]
> for group session_parser_02
> 2017-06-04 04:27:20,237 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-14_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-6, requestlegacy-7, requestlegacy-8]
> for group session_parser_02
> 2017-06-04 04:27:20,332 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-11_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-2, requestlegacy-0, requestlegacy-1]
> for group session_parser_02
> 2017-06-04 04:28:52,368 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-13_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-6, requestlegacy-7]
> for group session_parser_02
> 2017-06-04 04:29:15,201 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-11_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-2, requestlegacy-0,
> requestlegacy-1] for group session_parser_02
> 2017-06-04 04:30:22,379 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-14_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-6, requestlegacy-7,
> requestlegacy-8] for group session_parser_02
> 2017-06-04 04:30:24,431 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-15_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-9, requestlegacy-10,
> requestlegacy-11] for group session_parser_02
> 2017-06-04 04:30:38,229 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-12_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-3, requestlegacy-4,
> requestlegacy-5] for group session_parser_02
> 2017-06-04 04:30:39,156 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-13_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-6, requestlegacy-7] for group
> session_parser_02
> 2017-06-04 04:30:39,156 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-15_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-10, requestlegacy-11] for group
> session_parser_02
> 2017-06-04 04:30:39,156 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-11_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-2, requestlegacy-0, requestlegacy-1]
> for group session_parser_02
> 2017-06-04 04:30:39,156 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-12_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-3, requestlegacy-4, requestlegacy-5]
> for group session_parser_02
> 2017-06-04 04:30:39,157 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-14_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-8, requestlegacy-9] for group
> session_parser_02
> 2017-06-04 04:49:51,118 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-13_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-6, requestlegacy-7]
> for group session_parser_02
> 2017-06-04 04:50:40,847 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-12_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-3, requestlegacy-4,
> requestlegacy-5] for group session_parser_02
> 2017-06-04 04:50:58,898 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-15_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-10, requestlegacy-11]
> for group session_parser_02
> 2017-06-04 04:50:59,010 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-11_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-2, requestlegacy-0,
> requestlegacy-1] for group session_parser_02
> 2017-06-04 04:52:39,732 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-14_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-8, requestlegacy-9]
> for group session_parser_02
> 2017-06-04 04:55:57,633 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-14_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-8, requestlegacy-9] for group
> session_parser_02
> 2017-06-04 04:57:37,662 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-14_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-8, requestlegacy-9]
> for group session_parser_02
> 2017-06-04 05:00:57,601 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-14_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-2, requestlegacy-3, requestlegacy-4,
> requestlegacy-5, requestlegacy-6, requestlegacy-7, requestlegacy-8,
> requestlegacy-9, requestlegacy-0, requestlegacy-1, requestlegacy-10,
> requestlegacy-11] for group session_parser_02
> 2017-06-04 05:02:37,605 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-14_session_parser_02] ConsumerCoordinator [] -
> Revoking previously assigned partitions [requestlegacy-2, requestlegacy-3,
> requestlegacy-4, requestlegacy-5, requestlegacy-6, requestlegacy-7,
> requestlegacy-8, requestlegacy-9, requestlegacy-0, requestlegacy-1,
> requestlegacy-10, requestlegacy-11] for group session_parser_02
> 2017-06-04 05:02:37,666 INFO [Kafka Topics Cosumer
> requestlegacy.consumer-14_session_parser_02] ConsumerCoordinator [] - Setting
> newly assigned partitions [requestlegacy-2, requestlegacy-3, requestlegacy-4,
> requestlegacy-5, requestlegacy-6, requestlegacy-7, requestlegacy-8,
> requestlegacy-9, requestlegacy-0, requestlegacy-1, requestlegacy-10,
> requestlegacy-11] for group session_parser_02
> {code}
> The interesting part is that after the last rebalance, other consumers were
> still polling messages for hours, and failed upon commit (notice the revoke
> message on consumer-12, and the commit failure below on same consumer, 2
> minutes later):
> {code}
> 2017-06-04 04:52:41,396 ERROR [Kafka Topics Cosumer
> requestlegacy.consumer-12_session_parser_02] TaboolaKafkaConsumer [] - Failed
> committing to kafka topicPartitions
> [requestlegacy-3,requestlegacy-4,requestlegacy-5]
> org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be
> completed since the group has already rebalanced and assigned the partitions
> to another member. This means that the time between subsequent calls to
> poll() was longer than the configured max.poll.interval.ms, which typically
> implies that the poll loop is spending too much time message processing. You
> can address this either by increasing the session timeout or by reducing the
> maximum size of batches returned in poll() with max.poll.records.
> {code}
> To complete the picture from kafka side, kafka-consumer-group reports all
> partitions assigned to consumer-14:
> {code}
> requestlegacy 0 9092336 9092337 1
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 1 9094923 9094923 0
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 2 9091382 9091382 0
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 3 9097052 9097052 0
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 4 9097542 9097542 0
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 5 9098915 9098915 0
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 6 9098308 9098309 1
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 7 9096984 9096984 0
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 8 9100548 9100548 0
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 9 9097615 9097615 0
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 10 9095553 9095553 0
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> requestlegacy 11 9089520 9089520 0
> consumer-14-aafabf03-2e94-42ef-922a-ba699c5a8517 /10.105.137.28
> consumer-14
> {code}
> We took a trace log (attached) of org.apache.kafka for our consuming service
> when we discovered the issue, prior to restarting the service (which resulted
> in full recovery).
> Also, this happened for multiple consumer groups, this is just one example.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)