[ https://issues.apache.org/jira/browse/KAFKA-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807895#comment-16807895 ]
Guozhang Wang commented on KAFKA-4600: -------------------------------------- Hello Braedon, Could you confirm if your issue is that 1) when rebalance listener callback throws, "the error is logged and the consumer proceeds on as if nothing happened", or 2) you just want consumer coordinator to not capture the exception at all and always expose it to client callers. For 1), because of KAFKA-5154, the consumer will NOT proceeds as if nothing happened, it will continue try to re-join and during which it will not continue consuming messages from the newly assigned partitions. If the exception in the rebalance listener is consistently thrown, then no new messages will be consumed and consumer will repeatedly log ERROR message. If the exception is transient then the second rebalance will succeed in the rebalance listener. For 2), it is a user-facing interface change since the originally designed protocol is that listener callback's exception will not be thrown all the way to user's face and crash the consumer client. So if you propose we should change this behavior with good reasons, we should discuss it within a KIP. > Consumer proceeds on when ConsumerRebalanceListener fails > --------------------------------------------------------- > > Key: KAFKA-4600 > URL: https://issues.apache.org/jira/browse/KAFKA-4600 > Project: Kafka > Issue Type: Bug > Components: consumer > Affects Versions: 0.10.1.1 > Reporter: Braedon Vickers > Priority: Major > > One of the use cases for a ConsumerRebalanceListener is to load state > necessary for processing a partition when it is assigned. However, when > ConsumerRebalanceListener.onPartitionsAssigned() fails for some reason (i.e. > the state isn't loaded), the error is logged and the consumer proceeds on as > if nothing happened, happily consuming messages from the new partition. When > the state is relied upon for correct processing, this can be very bad, e.g. > data loss can occur. > It would be better if the error was propagated up so it could be dealt with > normally. At the very least the assignment should fail so the consumer > doesn't see any messages from the new partitions, and the rebalance can be > reattempted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)