[ 
https://issues.apache.org/jira/browse/KAFKA-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807895#comment-16807895
 ] 

Guozhang Wang commented on KAFKA-4600:
--------------------------------------

Hello Braedon,

Could you confirm if your issue is that 1) when rebalance listener callback 
throws, "the error is logged and the consumer proceeds on as if nothing 
happened", or 2) you just want consumer coordinator to not capture the 
exception at all and always expose it to client callers.

For 1), because of KAFKA-5154, the consumer will NOT proceeds as if nothing 
happened, it will continue try to re-join and during which it will not continue 
consuming messages from the newly assigned partitions. If the exception in the 
rebalance listener is consistently thrown, then no new messages will be 
consumed and consumer will repeatedly log ERROR message. If the exception is 
transient then the second rebalance will succeed in the rebalance listener.

For 2), it is a user-facing interface change since the originally designed 
protocol is that listener callback's exception will not be thrown all the way 
to user's face and crash the consumer client. So if you propose we should 
change this behavior with good reasons, we should discuss it within a KIP.

> Consumer proceeds on when ConsumerRebalanceListener fails
> ---------------------------------------------------------
>
>                 Key: KAFKA-4600
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4600
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.10.1.1
>            Reporter: Braedon Vickers
>            Priority: Major
>
> One of the use cases for a ConsumerRebalanceListener is to load state 
> necessary for processing a partition when it is assigned. However, when 
> ConsumerRebalanceListener.onPartitionsAssigned() fails for some reason (i.e. 
> the state isn't loaded), the error is logged and the consumer proceeds on as 
> if nothing happened, happily consuming messages from the new partition. When 
> the state is relied upon for correct processing, this can be very bad, e.g. 
> data loss can occur.
> It would be better if the error was propagated up so it could be dealt with 
> normally. At the very least the assignment should fail so the consumer 
> doesn't see any messages from the new partitions, and the rebalance can be 
> reattempted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to