[
https://issues.apache.org/jira/browse/KAFKA-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042077#comment-16042077
]
huxihx commented on KAFKA-5406:
-------------------------------
Maybe could estimate the total time period for the network recovery and make
sure `rebalance.max.retries` * `rebalance.backoff.ms` is no less than the
period. Perhaps some application-level logic is required to handle a long
network outage.
> NoNodeException result in rebalance failed
> ------------------------------------------
>
> Key: KAFKA-5406
> URL: https://issues.apache.org/jira/browse/KAFKA-5406
> Project: Kafka
> Issue Type: Bug
> Components: consumer
> Affects Versions: 0.8.2.2, 0.10.0.0
> Environment: windows8.1 centos6.4
> Reporter: xiaoguy
> Priority: Critical
> Labels: easyfix, patch
> Attachments: log.log
>
>
> hey guys , I got this problem this days,
> because of the network is unstableļ¼ consumer rebalance failed after 5 times
> ,the log shows that zk path /consumers/$(groupIdName)/ids/ is empty,
> consumer seems can't register after network recovered, so i got the kafka
> source code (0.8.2.2) and found the
> consumer/ZookeeperConsumerConnector$ZKSessionExpireListener handleNewSession
> won't call , and handleStateChanged do nothing,
> so i change the code like this ,and it seems works , and i checked 0.10.0.0
> version, the same problem, is this a bug ? i'm confused , thank you.
> def handleStateChanged(state: KeeperState) {
> // do nothing, since zkclient will do reconnect for us.
> if(state==KeeperState.SyncConnected){
> handleNewSession()
> }
> System.err.println("----------------ZKSessionExpireListener------------
> handleStateChanged-----state:"+state+"----"+state.getIntValue)
> }
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)