Thanks, Jun our ZK server's version is 3.3.5 and at client side zookeeper jar is 3.3.3
I will update the client ZK version and check if this will solve the issue. On Tue, Sep 11, 2012 at 10:16 PM, Jun Rao <jun...@gmail.com> wrote: > If your ZK session expires, but rebalance doesn't happen, this suggests > that a ZK watcher is not fired properly. Which version of ZK are you using? > Should use at least 3.3.4 for both the client and the server. > > Thanks, > > Jun > > On Tue, Sep 11, 2012 at 12:18 AM, 刘明敏 <diveintotomor...@gmail.com> wrote: > > > Jun,thanks for the reply > > > > but seems that no rebalance has triggered: no rebalance log shows up in > our > > log and I checked the cluster just now,the number of consumers on these 3 > > nodes are still: > > 40,24 and 32 > > (the first node get less consumers(14) than the third one(16),while the > > partitions that it owned is far more than the other 2 nodes) > > > > we have observed several times of session timeout of our consumers since > > the deploy of our kafka cluster. > > > > and some of it brought really awkward issues: after session time out,the > > consumers on that node released partition ownership,while no rebalance > > triggered,when query using kafka.tools.ConsumerOffsetChecker, we observed > > some partition's owner is null.In this case, some partitions just stopped > > been consumed. > > > > don't know if this happens to anyone else. > > > > On Tue, Sep 11, 2012 at 11:30 AM, Jun Rao <jun...@gmail.com> wrote: > > > > > ZK session expiration does trigger consumer rebalancing. However, the > > load > > > should still be balanced after the new session is established. > > > > > > Thanks, > > > > > > Jun > > > > > > On Mon, Sep 10, 2012 at 3:44 AM, 刘明敏 <diveintotomor...@gmail.com> > wrote: > > > > > > > we got three nodes in our kafka cluster,and I notice that after 2 of > > our > > > > consumers encountered session time out: > > > > > > > > [2012-09-09 22:57:13,502] INFO Client session timed out, have not > heard > > > > > from server in 4368ms for sessionid 0x338706f2acf72f5, closing > socket > > > > > connection and attempting reconnect > (org.apache.zookeeper.ClientCnxn) > > > > > [2012-09-09 22:57:13,603] INFO zookeeper state changed > (Disconnected) > > > > > (org.I0Itec.zkclient.ZkClient) > > > > > [2012-09-09 22:57:14,594] INFO Opening socket connection to server > > > > > unode22-ins-db1/10.18.10.32:2181 (org.apache.zookeeper.ClientCnxn) > > > > > [2012-09-09 22:57:14,595] INFO Socket connection established to > > > > > unode22-ins-db1/10.18.10.32:2181, initiating session > > > > > (org.apache.zookeeper.ClientCnxn) > > > > > [2012-09-09 22:57:14,596] INFO Session establishment complete on > > server > > > > > unode22-ins-db1/10.18.10.32:2181, sessionid = 0x338706f2acf72f5, > > > > > negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) > > > > > [2012-09-09 22:57:14,596] INFO zookeeper state changed > > (SyncConnected) > > > > > (org.I0Itec.zkclient.ZkClient) > > > > > > > > > > > > the partition ownership becomes quite un-even. > > > > > > > > number of consumers on these 3 nodes are: > > > > 14,12 and 16 > > > > > > > > well the actual partitions owned by these 3 nodes are(I check this > use > > > > ConsumerOffsetChecker): > > > > 40,24 and 32 > > > > > > > > is this a expected behaviour after client session time out? > > > > > > > > -- > > > > Best Regards > > > > > > > > ---------------------- > > > > 刘明敏 | mmLiu > > > > > > > > > > > > > > > -- > > Best Regards > > > > ---------------------- > > 刘明敏 | mmLiu > > > -- Best Regards ---------------------- 刘明敏 | mmLiu