If your ZK session expires, but rebalance doesn't happen, this suggests that a ZK watcher is not fired properly. Which version of ZK are you using? Should use at least 3.3.4 for both the client and the server.
Thanks, Jun On Tue, Sep 11, 2012 at 12:18 AM, 刘明敏 <diveintotomor...@gmail.com> wrote: > Jun,thanks for the reply > > but seems that no rebalance has triggered: no rebalance log shows up in our > log and I checked the cluster just now,the number of consumers on these 3 > nodes are still: > 40,24 and 32 > (the first node get less consumers(14) than the third one(16),while the > partitions that it owned is far more than the other 2 nodes) > > we have observed several times of session timeout of our consumers since > the deploy of our kafka cluster. > > and some of it brought really awkward issues: after session time out,the > consumers on that node released partition ownership,while no rebalance > triggered,when query using kafka.tools.ConsumerOffsetChecker, we observed > some partition's owner is null.In this case, some partitions just stopped > been consumed. > > don't know if this happens to anyone else. > > On Tue, Sep 11, 2012 at 11:30 AM, Jun Rao <jun...@gmail.com> wrote: > > > ZK session expiration does trigger consumer rebalancing. However, the > load > > should still be balanced after the new session is established. > > > > Thanks, > > > > Jun > > > > On Mon, Sep 10, 2012 at 3:44 AM, 刘明敏 <diveintotomor...@gmail.com> wrote: > > > > > we got three nodes in our kafka cluster,and I notice that after 2 of > our > > > consumers encountered session time out: > > > > > > [2012-09-09 22:57:13,502] INFO Client session timed out, have not heard > > > > from server in 4368ms for sessionid 0x338706f2acf72f5, closing socket > > > > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > > > > [2012-09-09 22:57:13,603] INFO zookeeper state changed (Disconnected) > > > > (org.I0Itec.zkclient.ZkClient) > > > > [2012-09-09 22:57:14,594] INFO Opening socket connection to server > > > > unode22-ins-db1/10.18.10.32:2181 (org.apache.zookeeper.ClientCnxn) > > > > [2012-09-09 22:57:14,595] INFO Socket connection established to > > > > unode22-ins-db1/10.18.10.32:2181, initiating session > > > > (org.apache.zookeeper.ClientCnxn) > > > > [2012-09-09 22:57:14,596] INFO Session establishment complete on > server > > > > unode22-ins-db1/10.18.10.32:2181, sessionid = 0x338706f2acf72f5, > > > > negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) > > > > [2012-09-09 22:57:14,596] INFO zookeeper state changed > (SyncConnected) > > > > (org.I0Itec.zkclient.ZkClient) > > > > > > > > > the partition ownership becomes quite un-even. > > > > > > number of consumers on these 3 nodes are: > > > 14,12 and 16 > > > > > > well the actual partitions owned by these 3 nodes are(I check this use > > > ConsumerOffsetChecker): > > > 40,24 and 32 > > > > > > is this a expected behaviour after client session time out? > > > > > > -- > > > Best Regards > > > > > > ---------------------- > > > 刘明敏 | mmLiu > > > > > > > > > -- > Best Regards > > ---------------------- > 刘明敏 | mmLiu >