Zookeeper 3.3.3 has known bugs in the watcher and session functionality. We recommend you run both server and client on 3.3.4 or higher.
Thanks, Neha On Tue, Sep 11, 2012 at 8:22 AM, mmLiu <diveintotomor...@gmail.com> wrote: > Thanks, Jun > > our ZK server's version is 3.3.5 and at client side zookeeper jar is 3.3.3 > > I will update the client ZK version and check if this will solve the issue. > > On Tue, Sep 11, 2012 at 10:16 PM, Jun Rao <jun...@gmail.com> wrote: > >> If your ZK session expires, but rebalance doesn't happen, this suggests >> that a ZK watcher is not fired properly. Which version of ZK are you using? >> Should use at least 3.3.4 for both the client and the server. >> >> Thanks, >> >> Jun >> >> On Tue, Sep 11, 2012 at 12:18 AM, 刘明敏 <diveintotomor...@gmail.com> wrote: >> >> > Jun,thanks for the reply >> > >> > but seems that no rebalance has triggered: no rebalance log shows up in >> our >> > log and I checked the cluster just now,the number of consumers on these 3 >> > nodes are still: >> > 40,24 and 32 >> > (the first node get less consumers(14) than the third one(16),while the >> > partitions that it owned is far more than the other 2 nodes) >> > >> > we have observed several times of session timeout of our consumers since >> > the deploy of our kafka cluster. >> > >> > and some of it brought really awkward issues: after session time out,the >> > consumers on that node released partition ownership,while no rebalance >> > triggered,when query using kafka.tools.ConsumerOffsetChecker, we observed >> > some partition's owner is null.In this case, some partitions just stopped >> > been consumed. >> > >> > don't know if this happens to anyone else. >> > >> > On Tue, Sep 11, 2012 at 11:30 AM, Jun Rao <jun...@gmail.com> wrote: >> > >> > > ZK session expiration does trigger consumer rebalancing. However, the >> > load >> > > should still be balanced after the new session is established. >> > > >> > > Thanks, >> > > >> > > Jun >> > > >> > > On Mon, Sep 10, 2012 at 3:44 AM, 刘明敏 <diveintotomor...@gmail.com> >> wrote: >> > > >> > > > we got three nodes in our kafka cluster,and I notice that after 2 of >> > our >> > > > consumers encountered session time out: >> > > > >> > > > [2012-09-09 22:57:13,502] INFO Client session timed out, have not >> heard >> > > > > from server in 4368ms for sessionid 0x338706f2acf72f5, closing >> socket >> > > > > connection and attempting reconnect >> (org.apache.zookeeper.ClientCnxn) >> > > > > [2012-09-09 22:57:13,603] INFO zookeeper state changed >> (Disconnected) >> > > > > (org.I0Itec.zkclient.ZkClient) >> > > > > [2012-09-09 22:57:14,594] INFO Opening socket connection to server >> > > > > unode22-ins-db1/10.18.10.32:2181 (org.apache.zookeeper.ClientCnxn) >> > > > > [2012-09-09 22:57:14,595] INFO Socket connection established to >> > > > > unode22-ins-db1/10.18.10.32:2181, initiating session >> > > > > (org.apache.zookeeper.ClientCnxn) >> > > > > [2012-09-09 22:57:14,596] INFO Session establishment complete on >> > server >> > > > > unode22-ins-db1/10.18.10.32:2181, sessionid = 0x338706f2acf72f5, >> > > > > negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) >> > > > > [2012-09-09 22:57:14,596] INFO zookeeper state changed >> > (SyncConnected) >> > > > > (org.I0Itec.zkclient.ZkClient) >> > > > >> > > > >> > > > the partition ownership becomes quite un-even. >> > > > >> > > > number of consumers on these 3 nodes are: >> > > > 14,12 and 16 >> > > > >> > > > well the actual partitions owned by these 3 nodes are(I check this >> use >> > > > ConsumerOffsetChecker): >> > > > 40,24 and 32 >> > > > >> > > > is this a expected behaviour after client session time out? >> > > > >> > > > -- >> > > > Best Regards >> > > > >> > > > ---------------------- >> > > > 刘明敏 | mmLiu >> > > > >> > > >> > >> > >> > >> > -- >> > Best Regards >> > >> > ---------------------- >> > 刘明敏 | mmLiu >> > >> > > > > -- > Best Regards > > ---------------------- > 刘明敏 | mmLiu