Took a quick glance at your issue. I see a 169.xxx.... IP address in your diags, which means that NIC cannot get an address reservation. Check your ZK configs and don't use 0.0.0.0 for an IP as it binds to all your NICS, use a pingable IP address as in 10.0.xxx. Make sure your server,0, server.1 server.2 settings are the same across all three ZK nodes. No 0.0.0.0.
Steph On Thu, Jul 6, 2023 at 10:40 AM Rounak Kakkar (rkakkar) <rkak...@cisco.com.invalid> wrote: > Hello Team, > > We are using a ZK cluster with 3 nodes in the cluster (namely: vm12, vm18 > and vm19). > We are seeing a issue with ZK cluster, where on one of the nodes (vm18) in > the cluster, ZK doesn’t list all the participants. > The node on which the issue is seen is a leader and the other two nodes > (vm12 and vm19) are followers and they are connected to the leader node > (vm18). > > Could you please help us to debug this issue? > > ZK Version: > > Zookeeper version: 3.6.2--803c7f1a12f85978cb049af5e4ef23bd8b688715 > > > Node on which issue is seen (vm18). It shows only 2 participants. > > vm18:~# docker exec -i coordination-server /var/lib/zookeeper/bin/zkCli.sh > -server localhost config | grep participant > server.0=10.0.105.32:2888:3888:participant > server.1=0.0.0.0:2888:3888:participant > vm18:~# > > > We can see in config there are nodes. > > vm18:~# echo conf | nc 10.0.105.38 2181 > clientPort=2181 > secureClientPort=-1 > dataDir=/data/6356d4ed-ed79-4481-9913-e1c470ec2e79/version-2 > dataDirSize=201387396 > dataLogDir=/data/6356d4ed-ed79-4481-9913-e1c470ec2e79/version-2 > dataLogSize=201387396 > tickTime=2000 > maxClientCnxns=0 > minSessionTimeout=4000 > maxSessionTimeout=40000 > clientPortListenBacklog=-1 > serverId=1 > initLimit=5 > syncLimit=2 > electionAlg=3 > electionPort=3888 > quorumPort=2888 > peerType=0 > membership: > server.0=10.0.105.32:2888:3888:participant > server.1=0.0.0.0:2888:3888:participant > server.2=10.0.105.39:2888:3888:participant > version=0 > > > And this node is a leader. > > vm18:~# echo stat | nc 10.0.105.38 2181 > Zookeeper version: 3.6.2--803c7f1a12f85978cb049af5e4ef23bd8b688715, built > on 09/04/2020 12:44 GMT > Clients: > /10.0.105.32:33086[1](queued=0,recved=229788,sent=229790) > /10.0.105.38:52350[1](queued=0,recved=55027,sent=55034) > /169.254.8.3:33050[1](queued=0,recved=3337,sent=3341) > /10.0.105.38:50552[0](queued=0,recved=1,sent=0) > > Latency min/avg/max: 0/0.2586/1626 > Received: 296333 > Sent: 296451 > Connections: 4 > Outstanding: 0 > Zxid: 0x5000015b9 > Mode: leader > Node count: 381 > Proposal > > Other two nodes where participants all three participants are seen. > > VM12: > > vm12:~# docker exec -i coordination-server /var/lib/zookeeper/bin/zkCli.sh > -server localhost config | grep participant > server.0=0.0.0.0:2888:3888:participant > server.1=10.0.105.38:2888:3888:participant > server.2=10.0.105.39:2888:3888:participant > vm12:~# > > > > vm12:~# echo stat | nc 10.0.105.32 2181 > Zookeeper version: 3.6.2--803c7f1a12f85978cb049af5e4ef23bd8b688715, built > on 09/04/2020 12:44 GMT > Clients: > /10.0.105.32:41494[0](queued=0,recved=1,sent=0) > /169.254.8.3:55894[1](queued=0,recved=3037,sent=3038) > /10.0.105.38:35292[1](queued=0,recved=231472,sent=231474) > /10.0.105.32:39984[1](queued=0,recved=55138,sent=55148) > /10.0.105.39:52614[1](queued=0,recved=5568,sent=5721) > /10.0.105.32:38996[1](queued=0,recved=6219,sent=6400) > /10.0.105.39:53858[0](queued=0,recved=1,sent=0) > /10.0.105.39:53880[1](queued=0,recved=54945,sent=54949) > > Latency min/avg/max: 0/0.4517/350 > Received: 361365 > Sent: 361728 > Connections: 8 > Outstanding: 0 > Zxid: 0x5000015d4 > Mode: follower > Node count: 384 > vm12:~# > > > VM19: > > vm19:~# docker exec -i coordination-server /var/lib/zookeeper/bin/zkCli.sh > -server localhost config | grep participant > server.0=10.0.105.32:2888:3888:participant > server.1=10.0.105.38:2888:3888:participant > server.2=0.0.0.0:2888:3888:participant > vm19:~# > > > > vm19:~# echo stat | nc 10.0.105.39 2181 > Zookeeper version: 3.6.2--803c7f1a12f85978cb049af5e4ef23bd8b688715, built > on 09/04/2020 12:44 GMT > Clients: > /169.254.8.3:58810[1](queued=0,recved=3246,sent=3250) > /10.0.105.38:43338[1](queued=0,recved=4097,sent=4258) > /169.254.8.1:53464[1](queued=0,recved=228908,sent=228908) > /10.0.105.39:41558[0](queued=0,recved=1,sent=0) > > Latency min/avg/max: 0/0.3002/276 > Received: 238713 > Sent: 238881 > Connections: 4 > Outstanding: 0 > Zxid: 0x5000015dc > Mode: follower > Node count: 386 > vm19:~# > > Thanks, > Rounak >