Thanks Steph.

We have changed the 0.0.0.0 to 10.x.x.x IP, and now config is same across all 
the ZK nodes in the cluster.
After this change we don’t see any issue.

Thanks,
Rounak

From: Jitesh Mulchandani (jmulchan) <jmulc...@cisco.com>
Date: Friday, 7 July 2023 at 14:25
To: Steph van Schalkwyk <svanschalk...@gmail.com>, u...@zookeeper.apache.org 
<u...@zookeeper.apache.org>, Rounak Kakkar (rkakkar) <rkak...@cisco.com>
Cc: dev@zookeeper.apache.org <dev@zookeeper.apache.org>
Subject: Re: Zookeeper server doesn't list all the participants
+Rounak

From: Steph van Schalkwyk <svanschalk...@gmail.com>
Date: Thursday, 6 July 2023 at 9:38 PM
To: u...@zookeeper.apache.org <u...@zookeeper.apache.org>
Cc: dev@zookeeper.apache.org <dev@zookeeper.apache.org>, Jitesh Mulchandani 
(jmulchan) <jmulc...@cisco.com>
Subject: Re: Zookeeper server doesn't list all the participants
Took a quick glance at your issue.
I see a 169.xxx.... IP address in your diags, which means that NIC cannot get 
an address reservation.
Check your ZK configs and don't use 0.0.0.0 for an IP as it binds to all your 
NICS, use a pingable IP address as in 10.0.xxx. Make sure your
server,0, server.1 server.2 settings are the same across all three ZK nodes. No 
0.0.0.0.

Steph


On Thu, Jul 6, 2023 at 10:40 AM Rounak Kakkar (rkakkar) 
<rkak...@cisco.com.invalid> wrote:
Hello Team,

We are using a ZK cluster with 3 nodes in the cluster (namely: vm12, vm18 and 
vm19).
We are seeing a issue with ZK cluster, where on one of the nodes (vm18) in the 
cluster, ZK doesn’t list all the participants.
The node on which the issue is seen is a leader and the other two nodes (vm12 
and vm19) are followers and they are connected to the leader node (vm18).

Could you please help us to debug this issue?

ZK Version:

Zookeeper version: 3.6.2--803c7f1a12f85978cb049af5e4ef23bd8b688715


Node on which issue is seen (vm18). It shows only 2 participants.

vm18:~# docker exec -i coordination-server /var/lib/zookeeper/bin/zkCli.sh 
-server localhost config | grep participant
server.0=10.0.105.32:2888<http://10.0.105.32:2888>:3888:participant
server.1=0.0.0.0:2888:3888:participant
vm18:~#


We can see in config there are nodes.

vm18:~# echo conf | nc 10.0.105.38 2181
clientPort=2181
secureClientPort=-1
dataDir=/data/6356d4ed-ed79-4481-9913-e1c470ec2e79/version-2
dataDirSize=201387396
dataLogDir=/data/6356d4ed-ed79-4481-9913-e1c470ec2e79/version-2
dataLogSize=201387396
tickTime=2000
maxClientCnxns=0
minSessionTimeout=4000
maxSessionTimeout=40000
clientPortListenBacklog=-1
serverId=1
initLimit=5
syncLimit=2
electionAlg=3
electionPort=3888
quorumPort=2888
peerType=0
membership:
server.0=10.0.105.32:2888<http://10.0.105.32:2888>:3888:participant
server.1=0.0.0.0:2888:3888:participant
server.2=10.0.105.39:2888<http://10.0.105.39:2888>:3888:participant
version=0


And this node is a leader.

vm18:~# echo stat | nc 10.0.105.38 2181
Zookeeper version: 3.6.2--803c7f1a12f85978cb049af5e4ef23bd8b688715, built on 
09/04/2020 12:44 GMT
Clients:
/10.0.105.32:33086[1](queued=0,recved=229788,sent=229790)
/10.0.105.38:52350[1](queued=0,recved=55027,sent=55034)
/169.254.8.3:33050[1](queued=0,recved=3337,sent=3341)
/10.0.105.38:50552[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0.2586/1626
Received: 296333
Sent: 296451
Connections: 4
Outstanding: 0
Zxid: 0x5000015b9
Mode: leader
Node count: 381
Proposal

Other two nodes where participants all three participants are seen.

VM12:

vm12:~# docker exec -i coordination-server /var/lib/zookeeper/bin/zkCli.sh 
-server localhost config | grep participant
server.0=0.0.0.0:2888:3888:participant
server.1=10.0.105.38:2888<http://10.0.105.38:2888>:3888:participant
server.2=10.0.105.39:2888<http://10.0.105.39:2888>:3888:participant
vm12:~#



vm12:~# echo stat | nc 10.0.105.32 2181
Zookeeper version: 3.6.2--803c7f1a12f85978cb049af5e4ef23bd8b688715, built on 
09/04/2020 12:44 GMT
Clients:
/10.0.105.32:41494[0](queued=0,recved=1,sent=0)
/169.254.8.3:55894[1](queued=0,recved=3037,sent=3038)
/10.0.105.38:35292[1](queued=0,recved=231472,sent=231474)
/10.0.105.32:39984[1](queued=0,recved=55138,sent=55148)
/10.0.105.39:52614[1](queued=0,recved=5568,sent=5721)
/10.0.105.32:38996[1](queued=0,recved=6219,sent=6400)
/10.0.105.39:53858[0](queued=0,recved=1,sent=0)
/10.0.105.39:53880[1](queued=0,recved=54945,sent=54949)

Latency min/avg/max: 0/0.4517/350
Received: 361365
Sent: 361728
Connections: 8
Outstanding: 0
Zxid: 0x5000015d4
Mode: follower
Node count: 384
vm12:~#


VM19:

vm19:~# docker exec -i coordination-server /var/lib/zookeeper/bin/zkCli.sh 
-server localhost config | grep participant
server.0=10.0.105.32:2888<http://10.0.105.32:2888>:3888:participant
server.1=10.0.105.38:2888<http://10.0.105.38:2888>:3888:participant
server.2=0.0.0.0:2888:3888:participant
vm19:~#



vm19:~# echo stat | nc 10.0.105.39 2181
Zookeeper version: 3.6.2--803c7f1a12f85978cb049af5e4ef23bd8b688715, built on 
09/04/2020 12:44 GMT
Clients:
/169.254.8.3:58810[1](queued=0,recved=3246,sent=3250)
/10.0.105.38:43338[1](queued=0,recved=4097,sent=4258)
/169.254.8.1:53464[1](queued=0,recved=228908,sent=228908)
/10.0.105.39:41558[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0.3002/276
Received: 238713
Sent: 238881
Connections: 4
Outstanding: 0
Zxid: 0x5000015dc
Mode: follower
Node count: 386
vm19:~#

Thanks,
Rounak

Reply via email to