[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gangadhar updated ZOOKEEPER-3906:
---------------------------------
    Description: 
*Issue*: Data Inconsistency Between Zookeeper Leader and zookeeper Followers. 
zookeeper followers and zookeeper leaders have other information. We try to 
delete the information from the follower's, but information not present in 
zookeeper leader, it's throwing error like *Node does not exist:*

*Expected behaviour:* Data consistency between zookeeper leader and Zookeeper 
followers should be same.

 

Steps followed as part of troubleshooting:

We have 5 zookeepers in clusters.

*Step1:*  verified all zookeepers are following the leader or not?. As per 
below information its following all 4 zookeepers to zookeeper leader

zk_version      3.5.7-f0fdd52973d373ffd9c86b81d99842dc2c7f660e, built on 
02/11/2020 11:30 GMT
 zk_avg_latency  0
 zk_max_latency  823
 zk_min_latency  0
 zk_packets_received     30214264
 zk_packets_sent 32424272
 zk_num_alive_connections        7
 zk_outstanding_requests 0
 zk_server_state leader
 zk_znode_count  75190
 zk_watch_count  21394
 zk_ephemerals_count     793
 zk_approximate_data_size        24706628
 zk_open_file_descriptor_count   281
 zk_max_file_descriptor_count    4096
 zk_followers    4
 zk_synced_followers     4
 zk_pending_syncs        0
 zk_last_proposal_size   166
 zk_max_proposal_size    121947
 zk_min_proposal_size    32

*Step 2:* Verified znode in all the zookeepers , but we are not getting same 
information from zookeeper leader and followers.

*Step 3:* Try to delete the Zookeeper node and received below error. Also, we 
are suspecting that when trying to delete the info of znode, it's trying to 
reach zookeeper leader and throwing *Node does not exist* error. 

*Error:* 
 14:04:54.769 [main] INFO  org.apache.zookeeper.ClientCnxnSocket - 
jute.maxbuffer value is 10485760 Bytes
 14:04:54.775 [main] INFO  org.apache.zookeeper.ClientCnxn - 
zookeeper.request.timeout value is 0. feature enabled=
 14:04:54.824 [main-SendThread(11.111.226.146:2181)] INFO  
org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
11-111-226-146.ebiz.verizon.com/11.111.226.146:2181. Will not attempt to 
authenticate using SASL (unknown error)
 14:04:54.831 [main-SendThread(11.111.226.146:2181)] INFO  
org.apache.zookeeper.ClientCnxn - Socket connection established, initiating 
session, client: /11.111.225.75:38804, server: 
11-111-226-146.ebiz.verizon.com/11.111.20.146:2181
 14:04:54.835 [main-SendThread(11.111.226.146:2181)] INFO  
org.apache.zookeeper.ClientCnxn - Session establishment complete on server 
11-111-226-146.ebiz.verizon.com/11.111.226.146:2181, sessionid = 
0x500001bbbeb0651, negotiated timeout = 20000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
 *Node does not exist: /namespace/$tenant/$Namespace/$zk-path*

  was:
Issue: Data Inconsistency Between Zookeeper Leader and zookeeper Followers. 
When we try to do the topic lookup for one of the topics I got broker not part 
of the cluster and verified below things as part of troubleshooting.

Steps followed as part of troubleshooting:

We have 5 zookeeper cluster.

*Step1:*  verified all zookeepers are following the leader or not?. As per 
below information its following all 4 zookeepers to zookeeper leader

zk_version      3.5.7-f0fdd52973d373ffd9c86b81d99842dc2c7f660e, built on 
02/11/2020 11:30 GMT
zk_avg_latency  0
zk_max_latency  823
zk_min_latency  0
zk_packets_received     30214264
zk_packets_sent 32424272
zk_num_alive_connections        7
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count  75190
zk_watch_count  21394
zk_ephemerals_count     793
zk_approximate_data_size        24706628
zk_open_file_descriptor_count   281
zk_max_file_descriptor_count    4096
zk_followers    4
zk_synced_followers     4
zk_pending_syncs        0
zk_last_proposal_size   166
zk_max_proposal_size    121947
zk_min_proposal_size    32


*Step 2:* Verified namespace bundle in all the zookeepers using the below 
command. We have received information from all zookeepers. except for Leader 
zookeeper.

./pulsar zookeeper-shell  get /namespace/$tenant/$Namespace/$Bubdle


*Step 3:* Try to delete the Namespace/$Bubdle to own the topic to another 
broker.

./pulsar zookeeper-shell  deleteall /namespace/$tenant/$Namespace/$Bubdle

*Error:* 
14:04:54.769 [main] INFO  org.apache.zookeeper.ClientCnxnSocket - 
jute.maxbuffer value is 10485760 Bytes
14:04:54.775 [main] INFO  org.apache.zookeeper.ClientCnxn - 
zookeeper.request.timeout value is 0. feature enabled=
14:04:54.824 [main-SendThread(11.111.226.146:2181)] INFO  
org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
11-111-226-146.ebiz.verizon.com/11.111.226.146:2181. Will not attempt to 
authenticate using SASL (unknown error)
14:04:54.831 [main-SendThread(11.111.226.146:2181)] INFO  
org.apache.zookeeper.ClientCnxn - Socket connection established, initiating 
session, client: /11.111.225.75:38804, server: 
11-111-226-146.ebiz.verizon.com/11.111.20.146:2181
14:04:54.835 [main-SendThread(11.111.226.146:2181)] INFO  
org.apache.zookeeper.ClientCnxn - Session establishment complete on server 
11-111-226-146.ebiz.verizon.com/11.111.226.146:2181, sessionid = 
0x500001bbbeb0651, negotiated timeout = 20000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
*Node does not exist: /namespace/$tenant/$Namespace/$Bubdle*


> Data Inconsistency Between Zookeeper Leader and zookeeper Followers
> -------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3906
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3906
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.5.7
>            Reporter: Gangadhar
>            Priority: Major
>
> *Issue*: Data Inconsistency Between Zookeeper Leader and zookeeper Followers. 
> zookeeper followers and zookeeper leaders have other information. We try to 
> delete the information from the follower's, but information not present in 
> zookeeper leader, it's throwing error like *Node does not exist:*
> *Expected behaviour:* Data consistency between zookeeper leader and Zookeeper 
> followers should be same.
>  
> Steps followed as part of troubleshooting:
> We have 5 zookeepers in clusters.
> *Step1:*  verified all zookeepers are following the leader or not?. As per 
> below information its following all 4 zookeepers to zookeeper leader
> zk_version      3.5.7-f0fdd52973d373ffd9c86b81d99842dc2c7f660e, built on 
> 02/11/2020 11:30 GMT
>  zk_avg_latency  0
>  zk_max_latency  823
>  zk_min_latency  0
>  zk_packets_received     30214264
>  zk_packets_sent 32424272
>  zk_num_alive_connections        7
>  zk_outstanding_requests 0
>  zk_server_state leader
>  zk_znode_count  75190
>  zk_watch_count  21394
>  zk_ephemerals_count     793
>  zk_approximate_data_size        24706628
>  zk_open_file_descriptor_count   281
>  zk_max_file_descriptor_count    4096
>  zk_followers    4
>  zk_synced_followers     4
>  zk_pending_syncs        0
>  zk_last_proposal_size   166
>  zk_max_proposal_size    121947
>  zk_min_proposal_size    32
> *Step 2:* Verified znode in all the zookeepers , but we are not getting same 
> information from zookeeper leader and followers.
> *Step 3:* Try to delete the Zookeeper node and received below error. Also, we 
> are suspecting that when trying to delete the info of znode, it's trying to 
> reach zookeeper leader and throwing *Node does not exist* error. 
> *Error:* 
>  14:04:54.769 [main] INFO  org.apache.zookeeper.ClientCnxnSocket - 
> jute.maxbuffer value is 10485760 Bytes
>  14:04:54.775 [main] INFO  org.apache.zookeeper.ClientCnxn - 
> zookeeper.request.timeout value is 0. feature enabled=
>  14:04:54.824 [main-SendThread(11.111.226.146:2181)] INFO  
> org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
> 11-111-226-146.ebiz.verizon.com/11.111.226.146:2181. Will not attempt to 
> authenticate using SASL (unknown error)
>  14:04:54.831 [main-SendThread(11.111.226.146:2181)] INFO  
> org.apache.zookeeper.ClientCnxn - Socket connection established, initiating 
> session, client: /11.111.225.75:38804, server: 
> 11-111-226-146.ebiz.verizon.com/11.111.20.146:2181
>  14:04:54.835 [main-SendThread(11.111.226.146:2181)] INFO  
> org.apache.zookeeper.ClientCnxn - Session establishment complete on server 
> 11-111-226-146.ebiz.verizon.com/11.111.226.146:2181, sessionid = 
> 0x500001bbbeb0651, negotiated timeout = 20000
> WATCHER::
> WatchedEvent state:SyncConnected type:None path:null
>  *Node does not exist: /namespace/$tenant/$Namespace/$zk-path*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to