[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514087#comment-15514087
 ] 

Srinivas Neginhal commented on ZOOKEEPER-2598:
----------------------------------------------

Once the ensemble is in this state:
1. Running "sync /moot/gmle/ActiveControllerCluster"  or "sync  
/moot/gmle/ActiveControllerCluster/member0000000025" via zkCli.sh on ZK3 does 
not help.
2. ZKNodes created on ZK1 or ZK2 show up on ZK3. ZNodes create on ZK3 show up 
on ZK1 and 2. 
3. Ephemeral ZKNodes created on ZK3 show up on ZK1 and 2.

Node of the above got rid of the following stale ephemeral nodes still showing 
on ZK3:
/moot/gmle/ActiveControllerCluster/member0000000025
/moot/gmle/ActiveControllerCluster/member0000000026
/moot/gmle/ActiveControllerCluster/member0000000027
/moot/gmle/ServiceDirectory/ActiveNodes/member0000000025
/moot/gmle/ServiceDirectory/ActiveNodes/member0000000026
/moot/gmle/ServiceDirectory/ActiveNodes/member0000000027


Ephemeral node created on ZK 3:

[zk: 10.0.0.3:1300(CONNECTED) 11] create -e /testEphemeral 
create -e /testEphemeral
Created /testEphemeral
[zk: 10.0.0.3:1300(CONNECTED) 12] ls /
ls /
[bar, f, foo, moot, testEphemeral, transport-nodes, vmware, vnet-sharding, 
zookeeper]
[zk: 10.0.0.3:1300(CONNECTED) 13] ls /moot/gmle/ActiveControllerCluster
ls /moot/gmle/ActiveControllerCluster
[member0000000025, member0000000026, member0000000027, member0000000065, 
member0000000067]
[zk: 10.0.0.3:1300(CONNECTED) 14] 

Shows up on ZK1 and ZK2:

[zk: 10.0.0.1:1300(CONNECTED) 9] ls /
ls /
[bar, f, foo, moot, testEphemeral, transport-nodes, vmware, vnet-sharding, 
zookeeper]
[zk: 10.0.0.1:1300(CONNECTED) 10] ls /moot/gmle/ActiveControllerCluster
ls /moot/gmle/ActiveControllerCluster
[member0000000065, member0000000067]

[zk: 10.0.0.2:1300(CONNECTED) 3] ls /
ls /
[bar, f, foo, moot, testEphemeral, transport-nodes, vmware, vnet-sharding, 
zookeeper]
[zk: 10.0.0.2:1300(CONNECTED) 4] ls /moot/gmle/ActiveControllerCluster
ls /moot/gmle/ActiveControllerCluster
[member0000000065, member0000000067]





> Data Inconsistency after power off/on of some nodes
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-2598
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2598
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.5.1
>         Environment: ZK is running in a docker container on a Ubuntu 14.04 VM
>            Reporter: Srinivas Neginhal
>         Attachments: zk1.tgz, zk2.tgz, zk3.tgz
>
>
> Steps to reproduce:
> 1. Create a three node cluster: Node1, Node2 and Node3.
>     Each node is a VM that runs:
>     1. ZK in a docker container
>     2. Two clients, A and B that use ZK for group membership and leader 
> election. The clients create sequential ephemeral nodes when they come up. 
> 2. The three ZK's running in the containers form an ensemble.
> 3. Power off/on Node 2 and Node 3 in a loop
> 4. After a few times, the ephemeral nodes seen by the three nodes are 
> different.
> Here is the output of some four letter commands with the ensemble in the 
> state:
> 1. conf:
> ZK 1:
> # echo conf| nc 10.0.0.1 1300
> clientPort=1300
> secureClientPort=-1
> dataDir=/moot/persistentStore/zkWorkspace/version-2
> dataDirSize=67293721
> dataLogDir=/moot/persistentStore/zkWorkspace/version-2
> dataLogSize=67293721
> tickTime=2000
> maxClientCnxns=60
> minSessionTimeout=4000
> maxSessionTimeout=40000
> serverId=1
> initLimit=100
> syncLimit=20
> electionAlg=3
> electionPort=1200
> quorumPort=1100
> peerType=0
> membership: 
> server.1=10.0.0.1:1100:1200:participant;10.0.0.1:1300;8e64c644-d0fa-414f-bab2-3c8c80364410
> server.2=10.0.0.2:1100:1200:participant;10.0.0.2:1300;38bf19b8-d4cb-4dac-b328-7bbf0ee1e2c4
> server.3=10.0.0.3:1100:1200:participant;10.0.0.3:1300;e1415d59-e857-43e6-ba9b-01daeb31a434
> ZK 2:
> # echo conf| nc 10.0.0.2 1300
> clientPort=1300
> secureClientPort=-1
> dataDir=/moot/persistentStore/zkWorkspace/version-2
> dataDirSize=1409480873
> dataLogDir=/moot/persistentStore/zkWorkspace/version-2
> dataLogSize=1409480873
> tickTime=2000
> maxClientCnxns=60
> minSessionTimeout=4000
> maxSessionTimeout=40000
> serverId=2
> initLimit=100
> syncLimit=20
> electionAlg=3
> electionPort=1200
> quorumPort=1100
> peerType=0
> membership: 
> server.1=10.0.0.1:1100:1200:participant;10.0.0.1:1300;8e64c644-d0fa-414f-bab2-3c8c80364410
> server.2=10.0.0.2:1100:1200:participant;10.0.0.2:1300;38bf19b8-d4cb-4dac-b328-7bbf0ee1e2c4
> server.3=10.0.0.3:1100:1200:participant;10.0.0.3:1300;e1415d59-e857-43e6-ba9b-01daeb31a434
> ZK 3:
> # echo conf| nc 10.0.0.3 1300
> clientPort=1300
> secureClientPort=-1
> dataDir=/moot/persistentStore/zkWorkspace/version-2
> dataDirSize=1409505467
> dataLogDir=/moot/persistentStore/zkWorkspace/version-2
> dataLogSize=1409505467
> tickTime=2000
> maxClientCnxns=60
> minSessionTimeout=4000
> maxSessionTimeout=40000
> serverId=3
> initLimit=100
> syncLimit=20
> electionAlg=3
> electionPort=1200
> quorumPort=1100
> peerType=0
> membership: 
> server.1=10.0.0.1:1100:1200:participant;10.0.0.1:1300;8e64c644-d0fa-414f-bab2-3c8c80364410
> server.2=10.0.0.2:1100:1200:participant;10.0.0.2:1300;38bf19b8-d4cb-4dac-b328-7bbf0ee1e2c4
> server.3=10.0.0.3:1100:1200:participant;10.0.0.3:1300;e1415d59-e857-43e6-ba9b-01daeb31a434
> 2. mntr:
> ZK 1:
> # echo mntr| nc 10.0.0.1 1300
> zk_version 3.5.1-alpha--1, built on 09/07/2016 00:34 GMT
> zk_avg_latency 0
> zk_max_latency 471
> zk_min_latency 0
> zk_packets_received 32556
> zk_packets_sent 32564
> zk_num_alive_connections 7
> zk_outstanding_requests 0
> zk_server_state leader
> zk_znode_count 58
> zk_watch_count 51
> zk_ephemerals_count 5
> zk_approximate_data_size 5251
> zk_open_file_descriptor_count 52
> zk_max_file_descriptor_count 1048576
> zk_followers 2
> zk_synced_followers 2
> zk_pending_syncs 0
> ZK 2:
> # echo mntr| nc 10.0.0.2 1300
> zk_version 3.5.1-alpha--1, built on 09/07/2016 00:34 GMT
> zk_avg_latency 1
> zk_max_latency 227
> zk_min_latency 0
> zk_packets_received 30905
> zk_packets_sent 30936
> zk_num_alive_connections 6
> zk_outstanding_requests 0
> zk_server_state follower
> zk_znode_count 58
> zk_watch_count 82
> zk_ephemerals_count 5
> zk_approximate_data_size 5251
> zk_open_file_descriptor_count 49
> zk_max_file_descriptor_count 1048576
> ZK 3:
> # echo mntr| nc 10.0.0.3 1300
> zk_version 3.5.1-alpha--1, built on 09/07/2016 00:34 GMT
> zk_avg_latency 4
> zk_max_latency 590
> zk_min_latency 0
> zk_packets_received 6192
> zk_packets_sent 6191
> zk_num_alive_connections 2
> zk_outstanding_requests 0
> zk_server_state follower
> zk_znode_count 64
> zk_watch_count 17
> zk_ephemerals_count 11
> zk_approximate_data_size 5806
> zk_open_file_descriptor_count 45
> zk_max_file_descriptor_count 1048576
> 3. dump showing the inconsistency:
> ZK 1:
> # echo dump| nc 10.0.0.1 1300
> SessionTracker dump:
> Session Sets (17)/(12):
> 0 expire at Tue Sep 20 18:22:35 UTC 2016:
> 0 expire at Tue Sep 20 18:22:37 UTC 2016:
> 0 expire at Tue Sep 20 18:22:39 UTC 2016:
> 0 expire at Tue Sep 20 18:22:41 UTC 2016:
> 0 expire at Tue Sep 20 18:22:43 UTC 2016:
> 0 expire at Tue Sep 20 18:22:45 UTC 2016:
> 0 expire at Tue Sep 20 18:22:49 UTC 2016:
> 0 expire at Tue Sep 20 18:22:51 UTC 2016:
> 0 expire at Tue Sep 20 18:22:53 UTC 2016:
> 0 expire at Tue Sep 20 18:22:55 UTC 2016:
> 0 expire at Tue Sep 20 18:22:57 UTC 2016:
> 4 expire at Tue Sep 20 18:22:59 UTC 2016:
>  0x100061435f7000d
>  0x10000d9e4460004
>  0x100061435f70002
>  0x10000d9e4460003
> 4 expire at Tue Sep 20 18:23:03 UTC 2016:
>  0x2000001141a0002
>  0x2000001141a0000
>  0x2000001141a0005
>  0x100061435f70010
> 1 expire at Tue Sep 20 18:23:07 UTC 2016:
>  0x2000001141a0001
> 1 expire at Tue Sep 20 18:23:09 UTC 2016:
>  0x100061435f70000
> 1 expire at Tue Sep 20 18:23:11 UTC 2016:
>  0x2000001141a000f
> 1 expire at Tue Sep 20 18:23:13 UTC 2016:
>  0x300000188c30001
> ephemeral nodes dump:
> Sessions with Ephemerals (5):
> 0x100061435f70000:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000064
> 0x2000001141a000f:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000066
> 0x2000001141a0001:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000065
> 0x2000001141a0000:
>  /moot/gmle/ActiveControllerCluster/member0000000065
> 0x2000001141a0005:
>  /moot/gmle/ActiveControllerCluster/member0000000066
> Connections dump:
> Connections Sets (5)/(10):
> 0 expire at Tue Sep 20 18:22:35 UTC 2016:
> 1 expire at Tue Sep 20 18:22:45 UTC 2016:
>  ip: /10.0.0.1:45591 sessionId: 0x0
> 0 expire at Tue Sep 20 18:22:55 UTC 2016:
> 5 expire at Tue Sep 20 18:23:05 UTC 2016:
>  ip: /10.0.0.3:34734 sessionId: 0x100061435f7000d
>  ip: /10.0.0.1:42963 sessionId: 0x10000d9e4460003
>  ip: /10.0.0.3:34739 sessionId: 0x100061435f70010
>  ip: /10.0.0.2:45750 sessionId: 0x100061435f70002
>  ip: /10.0.0.1:42961 sessionId: 0x10000d9e4460004
> 1 expire at Tue Sep 20 18:23:15 UTC 2016:
>  ip: /10.0.0.1:42964 sessionId: 0x100061435f70000
> ZK 2:
> # echo dump| nc 10.0.0.2 1300
> SessionTracker dump:
> Global Sessions(13):
> 0x10000d9e4460003 30000ms
> 0x10000d9e4460004 30000ms
> 0x100061435f70000 40000ms
> 0x100061435f70002 30000ms
> 0x100061435f7000d 30000ms
> 0x100061435f70010 30000ms
> 0x100061435f70584 4000ms
> 0x2000001141a0000 40000ms
> 0x2000001141a0001 40000ms
> 0x2000001141a0002 30000ms
> 0x2000001141a0005 40000ms
> 0x2000001141a000f 40000ms
> 0x300000188c30001 40000ms
> ephemeral nodes dump:
> Sessions with Ephemerals (5):
> 0x100061435f70000:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000064
> 0x2000001141a000f:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000066
> 0x2000001141a0001:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000065
> 0x2000001141a0000:
>  /moot/gmle/ActiveControllerCluster/member0000000065
> 0x2000001141a0005:
>  /moot/gmle/ActiveControllerCluster/member0000000066
> Connections dump:
> Connections Sets (4)/(6):
> 0 expire at Tue Sep 20 18:25:13 UTC 2016:
> 1 expire at Tue Sep 20 18:25:23 UTC 2016:
>  ip: /10.0.0.2:38021 sessionId: 0x0
> 1 expire at Tue Sep 20 18:25:33 UTC 2016:
>  ip: /10.0.0.2:35422 sessionId: 0x2000001141a0002
> 4 expire at Tue Sep 20 18:25:43 UTC 2016:
>  ip: /10.0.0.2:35419 sessionId: 0x2000001141a0001
>  ip: /10.0.0.1:59025 sessionId: 0x2000001141a0000
>  ip: /10.0.0.2:35427 sessionId: 0x2000001141a0005
>  ip: /10.0.0.3:56967 sessionId: 0x2000001141a000f
> ZK 3:
> # echo dump| nc 10.0.0.3 1300
> SessionTracker dump:
> Global Sessions(23):
> 0x10000d9e4460003 30000ms
> 0x10000d9e4460004 30000ms
> 0x100055a50b00001 30000ms
> 0x100055a50b00003 40000ms
> 0x100055a50b0000c 40000ms
> 0x100061435f70000 40000ms
> 0x100061435f70002 30000ms
> 0x100061435f7000d 30000ms
> 0x100061435f70010 30000ms
> 0x100061435f70585 4000ms
> 0x2000001141a0000 40000ms
> 0x2000001141a0001 40000ms
> 0x2000001141a0002 30000ms
> 0x2000001141a0005 40000ms
> 0x2000001141a000f 40000ms
> 0x200000130750000 40000ms
> 0x200000130750001 40000ms
> 0x200000130750002 30000ms
> 0x200000130750004 40000ms
> 0x20000013075000d 30000ms
> 0x3000000e4860000 30000ms
> 0x3000000e4860002 40000ms
> 0x300000188c30001 40000ms
> ephemeral nodes dump:
> Sessions with Ephemerals (11):
> 0x100061435f70000:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000064
> 0x3000000e4860002:
>  /moot/gmle/ActiveControllerCluster/member0000000027
> 0x100055a50b0000c:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000027
> 0x100055a50b00003:
>  /moot/gmle/ActiveControllerCluster/member0000000025
> 0x200000130750004:
>  /moot/gmle/ActiveControllerCluster/member0000000026
> 0x200000130750000:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000026
> 0x2000001141a000f:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000066
> 0x200000130750001:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000025
> 0x2000001141a0001:
>  /moot/gmle/ServiceDirectory/ActiveNodes/member0000000065
> 0x2000001141a0000:
>  /moot/gmle/ActiveControllerCluster/member0000000065
> 0x2000001141a0005:
>  /moot/gmle/ActiveControllerCluster/member0000000066
> Connections dump:
> Connections Sets (4)/(2):
> 0 expire at Tue Sep 20 18:25:40 UTC 2016:
> 1 expire at Tue Sep 20 18:25:50 UTC 2016:
>  ip: /10.0.0.3:52784 sessionId: 0x0
> 0 expire at Tue Sep 20 18:26:10 UTC 2016:
> 1 expire at Tue Sep 20 18:26:20 UTC 2016:
>  ip: /10.0.0.3:50222 sessionId: 0x300000188c30001



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to