Hi All, We have a 3 server zookeeper cluster running version 3.4.6. During testing we noticed that our cluster sometimes gets into a state where some data items are only available are on the LEADER but not on any of the followers.
Once our cluster hit this state, we tried explicitly syncing the missing items (with the sync "/" and sync "/missingitem" commands in zkCli) in the followers but it did not result in retrieving the missing data items. However, when I try to create the missing data item explicitly, I get a NodeExists exception. But a "get" produces NoNodeExists exception! We even let the cluster run for tens of minutes (well past our syncLimit settings) and it did not lead to the missing items being synchronized either. Also, if we login manually to the leader or any follower and create new entries, when the cluster is in this state, they show up immediately on all servers. Has anyone observed this bizarre behavior before? Our zookeeper calls in the code are wrapped with Netflix curator library. Our zoo.cfg looks like this: --- tickTime=2000 initLimit=10 syncLimit=5 dataDir=/home/zookeeper/data clientPort=2181 autopurge.snapRetainCount=5 autopurge.purgeInterval=6 server.1=192.168.100.43:2888:3888 server.2=192.168.100.44:2888:3888 server.3=192.168.100.45:2888:3888 --- I'm trying to collect debug level logs from the setup and will update this thread with it once I have them. Any help or input will be much appreciated. Thanks!
