Alex Rankin created CURATOR-439:
-----------------------------------
Summary: CuratorFrameworkState STARTED, but ZookeeperClient not
connected
Key: CURATOR-439
URL: https://issues.apache.org/jira/browse/CURATOR-439
Project: Apache Curator
Issue Type: Bug
Components: Framework
Affects Versions: 3.2.1
Reporter: Alex Rankin
Priority: Minor
I recently ran into an issue on some of our nodes caused by network issues
between a service and Zookeeper. I have been unable to recreate them as of yet,
but I'm still trying.
*+Setup+*
5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5
nodes).
Network issues caused the services to disconnect from Zookeeper.
There's a check in our code to see if the Zookeeper connection is available
before sending a request:
{quote}public boolean isConnected() {
return curatorFramework.getZookeeperClient().isConnected();
}
{quote}
After the network issues resolved, we noticed that all calls to Zookeeper were
still failing. Checking the logs, we saw that {{CuratorFramework.getState()}}
was reporting the state as STARTED, but
{{curatorFramework.getZookeeperClient().isConnected();}} was returning false.
Restarting the service fixed everything, but I want to obviously avoid this
issue in future.
*+Problem+*
I couldn't find any documentation stating whether the
{{CuratorZookeeperClient.isConnected()}} should be used, or if
{{CuratorFramework.getState() == CuratorFrameworkState.STARTED}} (the
functionality of the deprecated {{CuratorFramework.isConnected()}}) would be
the better check, or if these should both be equivalent, and there's a bug that
let one be true while the other was false.
If my own check is wrong, and I shouldn't be using
{{CuratorZookeeperClient.isConnected()}}, then I can easily fix that. I wanted
to check the expected behaviour before diving too deep into this, in case this
is normal and I am just using Curator incorrectly.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)