[ 
https://issues.apache.org/jira/browse/CURATOR-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rankin updated CURATOR-439:
--------------------------------
    Description: 
I recently ran into an issue on some of our nodes caused by network issues 
between a service and Zookeeper. I have been unable to recreate them as of yet, 
but I'm still trying.

*+Setup+*
5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5 
nodes).

Network issues caused the services to disconnect from Zookeeper. 

There's a check in our code to see if the Zookeeper connection is available 
before sending a request:

{quote}public boolean isConnected() \{
    return curatorFramework.getZookeeperClient().isConnected();
\}
{quote}

After the network issues resolved, we noticed that all calls to Zookeeper were 
still failing. Checking the logs, we saw that {{CuratorFramework.getState()}} 
was reporting the state as STARTED, but 
{{curatorFramework.getZookeeperClient().isConnected();}} was returning false. 
Restarting the service fixed everything, but I want to obviously avoid this 
issue in future.

*+Problem+*
I couldn't find any documentation stating whether the 
{{CuratorZookeeperClient.isConnected()}} should be used, or if 
{{CuratorFramework.getState() == CuratorFrameworkState.STARTED}} (the 
functionality of the deprecated {{CuratorFramework.isConnected()}}) would be 
the better check, or if these should both be equivalent, and there's a bug that 
let one be true while the other was false.

If my own check is wrong, and I shouldn't be using 
{{CuratorZookeeperClient.isConnected()}}, then I can easily fix that. I wanted 
to check the expected behaviour before diving too deep into this, in case this 
is normal and I am just using Curator incorrectly.

  was:
I recently ran into an issue on some of our nodes caused by network issues 
between a service and Zookeeper. I have been unable to recreate them as of yet, 
but I'm still trying.

*+Setup+*
5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5 
nodes).

Network issues caused the services to disconnect from Zookeeper. 

There's a check in our code to see if the Zookeeper connection is available 
before sending a request:

{quote}public boolean isConnected() {
    return curatorFramework.getZookeeperClient().isConnected();
}
{quote}

After the network issues resolved, we noticed that all calls to Zookeeper were 
still failing. Checking the logs, we saw that {{CuratorFramework.getState()}} 
was reporting the state as STARTED, but 
{{curatorFramework.getZookeeperClient().isConnected();}} was returning false. 
Restarting the service fixed everything, but I want to obviously avoid this 
issue in future.

*+Problem+*
I couldn't find any documentation stating whether the 
{{CuratorZookeeperClient.isConnected()}} should be used, or if 
{{CuratorFramework.getState() == CuratorFrameworkState.STARTED}} (the 
functionality of the deprecated {{CuratorFramework.isConnected()}}) would be 
the better check, or if these should both be equivalent, and there's a bug that 
let one be true while the other was false.

If my own check is wrong, and I shouldn't be using 
{{CuratorZookeeperClient.isConnected()}}, then I can easily fix that. I wanted 
to check the expected behaviour before diving too deep into this, in case this 
is normal and I am just using Curator incorrectly.


> CuratorFrameworkState STARTED, but ZookeeperClient not connected
> ----------------------------------------------------------------
>
>                 Key: CURATOR-439
>                 URL: https://issues.apache.org/jira/browse/CURATOR-439
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: 3.2.1
>            Reporter: Alex Rankin
>            Priority: Minor
>
> I recently ran into an issue on some of our nodes caused by network issues 
> between a service and Zookeeper. I have been unable to recreate them as of 
> yet, but I'm still trying.
> *+Setup+*
> 5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5 
> nodes).
> Network issues caused the services to disconnect from Zookeeper. 
> There's a check in our code to see if the Zookeeper connection is available 
> before sending a request:
> {quote}public boolean isConnected() \{
>     return curatorFramework.getZookeeperClient().isConnected();
> \}
> {quote}
> After the network issues resolved, we noticed that all calls to Zookeeper 
> were still failing. Checking the logs, we saw that 
> {{CuratorFramework.getState()}} was reporting the state as STARTED, but 
> {{curatorFramework.getZookeeperClient().isConnected();}} was returning false. 
> Restarting the service fixed everything, but I want to obviously avoid this 
> issue in future.
> *+Problem+*
> I couldn't find any documentation stating whether the 
> {{CuratorZookeeperClient.isConnected()}} should be used, or if 
> {{CuratorFramework.getState() == CuratorFrameworkState.STARTED}} (the 
> functionality of the deprecated {{CuratorFramework.isConnected()}}) would be 
> the better check, or if these should both be equivalent, and there's a bug 
> that let one be true while the other was false.
> If my own check is wrong, and I shouldn't be using 
> {{CuratorZookeeperClient.isConnected()}}, then I can easily fix that. I 
> wanted to check the expected behaviour before diving too deep into this, in 
> case this is normal and I am just using Curator incorrectly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to