[ 
https://issues.apache.org/jira/browse/CURATOR-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16218259#comment-16218259
 ] 

Alex Rankin commented on CURATOR-439:
-------------------------------------

>From analysing the log files, it looks like the ConnectionState fluctuated 
>between SUSPENDED and RECONNECTED a few times, and was LOST twice. The first 
>time the connection was LOST, it RECONNECTED again afterwards. After the 
>second time, there were no more ConnectionState changes.

It isn't clear from the documentation, but are we expected to close and restart 
the Curator instance if the ConnectionState is LOST? After looking through some 
other public codebases, it seems that this is the approach that others take.

> CuratorFrameworkState STARTED, but ZookeeperClient not connected
> ----------------------------------------------------------------
>
>                 Key: CURATOR-439
>                 URL: https://issues.apache.org/jira/browse/CURATOR-439
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: 3.2.1
>            Reporter: Alex Rankin
>            Priority: Minor
>
> I recently ran into an issue on some of our nodes caused by network issues 
> between a service and Zookeeper. I have been unable to recreate them as of 
> yet, but I'm still trying.
> *+Setup+*
> 5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5 
> nodes).
> Network issues caused the services to disconnect from Zookeeper. 
> There's a check in our code to see if the Zookeeper connection is available 
> before sending a request:
> {quote}public boolean isConnected() \{
>     return curatorFramework.getZookeeperClient().isConnected();
> \}
> {quote}
> After the network issues resolved, we noticed that all calls to Zookeeper 
> from 4 of the services were still failing (the fifth was fine). Checking the 
> logs, we saw that {{CuratorFramework.getState()}} was reporting the state as 
> STARTED, but {{curatorFramework.getZookeeperClient().isConnected();}} was 
> returning false. Restarting the service fixed everything, but I want to 
> obviously avoid this issue in future.
> *+Problem+*
> I couldn't find any documentation stating whether the 
> {{CuratorZookeeperClient.isConnected()}} should be used, or if 
> {{CuratorFramework.getState() == CuratorFrameworkState.STARTED}} (the 
> functionality of the deprecated {{CuratorFramework.isConnected()}}) would be 
> the better check, or if these should both be equivalent, and there's a bug 
> that let one be true while the other was false.
> If my own check is wrong, and I shouldn't be using 
> {{CuratorZookeeperClient.isConnected()}}, then I can easily fix that. I 
> wanted to check the expected behaviour before diving too deep into this, in 
> case this is normal and I am just using Curator incorrectly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to