[ 
https://issues.apache.org/jira/browse/CURATOR-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447840#comment-16447840
 ] 

Alex Rankin commented on CURATOR-439:
-------------------------------------

Thanks [~randgalt] - I think the confusion just comes from the lack of good 
examples or explanation of the behaviour of Curator in different scenarios. We 
did have a {{ConnectionStateListener}}, but the following line in the 
documentation made us think there was more we should be doing:
{quote}Clients can monitor these changes and take appropriate action.
{quote}
Looking at other libraries ([like 
this|[https://github.com/mitdbg/amoeba/blob/master/src/main/java/core/utils/CuratorUtils.java#L66]
 ), people seemed to be checking that the ZK Client was connected - so we 
thought that was a good practice. 

If I understand correctly, the following should be true:
 # {{ConnectionStateListener}} does not need to do anything - it can be used 
purely to log changes in the state of Curator, but no further action is needed. 
{{LOST}} or {{SUSPENDED}} connections should automatically {{RECONNECT}} when 
the network is back up.
 # I should not check {{getZookeeperClient().isConnected()}} before any action 
- just perform the action, and if the client isn't connected, it will connect 
(if possible).

If I've got this right, then I'll make sure to close this ticket as "Not an 
Issue".

> CuratorFrameworkState STARTED, but ZookeeperClient not connected
> ----------------------------------------------------------------
>
>                 Key: CURATOR-439
>                 URL: https://issues.apache.org/jira/browse/CURATOR-439
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: 3.2.1
>            Reporter: Alex Rankin
>            Priority: Major
>
> I recently ran into an issue on some of our nodes caused by network issues 
> between a service and Zookeeper. I have been unable to recreate them as of 
> yet, but I'm still trying.
> *+Setup+*
> 5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5 
> nodes).
> Network issues caused the services to disconnect from Zookeeper. 
> There's a check in our code to see if the Zookeeper connection is available 
> before sending a request:
> {quote}public boolean isConnected() \{
>     return curatorFramework.getZookeeperClient().isConnected();
> \}
> {quote}
> After the network issues resolved, we noticed that all calls to Zookeeper 
> from 4 of the services were still failing (the fifth was fine). Checking the 
> logs, we saw that {{CuratorFramework.getState()}} was reporting the state as 
> STARTED, but {{curatorFramework.getZookeeperClient().isConnected();}} was 
> returning false. Restarting the service fixed everything, but I want to 
> obviously avoid this issue in future.
> *+Problem+*
> I couldn't find any documentation stating whether the 
> {{CuratorZookeeperClient.isConnected()}} should be used, or if 
> {{CuratorFramework.getState() == CuratorFrameworkState.STARTED}} (the 
> functionality of the deprecated {{CuratorFramework.isConnected()}}) would be 
> the better check, or if these should both be equivalent, and there's a bug 
> that let one be true while the other was false.
> If my own check is wrong, and I shouldn't be using 
> {{CuratorZookeeperClient.isConnected()}}, then I can easily fix that. I 
> wanted to check the expected behaviour before diving too deep into this, in 
> case this is normal and I am just using Curator incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to