[
https://issues.apache.org/jira/browse/CURATOR-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Rankin updated CURATOR-439:
--------------------------------
Priority: Minor (was: Major)
Description:
I recently ran into an issue on some of our nodes caused by network issues
between a service and Zookeeper. I have been unable to recreate them as of yet,
but I'm still trying.
*+Setup+*
5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5
nodes).
Network issues caused the services to disconnect from Zookeeper.
There's a check in our code to see if the Zookeeper connection is available
before sending a request:
{quote}public boolean isConnected() {
return curatorFramework.getZookeeperClient().isConnected();
}
{quote}
After the network issues resolved, we noticed that all calls to Zookeeper from
4 of the services were still failing (the fifth was fine). Checking the logs,
we saw that {{CuratorFramework.getState()}} was reporting the state as STARTED,
but {{curatorFramework.getZookeeperClient().isConnected();}} was returning
false. Restarting the service fixed everything, but I want to obviously avoid
this issue in future.
*+Problem+*
I couldn't find any documentation stating whether the
{{CuratorZookeeperClient.isConnected()}} should be used, or if
{{CuratorFramework.getState() == CuratorFrameworkState.STARTED}} (the
functionality of the deprecated {{CuratorFramework.isConnected()}}) would be
the better check, or if these should both be equivalent, and there's a bug that
let one be true while the other was false.
If my own check is wrong, and I shouldn't be using
{{CuratorZookeeperClient.isConnected()}}, then I can easily fix that. I wanted
to check the expected behaviour before diving too deep into this, in case this
is normal and I am just using Curator incorrectly.
+*Edit*+
This was a misunderstanding on my part. I'm leaving it open so that I can
submit a documentation/example update shortly to hopefully clarify things a bit
better for others.
was:
I recently ran into an issue on some of our nodes caused by network issues
between a service and Zookeeper. I have been unable to recreate them as of yet,
but I'm still trying.
*+Setup+*
5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5
nodes).
Network issues caused the services to disconnect from Zookeeper.
There's a check in our code to see if the Zookeeper connection is available
before sending a request:
{quote}public boolean isConnected() \{
return curatorFramework.getZookeeperClient().isConnected();
\}
{quote}
After the network issues resolved, we noticed that all calls to Zookeeper from
4 of the services were still failing (the fifth was fine). Checking the logs,
we saw that {{CuratorFramework.getState()}} was reporting the state as STARTED,
but {{curatorFramework.getZookeeperClient().isConnected();}} was returning
false. Restarting the service fixed everything, but I want to obviously avoid
this issue in future.
*+Problem+*
I couldn't find any documentation stating whether the
{{CuratorZookeeperClient.isConnected()}} should be used, or if
{{CuratorFramework.getState() == CuratorFrameworkState.STARTED}} (the
functionality of the deprecated {{CuratorFramework.isConnected()}}) would be
the better check, or if these should both be equivalent, and there's a bug that
let one be true while the other was false.
If my own check is wrong, and I shouldn't be using
{{CuratorZookeeperClient.isConnected()}}, then I can easily fix that. I wanted
to check the expected behaviour before diving too deep into this, in case this
is normal and I am just using Curator incorrectly.
Component/s: (was: Framework)
Documentation
Issue Type: Improvement (was: Bug)
> CuratorFrameworkState STARTED, but ZookeeperClient not connected
> ----------------------------------------------------------------
>
> Key: CURATOR-439
> URL: https://issues.apache.org/jira/browse/CURATOR-439
> Project: Apache Curator
> Issue Type: Improvement
> Components: Documentation
> Affects Versions: 3.2.1
> Reporter: Alex Rankin
> Priority: Minor
>
> I recently ran into an issue on some of our nodes caused by network issues
> between a service and Zookeeper. I have been unable to recreate them as of
> yet, but I'm still trying.
> *+Setup+*
> 5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5
> nodes).
> Network issues caused the services to disconnect from Zookeeper.
> There's a check in our code to see if the Zookeeper connection is available
> before sending a request:
> {quote}public boolean isConnected() {
> return curatorFramework.getZookeeperClient().isConnected();
> }
> {quote}
> After the network issues resolved, we noticed that all calls to Zookeeper
> from 4 of the services were still failing (the fifth was fine). Checking the
> logs, we saw that {{CuratorFramework.getState()}} was reporting the state as
> STARTED, but {{curatorFramework.getZookeeperClient().isConnected();}} was
> returning false. Restarting the service fixed everything, but I want to
> obviously avoid this issue in future.
> *+Problem+*
> I couldn't find any documentation stating whether the
> {{CuratorZookeeperClient.isConnected()}} should be used, or if
> {{CuratorFramework.getState() == CuratorFrameworkState.STARTED}} (the
> functionality of the deprecated {{CuratorFramework.isConnected()}}) would be
> the better check, or if these should both be equivalent, and there's a bug
> that let one be true while the other was false.
> If my own check is wrong, and I shouldn't be using
> {{CuratorZookeeperClient.isConnected()}}, then I can easily fix that. I
> wanted to check the expected behaviour before diving too deep into this, in
> case this is normal and I am just using Curator incorrectly.
> +*Edit*+
> This was a misunderstanding on my part. I'm leaving it open so that I can
> submit a documentation/example update shortly to hopefully clarify things a
> bit better for others.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)