[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2019-03-28 Thread Jordan Zimmerman (JIRA)


[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804358#comment-16804358
 ] 

Jordan Zimmerman commented on CURATOR-355:
--

>  This ticket will not be solved?

 We'd need a PR with the solution. I can help you if you take it on.

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
>   ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2019-03-28 Thread Ken Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804314#comment-16804314
 ] 

Ken Liu commented on CURATOR-355:
-

This ticket will not be solved? I also face this issue, when one of the zk is 
down, I receive the ConnectionLossException, and function could not works 
normally.

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
>   ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-11 Thread Benjamin Jaton (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15566676#comment-15566676
 ] 

Benjamin Jaton commented on CURATOR-355:


Let's note that the code you provided using the TestingCluster class cannot be 
used to reproduce the behavior stated in the bug, as local connection will be 
actively denied if the port is not open, whereas in the original example, the 
TCP connection will timeout.

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-11 Thread Benjamin Jaton (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15565929#comment-15565929
 ] 

Benjamin Jaton commented on CURATOR-355:


Is there a documentation somewhere that talks about what connectionTimeout 
means for Curator?
I thought it was the timeout of the connection to a specific node.

Also I don't think ZK fails after 2/3 of a session. From my tests it seems to 
fail at (sesstionTimeout / nbServersInConnectionString).

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-11 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15565020#comment-15565020
 ] 

Jordan Zimmerman commented on CURATOR-355:
--

ConnectionTimeout is a Curator concept. You should set it to whatever you need. 
ZooKeeper fails a heartbeat after 2/3 of a session as you have seen.

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-10 Thread Benjamin Jaton (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563713#comment-15563713
 ] 

Benjamin Jaton commented on CURATOR-355:


So when I connect using ZK API directly with sessionTimeout=45000, and when it 
picks up the server that is NOT started first, it takes the ZK client API 22 
seconds (45/2?) to try the second server, which then works and I get my 
connection.

In contrast Curator seems to wait only connectionTimeout=15000 in 
blockUntilConnectedOrTimedOut(), so it seems like it's failing because it's 
stops trying too early.

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-10 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563539#comment-15563539
 ] 

Jordan Zimmerman commented on CURATOR-355:
--

Yes - though it's ZooKeeper doing the actual connecting. 

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-10 Thread Benjamin Jaton (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563532#comment-15563532
 ] 

Benjamin Jaton commented on CURATOR-355:


Just to clarify, in this case there is still 1 of the ZK node started, so then 
the Curator client should successfully connect to it, and 
blockUntilConnectedOrTimedOut() should return true.

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-10 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563510#comment-15563510
 ] 

Jordan Zimmerman commented on CURATOR-355:
--

It's one of those things where I can't know if people are depending on it. 
Maybe we can add a System property to change to new behavior. That would be OK.


> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-10 Thread Benjamin Jaton (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563495#comment-15563495
 ] 

Benjamin Jaton commented on CURATOR-355:


Does it need a backport? Just a fix of the existing mechanism is enough, no 
need to change the whole thing if possible.

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-10 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563375#comment-15563375
 ] 

Jordan Zimmerman commented on CURATOR-355:
--

It would be hard to back port to 2.x. 

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-10 Thread Benjamin Jaton (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563014#comment-15563014
 ] 

Benjamin Jaton commented on CURATOR-355:


Sounds good, I will check the 3.x release. But I won't be able to use it for 
existing deployments, any chance to fix version 2.x?

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-10 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15562989#comment-15562989
 ] 

Jordan Zimmerman commented on CURATOR-355:
--

Firstly, the call to 
`client.getZookeeperClient().blockUntilConnectedOrTimedOut();` is unnecessary 
as Curator does this internally. 

Curator 3.0 has better connection timeout behavior than Curator 2.0. In 2.0, 
the connection timeout is applied for each iteration of the Retry Policy. So, 
in this case, you'd expect `getData()` to wait 15 seconds * 3, plus 5 seconds * 
3 for a total of one minute. In my recreation of your test that's exactly what 
I see:

```
System.setProperty("readonlymode.enabled", "true");
TestingCluster cluster = new TestingCluster(3);
cluster.getServers().get(0).stop();
cluster.getServers().get(1).stop();

CuratorFrameworkFactory.Builder curatorClientBuilder = 
CuratorFrameworkFactory.builder()
.connectString(cluster.getConnectString())
.sessionTimeoutMs(45000).connectionTimeoutMs(15000)
.retryPolicy(new RetryNTimes(3, 5000)).canBeReadOnly(true);

CuratorFramework client = curatorClientBuilder.build();
client.start();
client.getZookeeperClient().blockUntilConnectedOrTimedOut();
System.out.println("Successfully established the connection with 
ZooKeeper");

client.getData().forPath("/");
System.out.println("Done.");
```

With Curator 3.0, the time improves to just 15 seconds * 2 - the connection 
timeout number twice. Once for the `blockUntilConnectedOrTimedOut()` and once 
for the `getData()`. Note: `blockUntilConnectedOrTimedOut()` in all cases 
would've returned `false` implying you should not continue.

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-10 Thread Benjamin Jaton (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15562967#comment-15562967
 ] 

Benjamin Jaton commented on CURATOR-355:


I used Curator 2.11.0.

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble

2016-10-10 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15562950#comment-15562950
 ] 

Jordan Zimmerman commented on CURATOR-355:
--

Is this Curator 3.x? or 2.x?

> Curator client fails when connecting to read-only ensemble
> --
>
> Key: CURATOR-355
> URL: https://issues.apache.org/jira/browse/CURATOR-355
> Project: Apache Curator
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.11.0
>Reporter: Benjamin Jaton
>Priority: Critical
> Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by 
> using the ZK API manually). All the machines of the ensemble can be pinged 
> from the client.
> I'm using this piece of code:
> {code}
>   Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>   
> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>   
> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>   .retryPolicy(new RetryNTimes(3, 
> 5000)).canBeReadOnly(true);
>   CuratorFramework client = curatorClientBuilder.build();
>   client.start();
>   client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>   System.out.println("Successfully established the connection 
> with ZooKeeper");
>   
>   client.getData().forPath("/");
>   System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. 
> When it picks the host that is down first (QA-E8WIN12), it seems to be stuck 
> at the getData() call for a very long time, and then eventually fail with a 
> ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)