>From my brief digging, my feeling was that the java way of doing it was
better: statichostprovider is the only one that increments pointers and
gives out addresses and the caller doesn't do any of this... But this may
be too much of a change for C.

On Jul 6, 2016 03:53, "Flavio Junqueira (JIRA)" <[email protected]> wrote:

>
>     [
> https://issues.apache.org/jira/browse/ZOOKEEPER-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364131#comment-15364131
> ]
>
> Flavio Junqueira commented on ZOOKEEPER-2466:
> ---------------------------------------------
>
> [~shralex] Good catch, it is exactly the same problem. The description
> about a list of two servers, but it is an issue in general that we skip one
> server of the list every time.
>
> [~hanm] The test case isn't related to reconfiguration, that's correct.
> However, zh->reconfig is set to 1 initially according to the logic we have
> implemented. That's what I observed while tracing the execution. The fact
> that it is set to 1 initially actually changes the lists we are getting the
> server addresses from (there are _old and _new lists in the handle).
>
> There isn't much in the output, but here is a sample:
>
> {noformat}
> 2016-07-05 18:35:50,174:42240:ZOO_INFO@log_env@1027: Client
> environment:zookeeper.version=zookeeper C client 3.5.2
> 2016-07-05 18:35:50,174:42240:ZOO_INFO@log_env@1031: Client environment:
> host.name=fpj-test-apache-01
> 2016-07-05 18:35:50,174:42240:ZOO_INFO@log_env@1038: Client environment:
> os.name=Linux
> 2016-07-05 18:35:50,174:42240:ZOO_INFO@log_env@1039: Client
> environment:os.arch=4.4.0-28-generic
> 2016-07-05 18:35:50,174:42240:ZOO_INFO@log_env@1040: Client
> environment:os.version=#47-Ubuntu SMP Fri Jun 24 10:09:13 UTC 2016
> 2016-07-05 18:35:50,174:42240:ZOO_INFO@log_env@1048: Client environment:
> user.name=fpj
> 2016-07-05 18:35:50,174:42240:ZOO_INFO@log_env@1056: Client
> environment:user.home=/root
> 2016-07-05 18:35:50,174:42240:ZOO_INFO@log_env@1068: Client
> environment:user.dir=/home/fpj/code/zookeeper-3.5.2-alpha/src/c
> 2016-07-05 18:35:50,174:42240:ZOO_INFO@zookeeper_init_internal@1111:
> Initiating client connection, host=127.0.0.1:22182,127.0.0.1:22181
> sessionTimeout=10000 watcher=0x447050 sessionId=0 sessionPasswd=<null>
> context=0x7ffcc708fec0 flags=0
> 2016-07-05 18:35:51,174:42240:ZOO_WARN@get_next_server_in_reconfig@1256:
> [OLD] count=0 capacity=0 next=0 hasnext=0
> 2016-07-05 18:35:51,174:42240:ZOO_WARN@get_next_server_in_reconfig@1259:
> [NEW] count=2 capacity=16 next=0 hasnext=1
> 2016-07-05 18:35:51,175:42240:ZOO_WARN@get_next_server_in_reconfig@1268:
> Using next from NEW=127.0.0.1:22182
> 2016-07-05 18:35:51,175:42240:ZOO_ERROR@handle_socket_error_msg@2353:
> Socket [127.0.0.1:22182] zk retcode=-4, errno=111(Connection refused):
> server refused to accept the client
> 2016-07-05 18:35:51,175:42240:ZOO_WARN@get_next_server_in_reconfig@1256:
> [OLD] count=0 capacity=0 next=0 hasnext=0
> 2016-07-05 18:35:51,175:42240:ZOO_WARN@get_next_server_in_reconfig@1259:
> [NEW] count=2 capacity=16 next=1 hasnext=1
> 2016-07-05 18:35:51,175:42240:ZOO_WARN@get_next_server_in_reconfig@1268:
> Using next from NEW=127.0.0.1:22181
> 2016-07-05 18:35:51,175:42240:ZOO_ERROR@handle_socket_error_msg@2353:
> Socket [127.0.0.1:22181] zk retcode=-4, errno=111(Connection refused):
> server refused to accept the client
> 2016-07-05 18:35:51,175:42240:ZOO_WARN@get_next_server_in_reconfig@1256:
> [OLD] count=0 capacity=0 next=0 hasnext=0
> 2016-07-05 18:35:51,175:42240:ZOO_WARN@get_next_server_in_reconfig@1259:
> [NEW] count=2 capacity=16 next=2 hasnext=0
> 2016-07-05 18:35:51,175:42240:ZOO_WARN@get_next_server_in_reconfig@1279:
> Failed to find either new or old
> 2016-07-05 18:35:51,175:42240:ZOO_ERROR@handle_socket_error_msg@2353:
> Socket [127.0.0.1:22182] zk retcode=-4, errno=111(Connection refused):
> server refused to accept the client
> 2016-07-05 18:35:51,175:42240:ZOO_ERROR@handle_socket_error_msg@2353:
> Socket [127.0.0.1:22182] zk retcode=-4, errno=111(Connection refused):
> server refused to accept the client
> 2016-07-05 18:35:51,176:42240:ZOO_ERROR@handle_socket_error_msg@2353:
> Socket [127.0.0.1:22182] zk retcode=-4, errno=111(Connection refused):
> server refused to accept the client
> 2016-07-05 18:35:51,176:42240:ZOO_ERROR@handle_socket_error_msg@2353:
> Socket [127.0.0.1:22182] zk retcode=-4, errno=111(Connection refused):
> server refused to accept the client
> <This line keeps repeating>
> {noformat}
>
> No server seems to be up for the client to connect, which I don't
> understand the reason, but I've focused mostly on why the address is the
> same after some point rather than alternating between the two addresses.
>
> > Client skips servers when trying to connect
> > -------------------------------------------
> >
> >                 Key: ZOOKEEPER-2466
> >                 URL:
> https://issues.apache.org/jira/browse/ZOOKEEPER-2466
> >             Project: ZooKeeper
> >          Issue Type: Bug
> >          Components: c client
> >            Reporter: Flavio Junqueira
> >            Assignee: Flavio Junqueira
> >            Priority: Critical
> >             Fix For: 3.5.3, 3.6.0
> >
> >
> > I've been looking at {{Zookeeper_simpleSystem::testFirstServerDown}} and
> I observed the following behavior. The list of servers to connect contains
> two servers, let's call them S1 and S2. The client never connects, but the
> odd bit is the sequence of servers that the client tries to connect to:
> > {noformat}
> > S1
> > S2
> > S1
> > S1
> > S1
> > <keeps repeating S1>
> > {noformat}
> > It intrigued me that S2 is only tried once and never again. Checking the
> code, here is what happens. Initially, {{zh->reconfig}} is 1, so in
> {{zoo_cycle_next_server}} we return an address from
> {{get_next_server_in_reconfig}}, which is taken from {{zh->addrs_new}} in
> this test case. The attempt to connect fails, and {{handle_error}} is
> invoked in the error handling path. {{handle_error}} actually invokes
> {{addrvec_next}} which changes the address pointer to the next server on
> the list.
> > After two attempts, it decides that it has tried all servers in
> {{zoo_cycle_next_server}} and sets {{zh->reconfig}} to zero. Once
> {{zh->reconfig == 0}}, we have that each call to {{zoo_cycle_next_server}}
> moves the address pointer to the next server in {{zh->addrs}}. But, given
> that {{handle_error}} also moves the pointer to the next server, we end up
> moving the pointer ahead twice upon every failed attempt to connect, which
> is wrong.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>

Reply via email to