[
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15309252#comment-15309252
]
Michael Han commented on ZOOKEEPER-2152:
----------------------------------------
The root cause of failures in testMigrateOrNot() is identified as follows:
I think an invariant we assumed always hold in our reconfiguration tests is the
state of current server that client connects to is always uniquely determined
by a call to cycleNextServer implemented in our tests (which calls
zoo_cycle_next_server.). This assumption is not true because cycleNextServer is
not the only place where zoo_cycle_next_server gets called: zookeeper_interest
in the client IO thread, because our reconfiguration client tests does not
actually have a real server set up, so client would end up recycling servers in
each reconnect attempt:
{code}
// No need to delay -- grab the next server and attempt connection
zoo_cycle_next_server(zh);
{code}
The end result of calls of zoo_cycle_next_server from both our tests and ZK IO
thread will randomize the state of client's currently connected server. Since
this state is the key assumption of most of our tests, they will fail randomly,
or pass, depends on timing. This also explains why MT tests failed more often
than ST tests.
I'll prepare a patch - my current idea is that we could try to set zh->delay in
our tests which effectively disable the zoo_cycle_next_server in ZK IO thread.
> Intermittent failure in TestReconfig.cc
> ---------------------------------------
>
> Key: ZOOKEEPER-2152
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
> Project: ZooKeeper
> Issue Type: Sub-task
> Components: c client
> Reporter: Michi Mutsuzaki
> Assignee: Michael Han
> Labels: reconfiguration
> Fix For: 3.6.0
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec]
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
> Assertion: assertion failed [Expression: found != string::npos,
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)