[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282248#comment-15282248
 ] 

Michael Han commented on ZOOKEEPER-2152:
----------------------------------------

There are a set of failures in TestReconfig.cc that can be reproduced locally 
(Ubuntu 14.04) against trunk. I ran TestReconfig 100 times, in test_mt and 
test_st mode separately. test_st is more stable (2 failure out of 100), and 
test_mt has more failures (17 failure out of 100). The set of failures are:

{code:title=TestReconfig.cc:154: Assertion: assertion failed [Expression: 
false] (15 failures out of 100 in test_mt) |borderStyle=solid}
   // Else we've looped around!
   else if (first == next)
   {
     CPPUNIT_ASSERT(false);
   }
{code}

{code:title=TestReconfig.cc:474: Assertion: assertion failed [Expression: found 
!= string::npos, 10.10.10.4:2004 not in newComing list]
 (1 failure out of 100 in test_mt) |borderStyle=solid}
    // Assert next server is in the 'new' list
    size_t found = newComing.find(next);
    CPPUNIT_ASSERT_MESSAGE(next + " not in newComing list", found != 
string::npos);
{code}

{code:title=TestReconfig.cc:183: Assertion: equality assertion failed 
[Expected: 1, Actual  : 0]
 (1 failure out of 100 in test_mt) |borderStyle=solid}
    void setServersAndVerifyReconfig(const string new_hosts, bool is_reconfig)
    {
        setServers(new_hosts);
        CPPUNIT_ASSERT_EQUAL(is_reconfig, isReconfig());
    }
{code}

{code:title=TestReconfig.cc:381: Assertion: assertion failed [Expression: 
numClientsPerHost.at(index) >= lowerboundClientsPerServer(numClients, 
numServers)]
 (1 failure out of 100 in test_st) |borderStyle=solid}
     CPPUNIT_ASSERT(numClientsPerHost.at(index) >= 
lowerboundClientsPerServer(numClients, numServers));
     numClientsPerHost.at(index) = 0; // prepare for next test
{code}

{code:title=TestReconfig.cc:573: Assertion: assertion failed [Expression: 
numClientsPerHost.at(i) >= lowerboundClientsPerServer(numClients, numServers)]
 (1 failure out of 100 in test_st) |borderStyle=solid}
     for (int i = 0; i < numServers; i++) {
            CPPUNIT_ASSERT(numClientsPerHost.at(i) <= 
upperboundClientsPerServer(numClients, numServers));
            CPPUNIT_ASSERT(numClientsPerHost.at(i) >= 
lowerboundClientsPerServer(numClients, numServers));
            numClientsPerHost.at(i) = 0; // prepare for next test
     }
{code}

The failures are align with what [~suda] has observed, though I have a much 
lower reproduce rate (17%) comparing to 50%. In particular, the failure 
described in the JIRA has a reproduce rate of 1%. Based on the results, I think 
the reconfig c client tests are more fragile in threaded mode. Will investigate 
further on the states of object when the tests failed.

> Intermittent failure in TestReconfig.cc
> ---------------------------------------
>
>                 Key: ZOOKEEPER-2152
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2152
>             Project: ZooKeeper
>          Issue Type: Sub-task
>          Components: c client
>            Reporter: Michi Mutsuzaki
>            Assignee: Michael Han
>              Labels: reconfiguration
>             Fix For: 3.6.0
>
>
> I'm seeing this failure in the c client test once in a while:
> {noformat}
> [exec] 
> /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestReconfig.cc:474:
>  Assertion: assertion failed [Expression: found != string::npos, 
> 10.10.10.4:2004 not in newComing list]
> {noformat}
> https://builds.apache.org/job/ZooKeeper-trunk/2640/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to