[jira] Commented: (DERBY-4186) After failover, test fails when it succeeds in connecting early to failed over slave

Dag H. Wanvik (JIRA) Fri, 24 Apr 2009 16:52:52 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702626#action_12702626
 ]


Dag H. Wanvik commented on DERBY-4186:
--------------------------------------

My initial analysis was not entirely correct. Looking at the log file, I see 
that the
setting up of the master never succeeded in the cases where we see 08004.C.7.
This in turn lead to the stopMaster to fail (there is no master yet!), but 
operation does not throw because of this
piece of code in MasterController.tearDownNetwork called from 
MasterController.stopMaster

            try {
                ReplicationMessage mesg =
                    new ReplicationMessage(ReplicationMessage.TYPE_STOP, null);
                transmitter.sendMessage(mesg);
            } catch (IOException ioe) {}   // <************ 
java.net.ConnectException: Connection refused
            try {
                transmitter.tearDown();
            } catch (IOException ioe) {}

The end result of this is that the slave is still listening when the test comes 
around to calling to waitForSQLState (seethe
issue description), so we naturally get 08004.C.7 
CANNOT_CONNECT_TO_DB_IN_SLAVE_MODE. 
But the test is also wrong, it should expect success here.

Now the next question is, why does the test think starting the master worked? 
It calls the method ReplicationRun.startMaster to
achieve this.






> After failover, test fails when it succeeds in connecting early to failed 
> over slave
> ------------------------------------------------------------------------------------
>
>                 Key: DERBY-4186
>                 URL: https://issues.apache.org/jira/browse/DERBY-4186
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication, Test
>    Affects Versions: 10.6.0.0
>            Reporter: Dag H. Wanvik
>
> Occasionally I see this error in ReplicationRun_Local_3_p3:
> 1) 
> testReplication_Local_3_p3_StateNegativeTests(org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_3_p3)junit.framework.AssertionFailedError:
>  Expected SQLState'08004', but got connection!
>       at 
> org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.waitForSQLState(ReplicationRun.java:332)
>       at 
> org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_3_p3.testReplication_Local_3_p3_StateNegativeTests(ReplicationRun_Local_3_p3.java:170)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at 
> org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:105)
>       at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
>       at junit.extensions.TestSetup$1.protect(TestSetup.java:21)
>       at junit.extensions.TestSetup.run(TestSetup.java:25)
> In the code, after a stopMaster is given to the master (should lead to 
> fail-over),
> the tests expects to see CANNOT_CONNECT_TO_DB_IN_SLAVE_MODE (08004.C.7), 
> which will only succeed if
> the tests gets to try to connect before the failover has started. This seems 
> wrong. If the failover has completed, it should expect a successful
> connect (which boots the database, btw, since its shut down after auccessful 
> failover).
> Quote from code:
> waitForSQLState("08004", 100L, 20, // 08004.C.7 - 
> CANNOT_CONNECT_TO_DB_IN_SLAVE_MODE
>                 slaveDatabasePath + FS + slaveDbSubPath + FS + replicatedDb,
>                 slaveServerHost, slaveServerPort); // _failOver above fails...
> There is a race between the failover on the slave and the test here I think.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4186) After failover, test fails when it succeeds in connecting early to failed over slave

Reply via email to