[
https://issues.apache.org/jira/browse/DERBY-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dag H. Wanvik updated DERBY-4175:
---------------------------------
Attachment: derby-4175-2.diff
Uploading a second version of this patch, we shows that there are actually
two intermediate states we could encounter before we reach the end state in
step 3. I don't this this behavior is well documented if at all. The state
labelled b) in the code comment is a bit murky..
The new code uses a loop to wait until the final expected end state is reached.
Uncomment the printlns inside the loop to see what happens.
This patch should make this instability go away, running regressions.
> See XRE42 in
> ReplicationRun_Local_StateTest_part1_1._testPostStartedMasterAndSlave_StopSlave,
> XRE11 expected
> ------------------------------------------------------------------------------------------------------------
>
> Key: DERBY-4175
> URL: https://issues.apache.org/jira/browse/DERBY-4175
> Project: Derby
> Issue Type: Bug
> Components: Regression Test Failure, Replication
> Environment: Solaris 2008.11. snv_111 (x86) on trunk.
> Reporter: Dag H. Wanvik
> Priority: Minor
> Attachments: derby-4175-2.diff, derby-4175.diff, derby-4175.stat
>
>
> The test expects REPLICATION_DB_NOT_BOOTED (XRE11), but sees
> REPLICATION_SLAVE_SHUTDOWN_OK (XRE 42):
> 1)
> testReplication_Local_StateTest_part1_1(org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_StateTest_part1_1)junit.framework.AssertionFailedError:
>
> jdbc:derby://localhost:4527//export/home/dag/java/sb/tests/derby-3417a-replicationTests.ReplicationSuite-sb4.jars.sane-1.6.0_13-21079/db_slave/wombat;stopSlave=true
> failed: -1 XRE42 DERBY SQL error: SQLCODE: -1, SQLSTATE: XRE42, SQLERRMC:
> /export/home/dag/java/sb/tests/derby-3417a-replicationTests.ReplicationSuite-sb4.jars.sane-1.6.0_13-21079/db_slave/wombat^TXRE42
> at
> org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_StateTest_part1_1._testPostStartedMasterAndSlave_StopSlave(ReplicationRun_Local_StateTest_part1_1.java:226)
> at
> org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_StateTest_part1_1.testReplication_Local_StateTest_part1_1(ReplicationRun_Local_StateTest_part1_1.java:130)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at
> org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:105)
> at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
> at junit.extensions.TestSetup$1.protect(TestSetup.java:21)
> at junit.extensions.TestSetup.run(TestSetup.java:25)
> I think this is a race condition: when the slave receives a message to
> shut down (this is what happens here when the server is master's
> server is shut down) it takes some time for this to happen, and in the
> meantime a stopSlave on the slave will get
> REPLICATION_SLAVE_SHUTDOWN_OK.
> In the code, there is a sleep just ahead of the failing stopSlave to
> avoid this scenario:
> // Take down master - slave connection:
> killMaster(masterServerHost, masterServerPort);
> Thread.sleep(5000L); // TEMPORARY to see if slave sees that master is
> gone!
> and I guess on my laptop, the 5 seconds was not enough. I think it
> would be better to accept both states here as acceptable, than make
> the test brittle. If this is a bug - that we sometimes see
> REPLICATION_SLAVE_SHUTDOWN_OK - and it may well be, since ahead of the
> master stop, we would see SLAVE_OPERATION_DENIED_WHILE_CONNECTED
> (XRE41), I think - then this should be logged as a separate issue.
> In contrast, I think that if connection to the master is *lost*, a
> stopSlave on slave would see REPLICATION_SLAVE_SHUTDOWN_OK as the
> normal response.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.