[jira] [Updated] (DERBY-5643) Occasional hangs in replication tests on Linux

Knut Anders Hatlen (Updated) (JIRA) Tue, 13 Mar 2012 08:35:03 -0700

     [ 
https://issues.apache.org/jira/browse/DERBY-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Knut Anders Hatlen updated DERBY-5643:
--------------------------------------

    Attachment: higher-timeout.diff

Attaching a new patch (higher-timeout.diff) that changes the default server 
startup timeout in NetworkServerTestSetup to four minutes. If the server 
startup eventually succeeds, but takes more than one minute, it'll call 
alarm(...) to tell that something isn't quite as it should be, but it won't 
actually fail unless it has been unsuccessful for four minutes.

The patch also makes the replication tests and the compatibility tests use 
NetworkServerTestSetup's helper methods to ping the server while waiting for it 
to start. This makes it possible to remove some duplicate logic from those 
tests, and makes the tests behave in a more consistent manner.

I've run derbyall, suites.All and the compatibility tests successfully with the 
patch. I've also had the replication tests running in a loop for four hours on 
a machine where I saw frequent hangs before. No hangs so far, but I've seen two 
occurrences of the alarm message, which indicates that the condition that 
previously made the tests hang did happen:

ALARM: Very slow server startup: 189735 ms
ALARM: Very slow server startup: 189850 ms
                
> Occasional hangs in replication tests on Linux
> ----------------------------------------------
>
>                 Key: DERBY-5643
>                 URL: https://issues.apache.org/jira/browse/DERBY-5643
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication, Test
>    Affects Versions: 10.9.0.0
>            Reporter: Knut Anders Hatlen
>         Attachments: higher-timeout.diff, thread-dump.txt, waitFor-2.diff, 
> waitFor.diff
>
>
> We occasionally see hangs in the replication tests on Linux. For example 
> here: 
> http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/sles/1298470-suitesAll_diff.txt
> This test run was stuck in tearDown() after 
> ReplicationRun_Local_Derby4910.testSlaveWaitsForMaster(). (Waiting for 
> Thread.join() to return.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (DERBY-5643) Occasional hangs in replication tests on Linux

Reply via email to