[
https://issues.apache.org/jira/browse/DERBY-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226041#comment-13226041
]
Knut Anders Hatlen commented on DERBY-5643:
-------------------------------------------
I ran ReplicationSuite in a loop on one of the machines where this was seen,
and after a couple of iterations it was stuck. I ran the suite with
derby.tests.repltrace=true, and the trace indicated that the slave server used
more than 3 minutes to get up and accept connections.
ReplicationRun.startServer() pings the server for 75 seconds before it gives
up, so it gave up before the server was up. The attempt to shut down the slave
server also failed, because a connection to the server could not be
established. tearDown() ended up waiting for the server to stop, and the server
of course didn't stop since it never received the shutdown command.
So the question is: Why does it take 3 minutes for the server to start
accepting connections?
> Occasional hangs in replication tests on Linux
> ----------------------------------------------
>
> Key: DERBY-5643
> URL: https://issues.apache.org/jira/browse/DERBY-5643
> Project: Derby
> Issue Type: Bug
> Components: Replication, Test
> Affects Versions: 10.9.0.0
> Reporter: Knut Anders Hatlen
> Attachments: thread-dump.txt
>
>
> We occasionally see hangs in the replication tests on Linux. For example
> here:
> http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/sles/1298470-suitesAll_diff.txt
> This test run was stuck in tearDown() after
> ReplicationRun_Local_Derby4910.testSlaveWaitsForMaster(). (Waiting for
> Thread.join() to return.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira