[
https://issues.apache.org/jira/browse/QPID-6560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Keith Wall updated QPID-6560:
-----------------------------
Summary: [Java Broker] BDB HA JE environment close on intruder detection
might block the execution of VHN children tasks thus causing unnecessary delays
in shutdown of ReplicatedEnvironmentFacade executors (was: [Java Broker] BDB
HA JE environment close on intruder detection might block the execution of VHN
children tasks thus causing unecessary delays in shutdown of
ReplicatedEnvironmentFacade executors)
> [Java Broker] BDB HA JE environment close on intruder detection might block
> the execution of VHN children tasks thus causing unnecessary delays in
> shutdown of ReplicatedEnvironmentFacade executors
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: QPID-6560
> URL: https://issues.apache.org/jira/browse/QPID-6560
> Project: Qpid
> Issue Type: Bug
> Affects Versions: 0.32
> Reporter: Alex Rudyy
> Assignee: Alex Rudyy
> Fix For: qpid-java-6.0
>
>
> On intruder detection a task to close VHN children and set VHN state to
> ERRORED is scheduled in Broker configuration thread. Immediately after
> scheduling the task, ReplicatedEnvironmentFacade.close() is invoked.
> ReplicatedEnvironmentFacade executors are shutdown in close method.
> If any of ReplicatedEnvironmentFacade executors has a pending work (tasks to
> run) and that work needs to be performed in VHN configuration thread or
> Broker configuration thread in synchronous manner (blocking
> ReplicatedEnvironmentFacade executors threads), the executors shutdown would
> be blocked and eventually times out.
> Test BDBHAVirtualHostNodeRestTest.testIntruderProtection fails sporadically
> as indicated by stack trace below:
> {noformat}
> junit.framework.AssertionFailedError: Attribute state did not reach expected
> value within permitted timeout 5000ms. expected:<ERRORED> but was:<ACTIVE>
> at junit.framework.Assert.fail(Assert.java:57)
> at junit.framework.Assert.failNotEquals(Assert.java:329)
> at junit.framework.Assert.assertEquals(Assert.java:78)
> at junit.framework.TestCase.assertEquals(TestCase.java:244)
> at
> org.apache.qpid.systest.rest.QpidRestTestCase.waitForAttributeChanged(QpidRestTestCase.java:117)
> at
> org.apache.qpid.server.store.berkeleydb.replication.BDBHAVirtualHostNodeRestTest.testIntruderProtection(BDBHAVirtualHostNodeRestTest.java:311)
> {noformat}
> The log analysis showed that the issue occurs in the following scenario:
> * 2-node cluster is created
> * intruder node is connected
> * node1 is shutdown by intruder protection
> * node2 intruder protection is triggered and task to close VHN children is
> scheduled in Broker configuration thread. At the same time STATE event is
> issued by JE on transition from REPLICA into UNKNOWN (as majority is lost).
> The state change logic is invoked in the ReplicatedEnvironmentFacade
> StateShange executor which in turns performs VH close in VHN configuration
> thread and blocks until VH close is completed.
> * As result, VHN configuration thread will be performing VHN children close
> caused by intruder protection, StateChange executor thread will be waiting
> for completion of VH close task which is scheduled as a separate task, Broker
> configuration thread will be performing REF.close waiting for shutdown of
> StateChange executor. When task to close VHN children is complete is
> schedules task in broker configuration thread to close configuration store.
> The latter can only be performed after intruder protection logic is completed.
> * Thus, we have an effective dead lock, when tasks block each other threads.
> It seems that REF.close in intruder protection functionality is not only
> redundant but harmful as it causes the effective dead lock. The deadlock
> resolves by timeout on waiting for a task executor shutdown.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]