Alex Rudyy created QPID-6464:
--------------------------------
Summary: HA hangs for duration of replica conistency policy
timeout if the node changes role whilst recovery is in flight
Key: QPID-6464
URL: https://issues.apache.org/jira/browse/QPID-6464
Project: Qpid
Issue Type: Bug
Components: Java Broker
Affects Versions: 6.0 [Java]
Reporter: Alex Rudyy
Assignee: Alex Rudyy
On node transition from Master state some in-flight pending configuration
change tasks might be attempted to execute after Master transited into
Detached/Replica state when node has not been synchronized with new Master or
new Master has not been elected yet. On trying to execute configuration change
tasks BDB HA Store can try to create a JE transaction. The creation of JE
transaction can hang for a duration of timeout interval which is configured in
TimeConsistencyPolicy. The following database write operation would cause
ReplicaWriteException and transaction abort.
The store should fail immediately in this case rather then pointlessly waiting
for some time before failing anyway. More over, a pointless waiting blocks
configuration thread making it impossible to perform any other configuration
tasks
Here is the stack trace for a problem
{noformat}
at com.sleepycat.je.rep.txn.ReadonlyTxn.<init>(ReadonlyTxn.java:39)
at com.sleepycat.je.rep.impl.RepImpl.createRepUserTxn(RepImpl.java:924)
at com.sleepycat.je.txn.Txn.createUserTxn(Txn.java:301)
at com.sleepycat.je.txn.TxnManager.txnBegin(TxnManager.java:182)
at com.sleepycat.je.dbi.EnvironmentImpl.txnBegin(EnvironmentImpl.java:2366)
at com.sleepycat.je.Environment.beginTransactionInternal(Environment.java:1437)
at com.sleepycat.je.Environment.beginTransaction(Environment.java:1319)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]