[
https://issues.apache.org/jira/browse/QPID-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Conway updated QPID-5974:
------------------------------
Description:
Description of problem:
qpid-txtest2 AMQP0-10 transactional & durable transfer operation can bring down
whole qpid HA. Note no brokers were killed, just the txtest was run.
To reproduce:
3 node cluster
whlie qpid-txtest2 -b 20.0.20.200 --tx-count 500 --queues 10 --messages-per-tx
10 --total-messages 1000 --durable 1
Result:
Test fails. Broker logs show critical and error messages like this:
{noformat}
[root@dhcp-lab-A ~]# grep -E 'error|critical' ~qpidd/qpidd.log
2014-07-24 14:10:33 [Protocol] error Connection
qpid.192.168.6.246:5672-192.168.6.247:34210 timed out: closing
[root@dhcp-lab-B ~]# grep -E 'error|critical' ~qpidd/qpidd.log
2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1:
Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02
wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending
transaction. (drid=0x6da3)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
2014-07-24 14:10:23 [Protocol] error Connection
qpid.ha.link.09e80392-0c79-4239-a1d0-ea5b53c71bd9 closed by error: Backup of
tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed:
jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked
by a pending transaction. (drid=0x6da3)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed
prematurely, rollback
2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-1:
Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02
wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending
transaction. (drid=0x7a42)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
2014-07-24 14:10:38 [Protocol] error Connection
qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of
tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed:
jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked
by a pending transaction. (drid=0x7a42)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-10:
Replication failed: Queue tx-test2-10: async_dequeue() failed: jexception
0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a
pending transaction. (drid=0x7a43)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
2014-07-24 14:10:38 [Protocol] error Connection
qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of
tx-test2-10: Replication failed: Queue tx-test2-10: async_dequeue() failed:
jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked
by a pending transaction. (drid=0x7a43)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
2014-07-24 14:10:40 [HA] critical Shutting down: Backup of tx-test2-7:
Replication failed: Queue tx-test2-7: async_dequeue() failed: jexception 0x0b02
wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending
transaction. (drid=0x7a49)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
2014-07-24 14:10:40 [Protocol] error Connection
qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of
tx-test2-7: Replication failed: Queue tx-test2-7: async_dequeue() failed:
jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked
by a pending transaction. (drid=0x7a49)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
2014-07-24 14:10:40 [Broker] error Could not find dequeued message on commit
2014-07-24 14:11:10 [HA] error Backup: Joining active cluster, cannot be
promoted.
[root@dhcp-lab-C ~]# grep -E 'error|critical' ~qpidd/qpidd.log
2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1:
Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02
wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending
transaction. (drid=0x53a3)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
2014-07-24 14:10:23 [Protocol] error Connection
qpid.ha.link.1bb57f0a-48db-460c-9260-0f5b353e4bd1 closed by error: Backup of
tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed:
jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked
by a pending transaction. (drid=0x53a3)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction 243b4279: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction 4f4a25df: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction 80cbe9af: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction a3ed917a: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction b7a4b9a0: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction b9ba9995: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction cbd0d6bf: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction e127288a: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction eb43e683: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction f29196c1: Destroyed
prematurely, rollback
2014-07-24 14:10:53 [HA] error Backup: Still catching up, cannot be promoted.
{noformat}
was:
Description of problem:
qpid-txtest2 AMQP0-10 transactional & durable transfer operation can bring down
whole qpid HA.
It was multiple times seen that qpid-txtest2 transfer operation suddenly failed:
######################### init queues and add messages
[root@mrg-qe-10 clients]# txtest 192.168.6.251 1 0 0 500 10 10 1000 1
######################### start stansfers (no kill)
[root@mrg-qe-10 clients]# txtest 192.168.6.251 0 1 0 500 10 10 1000 1
Transfering from tx-test2-1 to tx-test2-2
Transfering from tx-test2-2 to tx-test2-3
Transfering from tx-test2-3 to tx-test2-4
Transfering from tx-test2-4 to tx-test2-5
Transfering from tx-test2-5 to tx-test2-6
Transfering from tx-test2-6 to tx-test2-7
Transfering from tx-test2-7 to tx-test2-8
Transfering from tx-test2-8 to tx-test2-9
Transfering from tx-test2-9 to tx-test2-10
Transfering from tx-test2-10 to tx-test2-1
Transaction 10 of 500 committed successfully
Transaction 10 of 500 committed successfully
Transaction 10 of 500 committed successfully
Transaction 10 of 500 committed successfully
2014-07-24 08:10:39 [Client] warning Connection
[192.168.6.1:58590-192.168.6.251:5672] closed
2014-07-24 08:10:39 [Client] warning Connection
[192.168.6.1:58586-192.168.6.251:5672] closed
2014-07-24 08:10:39 [Client] warning Connection
[192.168.6.1:58584-192.168.6.251:5672] closed
2014-07-24 08:10:39 [Unspecified] error Transfer tx-test2-3 to tx-test2-2
interrupted: Failed to connect (reconnect disabled)
2014-07-24 08:10:39 [Unspecified] error Transfer tx-test2-7 to tx-test2-6
interrupted: Failed to connect (reconnect disabled)
2014-07-24 08:10:40 [Client] warning Connection
[192.168.6.1:58587-192.168.6.251:5672] closed
2014-07-24 08:10:40 [Unspecified] error Transfer tx-test2-4 to tx-test2-3
interrupted: Failed to connect (reconnect disabled)
2014-07-24 08:10:40 [Client] warning Connection
[192.168.6.1:58589-192.168.6.251:5672] closed
2014-07-24 08:10:40 [Unspecified] error Transfer tx-test2-6 to tx-test2-5
interrupted: Failed to connect (reconnect disabled)
2014-07-24 08:10:40 [Client] warning Connection
[192.168.6.1:58592-192.168.6.251:5672] closed
2014-07-24 08:10:40 [Unspecified] error Transfer tx-test2-9 to tx-test2-8
interrupted: Failed to connect (reconnect disabled)
2014-07-24 08:10:40 [Client] warning Connection
[192.168.6.1:58594-192.168.6.251:5672] closed
2014-07-24 08:10:40 [Unspecified] error Transfer tx-test2-1 to tx-test2-10
interrupted: Failed to connect (reconnect disabled)
2014-07-24 08:10:40 [Client] warning Connection
[192.168.6.1:58593-192.168.6.251:5672] closed
2014-07-24 08:10:40 [Unspecified] error Transfer tx-test2-10 to tx-test2-9
interrupted: Failed to connect (reconnect disabled)
2014-07-24 08:10:40 [Client] warning Connection
[192.168.6.1:58585-192.168.6.251:5672] closed
2014-07-24 08:10:40 [Unspecified] error Transfer tx-test2-2 to tx-test2-1
interrupted: Failed to connect (reconnect disabled)
2014-07-24 08:10:40 [Client] warning Connection
[192.168.6.1:58591-192.168.6.251:5672] closed
2014-07-24 08:10:40 [Unspecified] error Transfer tx-test2-8 to tx-test2-7
interrupted: Failed to connect (reconnect disabled)
2014-07-24 08:10:40 [Client] warning Connection
[192.168.6.1:58588-192.168.6.251:5672] closed
2014-07-24 08:10:40 [Unspecified] error Transfer tx-test2-5 to tx-test2-4
interrupted: Failed to connect (reconnect disabled)
Transaction aborted due to transport failure
Transaction aborted due to transport failure
Transaction aborted due to transport failure
Transaction aborted due to transport failure
Transaction aborted due to transport failure
Transaction aborted due to transport failure
Transaction aborted due to transport failure
Transaction aborted due to transport failure
Transaction aborted due to transport failure
Transaction aborted due to transport failure
Failed to connect (reconnect disabled)
Transfer agents failed
[root@mrg-qe-10 clients]#
[root@mrg-qe-10 clients]# echo $?
2
Surprisingly although no failure was inserted HA qpid cluster ends up in
following state:
[root@dhcp-lab-A ~]# clustat
Cluster Status for dtests_ha @ Thu Jul 24 14:17:57 2014
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
192.168.6.246 1 Online, Local, rgmanager
192.168.6.247 2 Online, rgmanager
192.168.6.248 3 Online, rgmanager
Service Name Owner (Last)
State
------- ---- ----- ------
-----
service:qpidd_1 192.168.6.246
started
service:qpidd_2 192.168.6.247
started
service:qpidd_3 192.168.6.248
started
service:qpidd_primary (192.168.6.248)
stopped
[root@dhcp-lab-A ~]# pidof qpidd
26137
[root@dhcp-lab-A ~]# qpid-ha status --all
192.168.6.246:5672 joining
192.168.6.247:5672 joining
192.168.6.248:5672 joining
so brokers are up all in joining, HA not available.
Version-Release number of selected component (if applicable):
How reproducible:
10-50% (to be precised)
Steps to Reproduce:
1. start-up the 3-node HA cluster with virt-ip
2. init the queues and data
txtest 192.168.6.251 1 0 0 500 10 10 1000 1
3. transfer among queues
txtest 192.168.6.251 0 1 0 500 10 10 1000 1
Actual results:
qpid-txtest2 failing.
Expected results:
qpid-txtest2 not failing.
Additional info:
function txtest()
{
./qpid_txtest2 -b $1 --init $2 --transfer $3 --check $4 \
--tx-count $5 --queues $6 --messages-per-tx $7
--total-messages $8 --durable $9
}
qpidd.log inspection highlighted unexpected linearstore critical exceptions:
[root@dhcp-lab-A ~]# grep -E 'error|critical' ~qpidd/qpidd.log
2014-07-24 14:10:33 [Protocol] error Connection
qpid.192.168.6.246:5672-192.168.6.247:34210 timed out: closing
[root@dhcp-lab-B ~]# grep -E 'error|critical' ~qpidd/qpidd.log
2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1:
Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02
wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending
transaction. (drid=0x6da3)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
2014-07-24 14:10:23 [Protocol] error Connection
qpid.ha.link.09e80392-0c79-4239-a1d0-ea5b53c71bd9 closed by error: Backup of
tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed:
jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked
by a pending transaction. (drid=0x6da3)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed
prematurely, rollback
2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-1:
Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02
wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending
transaction. (drid=0x7a42)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
2014-07-24 14:10:38 [Protocol] error Connection
qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of
tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed:
jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked
by a pending transaction. (drid=0x7a42)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-10:
Replication failed: Queue tx-test2-10: async_dequeue() failed: jexception
0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a
pending transaction. (drid=0x7a43)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
2014-07-24 14:10:38 [Protocol] error Connection
qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of
tx-test2-10: Replication failed: Queue tx-test2-10: async_dequeue() failed:
jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked
by a pending transaction. (drid=0x7a43)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
2014-07-24 14:10:40 [HA] critical Shutting down: Backup of tx-test2-7:
Replication failed: Queue tx-test2-7: async_dequeue() failed: jexception 0x0b02
wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending
transaction. (drid=0x7a49)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
2014-07-24 14:10:40 [Protocol] error Connection
qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of
tx-test2-7: Replication failed: Queue tx-test2-7: async_dequeue() failed:
jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked
by a pending transaction. (drid=0x7a49)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
2014-07-24 14:10:40 [Broker] error Could not find dequeued message on commit
2014-07-24 14:11:10 [HA] error Backup: Joining active cluster, cannot be
promoted.
[root@dhcp-lab-C ~]# grep -E 'error|critical' ~qpidd/qpidd.log
2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1:
Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02
wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a pending
transaction. (drid=0x53a3)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
2014-07-24 14:10:23 [Protocol] error Connection
qpid.ha.link.1bb57f0a-48db-460c-9260-0f5b353e4bd1 closed by error: Backup of
tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed:
jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked
by a pending transaction. (drid=0x53a3)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed
prematurely, rollback
2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction 243b4279: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction 4f4a25df: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction 80cbe9af: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction a3ed917a: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction b7a4b9a0: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction b9ba9995: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction cbd0d6bf: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction e127288a: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction eb43e683: Destroyed
prematurely, rollback
2014-07-24 14:10:35 [HA] error Backup of transaction f29196c1: Destroyed
prematurely, rollback
2014-07-24 14:10:53 [HA] error Backup: Still catching up, cannot be promoted.
> HA qpid-txtest2 can bring down a cluster (JERR_MAP_LOCKED)
> ----------------------------------------------------------
>
> Key: QPID-5974
> URL: https://issues.apache.org/jira/browse/QPID-5974
> Project: Qpid
> Issue Type: Bug
> Components: C++ Clustering
> Affects Versions: 0.28
> Reporter: Alan Conway
> Assignee: Alan Conway
>
> Description of problem:
> qpid-txtest2 AMQP0-10 transactional & durable transfer operation can bring
> down whole qpid HA. Note no brokers were killed, just the txtest was run.
> To reproduce:
> 3 node cluster
> whlie qpid-txtest2 -b 20.0.20.200 --tx-count 500 --queues 10
> --messages-per-tx 10 --total-messages 1000 --durable 1
> Result:
> Test fails. Broker logs show critical and error messages like this:
> {noformat}
> [root@dhcp-lab-A ~]# grep -E 'error|critical' ~qpidd/qpidd.log
> 2014-07-24 14:10:33 [Protocol] error Connection
> qpid.192.168.6.246:5672-192.168.6.247:34210 timed out: closing
> [root@dhcp-lab-B ~]# grep -E 'error|critical' ~qpidd/qpidd.log
> 2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1:
> Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception
> 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a
> pending transaction. (drid=0x6da3)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:23 [Protocol] error Connection
> qpid.ha.link.09e80392-0c79-4239-a1d0-ea5b53c71bd9 closed by error: Backup of
> tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed:
> jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID
> locked by a pending transaction. (drid=0x6da3)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-1:
> Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception
> 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a
> pending transaction. (drid=0x7a42)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:38 [Protocol] error Connection
> qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of
> tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed:
> jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID
> locked by a pending transaction. (drid=0x7a42)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-10:
> Replication failed: Queue tx-test2-10: async_dequeue() failed: jexception
> 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a
> pending transaction. (drid=0x7a43)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:38 [Protocol] error Connection
> qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of
> tx-test2-10: Replication failed: Queue tx-test2-10: async_dequeue() failed:
> jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID
> locked by a pending transaction. (drid=0x7a43)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:40 [HA] critical Shutting down: Backup of tx-test2-7:
> Replication failed: Queue tx-test2-7: async_dequeue() failed: jexception
> 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a
> pending transaction. (drid=0x7a49)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:40 [Protocol] error Connection
> qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of
> tx-test2-7: Replication failed: Queue tx-test2-7: async_dequeue() failed:
> jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID
> locked by a pending transaction. (drid=0x7a49)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:40 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:11:10 [HA] error Backup: Joining active cluster, cannot be
> promoted.
> [root@dhcp-lab-C ~]# grep -E 'error|critical' ~qpidd/qpidd.log
> 2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1:
> Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception
> 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a
> pending transaction. (drid=0x53a3)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:23 [Protocol] error Connection
> qpid.ha.link.1bb57f0a-48db-460c-9260-0f5b353e4bd1 closed by error: Backup of
> tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed:
> jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID
> locked by a pending transaction. (drid=0x53a3)
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction 243b4279: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction 4f4a25df: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction 80cbe9af: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction a3ed917a: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction b7a4b9a0: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction b9ba9995: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction cbd0d6bf: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction e127288a: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction eb43e683: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction f29196c1: Destroyed
> prematurely, rollback
> 2014-07-24 14:10:53 [HA] error Backup: Still catching up, cannot be promoted.
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]