-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14565/
-----------------------------------------------------------

(Updated Oct. 11, 2013, 6:20 p.m.)


Review request for qpid, Andrew Stitcher, Gordon Sim, Kenneth Giusti, and Steve 
Huston.


Changes
-------

Corrected typos pointed out by reviewers
Fixed a deadlock


Summary (updated)
-----------------

QPID-5139: HA Fix deadlock between Primary and PrimaryTxObserver


Bugs: QPID-5139
    https://issues.apache.org/jira/browse/QPID-5139


Repository: qpid


Description (updated)
-------

QPID-5139: HA Fix deadlock between Primary and PrimaryTxObserver

- Primary::removeReplica - move cancel outside of lock.
- PrimaryTxObserver::prepare - move de-duplication outside lock.

QPID-5139: HA transactions block a thread, can deadlock the broker

PrimaryTxObserver::prepare used to block pending responses from each backup. 
With
concurrent transactions this can deadlock the broker: once all worker threads
are blocked in prepare, responses from backups cannot be received.

This commit generalizes the async completion mechanism for messages to allow
async completion of arbitrary commands. It leaves the special-case code for
messages undisturbed but adds a second path (starting from
SessionState::handleCommand) for async completion of other commands.
In particular it implements tx.commit to allow async completion.

TxBuffer is now an AsyncCompletion and commitLocal() is split into
- startCommit() called by SemanticState::commit()
- endCommit() called when the commit command completes

QPID-5139: Make TxBuffer inherit from AsyncCompletion.

Switched from shared_ptr to intrusive_ptr for TxBuffer and DtxBuffer.

QPID-5139: Add unit test for deadlock caused by blocking HA commit.


QPID-5139: Add timeout argument to python messaging.Session.commit.


Diffs (updated)
-----

  /trunk/qpid/cpp/src/CMakeLists.txt 1530798 
  /trunk/qpid/cpp/src/qpid/broker/AsyncCommandCallback.h PRE-CREATION 
  /trunk/qpid/cpp/src/qpid/broker/AsyncCommandCallback.cpp PRE-CREATION 
  /trunk/qpid/cpp/src/qpid/broker/BrokerObserver.h 1530798 
  /trunk/qpid/cpp/src/qpid/broker/BrokerObservers.h 1530798 
  /trunk/qpid/cpp/src/qpid/broker/DtxBuffer.h 1530798 
  /trunk/qpid/cpp/src/qpid/broker/DtxManager.h 1530798 
  /trunk/qpid/cpp/src/qpid/broker/DtxManager.cpp 1530798 
  /trunk/qpid/cpp/src/qpid/broker/DtxWorkRecord.h 1530798 
  /trunk/qpid/cpp/src/qpid/broker/DtxWorkRecord.cpp 1530798 
  /trunk/qpid/cpp/src/qpid/broker/RecoverableMessageImpl.h 1530798 
  /trunk/qpid/cpp/src/qpid/broker/RecoveryManagerImpl.cpp 1530798 
  /trunk/qpid/cpp/src/qpid/broker/SemanticState.h 1530798 
  /trunk/qpid/cpp/src/qpid/broker/SemanticState.cpp 1530798 
  /trunk/qpid/cpp/src/qpid/broker/SessionState.h 1530798 
  /trunk/qpid/cpp/src/qpid/broker/SessionState.cpp 1530798 
  /trunk/qpid/cpp/src/qpid/broker/TxBuffer.h 1530798 
  /trunk/qpid/cpp/src/qpid/broker/TxBuffer.cpp 1530798 
  /trunk/qpid/cpp/src/qpid/ha/Primary.h 1530798 
  /trunk/qpid/cpp/src/qpid/ha/Primary.cpp 1530798 
  /trunk/qpid/cpp/src/qpid/ha/PrimaryTxObserver.h 1530798 
  /trunk/qpid/cpp/src/qpid/ha/PrimaryTxObserver.cpp 1530798 
  /trunk/qpid/cpp/src/qpid/ha/TxReplicator.h 1530798 
  /trunk/qpid/cpp/src/tests/DtxWorkRecordTest.cpp 1530798 
  /trunk/qpid/cpp/src/tests/TransactionObserverTest.cpp 1530798 
  /trunk/qpid/cpp/src/tests/TxBufferTest.cpp 1530798 
  /trunk/qpid/cpp/src/tests/ha_test.py 1530798 
  /trunk/qpid/cpp/src/tests/ha_tests.py 1530798 
  /trunk/qpid/python/qpid/messaging/endpoints.py 1530798 

Diff: https://reviews.apache.org/r/14565/diff/


Testing
-------

Passes full make test
HA unit test to verify we don't block.


Thanks,

Alan Conway

Reply via email to