----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14213/ -----------------------------------------------------------
(Updated Sept. 20, 2013, 6:21 p.m.) Review request for qpid, Andrew Stitcher, Chug Rolke, Gordon Sim, and Steve Huston. Repository: qpid Description (updated) ------- QPID-5139: HA transactions block a thread, can deadlock the broker PrimaryTxObserver::prepare blocks pending responses from each backup. With concurrent transactions this can deadlock the broker: once all worker threads are blocked in prepare, responses from backups cannot be received. The solution is as follows: - before blocking in prepare, start a new worker thread. - after blocking in prepare, stop a worker thread. Given N initial broker theads, this ensures that there are always N worker threads that are not blocked in tx-prepare. An alternative solution would be to make the prepare complete asynchronously. I believe this approach would be more complex, more risky and would be specific to the 0-10 protocol. TODO: implement for windows and other pollers. Diffs (updated) ----- /trunk/qpid/cpp/src/CMakeLists.txt 1524570 /trunk/qpid/cpp/src/qpid/broker/Broker.h 1524570 /trunk/qpid/cpp/src/qpid/broker/Broker.cpp 1524570 /trunk/qpid/cpp/src/qpid/ha/PrimaryTxObserver.cpp 1524570 /trunk/qpid/cpp/src/qpid/sys/Poller.h 1524570 /trunk/qpid/cpp/src/qpid/sys/PollerThreads.h PRE-CREATION /trunk/qpid/cpp/src/qpid/sys/PollerThreads.cpp PRE-CREATION /trunk/qpid/cpp/src/qpid/sys/epoll/EpollPoller.cpp 1524570 /trunk/qpid/cpp/src/tests/ha_test.py 1524570 /trunk/qpid/cpp/src/tests/ha_tests.py 1524570 /trunk/qpid/cpp/src/tests/test_store.cpp 1524570 Diff: https://reviews.apache.org/r/14213/diff/ Testing ------- New ha_tests.py unit test, starts broker with 2 threads and runs 10 concurrent transactions. Fails reliably before the fix is applied. Passed > 300 iteration after the fix. Full ctest passes. Thanks, Alan Conway
