----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14213/ -----------------------------------------------------------
Review request for qpid, Andrew Stitcher, Gordon Sim, and Steve Huston. Repository: qpid Description ------- QPID-5139: HA transactions block a thread, can deadlock the broker PrimaryTxObserver::prepare blocks pending responses from each backup. With concurrent transactions this can deadlock the broker: once all worker threads are blocked in prepare, responses from backups cannot be received. The solution is as follows: - before blocking in prepare, start a new worker thread. - after blocking in prepare, stop a worker thread. This ensures that there are always more worker threads than pending transactions, and also that we do not grow the worker thread pool by more than number of concurrent transactions. An alternative solution would be to make the prepare complete asynchronously. I believe this approach would be more complex, more risky and would be specific to the 0-10 protocol. TODO: implement for windows and other pollers. Any hints much appreciated!! Diffs ----- /trunk/qpid/cpp/src/CMakeLists.txt 1524063 /trunk/qpid/cpp/src/qpid/broker/Broker.h 1524063 /trunk/qpid/cpp/src/qpid/broker/Broker.cpp 1524063 /trunk/qpid/cpp/src/qpid/ha/PrimaryTxObserver.cpp 1524063 /trunk/qpid/cpp/src/qpid/sys/Poller.h 1524063 /trunk/qpid/cpp/src/qpid/sys/PollerThreads.h PRE-CREATION /trunk/qpid/cpp/src/qpid/sys/PollerThreads.cpp PRE-CREATION /trunk/qpid/cpp/src/qpid/sys/epoll/EpollPoller.cpp 1524063 /trunk/qpid/cpp/src/tests/ha_test.py 1524063 /trunk/qpid/cpp/src/tests/ha_tests.py 1524063 /trunk/qpid/cpp/src/tests/test_store.cpp 1524063 Diff: https://reviews.apache.org/r/14213/diff/ Testing ------- New ha_tests.py unit test, starts broker with 2 threads and runs 10 concurrent transactions. Fails reliably before the fix is applied. Passed > 300 iteration after the fix. Full ctest passes. Thanks, Alan Conway
