ThreadManager crashing bugs
---------------------------
Key: THRIFT-488
URL: https://issues.apache.org/jira/browse/THRIFT-488
Project: Thrift
Issue Type: Bug
Components: Library (C++)
Affects Versions: 0.1
Environment: Mac OS X 10.5.6, Xcode
Reporter: Rush Manbert
Attachments: ThreadManager.cpp.patch
The ThreadManager::Impl and the ThreadManager::Worker classes work together to
execute client threads. There are race conditions between the two classes that
can cause Bus Error exceptions in certain cases. In general, these occur when a
ThreadManager instance is destructed, so might not be seen in long running
programs. They happen frequently enough, though, that looped repetitions of
ThreadManagerTests::blockTest() (part of the concurrency_test program) fail
quite often.
These errors are generally not seen with the current version of
ThreadManagerTests::blockTest() due to errors in the test itself that cause
failures at a far higher frequency. In order to see them, you need to apply the
patches that are attached to THRIFT-487
(https://issues.apache.org/jira/browse/THRIFT-487).
Test procedure:
1) Apply the patch from THRIFT-487 for the Tests.cpp file.
2) Run make in lib/cpp in order to rebuild concurrency_test
3) Run concurrency_test with the command line argument "thread-manager" and
observe that the test hangs in no time.
4) Apply the patch from THRIFT-487 for the ThreadManagerTests.h file.
5) Run make in lib/cpp
6) Run concurrency_test as before. Observe that now it runs for longer
(generally) and usually fails with an assert in Monitor.cpp. This failure is
because of one of the bugs in ThreadManager.
7) Apply the attached patch file for ThreadManager.cpp
8) Run make in lib/cpp
9) Run concurrency_test as before. It should just run, and run, and run.
Note that there is a possible path through the original
ThreadManager::Worker::run() method where active never becomes true. In
practice, exercising this code path is difficuly. The way that I exercised it
was to edit line 322 in the patched version of ThreadManager.cpp. I changed the
for statement to read:
for (size_t ix = 0; ix < value + 1; ix++)
so that the ThreadManager always created more workers than were needed. That
extra worker caused quite a bit of trouble until I moved his handling up to the
top of the run() method. I don't understand how this situation could occur in
real life, but this code appears to handle it correctly.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.