ThreadManager crashing bugs
---------------------------

                 Key: THRIFT-488
                 URL: https://issues.apache.org/jira/browse/THRIFT-488
             Project: Thrift
          Issue Type: Bug
          Components: Library (C++)
    Affects Versions: 0.1
         Environment: Mac OS X 10.5.6, Xcode
            Reporter: Rush Manbert
         Attachments: ThreadManager.cpp.patch

The ThreadManager::Impl and the ThreadManager::Worker classes work together to 
execute client threads. There are race conditions between the two classes that 
can cause Bus Error exceptions in certain cases. In general, these occur when a 
ThreadManager instance is destructed, so might not be seen in long running 
programs. They happen frequently enough, though, that looped repetitions of 
ThreadManagerTests::blockTest() (part of the concurrency_test program) fail 
quite often.

These errors are generally not seen with the current version of 
ThreadManagerTests::blockTest() due to errors in the test itself that cause 
failures at a far higher frequency. In order to see them, you need to apply the 
patches that are attached to THRIFT-487 
(https://issues.apache.org/jira/browse/THRIFT-487).

Test procedure:
1) Apply the patch from THRIFT-487 for the Tests.cpp file.
2) Run make in lib/cpp in order to rebuild concurrency_test
3) Run concurrency_test with the command line argument "thread-manager" and 
observe that the test hangs in no time.
4) Apply the patch from THRIFT-487 for the ThreadManagerTests.h file.
5) Run make in lib/cpp
6) Run concurrency_test as before. Observe that now it runs for longer 
(generally) and usually fails with an assert in Monitor.cpp. This failure is 
because of one of the bugs in ThreadManager.
7) Apply the attached patch file for ThreadManager.cpp
8) Run make in lib/cpp
9) Run concurrency_test as before. It should just run, and run, and run.

Note that there is a possible path through the original 
ThreadManager::Worker::run() method where active never becomes true. In 
practice, exercising this code path is difficuly. The way that I exercised it 
was to edit line 322 in the patched version of ThreadManager.cpp. I changed the 
for statement to read:
for (size_t ix = 0; ix < value + 1; ix++)
so that the ThreadManager always created more workers than were needed. That 
extra worker caused quite a bit of trouble until I moved his handling up to the 
top of the run() method. I don't understand how this situation could occur in 
real life, but this code appears to handle it correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to