[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #16 from redi at gcc dot gnu dot org 2010-09-10 09:55 --- There certainly is a race condition: there's no ordering between pthread_cancel and pthread_testcancel so the main thread can run f2(50) before thread2 calls pthread_cancel, which is why you see it sometimes run beyond the cancellation point. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #17 from redi at gcc dot gnu dot org 2010-09-10 10:11 --- (In reply to comment #15) In particular, it does not appear that the thread is being reliably cancelled at the pthread_testcancel call - sometimes f2 seems to run beyond the pthread_testcancel, As I said above, that's consistent with f2(50) executing before pthread_cancel. which causes the throw to execute, and results in an abort (seems to want to act like an uncaught exception propagated out). If you comment out the throw, f2 will sometimes continue to construct additional objects past 50. I have also noticed that sometimes a bunch of the Y objects get destructed, but then the program suddenly summarily exits. I think that's because f2(50) leaves cancellation enabled and writing to cout is a cancellation point, so the exit happens when some ~Y destructor coincides with thread2 calling pthread_cancel. I also tried setting the cancellation type to asynchronous, but that doesn't make any difference - sometimes it works, sometimes it don't. Its very unpredictable. Yes, race conditions tend to have unpredictable results. If I change the condition in f2 to (i = 50) and disable cancellation again after the call to pthread_testcancel then I get more predictable behaviour, because that ensures that the only cancellation points are the calls to pthread_testcancel in f2, which still occur in f2(51), f2(52) etc. i.e. cancellation still occurs at the intended place even if f2(50) happens before the call to pthread_cancel. That seems to validate my theory that the cancel happens after f2(50), and so takes effect at the first cancellation point after the cancellation request. I don't think there's a gcc or glibc bug here, just non-portable code with indeterminate behaviour. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #11 from mikedalpee at enginsol dot com 2010-09-02 13:43 --- Created an attachment (id=21663) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21663action=view) Workaround for the problem. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #12 from mikedalpee at enginsol dot com 2010-09-02 13:44 --- Created an attachment (id=21664) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21664action=view) output of the workaround -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #13 from mikedalpee at enginsol dot com 2010-09-02 13:46 --- Simply adding an additional scope to the catch part of the function f2 try block caused the program to execute properly. So is this a code generation bug or a glibc bug? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #14 from mikedalpee at enginsol dot com 2010-09-03 00:05 --- Created an attachment (id=21679) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21679action=view) Program that demostrates the bug -- mikedalpee at enginsol dot com changed: What|Removed |Added Attachment #21633|0 |1 is obsolete|| Attachment #21635|0 |1 is obsolete|| Attachment #21637|0 |1 is obsolete|| Attachment #21663|0 |1 is obsolete|| Attachment #21664|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #15 from mikedalpee at enginsol dot com 2010-09-03 00:24 --- I have spent a lot more time playing around with this, and after running the exception1 program numerous times, the behaviour oscillates between working correctly or not working at all. So there appears to be some sort of race condition during the the stack unwinding process that results from a thread cancellation. In particular, it does not appear that the thread is being reliably cancelled at the pthread_testcancel call - sometimes f2 seems to run beyond the pthread_testcancel, which causes the throw to execute, and results in an abort (seems to want to act like an uncaught exception propagated out). If you comment out the throw, f2 will sometimes continue to construct additional objects past 50. I have also noticed that sometimes a bunch of the Y objects get destructed, but then the program suddenly summarily exits. I also tried setting the cancellation type to asynchronous, but that doesn't make any difference - sometimes it works, sometimes it don't. Its very unpredictable. In any case, I need someone to tell me if this is the appropriate place to have aired this particular bug, or would it be better placed in the glibc bugzilla. I really do need this to work reliably so my 400K+ SLOC port can have a chance to work without requiring a major redesign of the underlying thread framework. FYI, this has been working flawlessly under Solaris/Sun Studio C++ for the last 10 years. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #1 from mikedalpee at enginsol dot com 2010-09-01 13:13 --- Created an attachment (id=21633) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21633action=view) Program that demonstrates the bug -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #2 from mikedalpee at enginsol dot com 2010-09-01 13:13 --- Created an attachment (id=21634) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21634action=view) script that builds the bug program -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #3 from mikedalpee at enginsol dot com 2010-09-01 13:16 --- Created an attachment (id=21635) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21635action=view) output of the bug program -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #4 from mikedalpee at enginsol dot com 2010-09-01 13:27 --- Created an attachment (id=21637) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21637action=view) Expected output of bug program - generated on Solaris 9 using Sun Studio 12. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #5 from mikedalpee at enginsol dot com 2010-09-01 13:32 --- This bug occurs across all versions of the compiler I have tested - 4.3, 4.4, 4.5, and 4.6. The bug is preventing me from porting software, because correct destructor excecution in a cancelled thread is fundamental to the proper functioning of this software. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #6 from rguenth at gcc dot gnu dot org 2010-09-01 13:40 --- I am sure this is more a pthread implementation issue, so a glibc bug on sourceware.org/bugzilla would be more appropriate. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
-- paolo dot carlini at oracle dot com changed: What|Removed |Added Severity|blocker |normal http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #7 from paolo dot carlini at oracle dot com 2010-09-01 13:53 --- Likewise about ICC. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #8 from pinskia at gcc dot gnu dot org 2010-09-02 00:44 --- Doing: catch (int i) { Guard g(ioSync); cout Caught i endl flush; sched_yield(); pthread_testcancel(); } Fixes the issue. Note there is a blog entry about POSIX thread cancel and C++ exceptions by Uli somewhere. There was huge discussion on a mailing list about it too. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #9 from mikedalpee at enginsol dot com 2010-09-02 01:13 --- That fix didn't change the behaviour one bit for me - was there more to it than just moving the two lines from where they were to the exception handler? Also, as I am new to this venue, could you please tell me where I can find the reference material you cited? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479
[Bug c++/45479] Exceptions not delivered properly after thread cancellation
--- Comment #10 from pinskia at gcc dot gnu dot org 2010-09-02 01:19 --- http://lmgtfy.com/?q=posix+thread+cancel+C%2B%2B+exceptions the third link is an interesting news group entry. http://udrepper.livejournal.com/21541.html etc. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45479