http://bugzilla.novell.com/show_bug.cgi?id=528830
User [email protected] added comment http://bugzilla.novell.com/show_bug.cgi?id=528830#c7 --- Comment #7 from Romain Tartière <[email protected]> 2009-08-14 06:30:16 MDT --- I tried to malloc the thread data to see what happen but it did not do the trick (maybe I was doing this wrong though)... However, I have descovered that the tests fails each time WaitForSingleObjectEx returns WAIT_IO_COMPLETION 2 times. Printing some info like that... -----------8<-------------------------------- 2165 while ((res = WaitForSingleObjectEx (thread_handle, INFINITE, TRUE) == WAIT_IO_COMPLETION)) { 2166 if (mono_thread_has_appdomain_ref (mono_thread_current (), domain) && (mono_thread_interruption_requested ())) 2167 /* The unload thread tries to abort us */ 2168 /* The icall wrapper will execute the abort */ 2169 CloseHandle (thread_handle); 2170 printf("XXXXXX\n"); 2171 return; 2172 } 2173 printf("WaitForSingleObjectEx returned %d (expected %d).\n domain->state = %d (expected = %d)\n", res, WAIT_OBJECT_0, domain->state, MONO_APPDOMAIN_UNLOADED); -----------8<-------------------------------- Produces this output for a passing test -----------8<-------------------------------- UNLOAD STARTING FOR Test-unload0 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload1 (0x8424d20) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload2 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload3 (0x8424c30) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload4 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload5 (0x8424d20) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload6 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload7 (0x8424d20) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload8 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload9 (0x8424d20) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test2 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-is-finalizing (0x8424d20) IN THREAD 0x29601040. FINALIZING IN DOMAIN appdomain-unload.exe: False FINALIZING IN DOMAIN Test-is-finalizing: True Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test3 (0x8424e10) IN THREAD 0x29601040. Thread aborted correctly. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR test_0_unload_with_threadpool (0x8424d20) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test3 (0x8424e10) IN THREAD 0x29601040. XXXXXX UNLOAD STARTING FOR DeadInvokeTest (0x8424b40) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) Regression tests: 8 ran, 0 failed in Tests -----------8<-------------------------------- .. and this for a failing one... -----------8<-------------------------------- UNLOAD STARTING FOR Test-unload0 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload1 (0x8424d20) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload2 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload3 (0x8424c30) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload4 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload5 (0x8424d20) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload6 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload7 (0x8424d20) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload8 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-unload9 (0x8424d20) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test2 (0x8424e10) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test-is-finalizing (0x8424d20) IN THREAD 0x29601040. FINALIZING IN DOMAIN appdomain-unload.exe: False FINALIZING IN DOMAIN Test-is-finalizing: True Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test3 (0x8424e10) IN THREAD 0x29601040. Thread aborted correctly. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR test_0_unload_with_threadpool (0x8424d20) IN THREAD 0x29601040. Waiting result: domain->state = 3 (expected = 3) UNLOAD STARTING FOR Test3 (0x8424e10) IN THREAD 0x29601040. XXXXXX UNLOAD STARTING FOR DeadInvokeTest (0x8424b40) IN THREAD 0x29601040. XXXXXX Stacktrace: -----------8<-------------------------------- Is there some condition it would be usefull to test (e.g. try to have some static counter somewhere and if it's the second time we have the condition, abort and have a nice backtrace from a location we should not have reached and track down how we came here?). I have another evidence of a probable race condition, from time to time, mono hangs after the message saying all tests passed. Attaching the debugger to the process, I can see a dead lock (or at least I _think_ it is one): -----------8<-------------------------------- (gdb) thread apply all bt Thread 4 (Thread 0x29601040 (LWP 100362)): #0 0x28716197 in __error () from /lib/libthr.so.3 #1 0x28715d78 in __error () from /lib/libthr.so.3 #2 0x2965b4a0 in ?? () #3 0x00000008 in ?? () #4 0x00000001 in ?? () #5 0x2965b480 in ?? () #6 0x00000000 in ?? () #7 0xbfbfe85c in ?? () #8 0xbfbfe518 in ?? () #9 0x287146cf in pthread_setcancelstate () from /lib/libthr.so.3 #10 0x28713f6d in pthread_cond_signal () from /lib/libthr.so.3 #11 0x082414ef in _wapi_handle_timedwait_signal_handle (handle=0x5800, timeout=0x0, alertable=1, poll=0) at ../../../mono/mono/io-layer/handles.c:1610 #12 0x08241217 in _wapi_handle_wait_signal (poll=0) at ./../../mono/mono/io-layer/handles.c:1539 #13 0x082591ff in WaitForMultipleObjectsEx (numobjects=3, handles=0x29650400, waitall=1, timeout=4294967295, alertable=0) at ./../../mono/mono/io-layer/wait.c:724 #14 0x0821d6f5 in wait_for_tids (wait=0x29650400, timeout=4294967295) at ./../../mono/mono/metadata/threads.c:2675 #15 0x0821e4d0 in mono_thread_manage () at ./../../mono/mono/metadata/threads.c:2963 #16 0x080e5d8a in mono_main (argc=2, argv=0xbfbfe850) at ./../../mono/mono/mini/driver.c:1676 #17 0x080594ac in main (argc=0, argv=0x0) at ../../../mono/mono/mini/main.c:34 Thread 3 (Thread 0x29601260 (LWP 100270)): #0 0x287d83cf in nanosleep () from /lib/libc.so.7 #1 0x2870dd0b in nanosleep () from /lib/libthr.so.3 #2 0x0823ba65 in collection_thread (unused=0x0) at ./../../mono/mono/io-layer/collection.c:34 #3 0x2870b6ff in pthread_getprio () from /lib/libthr.so.3 #4 0x00000000 in ?? () Thread 2 (Thread 0x29601370 (LWP 100292)): #0 0x28716197 in __error () from /lib/libthr.so.3 #1 0x28715b89 in __error () from /lib/libthr.so.3 #2 0x296066ac in ?? () #3 0x0000000f in ?? () #4 0x00000000 in ?? () #5 0x00000000 in ?? () #6 0x00000000 in ?? () #7 0x28717b4c in ?? () from /lib/libthr.so.3 #8 0xbf9ede98 in ?? () #9 0x2870aefc in sem_wait () from /lib/libthr.so.3 Previous frame identical to this frame (corrupt stack?) Thread 1 (Thread 0x29ea3470 (LWP 100375)): #0 0x287e695b in gettimeofday () from /lib/libc.so.7 #1 0x287d839d in time () from /lib/libc.so.7 #2 0x0823ff44 in _wapi_handle_ref (handle=0x5801) at ./../../mono/mono/io-layer/handles.c:1006 #3 0x08258758 in _wapi_handle_lock_handle (handle=0x5801) at handles-private.h:259 #4 0x0825850d in WaitForSingleObjectEx (handle=0x5801, timeout=4294967295, alertable=1) at ../../../mono/mono/io-layer/wait.c:149 #5 0x082244c2 in async_invoke_thread (data=0x0) at ./../../mono/mono/metadata/threadpool.c:1438 #6 0x082188fa in start_wrapper (data=0x2a001040) at ./../../mono/mono/metadata/threads.c:657 #7 0x0825a8aa in thread_start_routine (args=0x2966e4f0) at ./../../mono/mono/io-layer/wthreads.c:286 #8 0x0827ac9e in GC_start_routine (arg=0x84489e0) at ./../mono/libgc/pthread_support.c:1390 #9 0x2870b6ff in pthread_getprio () from /lib/libthr.so.3 #10 0xbf3e7fec in ?? () (gdb) thread 4 [Switching to thread 4 (Thread 0x29601040 (LWP 100362))]#0 0x28716197 in __error () from /lib/libthr.so.3 (gdb) f 13 #13 0x082591ff in WaitForMultipleObjectsEx (numobjects=3, handles=0x29650400, waitall=1, timeout=4294967295, alertable=0) at ./../../mono/mono/io-layer/wait.c:724 724 ret = _wapi_handle_wait_signal (poll); (gdb) p handles $1 = (gpointer *) 0x29650400 (gdb) p handles[0] $2 = 0x5851 (gdb) p handles[1] $3 = 0x5852 (gdb) p handles[2] $4 = 0x5859 (gdb) p handles[3] $5 = 0x0 (gdb) -----------8<-------------------------------- So thread 4 and thread 1 are both waiting for the same resource (0x5851) infinitely. I am not sure that both problems are related but they both occur when mono is terminating, thus I did not open another bug. I keep a core file of mono hanging on this deadlock if you want me to dig in certain area (I don't really know where to look at). Thank you for your suggestions :-) -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug. _______________________________________________ mono-bugs maillist - [email protected] http://lists.ximian.com/mailman/listinfo/mono-bugs
