#3295: Null deref by threaded runtime scheduler
---------------------------------------------------------+------------------
    Reporter:  A1kmm                                     |        Owner:  
simonmar
        Type:  bug                                       |       Status:  new   
  
    Priority:  high                                      |    Milestone:  
6.12.1  
   Component:  Runtime System                            |      Version:  6.11  
  
    Severity:  major                                     |   Resolution:        
  
    Keywords:  crash, nullderef, threaded, parallel, GC  |   Difficulty:  
Unknown 
    Testcase:                                            |           Os:  Linux 
  
Architecture:  ia64                                      |  
---------------------------------------------------------+------------------
Changes (by simonmar):

  * priority:  normal => high
  * difficulty:  => Unknown
  * owner:  => simonmar
  * milestone:  => 6.12.1

Old description:

> Using ghc and runtime built from HEAD on Tuesday (although this has been
> an issue on older builds I tried as well), the ghc runtime crashes on a
> null deref.
>
> The system is a 24-processor Intel Xeon 2.66 GHz shared memory system
> running Linux 2.6.16.60-0.34-smp in 64 bit mode (from SUSE 10 SP2).
>
> This only happens when the program is compiled with -threaded (threaded
> runtime), but doesn't happen with the threaded debug runtime.
>
> It also only happens when +RTS -N2 or a greater number of threads is
> passed to the runtime. It doesn't happen when -g1 is passed to turn off
> distributed garbage collection.
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 1241594176 (LWP 20751)]
> 0x000000000067262c in schedule (initialCapability=<value optimized out>,
> task=0x994370) at rts/Schedule.c:672
> 672         return (waiting_for_gc ||
> (gdb) list
> 667         //   - another thread is initiating a GC
> 668         //   - another Task is returning from a foreign call
> 669         //   - the thread at the head of the run queue cannot be run
> 670         //     by this Task (it is bound to another Task, or it is
> unbound
> 671         //     and this task it bound).
> 672         return (waiting_for_gc ||
> 673                 cap->returning_tasks_hd != NULL ||
> 674                 (!emptyRunQueue(cap) && (task->tso == NULL
> 675                                          ? cap->run_queue_hd->bound
> != NULL
> 676                                          : cap->run_queue_hd->bound
> != task)));
> (gdb) print cap
> $1 = (Capability *) 0x0
> (gdb) print *task
> $2 = {id = 1241594176, cap = 0x0, stopped = 7160552, suspended_tso = 0x0,
> tso = 0x7fccc4, stat = NoStatus, ret = 0x0, cond = {__data = {__lock = 1,
> __futex = 0,
>       __total_seq = 1, __wakeup_seq = 0, __woken_seq = 0, __mutex =
> 0x84e340, __nwaiters = 10033872, __broadcast_seq = 0},
>     __size = "\001\000\000\000\000\000\000\000\001", '\0' <repeats 23
> times>, "@�\204\000\000\000\000\000�\032\231\000\000\000\000", __align =
> 1}, lock = {__data = {
>       __lock = 0, __count = 1, __owner = 0, __nusers = 0, __kind = 22668,
> __spins = 0, __list = {__prev = 0x5793, __next = 0x85afd}},
>     __size = "\000\000\000\000\001", '\0' <repeats 11 times>,
> "\214X\000\000\000\000\000\000\223W\000\000\000\000\000\000�Z\b\000\000\000\000",
> __align = 4294967296},
>   wakeup = 547420, elapsedtimestart = 202, muttimestart = 0, mut_time =
> 0, mut_etime = 0, gc_time = 0, gc_etime = 10033296, prev = 0x994370, next
> = 0x0, return_link = 0x0,
>   all_link = 0x0, prev_stack = 0x9945c0}
> (gdb) info threads
> * 21 Thread 1241594176 (LWP 20751)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x994370) at
> rts/Schedule.c:672
>   20 Thread 1233201472 (LWP 20750)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9b4930) at
> rts/Schedule.c:672
>   19 Thread 1224808768 (LWP 20749)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9b2df0) at
> rts/Schedule.c:672
>   18 Thread 1216416064 (LWP 20748)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9b12b0) at
> rts/Schedule.c:672
>   17 Thread 1208023360 (LWP 20747)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9af770) at
> rts/Schedule.c:672
>   16 Thread 1199630656 (LWP 20746)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9adc30) at
> rts/Schedule.c:672
>   15 Thread 1191237952 (LWP 20745)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9ac0f0) at
> rts/Schedule.c:672
>   14 Thread 1182845248 (LWP 20744)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9aa5b0) at
> rts/Schedule.c:672
>   13 Thread 1174452544 (LWP 20743)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a8a70) at
> rts/Schedule.c:672
>   12 Thread 1166059840 (LWP 20742)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a6f30) at
> rts/Schedule.c:672
>   11 Thread 1157667136 (LWP 20741)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a53f0) at
> rts/Schedule.c:672
>   10 Thread 1149274432 (LWP 20740)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a38b0) at
> rts/Schedule.c:672
>   9 Thread 1140881728 (LWP 20739)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a1d70) at
> rts/Schedule.c:672
>   8 Thread 1132489024 (LWP 20738)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a0230) at
> rts/Schedule.c:672
>   7 Thread 1124096320 (LWP 20737)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x99e6f0) at
> rts/Schedule.c:672
>   6 Thread 1115703616 (LWP 20736)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x99cbb0) at
> rts/Schedule.c:672
>   5 Thread 1107310912 (LWP 20735)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x99b070) at
> rts/Schedule.c:672
>   4 Thread 1098918208 (LWP 20734)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x999530) at
> rts/Schedule.c:672
>   3 Thread 1090525504 (LWP 20733)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9979f0) at
> rts/Schedule.c:672
>   2 Thread 1082132800 (LWP 20732)  0x00002aae2fa22548 in
> __lll_mutex_lock_wait () from /lib64/libpthread.so.0
>   1 Thread 46927615298704 (LWP 20729)  0x00002aae2fa20553 in
> pthread_cond_signal@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
>
> The values of *task and the thread in which the crash occurs is not
> always consistent, but the stacktrace is. For example, another crash...
>
> (gdb) bt
> #0  0x000000000067262c in schedule (initialCapability=<value optimized
> out>, task=0x9b4930) at rts/Schedule.c:672
> #1  0x0000000000673205 in workerStart (task=0x9915e0) at
> rts/Schedule.c:2033
> #2  0x00002aae0cd1a143 in start_thread () from /lib64/libpthread.so.0
> #3  0x00002aae0ceee8cd in clone () from /lib64/libc.so.6
> #4  0x0000000000000000 in ?? ()
> (gdb) print *task
> $6 = {id = 1233201472, cap = 0x0, stopped = 11791697, suspended_tso =
> 0x0, tso = 0x25ed0f1, stat = NoStatus, ret = 0x0, cond = {__data =
> {__lock = 1, __futex = 0,
>       __total_seq = 1, __wakeup_seq = 0, __woken_seq = 0, __mutex =
> 0x84e800, __nwaiters = 10033872, __broadcast_seq = 0},
>     __size = "\001\000\000\000\000\000\000\000\001", '\0' <repeats 24
> times>, "�\204\000\000\000\000\000�\032\231\000\000\000\000", __align =
> 1}, lock = {__data = {__lock = 0,
>       __count = 1, __owner = 0, __nusers = 0, __kind = 875, __spins = 0,
> __list = {__prev = 0x33a, __next = 0x2da2b}},
>     __size = "\000\000\000\000\001", '\0' <repeats 11 times>,
> "k\003\000\000\000\000\000\000:\003\000\000\000\000\000\000+�\002\000\000\000\000",
> __align = 4294967296},
>   wakeup = 186902, elapsedtimestart = 24, muttimestart = 0, mut_time = 0,
> mut_etime = 0, gc_time = 0, gc_etime = 10033296, prev = 0x9b4930, next =
> 0x0, return_link = 0x0,
>   all_link = 0x0, prev_stack = 0x9b4b80}
> (gdb) info threads
>   22 Thread 1249986880 (LWP 20846)  0x00002aae0cd20548 in
> __lll_mutex_lock_wait () from /lib64/libpthread.so.0
>   21 Thread 1241594176 (LWP 20845)  0x00002aae0cd1e1c6 in
> pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
> * 20 Thread 1233201472 (LWP 20844)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9b4930) at
> rts/Schedule.c:672
>   19 Thread 1224808768 (LWP 20843)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9b2df0) at
> rts/Schedule.c:672
>   18 Thread 1216416064 (LWP 20842)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9b12b0) at
> rts/Schedule.c:672
>   17 Thread 1208023360 (LWP 20841)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9af770) at
> rts/Schedule.c:672
>   16 Thread 1199630656 (LWP 20840)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9adc30) at
> rts/Schedule.c:672
>   15 Thread 1191237952 (LWP 20839)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9ac0f0) at
> rts/Schedule.c:672
>   14 Thread 1182845248 (LWP 20838)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9aa5b0) at
> rts/Schedule.c:672
>   13 Thread 1174452544 (LWP 20837)  gcWorkerThread (cap=<value optimized
> out>) at includes/SpinLock.h:45
>   12 Thread 1166059840 (LWP 20836)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a6f30) at
> rts/Schedule.c:672
>   11 Thread 1157667136 (LWP 20835)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a53f0) at
> rts/Schedule.c:672
>   10 Thread 1149274432 (LWP 20834)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a38b0) at
> rts/Schedule.c:672
>   9 Thread 1140881728 (LWP 20833)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a1d70) at
> rts/Schedule.c:672
>   8 Thread 1132489024 (LWP 20832)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9a0230) at
> rts/Schedule.c:672
>   7 Thread 1124096320 (LWP 20831)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x99e6f0) at
> rts/Schedule.c:672
>   6 Thread 1115703616 (LWP 20830)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x99cbb0) at
> rts/Schedule.c:672
>   5 Thread 1107310912 (LWP 20829)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x99b070) at
> rts/Schedule.c:672
>   4 Thread 1098918208 (LWP 20827)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x999530) at
> rts/Schedule.c:672
>   3 Thread 1090525504 (LWP 20826)  0x000000000067262c in schedule
> (initialCapability=<value optimized out>, task=0x9979f0) at
> rts/Schedule.c:672
>   2 Thread 1082132800 (LWP 20825)  0x00002aae0cee86e2 in select () from
> /lib64/libc.so.6
>   1 Thread 46927031233680 (LWP 20824)  0x00002aae0cd1c2ef in
> pthread_mutex_lock () from /lib64/libpthread.so.0
>
> valgrind memcheck does not report any errors prior to the NULL deref:
> ==20983== Thread 2:
> ==20983== Invalid read of size 8
> ==20983==    at 0x67262C: schedule (Schedule.c:672)
> ==20983==    by 0x673204: workerStart (Schedule.c:2033)
> ==20983==    by 0x58E4142: start_thread (in /lib64/libpthread-2.4.so)
> ==20983==    by 0x5AB78CC: clone (in /lib64/libc-2.4.so)
> ==20983==  Address 0x1d0 is not stack'd, malloc'd or (recently) free'd
> ==20983==
> ==20983== Process terminating with default action of signal 11 (SIGSEGV)
>
> valgrind helgrind reports numerous possible data race conditions,
> including one between schedule and GarbageCollect just prior to the
> crash:
> ==21106== Possible data race during read of size 8 at 0x5ccc300 by thread
> #5
> ==21106==    at 0x672612: schedule (Schedule.c:367)
> ==21106==    by 0x673204: workerStart (Schedule.c:2033)
> ==21106==    by 0x4A24A1E: mythread_wrapper (hg_intercepts.c:194)
> ==21106==    by 0x58E7142: start_thread (in /lib64/libpthread-2.4.so)
> ==21106==    by 0x5ABA8CC: clone (in /lib64/libc-2.4.so)
> ==21106==  This conflicts with a previous write of size 8 by thread #1
> ==21106==    at 0x679B8D: GarbageCollect (SpinLock.h:55)
> ==21106==    by 0x6712D0: scheduleDoGC (Schedule.c:1522)
> ==21106==    by 0x672C77: schedule (Schedule.c:621)
> ==21106==    by 0x66FF84: real_main (RtsMain.c:68)
> ==21106==    by 0x67009D: hs_main (RtsMain.c:117)
> ==21106==    by 0x5A17183: (below main) (in /lib64/libc-2.4.so)
>
> I am still working on whether I can make a small program to reproduce
> this.

New description:

 Using ghc and runtime built from HEAD on Tuesday (although this has been
 an issue on older builds I tried as well), the ghc runtime crashes on a
 null deref.

 The system is a 24-processor Intel Xeon 2.66 GHz shared memory system
 running Linux 2.6.16.60-0.34-smp in 64 bit mode (from SUSE 10 SP2).

 This only happens when the program is compiled with -threaded (threaded
 runtime), but doesn't happen with the threaded debug runtime.

 It also only happens when +RTS -N2 or a greater number of threads is
 passed to the runtime. It doesn't happen when -g1 is passed to turn off
 distributed garbage collection.
 {{{
 Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 1241594176 (LWP 20751)]
 0x000000000067262c in schedule (initialCapability=<value optimized out>,
 task=0x994370) at rts/Schedule.c:672
 672         return (waiting_for_gc ||
 (gdb) list
 667         //   - another thread is initiating a GC
 668         //   - another Task is returning from a foreign call
 669         //   - the thread at the head of the run queue cannot be run
 670         //     by this Task (it is bound to another Task, or it is
 unbound
 671         //     and this task it bound).
 672         return (waiting_for_gc ||
 673                 cap->returning_tasks_hd != NULL ||
 674                 (!emptyRunQueue(cap) && (task->tso == NULL
 675                                          ? cap->run_queue_hd->bound !=
 NULL
 676                                          : cap->run_queue_hd->bound !=
 task)));
 (gdb) print cap
 $1 = (Capability *) 0x0
 (gdb) print *task
 $2 = {id = 1241594176, cap = 0x0, stopped = 7160552, suspended_tso = 0x0,
 tso = 0x7fccc4, stat = NoStatus, ret = 0x0, cond = {__data = {__lock = 1,
 __futex = 0,
       __total_seq = 1, __wakeup_seq = 0, __woken_seq = 0, __mutex =
 0x84e340, __nwaiters = 10033872, __broadcast_seq = 0},
     __size = "\001\000\000\000\000\000\000\000\001", '\0' <repeats 23
 times>, "@�\204\000\000\000\000\000�\032\231\000\000\000\000", __align =
 1}, lock = {__data = {
       __lock = 0, __count = 1, __owner = 0, __nusers = 0, __kind = 22668,
 __spins = 0, __list = {__prev = 0x5793, __next = 0x85afd}},
     __size = "\000\000\000\000\001", '\0' <repeats 11 times>,
 
"\214X\000\000\000\000\000\000\223W\000\000\000\000\000\000�Z\b\000\000\000\000",
 __align = 4294967296},
   wakeup = 547420, elapsedtimestart = 202, muttimestart = 0, mut_time = 0,
 mut_etime = 0, gc_time = 0, gc_etime = 10033296, prev = 0x994370, next =
 0x0, return_link = 0x0,
   all_link = 0x0, prev_stack = 0x9945c0}
 (gdb) info threads
 * 21 Thread 1241594176 (LWP 20751)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x994370) at
 rts/Schedule.c:672
   20 Thread 1233201472 (LWP 20750)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9b4930) at
 rts/Schedule.c:672
   19 Thread 1224808768 (LWP 20749)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9b2df0) at
 rts/Schedule.c:672
   18 Thread 1216416064 (LWP 20748)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9b12b0) at
 rts/Schedule.c:672
   17 Thread 1208023360 (LWP 20747)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9af770) at
 rts/Schedule.c:672
   16 Thread 1199630656 (LWP 20746)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9adc30) at
 rts/Schedule.c:672
   15 Thread 1191237952 (LWP 20745)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9ac0f0) at
 rts/Schedule.c:672
   14 Thread 1182845248 (LWP 20744)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9aa5b0) at
 rts/Schedule.c:672
   13 Thread 1174452544 (LWP 20743)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a8a70) at
 rts/Schedule.c:672
   12 Thread 1166059840 (LWP 20742)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a6f30) at
 rts/Schedule.c:672
   11 Thread 1157667136 (LWP 20741)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a53f0) at
 rts/Schedule.c:672
   10 Thread 1149274432 (LWP 20740)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a38b0) at
 rts/Schedule.c:672
   9 Thread 1140881728 (LWP 20739)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a1d70) at
 rts/Schedule.c:672
   8 Thread 1132489024 (LWP 20738)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a0230) at
 rts/Schedule.c:672
   7 Thread 1124096320 (LWP 20737)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x99e6f0) at
 rts/Schedule.c:672
   6 Thread 1115703616 (LWP 20736)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x99cbb0) at
 rts/Schedule.c:672
   5 Thread 1107310912 (LWP 20735)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x99b070) at
 rts/Schedule.c:672
   4 Thread 1098918208 (LWP 20734)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x999530) at
 rts/Schedule.c:672
   3 Thread 1090525504 (LWP 20733)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9979f0) at
 rts/Schedule.c:672
   2 Thread 1082132800 (LWP 20732)  0x00002aae2fa22548 in
 __lll_mutex_lock_wait () from /lib64/libpthread.so.0
   1 Thread 46927615298704 (LWP 20729)  0x00002aae2fa20553 in
 pthread_cond_signal@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
 }}}
 The values of *task and the thread in which the crash occurs is not always
 consistent, but the stacktrace is. For example, another crash...
 {{{
 (gdb) bt
 #0  0x000000000067262c in schedule (initialCapability=<value optimized
 out>, task=0x9b4930) at rts/Schedule.c:672
 #1  0x0000000000673205 in workerStart (task=0x9915e0) at
 rts/Schedule.c:2033
 #2  0x00002aae0cd1a143 in start_thread () from /lib64/libpthread.so.0
 #3  0x00002aae0ceee8cd in clone () from /lib64/libc.so.6
 #4  0x0000000000000000 in ?? ()
 (gdb) print *task
 $6 = {id = 1233201472, cap = 0x0, stopped = 11791697, suspended_tso = 0x0,
 tso = 0x25ed0f1, stat = NoStatus, ret = 0x0, cond = {__data = {__lock = 1,
 __futex = 0,
       __total_seq = 1, __wakeup_seq = 0, __woken_seq = 0, __mutex =
 0x84e800, __nwaiters = 10033872, __broadcast_seq = 0},
     __size = "\001\000\000\000\000\000\000\000\001", '\0' <repeats 24
 times>, "�\204\000\000\000\000\000�\032\231\000\000\000\000", __align =
 1}, lock = {__data = {__lock = 0,
       __count = 1, __owner = 0, __nusers = 0, __kind = 875, __spins = 0,
 __list = {__prev = 0x33a, __next = 0x2da2b}},
     __size = "\000\000\000\000\001", '\0' <repeats 11 times>,
 
"k\003\000\000\000\000\000\000:\003\000\000\000\000\000\000+�\002\000\000\000\000",
 __align = 4294967296},
   wakeup = 186902, elapsedtimestart = 24, muttimestart = 0, mut_time = 0,
 mut_etime = 0, gc_time = 0, gc_etime = 10033296, prev = 0x9b4930, next =
 0x0, return_link = 0x0,
   all_link = 0x0, prev_stack = 0x9b4b80}
 (gdb) info threads
   22 Thread 1249986880 (LWP 20846)  0x00002aae0cd20548 in
 __lll_mutex_lock_wait () from /lib64/libpthread.so.0
   21 Thread 1241594176 (LWP 20845)  0x00002aae0cd1e1c6 in
 pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
 * 20 Thread 1233201472 (LWP 20844)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9b4930) at
 rts/Schedule.c:672
   19 Thread 1224808768 (LWP 20843)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9b2df0) at
 rts/Schedule.c:672
   18 Thread 1216416064 (LWP 20842)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9b12b0) at
 rts/Schedule.c:672
   17 Thread 1208023360 (LWP 20841)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9af770) at
 rts/Schedule.c:672
   16 Thread 1199630656 (LWP 20840)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9adc30) at
 rts/Schedule.c:672
   15 Thread 1191237952 (LWP 20839)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9ac0f0) at
 rts/Schedule.c:672
   14 Thread 1182845248 (LWP 20838)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9aa5b0) at
 rts/Schedule.c:672
   13 Thread 1174452544 (LWP 20837)  gcWorkerThread (cap=<value optimized
 out>) at includes/SpinLock.h:45
   12 Thread 1166059840 (LWP 20836)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a6f30) at
 rts/Schedule.c:672
   11 Thread 1157667136 (LWP 20835)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a53f0) at
 rts/Schedule.c:672
   10 Thread 1149274432 (LWP 20834)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a38b0) at
 rts/Schedule.c:672
   9 Thread 1140881728 (LWP 20833)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a1d70) at
 rts/Schedule.c:672
   8 Thread 1132489024 (LWP 20832)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9a0230) at
 rts/Schedule.c:672
   7 Thread 1124096320 (LWP 20831)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x99e6f0) at
 rts/Schedule.c:672
   6 Thread 1115703616 (LWP 20830)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x99cbb0) at
 rts/Schedule.c:672
   5 Thread 1107310912 (LWP 20829)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x99b070) at
 rts/Schedule.c:672
   4 Thread 1098918208 (LWP 20827)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x999530) at
 rts/Schedule.c:672
   3 Thread 1090525504 (LWP 20826)  0x000000000067262c in schedule
 (initialCapability=<value optimized out>, task=0x9979f0) at
 rts/Schedule.c:672
   2 Thread 1082132800 (LWP 20825)  0x00002aae0cee86e2 in select () from
 /lib64/libc.so.6
   1 Thread 46927031233680 (LWP 20824)  0x00002aae0cd1c2ef in
 pthread_mutex_lock () from /lib64/libpthread.so.0
 }}}
 valgrind memcheck does not report any errors prior to the NULL deref:
 {{{
 ==20983== Thread 2:
 ==20983== Invalid read of size 8
 ==20983==    at 0x67262C: schedule (Schedule.c:672)
 ==20983==    by 0x673204: workerStart (Schedule.c:2033)
 ==20983==    by 0x58E4142: start_thread (in /lib64/libpthread-2.4.so)
 ==20983==    by 0x5AB78CC: clone (in /lib64/libc-2.4.so)
 ==20983==  Address 0x1d0 is not stack'd, malloc'd or (recently) free'd
 ==20983==
 ==20983== Process terminating with default action of signal 11 (SIGSEGV)
 }}}
 valgrind helgrind reports numerous possible data race conditions,
 including one between schedule and GarbageCollect just prior to the crash:
 {{{
 ==21106== Possible data race during read of size 8 at 0x5ccc300 by thread
 #5
 ==21106==    at 0x672612: schedule (Schedule.c:367)
 ==21106==    by 0x673204: workerStart (Schedule.c:2033)
 ==21106==    by 0x4A24A1E: mythread_wrapper (hg_intercepts.c:194)
 ==21106==    by 0x58E7142: start_thread (in /lib64/libpthread-2.4.so)
 ==21106==    by 0x5ABA8CC: clone (in /lib64/libc-2.4.so)
 ==21106==  This conflicts with a previous write of size 8 by thread #1
 ==21106==    at 0x679B8D: GarbageCollect (SpinLock.h:55)
 ==21106==    by 0x6712D0: scheduleDoGC (Schedule.c:1522)
 ==21106==    by 0x672C77: schedule (Schedule.c:621)
 ==21106==    by 0x66FF84: real_main (RtsMain.c:68)
 ==21106==    by 0x67009D: hs_main (RtsMain.c:117)
 ==21106==    by 0x5A17183: (below main) (in /lib64/libc-2.4.so)
 }}}

 I am still working on whether I can make a small program to reproduce
 this.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/3295#comment:1>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to