#4303: Segfault in RTS (apparently only MacOS)   FFI related?
-------------------------------+--------------------------------------------
    Reporter:  patperry        |        Owner:  igloo        
        Type:  bug             |       Status:  new          
    Priority:  highest         |    Milestone:  7.2.1        
   Component:  Runtime System  |      Version:  6.12.3       
    Keywords:                  |     Testcase:               
   Blockedby:                  |   Difficulty:               
          Os:  MacOS X         |     Blocking:               
Architecture:  x86             |      Failure:  Runtime crash
-------------------------------+--------------------------------------------

Comment(by simonmar):

 I've been looking at this, and have been able to reproduce it, but I don't
 know what the problem is.  Here is what I've discovered:

 The program does not seem to crash when run with `-V0`.  It crashes
 slightly more often with `-C0.001`.  Suggesting that the problem is
 somehow related to timing interrupts, which would explain why it is still
 non-deterministic even with the non-threaded RTS.

 When `StgReturn` is called, the C stack is supposed to look something like
 this:

 {{{
 (gdb) p32 $edx-64
 0xbffff99c:     0x120ab0b <schedule+948>
 0xbffff998:     0xbffff9e8
 0xbffff994:     0xbffffb20
 0xbffff990:     0x24
 0xbffff98c:     0x2cfe000
 0xbffff988:     0xbbbbbbbb
 0xbffff984:     0xbbbbbbbb
 0xbffff980:     0xbbbbbbbb
 0xbffff97c:     0xbbbbbbbb
 0xbffff978:     0xbbbbbbbb
 0xbffff974:     0xbbbbbbbb
 0xbffff970:     0xbbbbbbbb
 0xbffff96c:     0xbffff998
 0xbffff968:     0x0
 0xbffff964:     0x0
 0xbffff960:     0xbffffb20
 0xbffff95c:     0xbbbbbbbb
 0xbffff958:     0xbbbbbbbb
 0xbffff954:     0xbbbbbbbb
 0xbffff950:     0xbbbbbbbb
 0xbffff94c:     0xbbbbbbbb
 0xbffff948:     0xbbbbbbbb
 0xbffff944:     0xbbbbbbbb
 0xbffff940:     0xbbbbbbbb
 0xbffff93c:     0xbbbbbbbb
 0xbffff938:     0xbbbbbbbb
 0xbffff934:     0xbbbbbbbb
 0xbffff930:     0xbbbbbbbb
 0xbffff92c:     0xbbbbbbbb
 0xbffff928:     0xbbbbbbbb
 0xbffff924:     0xbbbbbbbb
 0xbffff920:     0xbbbbbbbb
 }}}

 (I added a memset to fill in the unused space with `0xbbbbbbbb` so we can
 easily see what has been written).

 The 4 words around `0xbffff960` are the saved values of some registers,
 which are restored by `StgReturn` before returning to the RTS.  Now, when
 the crash happens, the stack looks something like this:

 {{{
 (gdb) p32 $edx-64
 0xbffffa0c:     0x120ab0b <schedule+948>
 0xbffffa08:     0xbffffa58
 0xbffffa04:     0xbffffb98
 0xbffffa00:     0x0
 0xbffff9fc:     0x1
 0xbffff9f8:     0xbbbbbbbb
 0xbffff9f4:     0xbbbbbbbb
 0xbffff9f0:     0xbbbbbbbb
 0xbffff9ec:     0xbf8a6d01
 0xbffff9e8:     0xa6d01a6d
 0xbffff9e4:     0xbbbbbbbb
 0xbffff9e0:     0xbbbbbbbb
 0xbffff9dc:     0xc0001771
 0xbffff9d8:     0x1a6d01a
 0xbffff9d4:     0xbf7a6d01
 0xbffff9d0:     0xbffffb98
 0xbffff9cc:     0xbf7a6d01
 0xbffff9c8:     0xa6d01a64
 0xbffff9c4:     0xbbbbbbbb
 0xbffff9c0:     0xbbbbbbbb
 0xbffff9bc:     0xbbbbbbbb
 0xbffff9b8:     0xbbbbbbbb
 0xbffff9b4:     0xbbbbbbbb
 0xbffff9b0:     0xbbbbbbbb
 0xbffff9ac:     0xbbbbbbbb
 0xbffff9a8:     0xbbbbbbbb
 0xbffff9a4:     0xbbbbbbbb
 0xbffff9a0:     0xbbbbbbbb
 0xbffff99c:     0xbbbbbbbb
 0xbffff998:     0xbbbbbbbb
 0xbffff994:     0xbf7a6d01
 0xbffff990:     0xa6d01a67
 }}}

 i.e. those register values have been overwritten by junk.  In fact, it
 looks a lot like floating point values to me.

 The exact values are different each time, although I've seen similar
 values crop up.  The exact addresses that have been overwritten also
 differ.  Here's another sample:

 {{{
 (gdb) p32 $edx-64
 0xbffffa0c:     0x120ab0b <schedule+948>
 0xbffffa08:     0xbffffa58
 0xbffffa04:     0xbffffba0
 0xbffffa00:     0x0
 0xbffff9fc:     0x4a39000
 0xbffff9f8:     0xbbbbbbbb
 0xbffff9f4:     0xbbbbbbbb
 0xbffff9f0:     0xbbbbbbbb
 0xbffff9ec:     0xbbbbbbbb
 0xbffff9e8:     0xbbbbbbbb
 0xbffff9e4:     0xbbbbbbbb
 0xbffff9e0:     0xbbbbbbbb
 0xbffff9dc:     0xc0001771
 0xbffff9d8:     0x1a6d01a
 0xbffff9d4:     0xbf7a6d01
 0xbffff9d0:     0xbffffba0
 0xbffff9cc:     0xbf7a6d01
 0xbffff9c8:     0xa6d01a64
 0xbffff9c4:     0xbbbbbbbb
 0xbffff9c0:     0xbbbbbbbb
 0xbffff9bc:     0xbbbbbbbb
 0xbffff9b8:     0xbbbbbbbb
 0xbffff9b4:     0xbbbbbbbb
 0xbffff9b0:     0xbbbbbbbb
 0xbffff9ac:     0xbbbbbbbb
 0xbffff9a8:     0xbbbbbbbb
 0xbffff9a4:     0xbbbbbbbb
 0xbffff9a0:     0xbbbbbbbb
 0xbffff99c:     0xbbbbbbbb
 0xbffff998:     0xbbbbbbbb
 0xbffff994:     0xbbbbbbbb
 0xbffff990:     0xbbbbbbbb
 }}}

 So something is corrupting memory, in an unpredictable way.

 Is there any way to get the size of the example down at all?

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4303#comment:29>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to