#5003: test 4038 (ghci) fails on OS X 64 bit.
---------------------------------+------------------------------------------
    Reporter:  gwright           |        Owner:  gwright      
        Type:  bug               |       Status:  new          
    Priority:  high              |    Milestone:  7.2.1        
   Component:  GHCi              |      Version:  7.0.2        
    Keywords:                    |     Testcase:               
   Blockedby:                    |   Difficulty:               
          Os:  Unknown/Multiple  |     Blocking:               
Architecture:  x86_64 (amd64)    |      Failure:  Runtime crash
---------------------------------+------------------------------------------

Comment(by gwright):

 More information.  Great chunks of stack are being gobbled by
 `StgRunIsImplementedInAssembler` (in `rts/StgCRun.c`).  I set a watchpoint
 on `$rsp` an ran it between two breakpoints of `stg_makeStablePtrzh`.
 Here's the interesting piece:
 {{{
 Watchpoint 2: $rsp

 Old value = (void *) 0x7fff5f407b70
 New value = (void *) 0x7fff5f407b68
 0x0000000100330200 in StgRunIsImplementedInAssembler ()
 1: x/i $rip  0x100330200 <StgRunIsImplementedInAssembler>:      sub
 $0x4038,%rsp
 Continuing.
 Watchpoint 2: $rsp

 Old value = (void *) 0x7fff5f407b68
 New value = (void *) 0x7fff5f403b30
 0x0000000100330207 in StgRunIsImplementedInAssembler ()
 1: x/i $rip  0x100330207 <StgRunIsImplementedInAssembler+7>:    mov
 %rsp,%rax
 Continuing.
 }}}
 The stack pointer is being moved by a bit more than 16 kbytes by the `sub`
 instruction.

 On x86_64, the code for `StgRunIsImplementedInAssembler` is
 {{{
 static void GNUC3_ATTRIBUTE(used)
 StgRunIsImplementedInAssembler(void)
 {
     __asm__ volatile (
         /*
          * save callee-saves registers on behalf of the STG code.
          */
         ".globl " STG_RUN "\n"
         STG_RUN ":\n\t"
         "subq %0, %%rsp\n\t"
         "movq %%rsp, %%rax\n\t"
         "addq %0-48, %%rax\n\t"
         "movq %%rbx,0(%%rax)\n\t"
         "movq %%rbp,8(%%rax)\n\t"
         "movq %%r12,16(%%rax)\n\t"
         "movq %%r13,24(%%rax)\n\t"
         "movq %%r14,32(%%rax)\n\t"
         "movq %%r15,40(%%rax)\n\t"
         /*
          * Set BaseReg
          */
         "movq %%rsi,%%r13\n\t"
         /*
          * grab the function argument from the stack, and jump to it.
          */
         "movq %%rdi,%%rax\n\t"
         "jmp *%%rax\n\t"

         ".globl " STG_RETURN "\n"
         STG_RETURN ":\n\t"

         "movq %%rbx, %%rax\n\t"   /* Return value in R1  */

         /*
          * restore callee-saves registers.  (Don't stomp on %%rax!)
          */
         "movq %%rsp, %%rdx\n\t"
         "addq %0-48, %%rdx\n\t"
         "movq 0(%%rdx),%%rbx\n\t"       /* restore the registers saved
 above */
         "movq 8(%%rdx),%%rbp\n\t"
         "movq 16(%%rdx),%%r12\n\t"
         "movq 24(%%rdx),%%r13\n\t"
         "movq 32(%%rdx),%%r14\n\t"
         "movq 40(%%rdx),%%r15\n\t"
         "addq %0, %%rsp\n\t"
         "retq"

         : : "i"(RESERVED_C_STACK_BYTES+48+8 /*stack frame size*/));
     /*
        HACK alert!

        The x86_64 ABI specifies that on a procedure call, %rsp is
        aligned on a 16-byte boundary + 8.  That is, the first
        argument on the stack after the return address will be
        16-byte aligned.

        Which should be fine: RESERVED_C_STACK_BYTES+48 is a multiple
        of 16 bytes.

        BUT... when we do a C-call from STG land, gcc likes to put the
        stack alignment adjustment in the prolog.  eg. if we're calling
        a function with arguments in regs, gcc will insert 'subq $8,%rsp'
        in the prolog, to keep %rsp aligned (the return address is 8
        bytes, remember).  The mangler throws away the prolog, so we
        lose the stack alignment.

        The hack is to add this extra 8 bytes to our %rsp adjustment
        here, so that throughout STG code, %rsp is 16-byte aligned,
        ready for a C-call.

        A quick way to see if this is wrong is to compile this code:

           main = System.Exit.exitWith ExitSuccess

        And run it with +RTS -sstderr.  The stats code in the RTS, in
        particular statsPrintf(), relies on the stack alignment because
        it saves the %xmm regs on the stack, so it'll fall over if the
        stack isn't aligned, and calling exitWith from Haskell invokes
        shutdownHaskellAndExit using a C call.

        Future gcc releases will almost certainly break this hack...
     */
 }
 }}}
 Well, we're warned that something might go wrong here.

 When watching `$rsp` I saw the initial `subq` instruction but I never saw
 an `addq` instruction popping that big chunk of memory off the stack.

 The next question is whether these large blocks should be pushed, in which
 case the underlying error is elsewhere, or if the value 0x4038 subtracted
 from `$rsp` is wrong.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/5003#comment:16>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to