On 4/22/18 3:17 AM, Cym13 wrote:
On Sunday, 22 April 2018 at 05:29:30 UTC, Mike Franklin wrote:
On Sunday, 22 April 2018 at 00:41:34 UTC, Nicholas Wilson wrote:

You're not using the C library version of it, the compiler does the stack space reservation inline for you. There is no way around this.

I'm not convinced.  I did some no-runtime testing and eventually found the implementation in druntime here: https://github.com/dlang/druntime/blob/master/src/rt/alloca.d

Mike

The first assertion ("the C library isn't called") is easily apperent from
that assembly dump. The second is interesting but not so evident.

It might be clearer looking at actual assembly.

The doSomething function starts as such:

; sym._D4test11doSomethingFmZv (int arg_1h);
     ; prologue, puts the old stack pointer on the stack
       0x563d809095ec      55             push rbp
       0x563d809095ed      488bec         mov rbp, rsp
     ; allocate stack memory
       0x563d809095f0      4883ec20       sub rsp, 0x20
     ; setup arguments for the alloca call
    ; that 0x20 in rcx is actually the size of the current stack allocation
       0x563d809095f4      48c745e82000.  mov qword [local_18h], 0x20 ; 32
       0x563d809095fc      48ffc7         inc rdi
       0x563d809095ff      48897de0       mov qword [local_20h], rdi
       0x563d80909603      488d4de8       lea rcx, [local_18h]
     ; calls alloca
       0x563d80909607      e830010000     call sym.__alloca

The alloca function works as such:

;-- __alloca:
     ; Note how we don't create a stack frame by "push rbp;mov rbp,rsp"
     ; Those instructions could be inlined, it's not a function per se
     ;
    ; At that point rcx holds the size of the calling functions's stack frame
     ; and eax how much we want to add
       0x563d8090973c      4889ca         mov rdx, rcx
       0x563d8090973f      4889f8         mov rax, rdi
     ; Round rax up to 16 bytes
       0x563d80909742      4883c00f       add rax, 0xf
       0x563d80909746      24f0           and al, 0xf0
       0x563d80909748      4885c0         test rax, rax
   ,=< 0x563d8090974b      7505           jne 0x563d80909752
   |   0x563d8090974d      b810000000     mov eax, 0x10
   `-> 0x563d80909752      4889c6         mov rsi, rax
     ; Do the substraction in rax which holds the new address
       0x563d80909755      48f7d8         neg rax
       0x563d80909758      4801e0         add rax, rsp
     ; Check for overflows
   ,=< 0x563d8090975b      7321           jae 0x563d8090977e
   | ; Replace the old stack pointer by the new one
   |   0x563d8090975d      4889e9         mov rcx, rbp
   |   0x563d80909760      4829e1         sub rcx, rsp
   |   0x563d80909763      482b0a         sub rcx, qword [rdx]
   |   0x563d80909766      480132         add qword [rdx], rsi
   |   0x563d80909769      4889c4         mov rsp, rax
   |   0x563d8090976c      4801c8         add rax, rcx
   |   0x563d8090976f      4889e7         mov rdi, rsp
   |   0x563d80909772      4801e6         add rsi, rsp
   |   0x563d80909775      48c1e903       shr rcx, 3
  |   0x563d80909779      f348a5         rep movsq qword [rdi], qword ptr [rsi]
  ,==< 0x563d8090977c      eb02           jmp 0x563d80909780
  |`-> 0x563d8090977e      31c0           xor eax, eax
  |  ; Done!
  `--> 0x563d80909780      c3             ret

 So as you can see alloca isn't really a function in that it doesn't create a
  stack frame. It also needs help from the compiler to setup its arguments
 since the current allocation size is needed (rcx in the beginning of alloca)
which isn't a parameter known by the programmer. The compiler has to detect
that __alloca call and setup an additionnal argument by itself. Alloca then
just ("just") modifies the calling frame.


(I really hope I didn't mess something up)

Thanks, I didn't realize there was an implementation outside the compiler. I had thought the compiler did all this for you.

I also didn't realize there was an actual function (stack frame or no stack frame, you are calling and returning).

Literally, I thought alloca just bumped the stack pointer and loaded the result into your target. Seems really complex for what it's doing, but maybe that's because it's a function call that's not really normal.

-Steve

Reply via email to