Re: [tip:x86/asm] objtool: Track DRAP separately from callee-saved registers

hpa Fri, 11 Aug 2017 10:31:02 -0700

On August 11, 2017 9:57:13 AM PDT, Josh Poimboeuf <[email protected]> wrote:
>On Fri, Aug 11, 2017 at 09:22:11AM -0700, Andy Lutomirski wrote:
>> On Fri, Aug 11, 2017 at 5:13 AM, tip-bot for Josh Poimboeuf
>> <[email protected]> wrote:
>> > Commit-ID:  bf4d1a83758368c842c94cab9661a75ca98bc848
>> > Gitweb:    
>http://git.kernel.org/tip/bf4d1a83758368c842c94cab9661a75ca98bc848
>> > Author:     Josh Poimboeuf <[email protected]>
>> > AuthorDate: Thu, 10 Aug 2017 16:37:26 -0500
>> > Committer:  Ingo Molnar <[email protected]>
>> > CommitDate: Fri, 11 Aug 2017 14:06:15 +0200
>> >
>> > objtool: Track DRAP separately from callee-saved registers
>> >
>> > When GCC realigns a function's stack, it sometimes uses %r13 as the
>DRAP
>> > register, like:
>> >
>> >   push  %r13
>> >   lea   0x10(%rsp), %r13
>> >   and   $0xfffffffffffffff0, %rsp
>> >   pushq -0x8(%r13)
>> >   push  %rbp
>> >   mov   %rsp, %rbp
>> >   push  %r13
>> >   ...
>> >   mov   -0x8(%rbp),%r13
>> >   leaveq
>> >   lea   -0x10(%r13), %rsp
>> >   pop   %r13
>> >   retq
>> >
>> 
>> I have a couple questions, mainly to help me understand.
>> 
>> Question 1: What does DRAP stand for?  Duplicate Return Address
>> Pointer?  Dynamic ReAlignment Pointer?  I tried searching and got
>> nothing.
>
>It seems to be a GCC invention which stands for:
>
>  Dynamic Realign Argument Pointer.
>
>I don't think it's documented anywhere, but there's at least some
>comments about it in the GCC sources if you search for DRAP.
>
>> Question 2: What's up with the resulting stack layout?  It seems we
>have:
>> 
>> caller's last stack slot  <-- r13 in function body points here
>> return address
>> old r13
>> [possible padding for alignment]
>> return address, duplicated (for naive unwinder's benefit?)
>> old rbp  <-- rbp in body points here
>> new r13, i.e. pointer to caller's last stack slot
>> 
>> Now we have the function body, and r13 is free for use in here
>because
>> it's saved.
>> 
>> In the epilogue, we recover r13, use leaveq (hmm, shorter than pop
>> %rbp but does more work than needed), restore the old r13, and
>return.
>> 
>> I don't get it, though.  gcc only ever uses that inner r13 with an
>> offset.  The code would be considerably shorter if the second
>> instruction were just mov %rsp, %r13.  That would change the push to
>> pushq 0x8(%rsp) and the third-to-last instruction to mov %r13, %rsp,
>> saving something like 8 bytes of code.
>
>I don't know why it doesn't do it the way you suggest, but I'm glad it
>doesn't because I think it would make the DWARF/ORC data even more
>complicated.  Here it's "simple", because r13 == DWARF CFA.
>
>> I also don't get why any of this is needed.  Couldn't the compiler
>> just do push %rbp; mov %rsp, %rbp; and $0xfffffffffffffff0, %rsp and
>> be done with it?
>
>Good question.  I wish it did just use the frame pointer, because
>dealing with DRAP has been a headache.
>
>> I compiled this:
>> 
>> void func()
>> {
>>     int var __attribute__((aligned(32)));
>>     asm volatile ("" :: "m" (var));
>> }
>> 
>> and got:
>> 
>> func:
>>     leaq    8(%rsp), %r10
>>     andq    $-32, %rsp
>>     pushq    -8(%r10)
>>     pushq    %rbp
>>     movq    %rsp, %rbp
>>     pushq    %r10
>>     popq    %r10
>>     popq    %rbp
>>     leaq    -8(%r10), %rsp
>>     ret
>> 
>> Which is better than the crud you pasted, since it at least uses a
>> caller-saved reg (r10), but we still have the nasty addressing modes
>> *and* an unnecessary push and pop of r10.
>> 
>> I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81825 and maybe
>> some GCC person has a clue what's going on.
>
>I've found that, when it does this DRAP pattern, most of the time it
>uses r10.  The r13 version seems to be more rare.  I can provide a
>real-world r13 example if that would help.


One could logically assume %r10 if a clobbered register is sufficient.  It 
would make sense to do that renaming fairly late in the game.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [tip:x86/asm] objtool: Track DRAP separately from callee-saved registers

Reply via email to