On Tue, Sep 19, 2017 at 03:25:59PM -0700, Alexander Potapenko wrote:
> On Tue, Sep 19, 2017 at 2:55 PM, Alexander Potapenko <gli...@google.com> 
> wrote:
> > On Tue, Sep 19, 2017 at 11:45 AM, Josh Poimboeuf <jpoim...@redhat.com> 
> > wrote:
> >> For inline asm statements which have a CALL instruction, we list the
> >> stack pointer as a constraint to convince GCC to ensure the frame
> >> pointer is set up first:
> >>
> >>   static inline void foo()
> >>   {
> >>         register void *__sp asm(_ASM_SP);
> >>         asm("call bar" : "+r" (__sp))
> >>   }
> >>
> >> Unfortunately, that pattern causes clang to corrupt the stack pointer.
> >>
> >> There's actually an easier way to achieve the same goal in GCC, without
> >> causing trouble for clang.  If we declare the stack pointer register
> >> variable as a global variable, and remove the constraint altogether,
> >> that convinces GCC to always set up the frame pointer before inserting
> >> *any* inline asm.
> >>
> >> It basically acts as if *every* inline asm statement has a CALL
> >> instruction.  It's a bit overkill, but the performance impact should be
> >> negligible.
> >>
> >> Here are the vmlinux .text size differences with the following configs
> >> on GCC:
> >>
> >> - defconfig
> >> - defconfig without frame pointers
> >> - Fedora distro config
> >> - Fedora distro config without frame pointers
> >>
> >>         defconfig       defconfig-nofp  distro          distro-nofp
> >> before  9796300         9466764         9076191         8789745
> >> after   9796941         9462859         9076381         8785325
> >>
> >> With frame pointers, the text size increases slightly.  Without frame
> >> pointers, the text size decreases, and a little more significantly.
> >>
> >> Reported-by: Matthias Kaehlcke <m...@chromium.org>
> >> Signed-off-by: Josh Poimboeuf <jpoim...@redhat.com>
> >> ---
> >>  arch/x86/include/asm/alternative.h               |  3 +--
> >>  arch/x86/include/asm/asm.h                       |  9 ++++++++
> >>  arch/x86/include/asm/mshyperv.h                  | 27 
> >> ++++++++++--------------
> >>  arch/x86/include/asm/paravirt_types.h            | 14 ++++++------
> >>  arch/x86/include/asm/preempt.h                   | 15 +++++--------
> >>  arch/x86/include/asm/processor.h                 |  6 ++----
> >>  arch/x86/include/asm/rwsem.h                     |  6 +++---
> >>  arch/x86/include/asm/uaccess.h                   |  5 ++---
> >>  arch/x86/include/asm/xen/hypercall.h             |  5 ++---
> >>  arch/x86/kvm/emulate.c                           |  3 +--
> >>  arch/x86/kvm/vmx.c                               |  4 +---
> >>  arch/x86/mm/fault.c                              |  3 +--
> >>  tools/objtool/Documentation/stack-validation.txt | 20 +++++++++++++-----
> >>  13 files changed, 60 insertions(+), 60 deletions(-)
> >>
> >> diff --git a/arch/x86/include/asm/alternative.h 
> >> b/arch/x86/include/asm/alternative.h
> >> index 1b020381ab38..7aeb1cef4204 100644
> >> --- a/arch/x86/include/asm/alternative.h
> >> +++ b/arch/x86/include/asm/alternative.h
> >> @@ -218,10 +218,9 @@ static inline int alternatives_text_reserved(void 
> >> *start, void *end)
> >>  #define alternative_call_2(oldfunc, newfunc1, feature1, newfunc2, 
> >> feature2,   \
> >>                            output, input...)                               
> >>    \
> >>  {                                                                         
> >>    \
> >> -       register void *__sp asm(_ASM_SP);                                  
> >>    \
> >>         asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", 
> >> feature1,\
> >>                 "call %P[new2]", feature2)                                 
> >>    \
> >> -               : output, "+r" (__sp)                                      
> >>    \
> >> +               : output                                                   
> >>    \
> >>                 : [old] "i" (oldfunc), [new1] "i" (newfunc1),              
> >>    \
> >>                   [new2] "i" (newfunc2), ## input);                        
> >>    \
> >>  }
> >> diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
> >> index 676ee5807d86..ff8921d4615b 100644
> >> --- a/arch/x86/include/asm/asm.h
> >> +++ b/arch/x86/include/asm/asm.h
> >> @@ -132,4 +132,13 @@
> >>  /* For C file, we already have NOKPROBE_SYMBOL macro */
> >>  #endif
> >>
> >> +#ifndef __ASSEMBLY__
> >> +/*
> >> + * This variable declaration has the side effect of forcing GCC to always 
> >> set
> >> + * up the frame pointer before inserting any inline asm.  This is 
> >> necessary
> >> + * because some inline asm statements have CALL instructions.
> >> + */
> >> +register unsigned int __sp_unused asm("esp");

> > Shouldn't this be "asm(_ASM_SP)"?

> Answering my own question, this can't be _ASM_SP because of the
> realmode code, which is built with -m16 whereas _ASM_SP is "rsp".
> The patch works fine with "esp" though - most certainly declaring a
> ESP variable is enough to reserve the whole RSP in an x86_64 build.

Right, clang failing to build the realmode code was exactly why I did
that.

-- 
Josh

Reply via email to