On Tue, Sep 19, 2017 at 03:25:59PM -0700, Alexander Potapenko wrote: > On Tue, Sep 19, 2017 at 2:55 PM, Alexander Potapenko <gli...@google.com> > wrote: > > On Tue, Sep 19, 2017 at 11:45 AM, Josh Poimboeuf <jpoim...@redhat.com> > > wrote: > >> For inline asm statements which have a CALL instruction, we list the > >> stack pointer as a constraint to convince GCC to ensure the frame > >> pointer is set up first: > >> > >> static inline void foo() > >> { > >> register void *__sp asm(_ASM_SP); > >> asm("call bar" : "+r" (__sp)) > >> } > >> > >> Unfortunately, that pattern causes clang to corrupt the stack pointer. > >> > >> There's actually an easier way to achieve the same goal in GCC, without > >> causing trouble for clang. If we declare the stack pointer register > >> variable as a global variable, and remove the constraint altogether, > >> that convinces GCC to always set up the frame pointer before inserting > >> *any* inline asm. > >> > >> It basically acts as if *every* inline asm statement has a CALL > >> instruction. It's a bit overkill, but the performance impact should be > >> negligible. > >> > >> Here are the vmlinux .text size differences with the following configs > >> on GCC: > >> > >> - defconfig > >> - defconfig without frame pointers > >> - Fedora distro config > >> - Fedora distro config without frame pointers > >> > >> defconfig defconfig-nofp distro distro-nofp > >> before 9796300 9466764 9076191 8789745 > >> after 9796941 9462859 9076381 8785325 > >> > >> With frame pointers, the text size increases slightly. Without frame > >> pointers, the text size decreases, and a little more significantly. > >> > >> Reported-by: Matthias Kaehlcke <m...@chromium.org> > >> Signed-off-by: Josh Poimboeuf <jpoim...@redhat.com> > >> --- > >> arch/x86/include/asm/alternative.h | 3 +-- > >> arch/x86/include/asm/asm.h | 9 ++++++++ > >> arch/x86/include/asm/mshyperv.h | 27 > >> ++++++++++-------------- > >> arch/x86/include/asm/paravirt_types.h | 14 ++++++------ > >> arch/x86/include/asm/preempt.h | 15 +++++-------- > >> arch/x86/include/asm/processor.h | 6 ++---- > >> arch/x86/include/asm/rwsem.h | 6 +++--- > >> arch/x86/include/asm/uaccess.h | 5 ++--- > >> arch/x86/include/asm/xen/hypercall.h | 5 ++--- > >> arch/x86/kvm/emulate.c | 3 +-- > >> arch/x86/kvm/vmx.c | 4 +--- > >> arch/x86/mm/fault.c | 3 +-- > >> tools/objtool/Documentation/stack-validation.txt | 20 +++++++++++++----- > >> 13 files changed, 60 insertions(+), 60 deletions(-) > >> > >> diff --git a/arch/x86/include/asm/alternative.h > >> b/arch/x86/include/asm/alternative.h > >> index 1b020381ab38..7aeb1cef4204 100644 > >> --- a/arch/x86/include/asm/alternative.h > >> +++ b/arch/x86/include/asm/alternative.h > >> @@ -218,10 +218,9 @@ static inline int alternatives_text_reserved(void > >> *start, void *end) > >> #define alternative_call_2(oldfunc, newfunc1, feature1, newfunc2, > >> feature2, \ > >> output, input...) > >> \ > >> { > >> \ > >> - register void *__sp asm(_ASM_SP); > >> \ > >> asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", > >> feature1,\ > >> "call %P[new2]", feature2) > >> \ > >> - : output, "+r" (__sp) > >> \ > >> + : output > >> \ > >> : [old] "i" (oldfunc), [new1] "i" (newfunc1), > >> \ > >> [new2] "i" (newfunc2), ## input); > >> \ > >> } > >> diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h > >> index 676ee5807d86..ff8921d4615b 100644 > >> --- a/arch/x86/include/asm/asm.h > >> +++ b/arch/x86/include/asm/asm.h > >> @@ -132,4 +132,13 @@ > >> /* For C file, we already have NOKPROBE_SYMBOL macro */ > >> #endif > >> > >> +#ifndef __ASSEMBLY__ > >> +/* > >> + * This variable declaration has the side effect of forcing GCC to always > >> set > >> + * up the frame pointer before inserting any inline asm. This is > >> necessary > >> + * because some inline asm statements have CALL instructions. > >> + */ > >> +register unsigned int __sp_unused asm("esp");
> > Shouldn't this be "asm(_ASM_SP)"? > Answering my own question, this can't be _ASM_SP because of the > realmode code, which is built with -m16 whereas _ASM_SP is "rsp". > The patch works fine with "esp" though - most certainly declaring a > ESP variable is enough to reserve the whole RSP in an x86_64 build. Right, clang failing to build the realmode code was exactly why I did that. -- Josh