Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-21 Thread Borislav Petkov
On Tue, Jul 21, 2015 at 09:39:23PM -0700, Andy Lutomirski wrote:
> So your shiny perf profile shows cumulative time in whatever called
> them.  Sure, this is arguably silly if we're stuck with frame
> pointers.

You can count 20ish cycles tops for any one of them.

At least that was from the last time when comparing POPCNT perf to the
__sw_hweight* ones. IOW, POPCNT didn't show any improvement vs those sw
versions.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-21 Thread Andy Lutomirski
On Tue, Jul 21, 2015 at 9:25 PM, Borislav Petkov  wrote:
> On Tue, Jul 21, 2015 at 05:13:12PM -0700, Andy Lutomirski wrote:
>> Enough for oopses, perhaps, but maybe not enough for perf.
>>
>> It sounds like you want CFI unwinding :)
>
> What would you want to unwind in those __sw_hweight* almost-trivial,
> tail functions?

So your shiny perf profile shows cumulative time in whatever called
them.  Sure, this is arguably silly if we're stuck with frame
pointers.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-21 Thread Borislav Petkov
On Tue, Jul 21, 2015 at 05:13:12PM -0700, Andy Lutomirski wrote:
> Enough for oopses, perhaps, but maybe not enough for perf.
> 
> It sounds like you want CFI unwinding :)

What would you want to unwind in those __sw_hweight* almost-trivial,
tail functions?

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-21 Thread Andy Lutomirski
On Jul 18, 2015 9:13 PM, "Borislav Petkov"  wrote:
>
> On Sat, Jul 18, 2015 at 10:57:14AM -0500, Josh Poimboeuf wrote:
> > Currently, when stackvalidate sees an ALTERNATIVE, it assumes that
> > either code path is possible, so it follows both paths in parallel.
> >
> > If I understand right, you're proposing that stackvalidate should only
> > follow the POPCNT path and never follow the !POPCNT path?
>
> Actually, you don't even need to follow the POPCNT case either because
> it is a single instruction - no stack operations there.
>
> So yeah, either that or special-case the case where the original insn is
> CALL and the replacement is a POPCNT and ignore those CALL locations.
>
> The advantage is that the burden is put on the tool and not by adding
> markers to kernel code paths.
>
> > In general, I agree, and I like the original patch much better.  IMO, it
> > achieved the goal of keeping the kernel code clean, while fixing the
> > frame pointer bug.
>
> And I think that in that case, adding that rSP dependency is too much
> because even though it fixes the "bug", it is very very unlikely any
> stack trace will have __sw_hweight* in it for reasons pointed out
> earlier and also because those functions can't fail and they get
> integral types as args which can't fail when deref-fing either. And even
> if they do, they don't call any other functions so rIP pointing to them
> is already enough.

Enough for oopses, perhaps, but maybe not enough for perf.

It sounds like you want CFI unwinding :)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-18 Thread Borislav Petkov
On Sat, Jul 18, 2015 at 10:57:14AM -0500, Josh Poimboeuf wrote:
> Currently, when stackvalidate sees an ALTERNATIVE, it assumes that
> either code path is possible, so it follows both paths in parallel.
> 
> If I understand right, you're proposing that stackvalidate should only
> follow the POPCNT path and never follow the !POPCNT path?

Actually, you don't even need to follow the POPCNT case either because
it is a single instruction - no stack operations there.

So yeah, either that or special-case the case where the original insn is
CALL and the replacement is a POPCNT and ignore those CALL locations.

The advantage is that the burden is put on the tool and not by adding
markers to kernel code paths.

> In general, I agree, and I like the original patch much better.  IMO, it
> achieved the goal of keeping the kernel code clean, while fixing the
> frame pointer bug.

And I think that in that case, adding that rSP dependency is too much
because even though it fixes the "bug", it is very very unlikely any
stack trace will have __sw_hweight* in it for reasons pointed out
earlier and also because those functions can't fail and they get
integral types as args which can't fail when deref-fing either. And even
if they do, they don't call any other functions so rIP pointing to them
is already enough.

So even if we're not 100% correct wrt stack traces in this case, I think
that's ok.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-18 Thread Josh Poimboeuf
On Sat, Jul 18, 2015 at 04:56:29PM +0200, Borislav Petkov wrote:
> On Sat, Jul 18, 2015 at 08:44:15AM -0500, Josh Poimboeuf wrote:
> > Ok, so would you rather adding a whitelist to tell stackvalidate to
> > ignore it?  Something like this?
> 
> I tried it and maybe I'm missing something but that doesn't work:
> 
> $ make drivers/gpu/drm/i915/intel_ringbuffer.o
>   CHK include/config/kernel.release
>   CHK include/generated/uapi/linux/version.h
>   CHK include/generated/utsrelease.h
>   CHK include/generated/timeconst.h
>   CHK include/generated/bounds.h
>   CHK include/generated/asm-offsets.h
>   CALLscripts/checksyscalls.sh
>   CC  drivers/gpu/drm/i915/intel_ringbuffer.o
> ./arch/x86/include/asm/arch_hweight.h: Assembler messages:
> ./arch/x86/include/asm/arch_hweight.h:31: Error: symbol `.Ltemp32' is already 
> defined
> ./arch/x86/include/asm/arch_hweight.h:31: Error: symbol `.Ltemp32' is already 
> defined
> ./arch/x86/include/asm/arch_hweight.h:31: Error: symbol `.Ltemp32' is already 
> defined
> scripts/Makefile.build:258: recipe for target 
> 'drivers/gpu/drm/i915/intel_ringbuffer.o' failed
> make[1]: *** [drivers/gpu/drm/i915/intel_ringbuffer.o] Error 1
> Makefile:1528: recipe for target 'drivers/gpu/drm/i915/intel_ringbuffer.o' 
> failed
> make: *** [drivers/gpu/drm/i915/intel_ringbuffer.o] Error 2

Yeah, it doesn't actually support this particular example yet.  I was
just trying to figure out if that's what you were proposing.

> Also, that label temp32 could be more descriptive.

Yeah, that's from:

  ".Ltemp" __stringify(__LINE__) ":;"

Which was intended to give a unique ID for each use of the macro, but
apparently that didn't work as planned here.

> so you see that a CALL instruction gets replaced with a POPCNT and
> the feature bit used is 4*32+23 which is X86_FEATURE_POPCNT. This
> information is enough to detect that particular case and add the offset
> ".long 661b - ." to the list of instructions which stackvalidate should
> ignore.

Currently, when stackvalidate sees an ALTERNATIVE, it assumes that
either code path is possible, so it follows both paths in parallel.

If I understand right, you're proposing that stackvalidate should only
follow the POPCNT path and never follow the !POPCNT path?

> Anyway, this is what I'd do.
> 
> IMNSVHO, we must be very conservative and not add some
> markers/helpers/etc to code only so that tools can do their job. Not if
> it can be helped. Instead, tools should do the hard work and we should
> keep kernel code clean.

In general, I agree, and I like the original patch much better.  IMO, it
achieved the goal of keeping the kernel code clean, while fixing the
frame pointer bug.

If you insist on breaking stack traces on !POPCNT, I can probably add
some intelligence to stackvalidate to look for !POPCNT and ignore it.
It seems less "clean" to me, though.

-- 
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-18 Thread Borislav Petkov
On Sat, Jul 18, 2015 at 08:44:15AM -0500, Josh Poimboeuf wrote:
> Ok, so would you rather adding a whitelist to tell stackvalidate to
> ignore it?  Something like this?

I tried it and maybe I'm missing something but that doesn't work:

$ make drivers/gpu/drm/i915/intel_ringbuffer.o
  CHK include/config/kernel.release
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CHK include/generated/timeconst.h
  CHK include/generated/bounds.h
  CHK include/generated/asm-offsets.h
  CALLscripts/checksyscalls.sh
  CC  drivers/gpu/drm/i915/intel_ringbuffer.o
./arch/x86/include/asm/arch_hweight.h: Assembler messages:
./arch/x86/include/asm/arch_hweight.h:31: Error: symbol `.Ltemp32' is already 
defined
./arch/x86/include/asm/arch_hweight.h:31: Error: symbol `.Ltemp32' is already 
defined
./arch/x86/include/asm/arch_hweight.h:31: Error: symbol `.Ltemp32' is already 
defined
scripts/Makefile.build:258: recipe for target 
'drivers/gpu/drm/i915/intel_ringbuffer.o' failed
make[1]: *** [drivers/gpu/drm/i915/intel_ringbuffer.o] Error 1
Makefile:1528: recipe for target 'drivers/gpu/drm/i915/intel_ringbuffer.o' 
failed
make: *** [drivers/gpu/drm/i915/intel_ringbuffer.o] Error 2

Also, that label temp32 could be more descriptive.

Regardless of the above, I don't like the idea of adding some
compile-time checking and thus obfuscating what is already non-obvious
code.

And since your tool is already parsing ELF files and all that other fun,
what I'd do is make that checking out-of-line *without* adding any new
code to the kernel.

In this particular case, you have:

#APP
# 28 "./arch/x86/include/asm/arch_hweight.h" 1
661:
call __sw_hweight32
662:
.skip -(((6651f-6641f)-(662b-661b)) > 0) * ((6651f-6641f)-(662b-661b)),0x90
663:
.pushsection .altinstructions,"a"
 .long 661b - .
 .long 6641f - .
 .word ( 4*32+23)
 .byte 663b-661b
 .byte 6651f-6641f
 .byte 663b-662b
.popsection
.pushsection .altinstr_replacement, "ax"
6641:
.byte 0xf3,0x40,0x0f,0xb8,0xc7
6651:
.popsection
# 0 "" 2

so you see that a CALL instruction gets replaced with a POPCNT and
the feature bit used is 4*32+23 which is X86_FEATURE_POPCNT. This
information is enough to detect that particular case and add the offset
".long 661b - ." to the list of instructions which stackvalidate should
ignore.

Anyway, this is what I'd do.

IMNSVHO, we must be very conservative and not add some
markers/helpers/etc to code only so that tools can do their job. Not if
it can be helped. Instead, tools should do the hard work and we should
keep kernel code clean.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-18 Thread Josh Poimboeuf
On Sat, Jul 18, 2015 at 07:05:36AM +0200, Borislav Petkov wrote:
> On Fri, Jul 17, 2015 at 12:32:20PM -0500, Josh Poimboeuf wrote:
> > Well, but this isn't some whitelist code to make stackvalidate happy.
> > 
> > It's actually a real runtime frame pointer bug, and the rsp dependency
> > is real.  If it does the call without first creating the stack frame
> > then it breaks frame pointer based stack traces.
> 
> I think we can live with the stack trace being a little wrong in those
> __sw_* variants. And besides, we're talking about the very very small
> percentage of machines (which keeps getting smaller) which don't
> support POPCNT. And from those, only for the cases where the arg is not
> __builtin_constant_p() because there we do the __const_hweight* thing.
> 
> I'd prefer to not clutter the code more in that case.

Ok, so would you rather adding a whitelist to tell stackvalidate to
ignore it?  Something like this?

diff --git a/arch/x86/include/asm/arch_hweight.h 
b/arch/x86/include/asm/arch_hweight.h
index 9686c3d..d604691 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_X86_HWEIGHT_H
 #define _ASM_X86_HWEIGHT_H
 
+#include 
+
 #ifdef CONFIG_64BIT
 /* popcnt %edi, %eax -- redundant REX prefix for alignment */
 #define POPCNT32 ".byte 0xf3,0x40,0x0f,0xb8,0xc7"
@@ -25,7 +27,9 @@ static inline unsigned int __arch_hweight32(unsigned int w)
 {
unsigned int res = 0;
 
-   asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT)
+   asm (ALTERNATIVE(STACKVALIDATE_IGNORE_INSN
+"call __sw_hweight32",
+POPCNT32, X86_FEATURE_POPCNT)
 : "="REG_OUT (res)
 : REG_IN (w));
 
@@ -50,7 +54,9 @@ static inline unsigned long __arch_hweight64(__u64 w)
return  __arch_hweight32((u32)w) +
__arch_hweight32((u32)(w >> 32));
 #else
-   asm (ALTERNATIVE("call __sw_hweight64", POPCNT64, X86_FEATURE_POPCNT)
+   asm (ALTERNATIVE(STACKVALIDATE_IGNORE_INSN
+"call __sw_hweight64",
+POPCNT64, X86_FEATURE_POPCNT)
 : "="REG_OUT (res)
 : REG_IN (w));
 #endif /* CONFIG_X86_32 */

-- 
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-17 Thread Borislav Petkov
On Fri, Jul 17, 2015 at 12:32:20PM -0500, Josh Poimboeuf wrote:
> Well, but this isn't some whitelist code to make stackvalidate happy.
> 
> It's actually a real runtime frame pointer bug, and the rsp dependency
> is real.  If it does the call without first creating the stack frame
> then it breaks frame pointer based stack traces.

I think we can live with the stack trace being a little wrong in those
__sw_* variants. And besides, we're talking about the very very small
percentage of machines (which keeps getting smaller) which don't
support POPCNT. And from those, only for the cases where the arg is not
__builtin_constant_p() because there we do the __const_hweight* thing.

I'd prefer to not clutter the code more in that case.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-17 Thread Josh Poimboeuf
On Fri, Jul 17, 2015 at 07:17:26PM +0200, Borislav Petkov wrote:
> On Fri, Jul 17, 2015 at 11:47:20AM -0500, Josh Poimboeuf wrote:
> > If __arch_hweight32() or __arch_hweight64() is inlined at the beginning
> > of a function, gcc can insert the call instruction before setting up a
> > stack frame, which breaks frame pointer convention if
> > CONFIG_FRAME_POINTER is enabled and can result in a bad stack trace.
> > 
> > Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
> > listing the stack pointer as an output operand for the inline asm
> > statement.
> > 
> > Signed-off-by: Josh Poimboeuf 
> > ---
> >  arch/x86/include/asm/arch_hweight.h | 6 --
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/arch_hweight.h 
> > b/arch/x86/include/asm/arch_hweight.h
> > index 9686c3d..e438a0d 100644
> > --- a/arch/x86/include/asm/arch_hweight.h
> > +++ b/arch/x86/include/asm/arch_hweight.h
> > @@ -23,10 +23,11 @@
> >   */
> >  static inline unsigned int __arch_hweight32(unsigned int w)
> >  {
> > +   register void *__sp asm("esp");
> > unsigned int res = 0;
> >  
> > asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT)
> > -: "="REG_OUT (res)
> > +: "="REG_OUT (res), "+r" (__sp)
> >  : REG_IN (w));
> >  
> > return res;
> > @@ -44,6 +45,7 @@ static inline unsigned int __arch_hweight8(unsigned int w)
> >  
> >  static inline unsigned long __arch_hweight64(__u64 w)
> >  {
> > +   register void __maybe_unused *__sp asm("rsp");
> > unsigned long res = 0;
> >  
> >  #ifdef CONFIG_X86_32
> > @@ -51,7 +53,7 @@ static inline unsigned long __arch_hweight64(__u64 w)
> > __arch_hweight32((u32)(w >> 32));
> >  #else
> > asm (ALTERNATIVE("call __sw_hweight64", POPCNT64, X86_FEATURE_POPCNT)
> > -: "="REG_OUT (res)
> > +: "="REG_OUT (res), "+r" (__sp)
> >  : REG_IN (w));
> >  #endif /* CONFIG_X86_32 */
> 
> Eeew, useless code so that some compile-time validation is done. Let's
> not add this clutter please.
> 
> In this particular case, the majority of CPUs out there will get POPCNT
> patched in and that CALL is gone. And for the remaining cases where we
> do end up using the __sw_* variants, I'd prefer to rather not do the
> validation instead of polluting the code with that fake rsp dependency.

Well, but this isn't some whitelist code to make stackvalidate happy.

It's actually a real runtime frame pointer bug, and the rsp dependency
is real.  If it does the call without first creating the stack frame
then it breaks frame pointer based stack traces.

-- 
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-17 Thread Borislav Petkov
On Fri, Jul 17, 2015 at 11:47:20AM -0500, Josh Poimboeuf wrote:
> If __arch_hweight32() or __arch_hweight64() is inlined at the beginning
> of a function, gcc can insert the call instruction before setting up a
> stack frame, which breaks frame pointer convention if
> CONFIG_FRAME_POINTER is enabled and can result in a bad stack trace.
> 
> Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
> listing the stack pointer as an output operand for the inline asm
> statement.
> 
> Signed-off-by: Josh Poimboeuf 
> ---
>  arch/x86/include/asm/arch_hweight.h | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/arch_hweight.h 
> b/arch/x86/include/asm/arch_hweight.h
> index 9686c3d..e438a0d 100644
> --- a/arch/x86/include/asm/arch_hweight.h
> +++ b/arch/x86/include/asm/arch_hweight.h
> @@ -23,10 +23,11 @@
>   */
>  static inline unsigned int __arch_hweight32(unsigned int w)
>  {
> + register void *__sp asm("esp");
>   unsigned int res = 0;
>  
>   asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT)
> -  : "="REG_OUT (res)
> +  : "="REG_OUT (res), "+r" (__sp)
>: REG_IN (w));
>  
>   return res;
> @@ -44,6 +45,7 @@ static inline unsigned int __arch_hweight8(unsigned int w)
>  
>  static inline unsigned long __arch_hweight64(__u64 w)
>  {
> + register void __maybe_unused *__sp asm("rsp");
>   unsigned long res = 0;
>  
>  #ifdef CONFIG_X86_32
> @@ -51,7 +53,7 @@ static inline unsigned long __arch_hweight64(__u64 w)
>   __arch_hweight32((u32)(w >> 32));
>  #else
>   asm (ALTERNATIVE("call __sw_hweight64", POPCNT64, X86_FEATURE_POPCNT)
> -  : "="REG_OUT (res)
> +  : "="REG_OUT (res), "+r" (__sp)
>: REG_IN (w));
>  #endif /* CONFIG_X86_32 */

Eeew, useless code so that some compile-time validation is done. Let's
not add this clutter please.

In this particular case, the majority of CPUs out there will get POPCNT
patched in and that CALL is gone. And for the remaining cases where we
do end up using the __sw_* variants, I'd prefer to rather not do the
validation instead of polluting the code with that fake rsp dependency.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 04/21] x86/hweight: Add stack frame dependency for __arch_hweight*()

2015-07-17 Thread Josh Poimboeuf
If __arch_hweight32() or __arch_hweight64() is inlined at the beginning
of a function, gcc can insert the call instruction before setting up a
stack frame, which breaks frame pointer convention if
CONFIG_FRAME_POINTER is enabled and can result in a bad stack trace.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the inline asm
statement.

Signed-off-by: Josh Poimboeuf 
---
 arch/x86/include/asm/arch_hweight.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/arch_hweight.h 
b/arch/x86/include/asm/arch_hweight.h
index 9686c3d..e438a0d 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -23,10 +23,11 @@
  */
 static inline unsigned int __arch_hweight32(unsigned int w)
 {
+   register void *__sp asm("esp");
unsigned int res = 0;
 
asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT)
-: "="REG_OUT (res)
+: "="REG_OUT (res), "+r" (__sp)
 : REG_IN (w));
 
return res;
@@ -44,6 +45,7 @@ static inline unsigned int __arch_hweight8(unsigned int w)
 
 static inline unsigned long __arch_hweight64(__u64 w)
 {
+   register void __maybe_unused *__sp asm("rsp");
unsigned long res = 0;
 
 #ifdef CONFIG_X86_32
@@ -51,7 +53,7 @@ static inline unsigned long __arch_hweight64(__u64 w)
__arch_hweight32((u32)(w >> 32));
 #else
asm (ALTERNATIVE("call __sw_hweight64", POPCNT64, X86_FEATURE_POPCNT)
-: "="REG_OUT (res)
+: "="REG_OUT (res), "+r" (__sp)
 : REG_IN (w));
 #endif /* CONFIG_X86_32 */
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/