* Alexander Potapenko <[email protected]> wrote:

> On Fri, Apr 5, 2019 at 11:39 AM Ingo Molnar <[email protected]> wrote:
> >
> >
> > * Alexander Potapenko <[email protected]> wrote:
> >
> > > 1. Use memory clobber in bitops that touch arbitrary memory
> > >
> > > Certain bit operations that read/write bits take a base pointer and an
> > > arbitrarily large offset to address the bit relative to that base.
> > > Inline assembly constraints aren't expressive enough to tell the
> > > compiler that the assembly directive is going to touch a specific memory
> > > location of unknown size, therefore we have to use the "memory" clobber
> > > to indicate that the assembly is going to access memory locations other
> > > than those listed in the inputs/outputs.
> > > To indicate that BTR/BTS instructions don't necessarily touch the first
> > > sizeof(long) bytes of the argument, we also move the address to assembly
> > > inputs.
> > >
> > > This particular change leads to size increase of 124 kernel functions in
> > > a defconfig build. For some of them the diff is in NOP operations, other
> > > end up re-reading values from memory and may potentially slow down the
> > > execution. But without these clobbers the compiler is free to cache
> > > the contents of the bitmaps and use them as if they weren't changed by
> > > the inline assembly.
> > >
> > > 2. Use byte-sized arguments for operations touching single bytes.
> > >
> > > Passing a long value to ANDB/ORB/XORB instructions makes the compiler
> > > treat sizeof(long) bytes as being clobbered, which isn't the case. This
> > > may theoretically lead to worse code in the case of heavy optimization.
> > >
> > > Signed-off-by: Alexander Potapenko <[email protected]>
> > > Cc: Dmitry Vyukov <[email protected]>
> > > Cc: Paul E. McKenney <[email protected]>
> > > Cc: H. Peter Anvin <[email protected]>
> > > Cc: Peter Zijlstra <[email protected]>
> > > Cc: James Y Knight <[email protected]>
> > > ---
> > >  v2:
> > >   -- renamed the patch
> > >   -- addressed comment by Peter Zijlstra: don't use "+m" for functions
> > >   returning void
> > >   -- fixed input types for operations touching single bytes
> > > ---
> > >  arch/x86/include/asm/bitops.h | 41 +++++++++++++++--------------------
> > >  1 file changed, 18 insertions(+), 23 deletions(-)
> >
> > I'm wondering what the primary motivation for the patch is:
> >
> >  - Does it fix an actual miscompilation, or only a theoretical 
> > miscompilation?
> Depends on what we name an actual miscompilation.
> I've built a defconfig kernel and looked through some of the functions
> generated by GCC 7.3.0 with and without this clobber, and didn't spot
> any miscompilations.
> However there is a (trivial) theoretical case where this code leads to
> miscompilation: https://lkml.org/lkml/2019/3/28/393 using just GCC
> 8.3.0 with -O2.
> It isn't hard to imagine someone writes such a function in the kernel someday.
> 
> So the primary motivation is to fix an existing misuse of the asm
> directive, which happens to work in certain configurations now, but
> isn't guaranteed to work under different circumstances.

Thanks, that's all the context this patch needed: the miscompilation is 
real, demonstrated, and the pattern of your testcase doesn't look overly 
weird.

Also the 'cost' side of your patch appears to be pretty low, the 
defconfig-64 bloat from the stricter asm() constraints appears to be very 
small and not measurable in .text section size with bog standard GCC 7.3:

  text       data     bss      dec        hex      filename        sha1
  19565909   4974784  1826888  26367581   192565d  vmlinux.before  
07f91fa36d2b477b245c7fee283dd3b7
  19565909   4974784  1826888  26367581   192565d  vmlinux.after   
51a3f9a5fec4c29198953c06672a61a5

The allmodconfig-64 .text bloat appears to be zero as well:

  text       data     bss      dec        hex      filename        sha1
  27058207   34467402 30863436 92389045   581beb5  vmlinux.before  
b38c470330f38779ab0be08fd7d90053
  27058207   34467402 30863436 92389045   581beb5  vmlinux.after   
36c99be7cbebc13899ae22ced32fa2ec

Note that defconfig/allmodconfig is only a fraction of the kernel's 
complexity, especially if we consider the myriads of build time options 
in the Kconfig space plus the compiler variants out there.

Anyway, I agree that this needs fixing, so I have queued up your 
constraint fixes for x86/urgent, with a -stable tag.

We'll probably not push it to Linus until next week (-rc5 time), to make 
sure there are no surprises, and also to allow for any late review 
feedback.

Thanks,

        Ingo

Reply via email to