Re: [PATCH v4 3/3] powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp

2019-10-29 Thread Nathan Chancellor
On Tue, Oct 22, 2019 at 03:57:09AM -0500, Segher Boessenkool wrote:
> On Mon, Oct 21, 2019 at 10:15:29PM -0700, Nathan Chancellor wrote:
> > On Fri, Oct 18, 2019 at 03:02:10PM -0500, Segher Boessenkool wrote:
> > > I think the proper solution is for the kernel to *do* use -ffreestanding,
> > > and then somehow tell the kernel that memcpy etc. are the standard
> > > functions.  A freestanding GCC already requires memcpy, memmove, memset,
> > > memcmp, and sometimes abort to exist and do the standard thing; why cannot
> > > programs then also rely on it to be the standard functions.
> > > 
> > > What exact functions are the reason the kernel does not use 
> > > -ffreestanding?
> > > Is it just memcpy?  Is more wanted?
> > 
> > I think Linus summarized it pretty well here:
> > 
> > https://lore.kernel.org/lkml/CAHk-=wi-epJZfBHDbKKDZ64us7WkF=lpufhvybmzsteo8q0...@mail.gmail.com/
> 
> GCC recognises __builtin_memcpy (or any other __builtin) just fine even
> with -ffreestanding.
> 
> So the kernel wants a warning (or error) whenever a call to one of these
> library functions is generated by the compiler without the user asking
> for it directly (via a __builtin)?  And that is all that is needed for
> the kernel to use -ffreestanding?
> 
> That shouldn't be hard.  Anything missing here?
> 
> 
> Segher

Yes, I suppose that would be good enough.

I don't know if there are any other optimizations that are missed out on
by using -ffreestanding. It would probably be worth asking other kernel
developers on a separate thread (or the one I linked above).

Would be nice to get this shored up soon since our PowerPC builds have
been broken since the beginning of August :/

Cheers,
Nathan


Re: [PATCH v4 3/3] powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp

2019-10-22 Thread Segher Boessenkool
On Mon, Oct 21, 2019 at 10:15:29PM -0700, Nathan Chancellor wrote:
> On Fri, Oct 18, 2019 at 03:02:10PM -0500, Segher Boessenkool wrote:
> > I think the proper solution is for the kernel to *do* use -ffreestanding,
> > and then somehow tell the kernel that memcpy etc. are the standard
> > functions.  A freestanding GCC already requires memcpy, memmove, memset,
> > memcmp, and sometimes abort to exist and do the standard thing; why cannot
> > programs then also rely on it to be the standard functions.
> > 
> > What exact functions are the reason the kernel does not use -ffreestanding?
> > Is it just memcpy?  Is more wanted?
> 
> I think Linus summarized it pretty well here:
> 
> https://lore.kernel.org/lkml/CAHk-=wi-epJZfBHDbKKDZ64us7WkF=lpufhvybmzsteo8q0...@mail.gmail.com/

GCC recognises __builtin_memcpy (or any other __builtin) just fine even
with -ffreestanding.

So the kernel wants a warning (or error) whenever a call to one of these
library functions is generated by the compiler without the user asking
for it directly (via a __builtin)?  And that is all that is needed for
the kernel to use -ffreestanding?

That shouldn't be hard.  Anything missing here?


Segher


Re: [PATCH v4 3/3] powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp

2019-10-21 Thread Nathan Chancellor
On Fri, Oct 18, 2019 at 03:02:10PM -0500, Segher Boessenkool wrote:
> On Fri, Oct 18, 2019 at 12:00:22PM -0700, Nathan Chancellor wrote:
> > Just as an FYI, there was some more discussion around the availablity
> > and use of bcmp in this LLVM bug which spawned
> > commit 5f074f3e192f ("lib/string.c: implement a basic bcmp").
> > 
> > https://bugs.llvm.org/show_bug.cgi?id=41035#c13
> > 
> > I believe this is the proper solution but I am fine with whatever works,
> > I just want our CI to be green without any out of tree patches again...
> 
> I think the proper solution is for the kernel to *do* use -ffreestanding,
> and then somehow tell the kernel that memcpy etc. are the standard
> functions.  A freestanding GCC already requires memcpy, memmove, memset,
> memcmp, and sometimes abort to exist and do the standard thing; why cannot
> programs then also rely on it to be the standard functions.
> 
> What exact functions are the reason the kernel does not use -ffreestanding?
> Is it just memcpy?  Is more wanted?
> 
> 
> Segher

I think Linus summarized it pretty well here:

https://lore.kernel.org/lkml/CAHk-=wi-epJZfBHDbKKDZ64us7WkF=lpufhvybmzsteo8q0...@mail.gmail.com/

Cheers,
Nathan


Re: [PATCH v4 3/3] powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp

2019-10-18 Thread Segher Boessenkool
On Fri, Oct 18, 2019 at 12:00:22PM -0700, Nathan Chancellor wrote:
> Just as an FYI, there was some more discussion around the availablity
> and use of bcmp in this LLVM bug which spawned
> commit 5f074f3e192f ("lib/string.c: implement a basic bcmp").
> 
> https://bugs.llvm.org/show_bug.cgi?id=41035#c13
> 
> I believe this is the proper solution but I am fine with whatever works,
> I just want our CI to be green without any out of tree patches again...

I think the proper solution is for the kernel to *do* use -ffreestanding,
and then somehow tell the kernel that memcpy etc. are the standard
functions.  A freestanding GCC already requires memcpy, memmove, memset,
memcmp, and sometimes abort to exist and do the standard thing; why cannot
programs then also rely on it to be the standard functions.

What exact functions are the reason the kernel does not use -ffreestanding?
Is it just memcpy?  Is more wanted?


Segher


Re: [PATCH v4 3/3] powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp

2019-10-18 Thread Nathan Chancellor
On Mon, Oct 14, 2019 at 02:11:41PM -0500, Segher Boessenkool wrote:
> On Mon, Oct 14, 2019 at 08:56:12AM -0700, Nick Desaulniers wrote:
> > On Mon, Oct 14, 2019 at 2:35 AM Segher Boessenkool
> >  wrote:
> > >
> > > On Sun, Oct 13, 2019 at 07:51:01PM -0700, Nathan Chancellor wrote:
> > > > r374662 gives LLVM the ability to convert certain loops into a reference
> > > > to bcmp as an optimization; this breaks prom_init_check.sh:
> > >
> > > When/why does LLVM think this is okay?  This function has been removed
> > > from POSIX over a decade ago (and before that it always was marked as
> > > legacy).
> > 
> > Segher, do you have links for any of the above? If so, that would be
> > helpful to me.
> 
> Sure!
> 
> https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xsh_chap03.html
> 
> Older versions are harder to find online, unfortunately.  But there is
> 
> https://kernel.org/pub/linux/docs/man-pages/man-pages-posix/
> 
> in which man3p/bcmp.3p says:
> 
> FUTURE DIRECTIONS
>This function may be withdrawn in a future version.
> 
> Finally, the Linux man pages say (man bcmp):
> 
> CONFORMING TO
>4.3BSD.   This  function   is   deprecated   (marked   as   LEGACY   in
>POSIX.1-2001): use memcmp(3) in new programs.  POSIX.1-2008 removes the
>specification of bcmp().
> 
> 
> > I'm arguing against certain transforms that assume that
> > one library function is faster than another, when such claims are
> > based on measurements from one stdlib implementation.
> 
> Wow.  The difference between memcmp and bcmp is trivial (just the return
> value is different, and that costs hardly anything to add).  And memcmp
> is guaranteed to exist since C89/C90 at least.
> 
> > The rationale for why it was added was that memcmp takes a measurable
> > amount of time in Google's fleet, and most calls to memcmp don't care
> > about the position of the mismatch; bcmp is lower overhead (or at
> > least for our libc implementation, not sure about others).
> 
> You just have to do the read of the last words you compare as big-endian,
> and then you can just subtract the two words, convert that to "int" (which
> is very inconvenient to do, but hardly expensive), and there you go.
> 
> Or on x86 use the bswap insn, or something like it.
> 
> Or, if you use GCC, it has __builtin_memcmp but also __builtin_memcmp_eq,
> and those are automatically used, too.
> 
> 
> Segher

Just as an FYI, there was some more discussion around the availablity
and use of bcmp in this LLVM bug which spawned
commit 5f074f3e192f ("lib/string.c: implement a basic bcmp").

https://bugs.llvm.org/show_bug.cgi?id=41035#c13

I believe this is the proper solution but I am fine with whatever works,
I just want our CI to be green without any out of tree patches again...

Cheers,
Nathan


Re: [PATCH v4 3/3] powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp

2019-10-14 Thread Segher Boessenkool
On Mon, Oct 14, 2019 at 08:56:12AM -0700, Nick Desaulniers wrote:
> On Mon, Oct 14, 2019 at 2:35 AM Segher Boessenkool
>  wrote:
> >
> > On Sun, Oct 13, 2019 at 07:51:01PM -0700, Nathan Chancellor wrote:
> > > r374662 gives LLVM the ability to convert certain loops into a reference
> > > to bcmp as an optimization; this breaks prom_init_check.sh:
> >
> > When/why does LLVM think this is okay?  This function has been removed
> > from POSIX over a decade ago (and before that it always was marked as
> > legacy).
> 
> Segher, do you have links for any of the above? If so, that would be
> helpful to me.

Sure!

https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xsh_chap03.html

Older versions are harder to find online, unfortunately.  But there is

https://kernel.org/pub/linux/docs/man-pages/man-pages-posix/

in which man3p/bcmp.3p says:

FUTURE DIRECTIONS
   This function may be withdrawn in a future version.

Finally, the Linux man pages say (man bcmp):

CONFORMING TO
   4.3BSD.   This  function   is   deprecated   (marked   as   LEGACY   in
   POSIX.1-2001): use memcmp(3) in new programs.  POSIX.1-2008 removes the
   specification of bcmp().


> I'm arguing against certain transforms that assume that
> one library function is faster than another, when such claims are
> based on measurements from one stdlib implementation.

Wow.  The difference between memcmp and bcmp is trivial (just the return
value is different, and that costs hardly anything to add).  And memcmp
is guaranteed to exist since C89/C90 at least.

> The rationale for why it was added was that memcmp takes a measurable
> amount of time in Google's fleet, and most calls to memcmp don't care
> about the position of the mismatch; bcmp is lower overhead (or at
> least for our libc implementation, not sure about others).

You just have to do the read of the last words you compare as big-endian,
and then you can just subtract the two words, convert that to "int" (which
is very inconvenient to do, but hardly expensive), and there you go.

Or on x86 use the bswap insn, or something like it.

Or, if you use GCC, it has __builtin_memcmp but also __builtin_memcmp_eq,
and those are automatically used, too.


Segher


Re: [PATCH v4 3/3] powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp

2019-10-14 Thread Nick Desaulniers
On Mon, Oct 14, 2019 at 2:35 AM Segher Boessenkool
 wrote:
>
> On Sun, Oct 13, 2019 at 07:51:01PM -0700, Nathan Chancellor wrote:
> > r374662 gives LLVM the ability to convert certain loops into a reference
> > to bcmp as an optimization; this breaks prom_init_check.sh:
>
> When/why does LLVM think this is okay?  This function has been removed
> from POSIX over a decade ago (and before that it always was marked as
> legacy).

Segher, do you have links for any of the above? If so, that would be
helpful to me. I'm arguing against certain transforms that assume that
one library function is faster than another, when such claims are
based on measurements from one stdlib implementation. (There's others
in the pipeline I'm not too thrilled about, too).

The rationale for why it was added was that memcmp takes a measurable
amount of time in Google's fleet, and most calls to memcmp don't care
about the position of the mismatch; bcmp is lower overhead (or at
least for our libc implementation, not sure about others).
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH v4 3/3] powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp

2019-10-14 Thread Segher Boessenkool
On Sun, Oct 13, 2019 at 07:51:01PM -0700, Nathan Chancellor wrote:
> r374662 gives LLVM the ability to convert certain loops into a reference
> to bcmp as an optimization; this breaks prom_init_check.sh:

When/why does LLVM think this is okay?  This function has been removed
from POSIX over a decade ago (and before that it always was marked as
legacy).


Segher


[PATCH v4 3/3] powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp

2019-10-13 Thread Nathan Chancellor
r374662 gives LLVM the ability to convert certain loops into a reference
to bcmp as an optimization; this breaks prom_init_check.sh:

  CALLarch/powerpc/kernel/prom_init_check.sh
Error: External symbol 'bcmp' referenced from prom_init.c
make[2]: *** [arch/powerpc/kernel/Makefile:196: prom_init_check] Error 1

bcmp is defined in lib/string.c as a wrapper for memcmp so this could be
added to the whitelist. However, commit 450e7dd4001f ("powerpc/prom_init:
don't use string functions from lib/") copied memcmp as prom_memcmp to
avoid KASAN instrumentation so having bcmp be resolved to regular memcmp
would break that assumption. Furthermore, because the compiler is the
one that inserted bcmp, we cannot provide something like prom_bcmp.

To prevent LLVM from being clever with optimizations like this, use
-ffreestanding to tell LLVM we are not hosted so it is not free to make
transformations like this.

Link: https://github.com/ClangBuiltLinux/linux/issues/647
Link: 
https://github.com/llvm/llvm-project/commit/76cdcf25b883751d83402baea6316772aa73865c
Reviewed-by: Nick Desaulneris 
Signed-off-by: Nathan Chancellor 
---

v1 -> v3:

* New patch in the series

v3 -> v4:

* Rebase on v5.4-rc3.

* Add Nick's reviewed-by tag.

* Update the LLVM commit reference to the latest applied version (r374662)
  as it was originally committed as r370454, reverted in r370788, and
  reapplied as r374662.

 arch/powerpc/kernel/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index f1f362146135..7f0ee465dfb6 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -21,7 +21,7 @@ CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 
-CFLAGS_prom_init.o += $(call cc-option, -fno-stack-protector)
+CFLAGS_prom_init.o += $(call cc-option, -fno-stack-protector) -ffreestanding
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace early boot code
-- 
2.23.0