> Date: Sun, 25 Dec 2016 09:40:05 +1100
> From: Jonathan Gray <[email protected]>
>
> On Sat, Dec 24, 2016 at 05:07:11PM +0100, Mark Kettenis wrote:
> > > Date: Sat, 24 Dec 2016 00:08:35 +0100 (CET)
> > > From: Mark Kettenis <[email protected]>
> > >
> > > We already do this on some architectures, but not on amd64 for
> > > example. The main reason is that this disables memcpy() optimizations
> > > that have a measurable impact on the network stack performance.
> > >
> > > We can get those optimizations back by doing:
> > >
> > > #define memcpy(d, s, n) __builtin_memcpy((d), (s), (n))
> > >
> > > I verified that gcc still does proper bounds checking on
> > > __builtin_memcpy(), so we don't lose that.
> > >
> > > The nice thing about this solution is that we can choose explicitly
> > > which optimizations we want. And as you can see the kernel makefile
> > > gets simpler ;).
> > >
> > > Of course the real reason why I'm looking into this is that clang
> > > makes it really hard to build kernels without -ffreestanding.
> > >
> > > The diff below implements this strategy, and enabled the optimizations
> > > for memcpy() and memset(). We can add others if we think there is a
> > > benefit. I've tested the diff on amd64. We may need to put an #undef
> > > memcpy somewhere for platforms that use the generic C code for memcpy.
> > >
> > > Thoughts?
> >
> > So those #undefs are necessary. New diff below. Tested on armv7,
> > hppa and sparc64 now as well.
macppc tested now as well
> I agree this is the way we want to go. It also avoids having
> to expand the list to -fno-builtin-free etc for newer versions of gcc.
>
> Why build the memcpy/memset in libkern at all if we go this route?
__builtin_memcpy() may still expand to an explicit memset() call if
the compiler decides not to inline it.
> > Index: sys/systm.h
> > ===================================================================
> > RCS file: /cvs/src/sys/sys/systm.h,v
> > retrieving revision 1.119
> > diff -u -p -r1.119 systm.h
> > --- sys/systm.h 24 Sep 2016 18:35:52 -0000 1.119
> > +++ sys/systm.h 24 Dec 2016 16:05:48 -0000
> > @@ -306,6 +306,9 @@ extern int (*mountroot)(void);
> >
> > #include <lib/libkern/libkern.h>
> >
> > +#define memcpy(d, s, n) __builtin_memcpy((d), (s), (n))
> > +#define memset(b, c, n) __builtin_memset((b), (c), (n))
> > +
> > #if defined(DDB) || defined(KGDB)
> > /* debugger entry points */
> > void Debugger(void); /* in DDB only */
> > Index: lib/libkern/memcpy.c
> > ===================================================================
> > RCS file: /cvs/src/sys/lib/libkern/memcpy.c,v
> > retrieving revision 1.3
> > diff -u -p -r1.3 memcpy.c
> > --- lib/libkern/memcpy.c 12 Jun 2013 16:44:22 -0000 1.3
> > +++ lib/libkern/memcpy.c 24 Dec 2016 16:05:48 -0000
> > @@ -32,6 +32,8 @@
> > #include <sys/types.h>
> > #include <sys/systm.h>
> >
> > +#undef memcpy
> > +
> > /*
> > * This is designed to be small, not fast.
> > */
> > Index: lib/libkern/memset.c
> > ===================================================================
> > RCS file: /cvs/src/sys/lib/libkern/memset.c,v
> > retrieving revision 1.7
> > diff -u -p -r1.7 memset.c
> > --- lib/libkern/memset.c 10 Jun 2014 04:16:57 -0000 1.7
> > +++ lib/libkern/memset.c 24 Dec 2016 16:05:48 -0000
> > @@ -39,6 +39,8 @@
> > #include <sys/systm.h>
> > #include <lib/libkern/libkern.h>
> >
> > +#undef memset
> > +
> > #define wsize sizeof(u_int)
> > #define wmask (wsize - 1)
> >
> > Index: arch/amd64/conf/Makefile.amd64
> > ===================================================================
> > RCS file: /cvs/src/sys/arch/amd64/conf/Makefile.amd64,v
> > retrieving revision 1.74
> > diff -u -p -r1.74 Makefile.amd64
> > --- arch/amd64/conf/Makefile.amd64 29 Nov 2016 09:08:34 -0000 1.74
> > +++ arch/amd64/conf/Makefile.amd64 24 Dec 2016 16:05:49 -0000
> > @@ -29,9 +29,7 @@ CWARNFLAGS= -Werror -Wall -Wimplicit-fun
> >
> > CMACHFLAGS= -mcmodel=kernel -mno-red-zone -mno-sse2 -mno-sse
> > -mno-3dnow \
> > -mno-mmx -msoft-float -fno-omit-frame-pointer
> > -CMACHFLAGS+= -fno-builtin-printf -fno-builtin-snprintf \
> > - -fno-builtin-vsnprintf -fno-builtin-log \
> > - -fno-builtin-log2 -fno-builtin-malloc ${NOPIE_FLAGS}
> > +CMACHFLAGS+= -ffreestanding ${NOPIE_FLAGS}
> > .if ${IDENT:M-DNO_PROPOLICE}
> > CMACHFLAGS+= -fno-stack-protector
> > .endif
> >
>