> Date: Wed, 28 Dec 2016 08:29:05 +0100
> From: Martin Pieuchot <m...@openbsd.org>
> 
> On 28/12/16(Wed) 01:05, Jeremie Courreges-Anglas wrote:
> > Mark Kettenis <mark.kette...@xs4all.nl> writes:
> > 
> > >> Date: Sat, 24 Dec 2016 00:08:35 +0100 (CET)
> > >> From: Mark Kettenis <mark.kette...@xs4all.nl>
> > >> 
> > >> We already do this on some architectures, but not on amd64 for
> > >> example.  The main reason is that this disables memcpy() optimizations
> > >> that have a measurable impact on the network stack performance.
> > >> 
> > >> We can get those optimizations back by doing:
> > >> 
> > >> #define memcpy(d, s, n) __builtin_memcpy((d), (s), (n))
> > >> 
> > >> I verified that gcc still does proper bounds checking on
> > >> __builtin_memcpy(), so we don't lose that.
> > >> 
> > >> The nice thing about this solution is that we can choose explicitly
> > >> which optimizations we want.  And as you can see the kernel makefile
> > >> gets simpler ;).
> > >> 
> > >> Of course the real reason why I'm looking into this is that clang
> > >> makes it really hard to build kernels without -ffreestanding.
> > >> 
> > >> The diff below implements this strategy, and enabled the optimizations
> > >> for memcpy() and memset().  We can add others if we think there is a
> > >> benefit.  I've tested the diff on amd64.  We may need to put an #undef
> > >> memcpy somewhere for platforms that use the generic C code for memcpy.
> > >> 
> > >> Thoughts?
> > >
> > > So those #undefs are necessary.  New diff below.  Tested on armv7,
> > > hppa and sparc64 now as well.
> > 
> > I think this is the way to go; can't help tests on other archs, though.
> > ok jca@ fwiw
> 
> For the archives, Hrvoje Popovski measured a performance impact when using
> a kernel with this diff to forward packets.  I guess we're missing some
> defines.

The most likely candidate is memmove.  Here is a diff that adds it.

Index: arch/amd64/conf/Makefile.amd64
===================================================================
RCS file: /cvs/src/sys/arch/amd64/conf/Makefile.amd64,v
retrieving revision 1.74
diff -u -p -r1.74 Makefile.amd64
--- arch/amd64/conf/Makefile.amd64      29 Nov 2016 09:08:34 -0000      1.74
+++ arch/amd64/conf/Makefile.amd64      28 Dec 2016 21:48:52 -0000
@@ -29,9 +29,7 @@ CWARNFLAGS=   -Werror -Wall -Wimplicit-fun
 
 CMACHFLAGS=    -mcmodel=kernel -mno-red-zone -mno-sse2 -mno-sse -mno-3dnow \
                -mno-mmx -msoft-float -fno-omit-frame-pointer
-CMACHFLAGS+=   -fno-builtin-printf -fno-builtin-snprintf \
-               -fno-builtin-vsnprintf -fno-builtin-log \
-               -fno-builtin-log2 -fno-builtin-malloc ${NOPIE_FLAGS}
+CMACHFLAGS+=   -ffreestanding ${NOPIE_FLAGS}
 .if ${IDENT:M-DNO_PROPOLICE}
 CMACHFLAGS+=   -fno-stack-protector
 .endif
Index: lib/libkern/memcpy.c
===================================================================
RCS file: /cvs/src/sys/lib/libkern/memcpy.c,v
retrieving revision 1.3
diff -u -p -r1.3 memcpy.c
--- lib/libkern/memcpy.c        12 Jun 2013 16:44:22 -0000      1.3
+++ lib/libkern/memcpy.c        28 Dec 2016 21:48:53 -0000
@@ -32,6 +32,8 @@
 #include <sys/types.h>
 #include <sys/systm.h>
 
+#undef memcpy
+
 /*
  * This is designed to be small, not fast.
  */
Index: lib/libkern/memmove.c
===================================================================
RCS file: /cvs/src/sys/lib/libkern/memmove.c,v
retrieving revision 1.1
diff -u -p -r1.1 memmove.c
--- lib/libkern/memmove.c       11 Jun 2013 18:04:41 -0000      1.1
+++ lib/libkern/memmove.c       28 Dec 2016 21:48:53 -0000
@@ -32,6 +32,8 @@
 #include <sys/types.h>
 #include <sys/systm.h>
 
+#undef memmove
+
 /*
  * This is designed to be small, not fast.
  */
Index: lib/libkern/memset.c
===================================================================
RCS file: /cvs/src/sys/lib/libkern/memset.c,v
retrieving revision 1.7
diff -u -p -r1.7 memset.c
--- lib/libkern/memset.c        10 Jun 2014 04:16:57 -0000      1.7
+++ lib/libkern/memset.c        28 Dec 2016 21:48:53 -0000
@@ -39,6 +39,8 @@
 #include <sys/systm.h>
 #include <lib/libkern/libkern.h>
 
+#undef memset
+
 #define        wsize   sizeof(u_int)
 #define        wmask   (wsize - 1)
 
Index: sys/systm.h
===================================================================
RCS file: /cvs/src/sys/sys/systm.h,v
retrieving revision 1.120
diff -u -p -r1.120 systm.h
--- sys/systm.h 19 Dec 2016 08:36:50 -0000      1.120
+++ sys/systm.h 28 Dec 2016 21:48:53 -0000
@@ -330,6 +330,10 @@ extern int (*mountroot)(void);
 
 #include <lib/libkern/libkern.h>
 
+#define memcpy(d, s, n)                __builtin_memcpy((d), (s), (n))
+#define memmove(d, s, n)       __builtin_memmove((d), (s), (n))
+#define memset(b, c, n)                __builtin_memset((b), (c), (n))
+
 #if defined(DDB) || defined(KGDB)
 /* debugger entry points */
 void   Debugger(void); /* in DDB only */

Reply via email to