Mark Kettenis <[email protected]> writes:

>> Date: Sat, 24 Dec 2016 00:08:35 +0100 (CET)
>> From: Mark Kettenis <[email protected]>
>> 
>> We already do this on some architectures, but not on amd64 for
>> example.  The main reason is that this disables memcpy() optimizations
>> that have a measurable impact on the network stack performance.
>> 
>> We can get those optimizations back by doing:
>> 
>> #define memcpy(d, s, n) __builtin_memcpy((d), (s), (n))
>> 
>> I verified that gcc still does proper bounds checking on
>> __builtin_memcpy(), so we don't lose that.
>> 
>> The nice thing about this solution is that we can choose explicitly
>> which optimizations we want.  And as you can see the kernel makefile
>> gets simpler ;).
>> 
>> Of course the real reason why I'm looking into this is that clang
>> makes it really hard to build kernels without -ffreestanding.
>> 
>> The diff below implements this strategy, and enabled the optimizations
>> for memcpy() and memset().  We can add others if we think there is a
>> benefit.  I've tested the diff on amd64.  We may need to put an #undef
>> memcpy somewhere for platforms that use the generic C code for memcpy.
>> 
>> Thoughts?
>
> So those #undefs are necessary.  New diff below.  Tested on armv7,
> hppa and sparc64 now as well.

I think this is the way to go; can't help tests on other archs, though.
ok jca@ fwiw

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply via email to