Jakub Jelinek wrote:
>On Thu, Apr 12, 2018 at 04:30:07PM +0000, Wilco Dijkstra wrote:
>> Jakub Jelinek wrote:
>> Frankly I don't see why it is a P1 regression. Do you have a benchmark that
>That is how regression priorities are defined.
How can one justify considering this a release blocker without hard numbers?
If this is a 1% regression on a large body of code it would be very serious, if
not so much.
>> >> So generally it's a good idea to change mempcpy into memcpy by default.
>> >> It's
>> >> not slower than calling mempcpy even if you have a fast implementation,
>> >> it's faster
>> >> if you use an up to date GLIBC which calls memcpy, and it's significantly
>> >> better
>> >> when using an old GLIBC.
>> > mempcpy is quite good on many targets even in old GLIBCs.
>> Only true if with "many" you mean x86, x86_64 and IIRC sparc.
> Depending on what you mean old, I see e.g. in 2010 power7 mempcpy got added,
> in 2013 other power versions, in 2016 s390*, etc. Doing a decent mempcpy
> isn't hard if you have asm version of memcpy and one spare register.
More mempcpy implementations have been added in recent years indeed, but almost
add an extra copy of the memcpy code rather than using a single combined
That means it is still better to call memcpy (which is frequently used and thus
likely in L1/L2)
rather than mempcpy (which is more likely to be cold and thus not cached).