Re: [QUESTION,RFC] cacheable_memcpy() versus memcpy() == 8% improvment on FTP throughput
On Wed, 2015-02-11 at 08:53 +0100, leroy christophe wrote: In powerpc32 architecture there is a function called cacheable_memcpy() which does same thing as memcpy() but using dcbz/dcbt instructions for an optimised copy (just like __copy_tofrom_user()) What seems strange is that it is almost nowhere used (only used in drivers/net/ethernet/ibm/emac/core.c) For a try I replaced all memcpy() in include/linux/skbuff.h and net/core/skbuff.c by cacheable_memcpy() and I got around 8% improvement on FTP throughput on MPC885. What could be done to generalise the use of cacheable_memcpy() instead of memcpy() whenever possible ? Indeed, in order to use cacheable_memcpy(), we need * The destination to be cacheable * The source and destination to not overlap on the same cachelines Could we check, when calling memcpy(), whether the destination is cacheable or not, and if yes redirect the call to cacheable_memcpy() ? How can we check that ? Additionally we could have a P8 implementation that uses unaligned vectors. Adding Anton to the CC list. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[QUESTION,RFC] cacheable_memcpy() versus memcpy() == 8% improvment on FTP throughput
In powerpc32 architecture there is a function called cacheable_memcpy() which does same thing as memcpy() but using dcbz/dcbt instructions for an optimised copy (just like __copy_tofrom_user()) What seems strange is that it is almost nowhere used (only used in drivers/net/ethernet/ibm/emac/core.c) For a try I replaced all memcpy() in include/linux/skbuff.h and net/core/skbuff.c by cacheable_memcpy() and I got around 8% improvement on FTP throughput on MPC885. What could be done to generalise the use of cacheable_memcpy() instead of memcpy() whenever possible ? Indeed, in order to use cacheable_memcpy(), we need * The destination to be cacheable * The source and destination to not overlap on the same cachelines Could we check, when calling memcpy(), whether the destination is cacheable or not, and if yes redirect the call to cacheable_memcpy() ? How can we check that ? Christophe ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev