cc'ing in Liang Li who did the original avx2 code. Dave
* Richard Henderson (r...@twiddle.net) wrote: > Patches 1-3 remove the use of ifunc from the implementation. > > Patch 5 adjusts the x86 implementation a bit more to take > advantage of ptest (in sse4.1) and unaligned accesses (in avx1). > > Patches 2 and 6 are the result of my conversation with Vijaya > Kumar with respect to ThunderX. > > Patch 7 is the result of seeing some really really horrible code > produced for ppc64le (gcc 4.9 and mainline). > > This has had limited testing. What I don't know is the best way > to benchmark this -- the only way I know to trigger this is via > the console, by hand, which doesn't make for reasonable timing. > > > r~ > > > Richard Henderson (7): > cutils: Remove SPLAT macro > cutils: Export only buffer_is_zero > cutils: Rearrange buffer_is_zero acceleration > cutils: Add generic prefetch > cutils: Rewrite x86 buffer zero checking > cutils: Rewrite aarch64 buffer zero checking > cutils: Rewrite ppc buffer zero checking > > configure | 21 +- > include/qemu/cutils.h | 2 - > migration/ram.c | 2 +- > migration/rdma.c | 5 +- > util/cutils.c | 526 > +++++++++++++++++++++++++++++++++----------------- > 5 files changed, 352 insertions(+), 204 deletions(-) > > -- > 2.7.4 > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK