commit 5d48b3120a651eee088cced1b4cffd8a264722c6 Author: Matthew Dillon <dil...@apollo.backplane.com> Date: Sat May 5 21:52:37 2018 -0700
kernel - Refactor bcmp, bcopy, bzero, memset * For now continue to use stosq/stosb, movsq/movsb, cmpsq/cmpsb sequences which are well optimized on AMD and Intel. Do not just use the '*b' string op. While this is optimized on Intel it is not optimized on AMD. * Note that two string ops in a row result in a serious pessimization. To fix this, for now, conditionalize the movsb, stosb, or cmpsb op so it is only executed when the remaining count is non-zero. That is, assume nominal 8-byte alignment. * Refactor pagezero() to use a movq/addq/jne sequence. This is significantly faster than movsq on AMD and only just very slightly slower than movsq on Intel. * Also use the above adjusted kernel code in libc for these functions, with minor modifications. Since we are copying the code wholesale, replace the copyright for the related files in libc. * Refactor libc's memset() to replicate the data to all 64 bits code and then use code similar to bzero(). Reported-by: mjg_ (info on pessimizations) Summary of changes: lib/libc/x86_64/string/bcmp.S | 53 +++++++++++++----- lib/libc/x86_64/string/bcopy.S | 87 +++++++++++++++--------------- lib/libc/x86_64/string/bzero.S | 69 +++++++++++++----------- lib/libc/x86_64/string/memset.S | 107 +++++++++++++++++++++---------------- sys/platform/pc64/x86_64/support.s | 40 +++++++++++--- 5 files changed, 212 insertions(+), 144 deletions(-) http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/5d48b3120a651eee088cced1b4cffd8a264722c6 -- DragonFly BSD source repository