Optimize pg_memory_is_all_zeros() in memutils.h pg_memory_is_all_zeros() is currently implemented to do only a byte-per-byte comparison. While being sufficient for its existing callers for pgstats entries, it could lead to performance regressions should it be used for larger memory areas, like 8kB blocks, or even future commits.
This commit optimizes the implementation of this function to be more efficient for larger sizes, written in a way so that compilers can optimize the code. This is portable across 32b and 64b architectures. The implementation handles three cases, depending on the size of the input provided: * If less than sizeof(size_t), do a simple byte-by-byte comparison. * If between sizeof(size_t) and (sizeof(size_t) * 8 - 1): ** Phase 1: byte-by-byte comparison, until the pointer is aligned. ** Phase 2: size_t comparisons, with aligned pointers, up to the last aligned location possible. ** Phase 3: byte-by-byte comparison, until the end location. * If more than (sizeof(size_t) * 8) bytes, this is the same as case 2 except that an additional phase is placed between Phase 1 and Phase 2, with 8 * sizeof(size_t) comparisons using bitwise OR, to encourage compilers to use SIMD instructions if available. The last improvement proves to be at least 3 times faster than the size_t comparisons, which is something currently used for the all-zero page check in PageIsVerifiedExtended(). The optimization tricks that would encourage the use of SIMD instructions have been suggested by David Rowley. Author: Bertrand Drouvot Reviewed-by: Michael Paquier, Ranier Vilela Discussion: https://postgr.es/m/caaphdvq7p-jgfhgtxupqhavg-qsdvuhywaex9m8_mnorfei...@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/5be1dabd2ae0cf48d927aad363c4b65507e38b25 Modified Files -------------- src/include/utils/memutils.h | 119 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 116 insertions(+), 3 deletions(-)