Use AVX2 for calculating page checksums where available We already rely on autovectorization for computing page checksums, but on x86 we can get a further several-fold performance increase by annotating pg_checksum_block() with a function target attribute for the AVX2 instruction set extension. Not only does that use 256-bit registers, it can also use vector multiplication rather than the vector shifts and adds used in SSE2.
Similar to other hardware-specific paths, we set a function pointer on first use. We don't bother to avoid this on platforms without AVX2 since the overhead of indirect calls doesn't matter for multi-kilobyte inputs. However, we do arrange so that only core has the function pointer mechanism. External programs will continue to build a normal static function and don't need to be aware of this. This matters most when using io_uring since in that case the checksum computation is not done in parallel by IO workers. Co-authored-by: Matthew Sterrett <[email protected]> Co-authored-by: Andrew Kim <[email protected]> Reviewed-by: Oleg Tselebrovskiy <[email protected]> Tested-by: Ants Aasma <[email protected]> Tested-by: Stepan Neretin <[email protected]> (earlier version) Discussion: https://postgr.es/m/CA+vA85_5GTu+HHniSbvvP+8k3=xZO=we84npwikyxztqvpf...@mail.gmail.com Discussion: https://postgr.es/m/20250911054220.3784-1-root%40ip-172-31-36-228.ec2.internal Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/5e13b0f240397b210a0af11f83204d0b4f1713c2 Modified Files -------------- config/c-compiler.m4 | 25 +++++++++++++++ configure | 44 ++++++++++++++++++++++++++ configure.ac | 9 ++++++ meson.build | 27 ++++++++++++++++ src/backend/storage/page/checksum.c | 44 +++++++++++++++++++++++++- src/include/pg_config.h.in | 3 ++ src/include/port/pg_cpu.h | 3 ++ src/include/storage/checksum_block.inc.c | 42 +++++++++++++++++++++++++ src/include/storage/checksum_impl.h | 53 ++++++++++++-------------------- src/port/pg_cpu_x86.c | 4 +++ 10 files changed, 219 insertions(+), 35 deletions(-)
