Optimize popcount functions with ARM Neon intrinsics. This commit introduces Neon implementations of pg_popcount{32,64}, pg_popcount(), and pg_popcount_masked(). As in simd.h, we assume that all available AArch64 hardware supports Neon, so we don't need any new configure-time or runtime checks. Some compilers already emit Neon instructions for these functions, but our hand-rolled implementations for pg_popcount() and pg_popcount_masked() performed better in testing, likely due to better instruction-level parallelism.
Author: "chiranmoy.bhattacha...@fujitsu.com" <chiranmoy.bhattacha...@fujitsu.com> Reviewed-by: John Naylor <johncnaylo...@gmail.com> Discussion: https://postgr.es/m/010101936e4aaa70-b474ab9e-b9ce-474d-a3ba-a3dc223d295c-000000%40us-west-2.amazonses.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/6be53c27673a5fca64a00a684c36c29db6ca33a5 Modified Files -------------- src/include/port/pg_bitutils.h | 9 ++ src/port/Makefile | 1 + src/port/meson.build | 1 + src/port/pg_bitutils.c | 22 +++-- src/port/pg_popcount_aarch64.c | 208 +++++++++++++++++++++++++++++++++++++++++ 5 files changed, 235 insertions(+), 6 deletions(-)