On Wed, Feb 19, 2025 at 09:31:50AM +0000, chiranmoy.bhattacha...@fujitsu.com wrote: >> Hm. Any idea why that is? I wonder if the compiler isn't using as many >> SVE registers as it could for this. > > Not sure, we tried forcing loop unrolling using the below line in the MakeFile > but the results are the same. > > pg_popcount_sve.o: CFLAGS += ${CFLAGS_UNROLL_LOOPS} -march=native
Interesting. I do see different assembly with the 2 and 4 register versions, but I didn't get to testing it on a machine with SVE support today. Besides some additional benchmarking, I might make some small adjustments to the patch. But overall, it seems to be in decent shape. -- nathan