> -----Original Message-----
> From: Nathan Bossart <[email protected]>
> Sent: Friday, March 15, 2024 8:06 AM
> To: Amonson, Paul D <[email protected]>
> Cc: Andres Freund <[email protected]>; Alvaro Herrera <[email protected]
> ip.org>; Shankaran, Akash <[email protected]>; Noah Misch
> <[email protected]>; Tom Lane <[email protected]>; Matthias van de
> Meent <[email protected]>; pgsql-
> [email protected]
> Subject: Re: Popcount optimization using AVX512
>
> Which test suite did you run? Those numbers seem potentially
> indistinguishable from noise, which probably isn't great for such a large
> patch
> set.
I ran...
psql -c "select bitcount(column) from table;"
...in a loop with "column" widths of 84, 4096, 8192, and 16384 containing
random data. There DB has 1 million rows. In the loop before calling the
select I have code to clear all system caches. If I omit the code to clear
system caches the margin of error remains the same but the improvement percent
changes from 1.2% to 14.6% (much less I/O when cached data is available).
> I ran John Naylor's test_popcount module [0] with the following command on
> an i7-1195G7:
>
> time psql postgres -c 'select drive_popcount(10000000, 1024)'
>
> Without your patches, this seems to take somewhere around 8.8 seconds.
> With your patches, it takes 0.6 seconds. (I re-compiled and re-ran the tests
> a
> couple of times because I had a difficult time believing the amount of
> improvement.)
When I tested the code outside postgres in a micro benchmark I got 200-300%
improvements. Your results are interesting, as it implies more than 300%
improvement. Let me do some research on the benchmark you referenced. However,
in all cases it seems that there is no regression so should we move forward on
merging while I run some more local tests?
Thanks,
Paul