RE: Popcount optimization using AVX512

Amonson, Paul D Fri, 15 Mar 2024 08:31:47 -0700

> -----Original Message-----
> From: Nathan Bossart <[email protected]>
> Sent: Friday, March 15, 2024 8:06 AM
> To: Amonson, Paul D <[email protected]>
> Cc: Andres Freund <[email protected]>; Alvaro Herrera <[email protected]
> ip.org>; Shankaran, Akash <[email protected]>; Noah Misch
> <[email protected]>; Tom Lane <[email protected]>; Matthias van de
> Meent <[email protected]>; pgsql-
> [email protected]
> Subject: Re: Popcount optimization using AVX512
> 
> Which test suite did you run?  Those numbers seem potentially
> indistinguishable from noise, which probably isn't great for such a large 
> patch
> set.


I ran...
        psql -c "select bitcount(column) from table;"
...in a loop with "column" widths of 84, 4096, 8192, and 16384 containing 
random data. There DB has 1 million rows.  In the loop before calling the 
select I have code to clear all system caches. If I omit the code to clear 
system caches the margin of error remains the same but the improvement percent 
changes from 1.2% to 14.6% (much less I/O when cached data is available).

> I ran John Naylor's test_popcount module [0] with the following command on
> an i7-1195G7:
> 
>       time psql postgres -c 'select drive_popcount(10000000, 1024)'
> 
> Without your patches, this seems to take somewhere around 8.8 seconds.
> With your patches, it takes 0.6 seconds.  (I re-compiled and re-ran the tests 
> a
> couple of times because I had a difficult time believing the amount of
> improvement.)

When I tested the code outside postgres in a micro benchmark I got 200-300% 
improvements. Your results are interesting, as it implies more than 300% 
improvement. Let me do some research on the benchmark you referenced. However, 
in all cases it seems that there is no regression so should we move forward on 
merging while I run some more local tests?

Thanks,
Paul

RE: Popcount optimization using AVX512

Reply via email to