RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-05-17 Thread Amonson, Paul D
Hi, forgive the top-post but I have not seen any response to this post? Thanks, Paul > -Original Message- > From: Amonson, Paul D > Sent: Wednesday, May 1, 2024 8:56 AM > To: pgsql-hackers@lists.postgresql.org > Cc: Nathan Bossart ; Shankaran, Akash > > Subject:

Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-05-01 Thread Amonson, Paul D
Hi, Comparing the current SSE4.2 implementation of the CRC32C algorithm in Postgres, to an optimized AVX-512 algorithm [0] we observed significant gains. The result was a ~6.6X average multiplier of increased performance measured on 3 different Intel products. Details below. The AVX-512

RE: Popcount optimization using AVX512

2024-03-29 Thread Amonson, Paul D
> A counterexample is the CRC32C code. AFAICT we assume the presence of > CPUID in that code (and #error otherwise). I imagine its probably safe to > assume the compiler understands CPUID if it understands AVX512 intrinsics, > but that is still mostly a guess. If AVX-512 intrinsics are

RE: Popcount optimization using AVX512

2024-03-29 Thread Amonson, Paul D
> On Thu, Mar 28, 2024 at 11:10:33PM +0100, Alvaro Herrera wrote: > > We don't do MSVC via autoconf/Make. We used to have a special build > > framework for MSVC which parsed Makefiles to produce "solution" files, > > but it was removed as soon as Meson was mature enough to build. See > > commit

RE: Popcount optimization using AVX512

2024-03-29 Thread Amonson, Paul D
> -Original Message- > > Cool. I think we should run the benchmarks again to be safe, though. Ok, sure go ahead. :) > >> I forgot to mention that I also want to understand whether we can > >> actually assume availability of XGETBV when CPUID says we support > >> AVX512: > > > > You

RE: Popcount optimization using AVX512

2024-03-28 Thread Amonson, Paul D
> -Original Message- > From: Amonson, Paul D > Sent: Thursday, March 28, 2024 3:03 PM > To: Nathan Bossart > ... > I will review the new patch to see if there are anything that jumps out at me. I see in the meson.build you added the new file twice? @@ -7,6 +7,7

RE: Popcount optimization using AVX512

2024-03-28 Thread Amonson, Paul D
> -Original Message- > From: Nathan Bossart > Sent: Thursday, March 28, 2024 2:39 PM > To: Amonson, Paul D > > * The latest patch set from Paul Amonson appeared to support MSVC in the > meson build, but not the autoconf one. I don't have much expertise here,

RE: Popcount optimization using AVX512

2024-03-27 Thread Amonson, Paul D
> -Original Message- > From: Nathan Bossart > Sent: Wednesday, March 27, 2024 3:00 PM > To: Amonson, Paul D > > ... (I realize that I'm essentially > recanting much of my previous feedback, which I apologize for.) It happens. LOL As long as the algorithm for AVX

RE: Popcount optimization using AVX512

2024-03-25 Thread Amonson, Paul D
> -Original Message- > From: Amonson, Paul D > Sent: Monday, March 25, 2024 8:20 AM > To: Tom Lane > Cc: David Rowley ; Nathan Bossart > ; Andres Freund ; Alvaro > Herrera ; Shankaran, Akash > ; Noah Misch ; Matthias > van de Meent ; pgsql- > hack...@list

RE: Popcount optimization using AVX512

2024-03-25 Thread Amonson, Paul D
> -Original Message- > From: Tom Lane > Sent: Monday, March 25, 2024 8:12 AM > To: Amonson, Paul D > Cc: David Rowley ; Nathan Bossart > Subject: Re: Popcount optimization using AVX512 >... > Just for a note --- the cfbot will re-test existing patches every so of

RE: Popcount optimization using AVX512

2024-03-25 Thread Amonson, Paul D
> -Original Message- > From: Amonson, Paul D > Sent: Thursday, March 21, 2024 12:18 PM > To: David Rowley > Cc: Nathan Bossart ; Andres Freund I am re-posting the patches as CI for Mac failed (CI error not code/test error). The patches are the same as last time. Than

RE: Popcount optimization using AVX512

2024-03-21 Thread Amonson, Paul D
> -Original Message- > From: David Rowley > Sent: Wednesday, March 20, 2024 5:28 PM > To: Amonson, Paul D > Cc: Nathan Bossart ; Andres Freund > > I'm not sure about this "extern negates inline" comment. It seems to me the > compiler is perfectl

RE: Popcount optimization using AVX512

2024-03-20 Thread Amonson, Paul D
> -Original Message- > From: David Rowley > Sent: Tuesday, March 19, 2024 9:26 PM > To: Amonson, Paul D > > AMD's Zen4 also has AVX512, so it's misleading to indicate it's an Intel only > instruction. Also, writing the date isn't necessary as we have "git bl

RE: Popcount optimization using AVX512

2024-03-19 Thread Amonson, Paul D
> -Original Message- > From: Nathan Bossart > > Committed. Thanks for the suggestion and for reviewing! > > Paul, I suspect your patches will need to be rebased after commit cc4826d. > Would you mind doing so? Changed in this patch set. * Rebased. * Direct *slow* calls via macros as

RE: Popcount optimization using AVX512

2024-03-18 Thread Amonson, Paul D
> -Original Message- > From: Nathan Bossart > Sent: Monday, March 18, 2024 2:08 PM > To: David Rowley > Cc: Amonson, Paul D ; Andres Freund >... > > The only reason I left it out was because I couldn't convince myself that it > wasn't dead code, given we assum

RE: Popcount optimization using AVX512

2024-03-18 Thread Amonson, Paul D
> -Original Message- > From: Nathan Bossart > Sent: Monday, March 18, 2024 9:20 AM > ... > I don't think David was suggesting that we need to remove the runtime checks > for AVX512. IIUC he was pointing out that most of the performance gain is > from removing the function call overhead,

RE: Popcount optimization using AVX512

2024-03-18 Thread Amonson, Paul D
Bossart > Sent: Monday, March 18, 2024 8:29 AM > To: David Rowley > Cc: Amonson, Paul D ; Andres Freund > ; Alvaro Herrera ; Shankaran, > Akash ; Noah Misch ; > Tom Lane ; Matthias van de Meent > ; pgsql-hackers@lists.postgresql.org > Subject: Re: Popcount optimization using A

RE: Popcount optimization using AVX512

2024-03-15 Thread Amonson, Paul D
> -Original Message- > From: Amonson, Paul D > Sent: Friday, March 15, 2024 8:31 AM > To: Nathan Bossart ... > When I tested the code outside postgres in a micro benchmark I got 200- > 300% improvements. Your results are interesting, as it implies more than > 300% i

RE: Popcount optimization using AVX512

2024-03-15 Thread Amonson, Paul D
> -Original Message- > From: Nathan Bossart > Sent: Friday, March 15, 2024 8:06 AM > To: Amonson, Paul D > Cc: Andres Freund ; Alvaro Herrera ip.org>; Shankaran, Akash ; Noah Misch > ; Tom Lane ; Matthias van de > Meent ; pgsql- > hack...@lists.postgresql.

RE: Popcount optimization using AVX512

2024-03-14 Thread Amonson, Paul D
> -Original Message- > From: Nathan Bossart > Sent: Monday, March 11, 2024 6:35 PM > To: Amonson, Paul D > Thanks. There's no need to wait to post the AVX portion. I recommend using > "git format-patch" to construct the patch set for the lists. Afte

RE: Popcount optimization using AVX512

2024-03-13 Thread Amonson, Paul D
> -Original Message- > From: Nathan Bossart > Sent: Wednesday, March 13, 2024 9:39 AM > To: Amonson, Paul D > +extern int pg_popcount32_slow(uint32 word); extern int > +pg_popcount64_slow(uint64 word); > > +/* In pg_popcnt_*_accel source file. */ extern i

RE: Popcount optimization using AVX512

2024-03-11 Thread Amonson, Paul D
> -Original Message- > From: Nathan Bossart > Sent: Thursday, March 7, 2024 1:36 PM > Subject: Re: Popcount optimization using AVX512 I will be splitting the request into 2 patches. I am attaching the first patch (refactoring only) and I updated the commitfest entry to match this patch.

RE: Popcount optimization using AVX512

2024-03-05 Thread Amonson, Paul D
-Original Message- >From: Nathan Bossart >Sent: Tuesday, March 5, 2024 8:38 AM >To: Amonson, Paul D >Cc: Andres Freund ; Alvaro Herrera >; Shankaran, Akash ; Noah >Misch ; Tom Lane ; Matthias van de >Meent ; >pgsql-hackers@lists.postgresql.org >Subject

RE: Popcount optimization using AVX512

2024-03-05 Thread Amonson, Paul D
ould apply and build. It succeeded. Thanks, Paul -Original Message- From: Nathan Bossart Sent: Monday, March 4, 2024 2:21 PM To: Amonson, Paul D Cc: Andres Freund ; Alvaro Herrera ; Shankaran, Akash ; Noah Misch ; Tom Lane ; Matthias van de Meent ; pgsql-hackers@lists.postgresql.org S

RE: Popcount optimization using AVX512

2024-03-04 Thread Amonson, Paul D
be picked up by a committer, given it has been reviewed by multiple committers so far? The scope of the change is pretty contained as well. [0] https://wiki.postgresql.org/wiki/Submitting_a_Patch Thanks, Paul -Original Message- From: Nathan Bossart Sent: Friday, March 1, 2024 1:45 P

RE: Popcount optimization using AVX512

2024-02-27 Thread Amonson, Paul D
. Both meson and autoconf are updated with the new refactor. I am attaching the new patch. Paul -Original Message- From: Amonson, Paul D Sent: Monday, February 26, 2024 9:57 AM To: Amonson, Paul D ; Andres Freund Cc: Alvaro Herrera ; Shankaran, Akash ; Nathan Bossart ; Noah Misch

RE: Popcount optimization using AVX512

2024-02-26 Thread Amonson, Paul D
someone with Windows/MSVC experience help me? * Code: https://github.com/paul-amonson/postgresql/tree/popcnt_patch * CI build: https://cirrus-ci.com/task/4927666021728256 Thanks, Paul -Original Message- From: Amonson, Paul D Sent: Wednesday, February 21, 2024 9:36 AM To: Andres Freund

RE: Popcount optimization using AVX512

2024-02-21 Thread Amonson, Paul D
://cirrus-ci.com/task/4927666021728256. Thanks, Paul -Original Message- From: Andres Freund Sent: Monday, February 12, 2024 12:37 PM To: Amonson, Paul D Cc: Alvaro Herrera ; Shankaran, Akash ; Nathan Bossart ; Noah Misch ; Tom Lane ; Matthias van de Meent ; pgsql-hackers

RE: Popcount optimization using AVX512

2024-02-12 Thread Amonson, Paul D
expert in meson, but splitting might add complexity to meson.build. Could you elaborate if there are other benefits to the split file approach? Paul -Original Message- From: Andres Freund Sent: Friday, February 9, 2024 10:35 AM To: Amonson, Paul D Cc: Alvaro Herrera ; Shankaran, Akash

RE: Popcount optimization using AVX512

2024-02-09 Thread Amonson, Paul D
by the OS or hypervisor even if the CPU supports AVX512. The big change is adding all old and new build support to meson. I am new to meson/ninja so please review carefully. Thanks, Paul -Original Message- From: Alvaro Herrera Sent: Wednesday, February 7, 2024 2:13 AM To: Amonson, Paul D Cc

RE: Popcount optimization using AVX512

2024-02-06 Thread Amonson, Paul D
1:49 AM To: Shankaran, Akash Cc: Nathan Bossart ; Noah Misch ; Amonson, Paul D ; Tom Lane ; Matthias van de Meent ; pgsql-hackers@lists.postgresql.org Subject: Re: Popcount optimization using AVX512 On 2024-Jan-25, Shankaran, Akash wrote: > With the updated patch, we observed significant im