Re: support for SSE2 intrinsics
On Thu, Aug 4, 2022 at 12:38 PM Masahiko Sawada wrote: > > I also think it's a good start. There is a typo in the commit message: > > s/hepler/helper/ > > The rest looks good to me. Fixed, and pushed, thanks to you both! I'll polish a small patch I have that actually uses this. -- John Naylor EDB: http://www.enterprisedb.com
Re: support for SSE2 intrinsics
Hi, On Wed, Aug 3, 2022 at 2:01 PM John Naylor wrote: > > > On Tue, Aug 2, 2022 at 11:53 PM Nathan Bossart > wrote: > > I did a bit of cross-checking, and AFAICT this is a reasonable starting > > point. emmintrin.h appears to be sufficient for one of my patches that > > makes use of SSE2 instructions. That being said, I imagine it'll be > > especially important to keep an eye on the buildfarm when this change is > > committed. > > Thanks for checking! Here's a concrete patch for testing. I also think it's a good start. There is a typo in the commit message: s/hepler/helper/ The rest looks good to me. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: support for SSE2 intrinsics
On Wed, Aug 03, 2022 at 12:00:39PM +0700, John Naylor wrote: > Thanks for checking! Here's a concrete patch for testing. LGTM -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: support for SSE2 intrinsics
On Tue, Aug 2, 2022 at 11:53 PM Nathan Bossart wrote: > I did a bit of cross-checking, and AFAICT this is a reasonable starting > point. emmintrin.h appears to be sufficient for one of my patches that > makes use of SSE2 instructions. That being said, I imagine it'll be > especially important to keep an eye on the buildfarm when this change is > committed. Thanks for checking! Here's a concrete patch for testing. -- John Naylor EDB: http://www.enterprisedb.com From 0e01cc8f5f1c2c27631953091a1d657c5e5486be Mon Sep 17 00:00:00 2001 From: John Naylor Date: Wed, 3 Aug 2022 11:07:40 +0700 Subject: [PATCH v1] Support SSE2 intrinsics where available SSE2 vector instructions are part of the spec for the 64-bit x86 architecture. Until now we have relied on the compiler to autovectorize in some limited situations, but some useful coding idioms can only be expressed explicitly via compiler intrinsics. To this end, add a header that defines USE_SSE2 when available. While x86-only for now, we can add other architectures in the future. This will also be the intended place for low-level hepler functions that use vector operations. Reviewed by Nathan Bossart Discussion: https://www.postgresql.org/message-id/CAFBsxsE2G_H_5Wbw%2BNOPm70-BK4xxKf86-mRzY%3DL2sLoQqM%2B-Q%40mail.gmail.com --- src/include/port/simd.h | 30 ++ 1 file changed, 30 insertions(+) create mode 100644 src/include/port/simd.h diff --git a/src/include/port/simd.h b/src/include/port/simd.h new file mode 100644 index 00..a571e79f57 --- /dev/null +++ b/src/include/port/simd.h @@ -0,0 +1,30 @@ +/*- + * + * simd.h + * Support for platform-specific vector operations. + * + * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/port/simd.h + * + *- + */ +#ifndef SIMD_H +#define SIMD_H + +/* + * SSE2 instructions are part of the spec for the 64-bit x86 ISA. We assume + * that compilers targeting this architecture understand SSE2 intrinsics. + * + * We use emmintrin.h rather than the comprehensive header immintrin.h in + * order to exclude extensions beyond SSE2. This is because MSVC, at least, + * will allow the use of intrinsics that haven't been enabled at compile + * time. + */ +#if (defined(__x86_64__) || defined(_M_AMD64)) +#include +#define USE_SSE2 +#endif + +#endif /* SIMD_H */ -- 2.36.1
Re: support for SSE2 intrinsics
On Tue, Aug 02, 2022 at 05:22:52PM +0700, John Naylor wrote: > Given all this, the anti-climax is: it seems we can start with something > like src/include/port/simd.h with: > > #if (defined(__x86_64__) || defined(_M_AMD64)) > #include > #define USE_SSE2 > #endif > > (plus a comment summarizing the above) > > That we can include into other files, and would be the place to put helper > functions. Thoughts? +1 I did a bit of cross-checking, and AFAICT this is a reasonable starting point. emmintrin.h appears to be sufficient for one of my patches that makes use of SSE2 instructions. That being said, I imagine it'll be especially important to keep an eye on the buildfarm when this change is committed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
support for SSE2 intrinsics
Recently there have been several threads where the problem at hand lends itself to using SSE2 SIMD intrinsics. These are convenient because on 64-bit x86 the instructions are always present and so don't need a runtime check. To integrate them into our code base, we will need to take some measures for portability, but after looking around it seems fairly lightweight: 1. Compiler invocation and symbols Since SSE2 is part of the AMD64 spec, gcc enables it always: $ gcc -dM -E- < /dev/null | grep SSE | sort $ gcc -dM -E -msse2 - < /dev/null | grep SSE | sort #define __MMX_WITH_SSE__ 1 #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE_MATH__ 1 Passing -m32 discards the "MATH" macros but keeps the rest: $ gcc -dM -E -m32 - < /dev/null | grep SSE | sort #define __SSE__ 1 #define __SSE2__ 1 Clang behaves similarly. MSVC doesn't define __SSE2__ (although it does define __AVX__ etc), but we can just test for _M_X64 or _M_AMD64 (they are equivalent according to [1], and we have both in our code base already). We could test for __SSE2__ for 32-bit gcc-alikes in the build farm, but I don't think that would tell us anything interesting, so we can just test for __x86_64__. 2. The intrinsics header >From Peter Cordes on StackOverflow [2]: ``` immintrin.h is portable across all compilers, and includes all Intel SIMD intrinsics, and some scalar extensions like BMI2 _pdep_u32. (For AMD SSE4a and XOP (Bulldozer-family only, dropped for Zen), you need to include a different header as well.) The only reason I can think of for including specifically would be if you're using MSVC and want to leave intrinsics undefined for ISA extensions you don't want to depend on. ``` It seems then that MSVC will compile intrinsics without prompting, so to be safe we'd need to take the latter advice and use . 3. Support for SSE2 intrinsics This seems to be well-nigh universal AFAICT and doesn't need to be tested for at configure time. A quick search doesn't turn up anything weird for Msys or Cygwin. From [2] again, gcc older than 4.4 can generate poor code, but there is no mention that correctness is a problem. 4. Helper functions In a couple proposed patches, there has been some interest in abstracting some SIMD functionality into functions to hide implementation details away. I agree there are cases where that would help readability and avoid duplication. Given all this, the anti-climax is: it seems we can start with something like src/include/port/simd.h with: #if (defined(__x86_64__) || defined(_M_AMD64)) #include #define USE_SSE2 #endif (plus a comment summarizing the above) That we can include into other files, and would be the place to put helper functions. Thoughts? [1] https://docs.microsoft.com/en-us/archive/blogs/reiley/macro-revisited [2] https://stackoverflow.com/questions/56049110/including-the-correct-intrinsic-header -- John Naylor EDB: http://www.enterprisedb.com