Stuart Henderson <s...@spacehopper.org> wrote:
> On 2024/02/18 13:58, Evan Silberman wrote:
> > Stuart Henderson <s...@spacehopper.org> wrote:
> > > This is probably worth a try. I've asked if upstream can update it
> > 
> > Hi Stuart,
> > 
> > Unclear if necessary, but not sufficient. Turns out there's one more
> > unconditional AVX512 check in text's own C code, cbits/measure_off.c:
> > 
> > #if !((defined(__apple_build_version__) && __apple_build_version__ <= 
> > 10001145) \
> >       || (defined(__clang_major__) && __clang_major__ <= 6)) && 
> > !defined(__STDC_NO_ATOMICS__)
> > #define COMPILER_SUPPORTS_AVX512
> > #endif
> > 
> > Undefing this gets me good results:
> > 
> > text $ cabal repl
> > Build profile: -w ghc-9.2.7 -O1
> > In order, the following will be built (use -v for more details):
> >  - text-2.0.2 (lib) (first run)
> > Preprocessing library for text-2.0.2..
> > GHCi, version 9.2.7: https://www.haskell.org/ghc/  :? for help
> > [ 1 of 46] Compiling Data.Text.Array  ( src/Data/Text/Array.hs, interpreted 
> > )
> >  [ snip ]
> >  [46 of 46] Compiling Data.Text.Lazy.IO ( src/Data/Text/Lazy/IO.hs, 
> > interpreted )
> > Ok, 46 modules loaded.
> > ghci> take 1 $ pack "AA"
> > "A"
> > 
> > whew.
> > 
> > I _think_ the right patch for lang/ghc looks like this but I haven't
> > tested this exact thing in situ.
> > 
> > blob - /dev/null
> > blob + 282c4467d3eed9211a7b3c505248569d342863b4 (mode 644)
> > --- /dev/null
> > +++ lang/ghc/patches/patch-libraries_text_cbits_measure_off_c
> > @@ -0,0 +1,13 @@
> > +Disable AVX512 instructions, not supported on OpenBSD
> > +Index: libraries/text/cbits/measure_off.c
> > +--- libraries/text/cbits/measure_off.c.orig
> > ++++ libraries/text/cbits/measure_off.c
> > +@@ -34,7 +34,7 @@
> > +   Disable AVX-512 instructions as they are most likely not supported
> > +   on the hardware running clang-6.
> > + */
> > +-#if !((defined(__apple_build_version__) && __apple_build_version__ <= 
> > 10001145) \
> > ++#if !defined(__OpenBSD__) && !((defined(__apple_build_version__) && 
> > __apple_build_version__ <= 10001145) \
> > +       || (defined(__clang_major__) && __clang_major__ <= 6)) && 
> > !defined(__STDC_NO_ATOMICS__)
> > + #define COMPILER_SUPPORTS_AVX512
> > + #endif
> > 
> 
> That's not the right check, it should test whether it works rather than
> what the OS is.

Something like this? I'm out of my depth and heavily pattern-matching
against the fix to simdutf and other references. Genuinely no idea if
I'm using inline asm correctly, etc. Works on my machine, however.

--- /dev/null
+++ lang/ghc/patches/patch-libraries_text_cbits_measure_off_c
@@ -0,0 +1,22 @@
+Don't attempt to use avx512 kernels when the OS doesn't support them
+Index: libraries/text/cbits/measure_off.c
+--- libraries/text/cbits/measure_off.c.orig
++++ libraries/text/cbits/measure_off.c
+@@ -44,12 +44,16 @@
+ bool has_avx512_vl_bw() {
+ #if (__GNUC__ >= 7 || __GNUC__ == 6 && __GNUC_MINOR__ >= 3) || 
defined(__clang_major__)
+   uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0;
++  uint64_t xcr0;
+   __get_cpuid_count(7, 0, &eax, &ebx, &ecx, &edx);
++  __asm__("xgetbv\n\t" : "=a" (xcr0) : "c" (0));
+   // https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features
+   const bool has_avx512_bw = ebx & (1 << 30);
+   const bool has_avx512_vl = ebx & (1 << 31);
++  // XCR0 bits 5, 6, and 7
++  const bool avx512_os_enabled = (xcr0 & 0xE0) == 0xE0;
+   // printf("cpuid=%d=cpuid\n", has_avx512_bw && has_avx512_vl);
+-  return has_avx512_bw && has_avx512_vl;
++  return has_avx512_bw && has_avx512_vl && avx512_os_enabled;
+ #else
+   return false;
+ #endif

Reply via email to