Stuart Henderson <s...@spacehopper.org> wrote: > On 2024/02/18 13:58, Evan Silberman wrote: > > Stuart Henderson <s...@spacehopper.org> wrote: > > > This is probably worth a try. I've asked if upstream can update it > > > > Hi Stuart, > > > > Unclear if necessary, but not sufficient. Turns out there's one more > > unconditional AVX512 check in text's own C code, cbits/measure_off.c: > > > > #if !((defined(__apple_build_version__) && __apple_build_version__ <= > > 10001145) \ > > || (defined(__clang_major__) && __clang_major__ <= 6)) && > > !defined(__STDC_NO_ATOMICS__) > > #define COMPILER_SUPPORTS_AVX512 > > #endif > > > > Undefing this gets me good results: > > > > text $ cabal repl > > Build profile: -w ghc-9.2.7 -O1 > > In order, the following will be built (use -v for more details): > > - text-2.0.2 (lib) (first run) > > Preprocessing library for text-2.0.2.. > > GHCi, version 9.2.7: https://www.haskell.org/ghc/ :? for help > > [ 1 of 46] Compiling Data.Text.Array ( src/Data/Text/Array.hs, interpreted > > ) > > [ snip ] > > [46 of 46] Compiling Data.Text.Lazy.IO ( src/Data/Text/Lazy/IO.hs, > > interpreted ) > > Ok, 46 modules loaded. > > ghci> take 1 $ pack "AA" > > "A" > > > > whew. > > > > I _think_ the right patch for lang/ghc looks like this but I haven't > > tested this exact thing in situ. > > > > blob - /dev/null > > blob + 282c4467d3eed9211a7b3c505248569d342863b4 (mode 644) > > --- /dev/null > > +++ lang/ghc/patches/patch-libraries_text_cbits_measure_off_c > > @@ -0,0 +1,13 @@ > > +Disable AVX512 instructions, not supported on OpenBSD > > +Index: libraries/text/cbits/measure_off.c > > +--- libraries/text/cbits/measure_off.c.orig > > ++++ libraries/text/cbits/measure_off.c > > +@@ -34,7 +34,7 @@ > > + Disable AVX-512 instructions as they are most likely not supported > > + on the hardware running clang-6. > > + */ > > +-#if !((defined(__apple_build_version__) && __apple_build_version__ <= > > 10001145) \ > > ++#if !defined(__OpenBSD__) && !((defined(__apple_build_version__) && > > __apple_build_version__ <= 10001145) \ > > + || (defined(__clang_major__) && __clang_major__ <= 6)) && > > !defined(__STDC_NO_ATOMICS__) > > + #define COMPILER_SUPPORTS_AVX512 > > + #endif > > > > That's not the right check, it should test whether it works rather than > what the OS is.
Something like this? I'm out of my depth and heavily pattern-matching against the fix to simdutf and other references. Genuinely no idea if I'm using inline asm correctly, etc. Works on my machine, however. --- /dev/null +++ lang/ghc/patches/patch-libraries_text_cbits_measure_off_c @@ -0,0 +1,22 @@ +Don't attempt to use avx512 kernels when the OS doesn't support them +Index: libraries/text/cbits/measure_off.c +--- libraries/text/cbits/measure_off.c.orig ++++ libraries/text/cbits/measure_off.c +@@ -44,12 +44,16 @@ + bool has_avx512_vl_bw() { + #if (__GNUC__ >= 7 || __GNUC__ == 6 && __GNUC_MINOR__ >= 3) || defined(__clang_major__) + uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0; ++ uint64_t xcr0; + __get_cpuid_count(7, 0, &eax, &ebx, &ecx, &edx); ++ __asm__("xgetbv\n\t" : "=a" (xcr0) : "c" (0)); + // https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features + const bool has_avx512_bw = ebx & (1 << 30); + const bool has_avx512_vl = ebx & (1 << 31); ++ // XCR0 bits 5, 6, and 7 ++ const bool avx512_os_enabled = (xcr0 & 0xE0) == 0xE0; + // printf("cpuid=%d=cpuid\n", has_avx512_bw && has_avx512_vl); +- return has_avx512_bw && has_avx512_vl; ++ return has_avx512_bw && has_avx512_vl && avx512_os_enabled; + #else + return false; + #endif