Re: -current Haskell ports aborting with SIGILL
James Cook writes: > On Mon, Apr 22, 2024 at 11:21:36AM GMT, Greg Steuck wrote: >> James, if you'd like to play with this on -current, please remove both >> patch-libraries_text_simdutf_simdutf_h and >> patch-libraries_text_cbits_measure_off_c >> >> This should make the offending check disappear and the pre-existing >> checks should correctly report the OS doesn't support these instructions >> on your machine. We can then confirm that avx-512 is working for people >> who previously had problems. > > It works. I upgraded to -current and built ghc (now version 9.6.5) > after deleting patches/patch-libraries_text_cbits_measure_off_c and > patches/patch-libraries_text_simdutf_simdutf_h. All three examples > now work (i.e. no SIGILL): Excellent, thanks James! Antoine, would you be willing to repeat this on an avx512 machine so that we can fix this in -current? > > - T.take 1 $ T.pack "aa" in ghci > - pandoc - cabal-bundler > > This is on the old AMD Phenom machine. > > If you're interested in fixing it on 7.5 (I don't know how much > effort patching -stable ports is) please let me know if there's any > more testing that would be helpful for the patch I sent earlier > that makes has_avx512_vl_bw unconditionally return false. I can > still boot this machine to 7.5 with a USB stick. I'd be hesitant to do more work on this as you are already unblocked. I suspect the number of people who use ghc-built packages on such CPUs is low enough in our user base that the effort is not likely to be well spent. Thanks Greg
Re: -current Haskell ports aborting with SIGILL
On Mon, Apr 22, 2024 at 11:21:36AM GMT, Greg Steuck wrote: > Stuart Henderson writes: > > > On 2024/04/22 10:30, Greg Steuck wrote: > >> > If it would help, I could update my old AMD machine to -current > >> > and check ghc works with the two patches removed, once I've finished > >> > trying out the patch I just sent for 7.5. > >> > >> Thanks James for working through this. Yes, we need the new development > >> to happen on -current ports with -current base system. We'd also want a > >> more complicated patch than the one you just sent because base supports > >> avx-512 now. > > > > We probably don't need any patches for this in -current now that avx-512 > > opcodes are supported by the OS. > > James, if you'd like to play with this on -current, please remove both > patch-libraries_text_simdutf_simdutf_h and > patch-libraries_text_cbits_measure_off_c > > This should make the offending check disappear and the pre-existing > checks should correctly report the OS doesn't support these instructions > on your machine. We can then confirm that avx-512 is working for people > who previously had problems. > > Thanks > Greg It works. I upgraded to -current and built ghc (now version 9.6.5) after deleting patches/patch-libraries_text_cbits_measure_off_c and patches/patch-libraries_text_simdutf_simdutf_h. All three examples now work (i.e. no SIGILL): - T.take 1 $ T.pack "aa" in ghci - pandoc
Re: -current Haskell ports aborting with SIGILL
Stuart Henderson writes: > On 2024/04/22 10:30, Greg Steuck wrote: >> > If it would help, I could update my old AMD machine to -current >> > and check ghc works with the two patches removed, once I've finished >> > trying out the patch I just sent for 7.5. >> >> Thanks James for working through this. Yes, we need the new development >> to happen on -current ports with -current base system. We'd also want a >> more complicated patch than the one you just sent because base supports >> avx-512 now. > > We probably don't need any patches for this in -current now that avx-512 > opcodes are supported by the OS. James, if you'd like to play with this on -current, please remove both patch-libraries_text_simdutf_simdutf_h and patch-libraries_text_cbits_measure_off_c This should make the offending check disappear and the pre-existing checks should correctly report the OS doesn't support these instructions on your machine. We can then confirm that avx-512 is working for people who previously had problems. Thanks Greg
Re: -current Haskell ports aborting with SIGILL
On 2024/04/22 10:30, Greg Steuck wrote: > > If it would help, I could update my old AMD machine to -current > > and check ghc works with the two patches removed, once I've finished > > trying out the patch I just sent for 7.5. > > Thanks James for working through this. Yes, we need the new development > to happen on -current ports with -current base system. We'd also want a > more complicated patch than the one you just sent because base supports > avx-512 now. We probably don't need any patches for this in -current now that avx-512 opcodes are supported by the OS.
Re: -current Haskell ports aborting with SIGILL
James Cook writes: > The line you linked to comes after a check for cpuid_bit::osxsave, > so I don't think it would get reached on machines that don't have > xgetbv, i.e. it should be fine. Cool, so maybe we need a patch which does this? I also just noticed that you patched libraries/text/cbits/measure_off.c whereas I was looking at libraries/text/simdutf/simdutf.h hence my confusion. The latter is shipped disabled, so we don't have to keep patching it: https://gitlab.haskell.org/ghc/ghc/-/blob/master/hadrian/src/Settings/Packages.hs#L197 > Similarly, the existing patch-libraries_text_simdutf_simdutf_h in > ports doesn't seem to cause a problem with my AMD machine. At least, > now I have pandoc working; not sure if I actually exercised that code > in simdutf.h. Yeah, looks like this is dead code from our POV. > If it would help, I could update my old AMD machine to -current > and check ghc works with the two patches removed, once I've finished > trying out the patch I just sent for 7.5. Thanks James for working through this. Yes, we need the new development to happen on -current ports with -current base system. We'd also want a more complicated patch than the one you just sent because base supports avx-512 now. https://codeberg.org/openbsd/src/commit/c0f33c9875c4ab47e986b698610630b6cbf21c6c Thanks Greg
Re: -current Haskell ports aborting with SIGILL
On Sun, Apr 21, 2024 at 09:03:30PM +, James Cook wrote: > On Sun, Apr 21, 2024 at 11:49:46AM -0700, Greg Steuck wrote: > > Stuart Henderson writes: > > > > > This is in the avx512 checks in the text library again, I think it must > > > be patch-libraries_text_cbits_measure_off_c (the simdutf one doesn't > > > explicitly check for xgetbv but it does check for osxsave so I think > > > wouldn't have executed the xgetbv opcode on this cpu). > > > > > > As -current does now have avx512 support in the kernel we probably > > > should be able to remove that patch, but it needs testing on an avx512 > > > machine as well as that old Phenom. > > > > Sadly I have neither nearby. Furthermore, sumdutf upstream doesn't have > > a fix for this issue either. So this will have to be original work. > > > > https://github.com/simdutf/simdutf/blob/master/include/simdutf/internal/isadetection.h#L232 > > > > Thanks > > Greg > > The following patch fixes at least the ghci example on my AMD machine: > > $ ghci > GHCi, version 9.6.4: https://www.haskell.org/ghc/ :? for help > ghci> import qualified Data.Text as T > ghci> T.take 1 $ T.pack "aa" > "a" Confirmed that the following are also fixed, which were broken before for me: - cabal-bundler (I just invoked it with no arguments) - building the pandoc port - pandoc /dev/null -- James
Re: -current Haskell ports aborting with SIGILL
On Sun, Apr 21, 2024 at 11:49:46AM -0700, Greg Steuck wrote: > Stuart Henderson writes: > > > This is in the avx512 checks in the text library again, I think it must > > be patch-libraries_text_cbits_measure_off_c (the simdutf one doesn't > > explicitly check for xgetbv but it does check for osxsave so I think > > wouldn't have executed the xgetbv opcode on this cpu). > > > > As -current does now have avx512 support in the kernel we probably > > should be able to remove that patch, but it needs testing on an avx512 > > machine as well as that old Phenom. > > Sadly I have neither nearby. Furthermore, sumdutf upstream doesn't have > a fix for this issue either. So this will have to be original work. > > https://github.com/simdutf/simdutf/blob/master/include/simdutf/internal/isadetection.h#L232 > > Thanks > Greg The line you linked to comes after a check for cpuid_bit::osxsave, so I don't think it would get reached on machines that don't have xgetbv, i.e. it should be fine. Similarly, the existing patch-libraries_text_simdutf_simdutf_h in ports doesn't seem to cause a problem with my AMD machine. At least, now I have pandoc working; not sure if I actually exercised that code in simdutf.h. If it would help, I could update my old AMD machine to -current and check ghc works with the two patches removed, once I've finished trying out the patch I just sent for 7.5. -- James
Re: -current Haskell ports aborting with SIGILL
On Sun, Apr 21, 2024 at 11:49:46AM -0700, Greg Steuck wrote: > Stuart Henderson writes: > > > This is in the avx512 checks in the text library again, I think it must > > be patch-libraries_text_cbits_measure_off_c (the simdutf one doesn't > > explicitly check for xgetbv but it does check for osxsave so I think > > wouldn't have executed the xgetbv opcode on this cpu). > > > > As -current does now have avx512 support in the kernel we probably > > should be able to remove that patch, but it needs testing on an avx512 > > machine as well as that old Phenom. > > Sadly I have neither nearby. Furthermore, sumdutf upstream doesn't have > a fix for this issue either. So this will have to be original work. > > https://github.com/simdutf/simdutf/blob/master/include/simdutf/internal/isadetection.h#L232 > > Thanks > Greg The following patch fixes at least the ghci example on my AMD machine: $ ghci GHCi, version 9.6.4: https://www.haskell.org/ghc/ :? for help ghci> import qualified Data.Text as T ghci> T.take 1 $ T.pack "aa" "a" It simply patches has_avx512_vl_bw to always return false. Would this be a good candidate for the stable branch, which doesn't support AVX-512 anyway? I will try building pandoc now with the new ghc to see if it's fixed. (I'm a little fuzzy on how libraries are included... hopefully pandoc will take the patched library from the updated ghc? Or does it not work like that?) -- James Index: Makefile === RCS file: /cvs/ports/lang/ghc/Makefile,v diff -u -p -u -p -r1.223.2.1 Makefile --- Makefile21 Mar 2024 20:52:54 - 1.223.2.1 +++ Makefile21 Apr 2024 20:59:59 - @@ -14,7 +14,7 @@ USE_NOEXECONLY = Yes USE_NOBTCFI = Yes GHC_VERSION = 9.6.4 -REVISION = 2 +REVISION = 3 DISTNAME = ghc-${GHC_VERSION} CATEGORIES = lang devel HOMEPAGE = https://www.haskell.org/ghc/ Index: patches/patch-libraries_text_cbits_measure_off_c === RCS file: /cvs/ports/lang/ghc/patches/patch-libraries_text_cbits_measure_off_c,v diff -u -p -u -p -r1.2 patch-libraries_text_cbits_measure_off_c --- patches/patch-libraries_text_cbits_measure_off_c23 Feb 2024 19:45:04 - 1.2 +++ patches/patch-libraries_text_cbits_measure_off_c21 Apr 2024 20:59:59 - @@ -1,23 +1,24 @@ -Don't attempt to use avx512 kernels when the OS doesn't support them +Don't attempt to use avx512 kernels, because OpenBSD 7.5 doesn't support them Index: libraries/text/cbits/measure_off.c --- libraries/text/cbits/measure_off.c.orig +++ libraries/text/cbits/measure_off.c -@@ -44,12 +44,16 @@ +@@ -42,17 +42,8 @@ + + #if defined(__x86_64__) && defined(COMPILER_SUPPORTS_AVX512) bool has_avx512_vl_bw() { - #if (__GNUC__ >= 7 || __GNUC__ == 6 && __GNUC_MINOR__ >= 3) || defined(__clang_major__) - uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0; -+ uint64_t xcr0; - __get_cpuid_count(7, 0, , , , ); - // https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features -+ __asm__("xgetbv\n\t" : "=a" (xcr0) : "c" (0)); - const bool has_avx512_bw = ebx & (1 << 30); - const bool has_avx512_vl = ebx & (1 << 31); -+ // XCR0 bits 5, 6, and 7 -+ const bool avx512_os_enabled = (xcr0 & 0xE0) == 0xE0; - // printf("cpuid=%d=cpuid\n", has_avx512_bw && has_avx512_vl); +-#if (__GNUC__ >= 7 || __GNUC__ == 6 && __GNUC_MINOR__ >= 3) || defined(__clang_major__) +- uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0; +- __get_cpuid_count(7, 0, , , , ); +- // https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features +- const bool has_avx512_bw = ebx & (1 << 30); +- const bool has_avx512_vl = ebx & (1 << 31); +- // printf("cpuid=%d=cpuid\n", has_avx512_bw && has_avx512_vl); - return has_avx512_bw && has_avx512_vl; -+ return has_avx512_bw && has_avx512_vl && avx512_os_enabled; - #else +-#else ++ /* OpenBSD 7.5 doesn't support AVX-512. */ return false; +-#endif + } #endif +
Re: -current Haskell ports aborting with SIGILL
Stuart Henderson writes: > This is in the avx512 checks in the text library again, I think it must > be patch-libraries_text_cbits_measure_off_c (the simdutf one doesn't > explicitly check for xgetbv but it does check for osxsave so I think > wouldn't have executed the xgetbv opcode on this cpu). > > As -current does now have avx512 support in the kernel we probably > should be able to remove that patch, but it needs testing on an avx512 > machine as well as that old Phenom. Sadly I have neither nearby. Furthermore, sumdutf upstream doesn't have a fix for this issue either. So this will have to be original work. https://github.com/simdutf/simdutf/blob/master/include/simdutf/internal/isadetection.h#L232 Thanks Greg
Re: -current Haskell ports aborting with SIGILL
On 2024/04/19 17:58, Greg Steuck wrote: > James Cook writes: > > > Here are some results of debugging with lldb. > > > > > > With cabal-bundler and pandoc, it seems to be the xgetbv instruction > > itself: > > > > > > $ lldb /usr/local/bin/cabal-bundler > > (lldb) target create "/usr/local/bin/cabal-bundler" > > Current executable set to '/usr/local/bin/cabal-bundler' (x86_64). > > (lldb) run > > Process 90738 launched: '/usr/local/bin/cabal-bundler' (x86_64) > > Process 90738 stopped > > * thread #1, stop reason = signal SIGILL > > frame #0: 0x004c12ba cabal-bundler`___lldb_unnamed_symbol522 + > > 90 > > cabal-bundler`___lldb_unnamed_symbol522: > > -> 0x4c12ba <+90>: xgetbv > > Unless I'm missing something, xgetvb is not available in your CPU. > > cpu0: AMD Phenom(tm) II X3 710 Processor, 2611.95 MHz, 10-04-02, patch > 01db > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,3DNOW2,3DNOW,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,HWPSTATE,ITSC > > XGETBV1 is missing in the above and my cursory reading of > https://en.wikipedia.org/wiki/X86-64#AMD64 supports this conclusion. > > Somebody will have to adapt the checking code to be conditional on this > instruction presence if it's deemed important enough to support this CPU. This is in the avx512 checks in the text library again, I think it must be patch-libraries_text_cbits_measure_off_c (the simdutf one doesn't explicitly check for xgetbv but it does check for osxsave so I think wouldn't have executed the xgetbv opcode on this cpu). As -current does now have avx512 support in the kernel we probably should be able to remove that patch, but it needs testing on an avx512 machine as well as that old Phenom.
Re: -current Haskell ports aborting with SIGILL
James Cook writes: > Here are some results of debugging with lldb. > > > With cabal-bundler and pandoc, it seems to be the xgetbv instruction > itself: > > > $ lldb /usr/local/bin/cabal-bundler > (lldb) target create "/usr/local/bin/cabal-bundler" > Current executable set to '/usr/local/bin/cabal-bundler' (x86_64). > (lldb) run > Process 90738 launched: '/usr/local/bin/cabal-bundler' (x86_64) > Process 90738 stopped > * thread #1, stop reason = signal SIGILL > frame #0: 0x004c12ba cabal-bundler`___lldb_unnamed_symbol522 + 90 > cabal-bundler`___lldb_unnamed_symbol522: > -> 0x4c12ba <+90>: xgetbv Unless I'm missing something, xgetvb is not available in your CPU. cpu0: AMD Phenom(tm) II X3 710 Processor, 2611.95 MHz, 10-04-02, patch 01db cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,3DNOW2,3DNOW,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,HWPSTATE,ITSC XGETBV1 is missing in the above and my cursory reading of https://en.wikipedia.org/wiki/X86-64#AMD64 supports this conclusion. Somebody will have to adapt the checking code to be conditional on this instruction presence if it's deemed important enough to support this CPU. Thanks Greg
Re: -current Haskell ports aborting with SIGILL
On Fri, Apr 19, 2024 at 12:04:54AM +, James Cook wrote: > On Sun, Feb 18, 2024 at 08:35:26AM -0800, Evan Silberman wrote: > > Stuart Henderson wrote: > > > On 2024/02/18 09:02, Stuart Henderson wrote: > > > > On 2024/02/17 22:08, Greg Steuck wrote: > > > > > Oh wow, this is becoming eerily similar to the failures aja@ is > > > > > getting. Do > > > > > dig more into this! > > > > > > > > Antoine, can you send a dmesg from one of the exopi VMs, please? > > > > > > - specifically I am wondering if it could be someething to do with AVX, > > > with AVX512 being the most likely to cause problems - Evan's machine > > > does have this - my intel 11th gen doesn't because it has a mix of > > > P+E cores, E cores don't implement it, so they disable it on the P > > > cores too. > > > > > > ghc *does* have some code relating to AVX512. > > > > Breakthrough, ignore the previous reproducer and any association with > > template haskell. I can get a crash in GHCI very simply: > > > > ~ $ ghci > > GHCi, version 9.6.4: https://www.haskell.org/ghc/ :? for help > > ghci> import qualified Data.Text as T > > ghci> T.take 1 $ T.pack "aa" > > "Illegal instruction (core dumped) > > I'm seeing this on my OpenBSD 7.5 machine with an old AMD cpu (dmesg > follows signature). I have ghc-9.6.4p2 (which I think is after the > fix on this thread). Also pandoc doesn't work. > > $ pkg_info -I ghc pandoc > ghc-9.6.4p2 compiler for the functional language Haskell > pandoc-3.1.12.2 convert between markup and document formats > $ echo | pandoc -o tmp/test.html > Illegal instruction (core dumped) > $ ghci > GHCi, version 9.6.4: https://www.haskell.org/ghc/ :? for help > ghci> import qualified Data.Text as T > ghci> T.take 1 $ T.pack "aa" > "Illegal instruction (core dumped) > $ > > I also saw ghc die with a SIGILL when I tried to build pandoc from > ports (checked out from cvs with -rOPENBSD_7_5). It happened when > cabal was trying to build unicode-collation-0.1.3.6. Here are some results of debugging with lldb. With cabal-bundler and pandoc, it seems to be the xgetbv instruction itself: $ lldb /usr/local/bin/cabal-bundler (lldb) target create "/usr/local/bin/cabal-bundler" Current executable set to '/usr/local/bin/cabal-bundler' (x86_64). (lldb) run Process 90738 launched: '/usr/local/bin/cabal-bundler' (x86_64) Process 90738 stopped * thread #1, stop reason = signal SIGILL frame #0: 0x004c12ba cabal-bundler`___lldb_unnamed_symbol522 + 90 cabal-bundler`___lldb_unnamed_symbol522: -> 0x4c12ba <+90>: xgetbv 0x4c12bd <+93>: notl %eax 0x4c12bf <+95>: testb $-0x20, %al 0x4c12c1 <+97>: leaq 0x58(%rip), %rcx ; ___lldb_unnamed_symbol523 $ lldb pandoc /dev/null (lldb) target create "pandoc" Current executable set to '/usr/local/bin/pandoc' (x86_64). (lldb) settings set -- target.run-args "/dev/null" (lldb) run Process 25189 launched: '/usr/local/bin/pandoc' (x86_64) [WARNING] Could not deduce format from file extension Defaulting to markdown Process 25189 stopped * thread #1, stop reason = signal SIGILL frame #0: 0x057697fa pandoc`___lldb_unnamed_symbol1367 + 90 pandoc`___lldb_unnamed_symbol1367: -> 0x57697fa <+90>: xgetbv 0x57697fd <+93>: notl %eax 0x57697ff <+95>: testb $-0x20, %al 0x5769801 <+97>: leaq 0x58(%rip), %rcx ; ___lldb_unnamed_symbol1368 (lldb) I tried doing the same for Evan's ghci example, but lldb did not automatically print assembly output as it did for pandoc. I don't really know how to use lldb. I tried the "di" command but I don't know if it's doing the right thing: $ lldb /usr/local/lib/ghc-9.6.4/bin/ghc-9.6.4 -- --interactive (lldb) target create "/usr/local/lib/ghc-9.6.4/bin/ghc-9.6.4" Current executable set to '/usr/local/lib/ghc-9.6.4/bin/ghc-9.6.4' (x86_64). (lldb) settings set -- target.run-args "--interactive" (lldb) run Process 14800 launched: '/usr/local/lib/ghc-9.6.4/bin/ghc-9.6.4' (x86_64) GHCi, version 9.6.4: https://www.haskell.org/ghc/ :? for help ghci> import qualified Data.Text as T ghci> T.take 1 $ T.pack "aa" "Process 14800 stopped * thread #1, stop reason = signal SIGILL frame #0: 0x0002d2a1242b libc.so.99.0`_thread_sys_futex at -:2 warning: This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited. (lldb) di libc.so.99.0`_thread_sys_futex: 0x2d2a12410 <+0>: endbr64 0x2d2a12414 <+4>: movq 0x849ad(%rip), %r11 ; __retguard__thread_sys_futex 0x2d2a1241b <+11>: xorq (%rsp), %r11 0x2d2a1241f <+15>: pushq %r11 0x2d2a12421 <+17>: movl $0x53, %eax 0x2d2a12426 <+22>: movq %rcx, %r10 0x2d2a12429 <+25>: syscall -> 0x2d2a1242b <+27>: jae0x2d2a1243c ; <+44> 0x2d2a1242d <+29>: movl %eax, %fs:0x20 0x2d2a12435 <+37>: movq $-0x1, %rax 0x2d2a1243c <+44>: popq %r11 0x2d2a1243e <+46>: xorq (%rsp), %r11 0x2d2a12442 <+50>: cmpq
Re: -current Haskell ports aborting with SIGILL
On Sun, Feb 18, 2024 at 08:35:26AM -0800, Evan Silberman wrote: > Stuart Henderson wrote: > > On 2024/02/18 09:02, Stuart Henderson wrote: > > > On 2024/02/17 22:08, Greg Steuck wrote: > > > > Oh wow, this is becoming eerily similar to the failures aja@ is > > > > getting. Do > > > > dig more into this! > > > > > > Antoine, can you send a dmesg from one of the exopi VMs, please? > > > > - specifically I am wondering if it could be someething to do with AVX, > > with AVX512 being the most likely to cause problems - Evan's machine > > does have this - my intel 11th gen doesn't because it has a mix of > > P+E cores, E cores don't implement it, so they disable it on the P > > cores too. > > > > ghc *does* have some code relating to AVX512. > > Breakthrough, ignore the previous reproducer and any association with > template haskell. I can get a crash in GHCI very simply: > > ~ $ ghci > GHCi, version 9.6.4: https://www.haskell.org/ghc/ :? for help > ghci> import qualified Data.Text as T > ghci> T.take 1 $ T.pack "aa" > "Illegal instruction (core dumped) I'm seeing this on my OpenBSD 7.5 machine with an old AMD cpu (dmesg follows signature). I have ghc-9.6.4p2 (which I think is after the fix on this thread). Also pandoc doesn't work. $ pkg_info -I ghc pandoc ghc-9.6.4p2 compiler for the functional language Haskell pandoc-3.1.12.2 convert between markup and document formats $ echo | pandoc -o tmp/test.html Illegal instruction (core dumped) $ ghci GHCi, version 9.6.4: https://www.haskell.org/ghc/ :? for help ghci> import qualified Data.Text as T ghci> T.take 1 $ T.pack "aa" "Illegal instruction (core dumped) $ I also saw ghc die with a SIGILL when I tried to build pandoc from ports (checked out from cvs with -rOPENBSD_7_5). It happened when cabal was trying to build unicode-collation-0.1.3.6. I haven't followed the thread closely enough to understand what's going on, but hopefully this is helpful. I'm pretty sure pandoc worked on this machine when it ran OpenBSD 7.4. -- James OpenBSD 7.5 (GENERIC.MP) #82: Wed Mar 20 15:48:40 MDT 2024 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 17160474624 (16365MB) avail mem = 16619225088 (15849MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf0100 (59 entries) bios0: vendor Award Software International, Inc. version "F7" date 11/20/2009 bios0: Gigabyte Technology Co., Ltd. GA-MA790XT-UD4P acpi0 at bios0: ACPI 1.0 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT HPET MCFG TAMG APIC acpi0: wakeup devices PCI0(S5) USB0(S3) USB1(S3) USB2(S3) USB3(S3) USB4(S3) USB5(S3) USB6(S3) SBAZ(S4) P2P_(S5) PCE2(S4) PCE3(S4) PCE4(S4) PCE5(S4) PCE6(S4) PCE7(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 32 bits acpihpet0 at acpi0: 14318180 Hz acpimcfg0 at acpi0 acpimcfg0: addr 0xe000, bus 0-255 acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD Phenom(tm) II X3 710 Processor, 2611.95 MHz, 10-04-02, patch 01db cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,3DNOW2,3DNOW,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,HWPSTATE,ITSC cpu0: 64KB 64b/line 2-way D-cache, 64KB 64b/line 2-way I-cache cpu0: 512KB 64b/line 16-way L2 cache cpu0: smt 0, core 0, package 0 cpu0: AMD erratum 721 detected and fixed mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 200MHz cpu0: mwait min=64, max=64, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: AMD Phenom(tm) II X3 710 Processor, 2611.97 MHz, 10-04-02, patch 01db cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,3DNOW2,3DNOW,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,HWPSTATE,ITSC cpu1: 64KB 64b/line 2-way D-cache, 64KB 64b/line 2-way I-cache cpu1: 512KB 64b/line 16-way L2 cache cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 2 (application processor) cpu2: AMD Phenom(tm) II X3 710 Processor, 2612.03 MHz, 10-04-02, patch 01db cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,3DNOW2,3DNOW,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,HWPSTATE,ITSC cpu2: 64KB 64b/line 2-way D-cache, 64KB 64b/line 2-way I-cache cpu2: 512KB 64b/line 16-way L2 cache cpu2: smt 0, core 2, package 0 ioapic0 at mainbus0: apid 2 pa 0xfec0, version 21, 24 pins, remapped acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 4 (P2P_) acpiprt2 at acpi0: bus 1 (PCE2) acpiprt3 at acpi0: bus -1 (PCE3) acpiprt4 at acpi0: bus 2 (PCE4) acpiprt5 at acpi0: bus -1 (PCE5)
Re: -current Haskell ports aborting with SIGILL
oops, thanks - fix committed. On 2024/02/23 11:35, Evan Silberman wrote: > Stuart Henderson wrote: > > Index: patches/patch-libraries_text_cbits_measure_off_c > > === > > RCS file: patches/patch-libraries_text_cbits_measure_off_c > > diff -N patches/patch-libraries_text_cbits_measure_off_c > > --- /dev/null 1 Jan 1970 00:00:00 - > > +++ patches/patch-libraries_text_cbits_measure_off_c21 Feb 2024 > > 12:35:13 - > > @@ -0,0 +1,23 @@ > > +Don't attempt to use avx512 kernels when the OS doesn't support them > > + > > +Index: libraries/text/cbits/measure_off.c > > +--- libraries/text/cbits/measure_off.c.orig > > libraries/text/cbits/measure_off.c > > +@@ -44,12 +44,16 @@ > > + bool has_avx512_vl_bw() { > > + #if (__GNUC__ >= 7 || __GNUC__ == 6 && __GNUC_MINOR__ >= 3) || > > defined(__clang_major__) > > + uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0; > > ++ uint64_t xcr0; > > + __get_cpuid_count(7, 0, , , , ); > > + // https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features > > ++ // __asm__("xgetbv\n\t" : "=a" (xcr0) : "c" (0)); > > Whoopsie daisy, looks like this committed with the __asm__ that actually > does the thing commented out. (Only spotted this because I sent the > patch upstream, cf https://github.com/haskell/text/pull/566) > > Evan
Re: -current Haskell ports aborting with SIGILL
Stuart Henderson wrote: > Index: patches/patch-libraries_text_cbits_measure_off_c > === > RCS file: patches/patch-libraries_text_cbits_measure_off_c > diff -N patches/patch-libraries_text_cbits_measure_off_c > --- /dev/null 1 Jan 1970 00:00:00 - > +++ patches/patch-libraries_text_cbits_measure_off_c 21 Feb 2024 12:35:13 > - > @@ -0,0 +1,23 @@ > +Don't attempt to use avx512 kernels when the OS doesn't support them > + > +Index: libraries/text/cbits/measure_off.c > +--- libraries/text/cbits/measure_off.c.orig > libraries/text/cbits/measure_off.c > +@@ -44,12 +44,16 @@ > + bool has_avx512_vl_bw() { > + #if (__GNUC__ >= 7 || __GNUC__ == 6 && __GNUC_MINOR__ >= 3) || > defined(__clang_major__) > + uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0; > ++ uint64_t xcr0; > + __get_cpuid_count(7, 0, , , , ); > + // https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features > ++ // __asm__("xgetbv\n\t" : "=a" (xcr0) : "c" (0)); Whoopsie daisy, looks like this committed with the __asm__ that actually does the thing commented out. (Only spotted this because I sent the patch upstream, cf https://github.com/haskell/text/pull/566) Evan
Re: -current Haskell ports aborting with SIGILL
> Evan, if you have abundant time, could you try to run `make test` with > and without the patches? `make clean` in the middle is probably the > easiest way to ensure the patches get applied as expected. Did `make test` in lang/ghc with and without the patches and got the same three failed tests in both, with no evident relation to the text package. Further, I ran the tests in the patched `text` library and they ran fine; had to make a minor modification for the tests to build due to QuickCheck drift but that's of no significance. So it all looks good to me. Phew! Evan
Re: -current Haskell ports aborting with SIGILL
Stuart Henderson writes: > On 2024/02/18 16:56, Evan Silberman wrote: >> Something like this? I'm out of my depth and heavily pattern-matching >> against the fix to simdutf and other references. Genuinely no idea if >> I'm using inline asm correctly, etc. Works on my machine, however. > > That seems right to me. I don't have an AVX512 machine handy though. > Here's a combined diff. (No idea what happened with the distinfo that's > in tree but let's normalise it while there). > > The simdutf update is merged (https://github.com/haskell/text/pull/564), > so we probably want to try a PR for your diff against > https://github.com/haskell/text/blob/master/cbits/measure_off.c > if everyone's here is happy with it. Thanks Evan & Stuart for resolving this. I'm running `make test` on my end, but I imagine it will work fine. So, OK gnezdo@. Evan, if you have abundant time, could you try to run `make test` with and without the patches? `make clean` in the middle is probably the easiest way to ensure the patches get applied as expected. > > Index: Makefile > === > RCS file: /cvs/ports/lang/ghc/Makefile,v > retrieving revision 1.220 > diff -u -p -r1.220 Makefile > --- Makefile 5 Feb 2024 01:49:50 - 1.220 > +++ Makefile 21 Feb 2024 12:35:13 - > @@ -14,6 +14,7 @@ USE_NOEXECONLY =Yes > USE_NOBTCFI =Yes > > GHC_VERSION =9.6.4 > +REVISION = 0 > DISTNAME = ghc-${GHC_VERSION} > CATEGORIES = lang devel > HOMEPAGE = https://www.haskell.org/ghc/ > Index: distinfo > === > RCS file: /cvs/ports/lang/ghc/distinfo,v > retrieving revision 1.73 > diff -u -p -r1.73 distinfo > --- distinfo 5 Feb 2024 01:49:28 - 1.73 > +++ distinfo 21 Feb 2024 12:35:13 - > @@ -1,10 +1,10 @@ > -SHA256 (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = > CedJ29vBFZyl1e+DgcUqPfjHMDRKmEOsXP9gH4Wka6E= > -SHA256 (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = > Nb3trqnIF8H5kfKEkeGLr+sl4rPeFsbW/gfkelRprrY= > SHA256 (ghc/ghc-9.6.4-src.tar.xz) = > EL8luLBxdP3ZhotcDFbBfA7x7ctiR7S4ZL6TNlG/1MA= > SHA256 (ghc/ghc-9.6.4-testsuite.tar.xz) = > bhMoL76//b+gpJiJQ3REyakM/ldgxHlpzUJFhUwzjXM= > +SHA256 (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = > CedJ29vBFZyl1e+DgcUqPfjHMDRKmEOsXP9gH4Wka6E= > +SHA256 (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = > Nb3trqnIF8H5kfKEkeGLr+sl4rPeFsbW/gfkelRprrY= > SHA256 (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = > wMMJfyP7Pr6xjb/tj9Kz5iZugGr6+duMwJ23aGsUWy0= > SIZE (ghc/ghc-9.6.4-src.tar.xz) = 29451856 > SIZE (ghc/ghc-9.6.4-testsuite.tar.xz) = 7075820 > SIZE (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = 74706384 > SIZE (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = 3544885 > -SIZE (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = 2125322 > \ No newline at end of file > +SIZE (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = 2125322 > Index: patches/patch-libraries_text_cbits_measure_off_c > === > RCS file: patches/patch-libraries_text_cbits_measure_off_c > diff -N patches/patch-libraries_text_cbits_measure_off_c > --- /dev/null 1 Jan 1970 00:00:00 - > +++ patches/patch-libraries_text_cbits_measure_off_c 21 Feb 2024 12:35:13 > - > @@ -0,0 +1,23 @@ > +Don't attempt to use avx512 kernels when the OS doesn't support them > + > +Index: libraries/text/cbits/measure_off.c > +--- libraries/text/cbits/measure_off.c.orig > libraries/text/cbits/measure_off.c > +@@ -44,12 +44,16 @@ > + bool has_avx512_vl_bw() { > + #if (__GNUC__ >= 7 || __GNUC__ == 6 && __GNUC_MINOR__ >= 3) || > defined(__clang_major__) > + uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0; > ++ uint64_t xcr0; > + __get_cpuid_count(7, 0, , , , ); > + // https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features > ++ // __asm__("xgetbv\n\t" : "=a" (xcr0) : "c" (0)); > + const bool has_avx512_bw = ebx & (1 << 30); > + const bool has_avx512_vl = ebx & (1 << 31); > ++ // XCR0 bits 5, 6, and 7 > ++ const bool avx512_os_enabled = (xcr0 & 0xE0) == 0xE0; > + // printf("cpuid=%d=cpuid\n", has_avx512_bw && has_avx512_vl); > +- return has_avx512_bw && has_avx512_vl; > ++ return has_avx512_bw && has_avx512_vl && avx512_os_enabled; > + #else > + return false; > + #endif > Index: patches/patch-libraries_text_simdutf_simdutf_h > === > RCS file: patches/patch-libraries_text_simdutf_simdutf_h > diff -N patches/patch-libraries_text_simdutf_simdutf_h > --- /dev/null 1 Jan 1970 00:00:00 - > +++ patches/patch-libraries_text_simdutf_simdutf_h21 Feb 2024 12:35:13 > - > @@ -0,0 +1,78 @@ > +https://github.com/simdutf/simdutf/commit/55b107f609f5f63880db650a92861ae84cb10abe > +(haskell/text upstream has now updated to a version past this commit) > + > +Index: libraries/text/simdutf/simdutf.h > +---
Re: -current Haskell ports aborting with SIGILL
Stuart Henderson wrote: > On 2024/02/18 16:56, Evan Silberman wrote: > > Something like this? I'm out of my depth and heavily pattern-matching > > against the fix to simdutf and other references. Genuinely no idea if > > I'm using inline asm correctly, etc. Works on my machine, however. > > That seems right to me. I don't have an AVX512 machine handy though. > Here's a combined diff. (No idea what happened with the distinfo that's > in tree but let's normalise it while there). > > The simdutf update is merged (https://github.com/haskell/text/pull/564), > so we probably want to try a PR for your diff against > https://github.com/haskell/text/blob/master/cbits/measure_off.c > if everyone's here is happy with it. Hi Stuart, Thanks for putting the whole patch together. Having rebuilt the GHC package with a patched text library, I can build a working pandoc and my own code. So this all looks good to me. I'll take my diff to measure_off.c upstream when I get a moment. It won't actually trickle down to us until it gets bundled into a newer GHC and then we adopt it so we may carry the patch for a while. Evan > > Index: Makefile > === > RCS file: /cvs/ports/lang/ghc/Makefile,v > retrieving revision 1.220 > diff -u -p -r1.220 Makefile > --- Makefile 5 Feb 2024 01:49:50 - 1.220 > +++ Makefile 21 Feb 2024 12:35:13 - > @@ -14,6 +14,7 @@ USE_NOEXECONLY =Yes > USE_NOBTCFI =Yes > > GHC_VERSION =9.6.4 > +REVISION = 0 > DISTNAME = ghc-${GHC_VERSION} > CATEGORIES = lang devel > HOMEPAGE = https://www.haskell.org/ghc/ > Index: distinfo > === > RCS file: /cvs/ports/lang/ghc/distinfo,v > retrieving revision 1.73 > diff -u -p -r1.73 distinfo > --- distinfo 5 Feb 2024 01:49:28 - 1.73 > +++ distinfo 21 Feb 2024 12:35:13 - > @@ -1,10 +1,10 @@ > -SHA256 (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = > CedJ29vBFZyl1e+DgcUqPfjHMDRKmEOsXP9gH4Wka6E= > -SHA256 (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = > Nb3trqnIF8H5kfKEkeGLr+sl4rPeFsbW/gfkelRprrY= > SHA256 (ghc/ghc-9.6.4-src.tar.xz) = > EL8luLBxdP3ZhotcDFbBfA7x7ctiR7S4ZL6TNlG/1MA= > SHA256 (ghc/ghc-9.6.4-testsuite.tar.xz) = > bhMoL76//b+gpJiJQ3REyakM/ldgxHlpzUJFhUwzjXM= > +SHA256 (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = > CedJ29vBFZyl1e+DgcUqPfjHMDRKmEOsXP9gH4Wka6E= > +SHA256 (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = > Nb3trqnIF8H5kfKEkeGLr+sl4rPeFsbW/gfkelRprrY= > SHA256 (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = > wMMJfyP7Pr6xjb/tj9Kz5iZugGr6+duMwJ23aGsUWy0= > SIZE (ghc/ghc-9.6.4-src.tar.xz) = 29451856 > SIZE (ghc/ghc-9.6.4-testsuite.tar.xz) = 7075820 > SIZE (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = 74706384 > SIZE (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = 3544885 > -SIZE (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = 2125322 > \ No newline at end of file > +SIZE (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = 2125322 > Index: patches/patch-libraries_text_cbits_measure_off_c > === > RCS file: patches/patch-libraries_text_cbits_measure_off_c > diff -N patches/patch-libraries_text_cbits_measure_off_c > --- /dev/null 1 Jan 1970 00:00:00 - > +++ patches/patch-libraries_text_cbits_measure_off_c 21 Feb 2024 12:35:13 > - > @@ -0,0 +1,23 @@ > +Don't attempt to use avx512 kernels when the OS doesn't support them > + > +Index: libraries/text/cbits/measure_off.c > +--- libraries/text/cbits/measure_off.c.orig > libraries/text/cbits/measure_off.c > +@@ -44,12 +44,16 @@ > + bool has_avx512_vl_bw() { > + #if (__GNUC__ >= 7 || __GNUC__ == 6 && __GNUC_MINOR__ >= 3) || > defined(__clang_major__) > + uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0; > ++ uint64_t xcr0; > + __get_cpuid_count(7, 0, , , , ); > + // https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features > ++ // __asm__("xgetbv\n\t" : "=a" (xcr0) : "c" (0)); > + const bool has_avx512_bw = ebx & (1 << 30); > + const bool has_avx512_vl = ebx & (1 << 31); > ++ // XCR0 bits 5, 6, and 7 > ++ const bool avx512_os_enabled = (xcr0 & 0xE0) == 0xE0; > + // printf("cpuid=%d=cpuid\n", has_avx512_bw && has_avx512_vl); > +- return has_avx512_bw && has_avx512_vl; > ++ return has_avx512_bw && has_avx512_vl && avx512_os_enabled; > + #else > + return false; > + #endif > Index: patches/patch-libraries_text_simdutf_simdutf_h > === > RCS file: patches/patch-libraries_text_simdutf_simdutf_h > diff -N patches/patch-libraries_text_simdutf_simdutf_h > --- /dev/null 1 Jan 1970 00:00:00 - > +++ patches/patch-libraries_text_simdutf_simdutf_h21 Feb 2024 12:35:13 > - > @@ -0,0 +1,78 @@ > +https://github.com/simdutf/simdutf/commit/55b107f609f5f63880db650a92861ae84cb10abe > +(haskell/text upstream has now updated to a version
Re: -current Haskell ports aborting with SIGILL
On 2024/02/18 16:56, Evan Silberman wrote: > Something like this? I'm out of my depth and heavily pattern-matching > against the fix to simdutf and other references. Genuinely no idea if > I'm using inline asm correctly, etc. Works on my machine, however. That seems right to me. I don't have an AVX512 machine handy though. Here's a combined diff. (No idea what happened with the distinfo that's in tree but let's normalise it while there). The simdutf update is merged (https://github.com/haskell/text/pull/564), so we probably want to try a PR for your diff against https://github.com/haskell/text/blob/master/cbits/measure_off.c if everyone's here is happy with it. Index: Makefile === RCS file: /cvs/ports/lang/ghc/Makefile,v retrieving revision 1.220 diff -u -p -r1.220 Makefile --- Makefile5 Feb 2024 01:49:50 - 1.220 +++ Makefile21 Feb 2024 12:35:13 - @@ -14,6 +14,7 @@ USE_NOEXECONLY = Yes USE_NOBTCFI = Yes GHC_VERSION = 9.6.4 +REVISION = 0 DISTNAME = ghc-${GHC_VERSION} CATEGORIES = lang devel HOMEPAGE = https://www.haskell.org/ghc/ Index: distinfo === RCS file: /cvs/ports/lang/ghc/distinfo,v retrieving revision 1.73 diff -u -p -r1.73 distinfo --- distinfo5 Feb 2024 01:49:28 - 1.73 +++ distinfo21 Feb 2024 12:35:13 - @@ -1,10 +1,10 @@ -SHA256 (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = CedJ29vBFZyl1e+DgcUqPfjHMDRKmEOsXP9gH4Wka6E= -SHA256 (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = Nb3trqnIF8H5kfKEkeGLr+sl4rPeFsbW/gfkelRprrY= SHA256 (ghc/ghc-9.6.4-src.tar.xz) = EL8luLBxdP3ZhotcDFbBfA7x7ctiR7S4ZL6TNlG/1MA= SHA256 (ghc/ghc-9.6.4-testsuite.tar.xz) = bhMoL76//b+gpJiJQ3REyakM/ldgxHlpzUJFhUwzjXM= +SHA256 (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = CedJ29vBFZyl1e+DgcUqPfjHMDRKmEOsXP9gH4Wka6E= +SHA256 (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = Nb3trqnIF8H5kfKEkeGLr+sl4rPeFsbW/gfkelRprrY= SHA256 (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = wMMJfyP7Pr6xjb/tj9Kz5iZugGr6+duMwJ23aGsUWy0= SIZE (ghc/ghc-9.6.4-src.tar.xz) = 29451856 SIZE (ghc/ghc-9.6.4-testsuite.tar.xz) = 7075820 SIZE (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = 74706384 SIZE (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = 3544885 -SIZE (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = 2125322 \ No newline at end of file +SIZE (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = 2125322 Index: patches/patch-libraries_text_cbits_measure_off_c === RCS file: patches/patch-libraries_text_cbits_measure_off_c diff -N patches/patch-libraries_text_cbits_measure_off_c --- /dev/null 1 Jan 1970 00:00:00 - +++ patches/patch-libraries_text_cbits_measure_off_c21 Feb 2024 12:35:13 - @@ -0,0 +1,23 @@ +Don't attempt to use avx512 kernels when the OS doesn't support them + +Index: libraries/text/cbits/measure_off.c +--- libraries/text/cbits/measure_off.c.orig libraries/text/cbits/measure_off.c +@@ -44,12 +44,16 @@ + bool has_avx512_vl_bw() { + #if (__GNUC__ >= 7 || __GNUC__ == 6 && __GNUC_MINOR__ >= 3) || defined(__clang_major__) + uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0; ++ uint64_t xcr0; + __get_cpuid_count(7, 0, , , , ); + // https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features ++ // __asm__("xgetbv\n\t" : "=a" (xcr0) : "c" (0)); + const bool has_avx512_bw = ebx & (1 << 30); + const bool has_avx512_vl = ebx & (1 << 31); ++ // XCR0 bits 5, 6, and 7 ++ const bool avx512_os_enabled = (xcr0 & 0xE0) == 0xE0; + // printf("cpuid=%d=cpuid\n", has_avx512_bw && has_avx512_vl); +- return has_avx512_bw && has_avx512_vl; ++ return has_avx512_bw && has_avx512_vl && avx512_os_enabled; + #else + return false; + #endif Index: patches/patch-libraries_text_simdutf_simdutf_h === RCS file: patches/patch-libraries_text_simdutf_simdutf_h diff -N patches/patch-libraries_text_simdutf_simdutf_h --- /dev/null 1 Jan 1970 00:00:00 - +++ patches/patch-libraries_text_simdutf_simdutf_h 21 Feb 2024 12:35:13 - @@ -0,0 +1,78 @@ +https://github.com/simdutf/simdutf/commit/55b107f609f5f63880db650a92861ae84cb10abe +(haskell/text upstream has now updated to a version past this commit) + +Index: libraries/text/simdutf/simdutf.h +--- libraries/text/simdutf/simdutf.h.orig libraries/text/simdutf/simdutf.h +@@ -549,6 +549,7 @@ namespace cpuid_bit { + // EAX = 0x01 + constexpr uint32_t pclmulqdq = uint32_t(1) << 1; ///< @private bit 1 of ECX for EAX=0x1 + constexpr uint32_t sse42 = uint32_t(1) << 20;///< @private bit 20 of ECX for EAX=0x1 ++constexpr uint32_t osxsave = (uint32_t(1) << 26) | (uint32_t(1) << 27); ///< @private bits 26+27 of ECX for EAX=0x1 + + // EAX = 0x7f (Structured Extended Feature Flags), ECX = 0x00 (Sub-leaf) + // See:
Re: -current Haskell ports aborting with SIGILL
Stuart Henderson wrote: > On 2024/02/18 13:58, Evan Silberman wrote: > > Stuart Henderson wrote: > > > This is probably worth a try. I've asked if upstream can update it > > > > Hi Stuart, > > > > Unclear if necessary, but not sufficient. Turns out there's one more > > unconditional AVX512 check in text's own C code, cbits/measure_off.c: > > > > #if !((defined(__apple_build_version__) && __apple_build_version__ <= > > 10001145) \ > > || (defined(__clang_major__) && __clang_major__ <= 6)) && > > !defined(__STDC_NO_ATOMICS__) > > #define COMPILER_SUPPORTS_AVX512 > > #endif > > > > Undefing this gets me good results: > > > > text $ cabal repl > > Build profile: -w ghc-9.2.7 -O1 > > In order, the following will be built (use -v for more details): > > - text-2.0.2 (lib) (first run) > > Preprocessing library for text-2.0.2.. > > GHCi, version 9.2.7: https://www.haskell.org/ghc/ :? for help > > [ 1 of 46] Compiling Data.Text.Array ( src/Data/Text/Array.hs, interpreted > > ) > > [ snip ] > > [46 of 46] Compiling Data.Text.Lazy.IO ( src/Data/Text/Lazy/IO.hs, > > interpreted ) > > Ok, 46 modules loaded. > > ghci> take 1 $ pack "AA" > > "A" > > > > whew. > > > > I _think_ the right patch for lang/ghc looks like this but I haven't > > tested this exact thing in situ. > > > > blob - /dev/null > > blob + 282c4467d3eed9211a7b3c505248569d342863b4 (mode 644) > > --- /dev/null > > +++ lang/ghc/patches/patch-libraries_text_cbits_measure_off_c > > @@ -0,0 +1,13 @@ > > +Disable AVX512 instructions, not supported on OpenBSD > > +Index: libraries/text/cbits/measure_off.c > > +--- libraries/text/cbits/measure_off.c.orig > > libraries/text/cbits/measure_off.c > > +@@ -34,7 +34,7 @@ > > + Disable AVX-512 instructions as they are most likely not supported > > + on the hardware running clang-6. > > + */ > > +-#if !((defined(__apple_build_version__) && __apple_build_version__ <= > > 10001145) \ > > ++#if !defined(__OpenBSD__) && !((defined(__apple_build_version__) && > > __apple_build_version__ <= 10001145) \ > > + || (defined(__clang_major__) && __clang_major__ <= 6)) && > > !defined(__STDC_NO_ATOMICS__) > > + #define COMPILER_SUPPORTS_AVX512 > > + #endif > > > > That's not the right check, it should test whether it works rather than > what the OS is. Something like this? I'm out of my depth and heavily pattern-matching against the fix to simdutf and other references. Genuinely no idea if I'm using inline asm correctly, etc. Works on my machine, however. --- /dev/null +++ lang/ghc/patches/patch-libraries_text_cbits_measure_off_c @@ -0,0 +1,22 @@ +Don't attempt to use avx512 kernels when the OS doesn't support them +Index: libraries/text/cbits/measure_off.c +--- libraries/text/cbits/measure_off.c.orig libraries/text/cbits/measure_off.c +@@ -44,12 +44,16 @@ + bool has_avx512_vl_bw() { + #if (__GNUC__ >= 7 || __GNUC__ == 6 && __GNUC_MINOR__ >= 3) || defined(__clang_major__) + uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0; ++ uint64_t xcr0; + __get_cpuid_count(7, 0, , , , ); ++ __asm__("xgetbv\n\t" : "=a" (xcr0) : "c" (0)); + // https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features + const bool has_avx512_bw = ebx & (1 << 30); + const bool has_avx512_vl = ebx & (1 << 31); ++ // XCR0 bits 5, 6, and 7 ++ const bool avx512_os_enabled = (xcr0 & 0xE0) == 0xE0; + // printf("cpuid=%d=cpuid\n", has_avx512_bw && has_avx512_vl); +- return has_avx512_bw && has_avx512_vl; ++ return has_avx512_bw && has_avx512_vl && avx512_os_enabled; + #else + return false; + #endif
Re: -current Haskell ports aborting with SIGILL
On 2024/02/18 13:58, Evan Silberman wrote: > Stuart Henderson wrote: > > This is probably worth a try. I've asked if upstream can update it > > Hi Stuart, > > Unclear if necessary, but not sufficient. Turns out there's one more > unconditional AVX512 check in text's own C code, cbits/measure_off.c: > > #if !((defined(__apple_build_version__) && __apple_build_version__ <= > 10001145) \ > || (defined(__clang_major__) && __clang_major__ <= 6)) && > !defined(__STDC_NO_ATOMICS__) > #define COMPILER_SUPPORTS_AVX512 > #endif > > Undefing this gets me good results: > > text $ cabal repl > Build profile: -w ghc-9.2.7 -O1 > In order, the following will be built (use -v for more details): > - text-2.0.2 (lib) (first run) > Preprocessing library for text-2.0.2.. > GHCi, version 9.2.7: https://www.haskell.org/ghc/ :? for help > [ 1 of 46] Compiling Data.Text.Array ( src/Data/Text/Array.hs, interpreted ) > [ snip ] > [46 of 46] Compiling Data.Text.Lazy.IO ( src/Data/Text/Lazy/IO.hs, > interpreted ) > Ok, 46 modules loaded. > ghci> take 1 $ pack "AA" > "A" > > whew. > > I _think_ the right patch for lang/ghc looks like this but I haven't > tested this exact thing in situ. > > blob - /dev/null > blob + 282c4467d3eed9211a7b3c505248569d342863b4 (mode 644) > --- /dev/null > +++ lang/ghc/patches/patch-libraries_text_cbits_measure_off_c > @@ -0,0 +1,13 @@ > +Disable AVX512 instructions, not supported on OpenBSD > +Index: libraries/text/cbits/measure_off.c > +--- libraries/text/cbits/measure_off.c.orig > libraries/text/cbits/measure_off.c > +@@ -34,7 +34,7 @@ > + Disable AVX-512 instructions as they are most likely not supported > + on the hardware running clang-6. > + */ > +-#if !((defined(__apple_build_version__) && __apple_build_version__ <= > 10001145) \ > ++#if !defined(__OpenBSD__) && !((defined(__apple_build_version__) && > __apple_build_version__ <= 10001145) \ > + || (defined(__clang_major__) && __clang_major__ <= 6)) && > !defined(__STDC_NO_ATOMICS__) > + #define COMPILER_SUPPORTS_AVX512 > + #endif > That's not the right check, it should test whether it works rather than what the OS is.
Re: -current Haskell ports aborting with SIGILL
Stuart Henderson wrote: > This is probably worth a try. I've asked if upstream can update it Hi Stuart, Unclear if necessary, but not sufficient. Turns out there's one more unconditional AVX512 check in text's own C code, cbits/measure_off.c: #if !((defined(__apple_build_version__) && __apple_build_version__ <= 10001145) \ || (defined(__clang_major__) && __clang_major__ <= 6)) && !defined(__STDC_NO_ATOMICS__) #define COMPILER_SUPPORTS_AVX512 #endif Undefing this gets me good results: text $ cabal repl Build profile: -w ghc-9.2.7 -O1 In order, the following will be built (use -v for more details): - text-2.0.2 (lib) (first run) Preprocessing library for text-2.0.2.. GHCi, version 9.2.7: https://www.haskell.org/ghc/ :? for help [ 1 of 46] Compiling Data.Text.Array ( src/Data/Text/Array.hs, interpreted ) [ snip ] [46 of 46] Compiling Data.Text.Lazy.IO ( src/Data/Text/Lazy/IO.hs, interpreted ) Ok, 46 modules loaded. ghci> take 1 $ pack "AA" "A" whew. I _think_ the right patch for lang/ghc looks like this but I haven't tested this exact thing in situ. blob - /dev/null blob + 282c4467d3eed9211a7b3c505248569d342863b4 (mode 644) --- /dev/null +++ lang/ghc/patches/patch-libraries_text_cbits_measure_off_c @@ -0,0 +1,13 @@ +Disable AVX512 instructions, not supported on OpenBSD +Index: libraries/text/cbits/measure_off.c +--- libraries/text/cbits/measure_off.c.orig libraries/text/cbits/measure_off.c +@@ -34,7 +34,7 @@ + Disable AVX-512 instructions as they are most likely not supported + on the hardware running clang-6. + */ +-#if !((defined(__apple_build_version__) && __apple_build_version__ <= 10001145) \ ++#if !defined(__OpenBSD__) && !((defined(__apple_build_version__) && __apple_build_version__ <= 10001145) \ + || (defined(__clang_major__) && __clang_major__ <= 6)) && !defined(__STDC_NO_ATOMICS__) + #define COMPILER_SUPPORTS_AVX512 + #endif
Re: -current Haskell ports aborting with SIGILL
This is probably worth a try. I've asked if upstream can update it: https://github.com/haskell/text/issues/563 Index: Makefile === RCS file: /cvs/ports/lang/ghc/Makefile,v retrieving revision 1.220 diff -u -p -r1.220 Makefile --- Makefile5 Feb 2024 01:49:50 - 1.220 +++ Makefile18 Feb 2024 18:31:31 - @@ -14,6 +14,7 @@ USE_NOEXECONLY = Yes USE_NOBTCFI = Yes GHC_VERSION = 9.6.4 +REVISION = 0 DISTNAME = ghc-${GHC_VERSION} CATEGORIES = lang devel HOMEPAGE = https://www.haskell.org/ghc/ Index: distinfo === RCS file: /cvs/ports/lang/ghc/distinfo,v retrieving revision 1.73 diff -u -p -r1.73 distinfo --- distinfo5 Feb 2024 01:49:28 - 1.73 +++ distinfo18 Feb 2024 18:31:31 - @@ -1,10 +1,10 @@ -SHA256 (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = CedJ29vBFZyl1e+DgcUqPfjHMDRKmEOsXP9gH4Wka6E= -SHA256 (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = Nb3trqnIF8H5kfKEkeGLr+sl4rPeFsbW/gfkelRprrY= SHA256 (ghc/ghc-9.6.4-src.tar.xz) = EL8luLBxdP3ZhotcDFbBfA7x7ctiR7S4ZL6TNlG/1MA= SHA256 (ghc/ghc-9.6.4-testsuite.tar.xz) = bhMoL76//b+gpJiJQ3REyakM/ldgxHlpzUJFhUwzjXM= +SHA256 (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = CedJ29vBFZyl1e+DgcUqPfjHMDRKmEOsXP9gH4Wka6E= +SHA256 (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = Nb3trqnIF8H5kfKEkeGLr+sl4rPeFsbW/gfkelRprrY= SHA256 (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = wMMJfyP7Pr6xjb/tj9Kz5iZugGr6+duMwJ23aGsUWy0= SIZE (ghc/ghc-9.6.4-src.tar.xz) = 29451856 SIZE (ghc/ghc-9.6.4-testsuite.tar.xz) = 7075820 SIZE (ghc/ghc-9.6.4.20240111-amd64.tar.xz) = 74706384 SIZE (ghc/ghc-9.6.4.20240111-shlibs-amd64.tar.gz) = 3544885 -SIZE (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = 2125322 \ No newline at end of file +SIZE (ghc/hadrian-sources-9.6.4.20240111.tar.gz) = 2125322 Index: patches/patch-libraries_text_simdutf_simdutf_h === RCS file: patches/patch-libraries_text_simdutf_simdutf_h diff -N patches/patch-libraries_text_simdutf_simdutf_h --- /dev/null 1 Jan 1970 00:00:00 - +++ patches/patch-libraries_text_simdutf_simdutf_h 18 Feb 2024 18:31:31 - @@ -0,0 +1,77 @@ +https://github.com/simdutf/simdutf/commit/55b107f609f5f63880db650a92861ae84cb10abe + +Index: libraries/text/simdutf/simdutf.h +--- libraries/text/simdutf/simdutf.h.orig libraries/text/simdutf/simdutf.h +@@ -549,6 +549,7 @@ namespace cpuid_bit { + // EAX = 0x01 + constexpr uint32_t pclmulqdq = uint32_t(1) << 1; ///< @private bit 1 of ECX for EAX=0x1 + constexpr uint32_t sse42 = uint32_t(1) << 20;///< @private bit 20 of ECX for EAX=0x1 ++constexpr uint32_t osxsave = (uint32_t(1) << 26) | (uint32_t(1) << 27); ///< @private bits 26+27 of ECX for EAX=0x1 + + // EAX = 0x7f (Structured Extended Feature Flags), ECX = 0x00 (Sub-leaf) + // See: "Table 3-8. Information Returned by CPUID Instruction" +@@ -574,6 +575,10 @@ namespace cpuid_bit { + namespace edx { + constexpr uint32_t avx512vp2intersect = uint32_t(1) << 8; + } ++namespace xcr0_bit { ++ constexpr uint64_t avx256_saved = uint64_t(1) << 2; ///< @private bit 2 = AVX ++ constexpr uint64_t avx512_saved = uint64_t(7) << 5; ///< @private bits 5,6,7 = opmask, ZMM_hi256, hi16_ZMM ++} + } + } + +@@ -583,7 +588,7 @@ static inline void cpuid(uint32_t *eax, uint32_t *ebx, + uint32_t *edx) { + #if defined(_MSC_VER) + int cpu_info[4]; +- __cpuid(cpu_info, *eax); ++ __cpuidex(cpu_info, *eax, *ecx); + *eax = cpu_info[0]; + *ebx = cpu_info[1]; + *ecx = cpu_info[2]; +@@ -601,6 +606,16 @@ static inline void cpuid(uint32_t *eax, uint32_t *ebx, + #endif + } + ++static inline uint64_t xgetbv() { ++#if defined(_MSC_VER) ++ return _xgetbv(0); ++#else ++ uint32_t xcr0_lo, xcr0_hi; ++ asm volatile("xgetbv\n\t" : "=a" (xcr0_lo), "=d" (xcr0_hi) : "c" (0)); ++ return xcr0_lo | ((uint64_t)xcr0_hi << 32); ++#endif ++} ++ + static inline uint32_t detect_supported_architectures() { + uint32_t eax; + uint32_t ebx = 0; +@@ -620,6 +635,16 @@ static inline uint32_t detect_supported_architectures( + host_isa |= instruction_set::PCLMULQDQ; + } + ++ if ((ecx & cpuid_bit::osxsave) != cpuid_bit::osxsave) { ++return host_isa; ++ } ++ ++ // xgetbv for checking if the OS saves registers ++ uint64_t xcr0 = xgetbv(); ++ ++ if ((xcr0 & cpuid_bit::xcr0_bit::avx256_saved) == 0) { ++return host_isa; ++ } + // ECX for EAX=0x7 + eax = 0x7; + ecx = 0x0; // Sub-leaf = 0 +@@ -632,6 +657,9 @@ static inline uint32_t detect_supported_architectures( + } + if (ebx & cpuid_bit::ebx::bmi2) { + host_isa |= instruction_set::BMI2; ++ } ++ if (!((xcr0 & cpuid_bit::xcr0_bit::avx512_saved) == cpuid_bit::xcr0_bit::avx512_saved)) { ++return host_isa; + } + if (ebx &
Re: -current Haskell ports aborting with SIGILL
Stuart Henderson wrote: > On 2024/02/18 09:02, Stuart Henderson wrote: > > On 2024/02/17 22:08, Greg Steuck wrote: > > > Oh wow, this is becoming eerily similar to the failures aja@ is getting. > > > Do > > > dig more into this! > > > > Antoine, can you send a dmesg from one of the exopi VMs, please? > > - specifically I am wondering if it could be someething to do with AVX, > with AVX512 being the most likely to cause problems - Evan's machine > does have this - my intel 11th gen doesn't because it has a mix of > P+E cores, E cores don't implement it, so they disable it on the P > cores too. > > ghc *does* have some code relating to AVX512. Breakthrough, ignore the previous reproducer and any association with template haskell. I can get a crash in GHCI very simply: ~ $ ghci GHCi, version 9.6.4: https://www.haskell.org/ghc/ :? for help ghci> import qualified Data.Text as T ghci> T.take 1 $ T.pack "aa" "Illegal instruction (core dumped) The issue seems to be related to the employment of AVX512 by the simdutf project, which is vendored by the ubiquitous text package: https://github.com/haskell/text/tree/master/simdutf I had just noticed this dependency when I saw Stuart's message about "the commit that fixed node" in simdutf upstream: https://github.com/simdutf/simdutf/commit/55b107f609f5f63880db650a92861ae84cb10abe GHC 9.6.4 ships with text 2.0.2. GHC 9.2.7 ships with text 1.2.5.0. simdutf was first vendored in text 2.0 and the vendored copy there was last updated in 2022. The conclusion is text's old version of simdutf is using AVX512 instructions regardless of OS support. I haven't followed this thread all the way to a proposed patch to the ghc port, let's call it an exercise for the reader. Evan
Re: -current Haskell ports aborting with SIGILL
On Sun, Feb 18, 2024 at 09:02:16AM +, Stuart Henderson wrote: > On 2024/02/17 22:08, Greg Steuck wrote: > > Oh wow, this is becoming eerily similar to the failures aja@ is getting. Do > > dig more into this! > > Antoine, can you send a dmesg from one of the exopi VMs, please? Sure. OpenBSD 7.4-current (GENERIC.MP) #1688: Thu Feb 15 10:48:34 MST 2024 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 17162940416 (16367MB) avail mem = 16621662208 (15851MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf5a90 (11 entries) bios0: vendor SeaBIOS version "1.13.0-1ubuntu1.1" date 04/01/2014 bios0: Exoscale Exoscale Compute Platform acpi0 at bios0: ACPI 1.0 acpi0: sleep states S3 S4 S5 acpi0: tables DSDT FACP APIC HPET WAET acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel Xeon Processor (Skylake), 2400.53 MHz, 06-55-04 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,AVX512CD,AVX512BW,AVX512VL,PKU,IBRS,IBPB,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,MELTDOWN cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 4MB 64b/line 16-way L2 cache, 16MB 64b/line 16-way L3 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 1000MHz cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel Xeon Processor (Skylake), 2400.53 MHz, 06-55-04 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,AVX512CD,AVX512BW,AVX512VL,PKU,IBRS,IBPB,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,MELTDOWN cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 4MB 64b/line 16-way L2 cache, 16MB 64b/line 16-way L3 cache cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 2 (application processor) cpu2: Intel Xeon Processor (Skylake), 2400.47 MHz, 06-55-04 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,AVX512CD,AVX512BW,AVX512VL,PKU,IBRS,IBPB,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,MELTDOWN cpu2: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 4MB 64b/line 16-way L2 cache, 16MB 64b/line 16-way L3 cache cpu2: smt 0, core 2, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel Xeon Processor (Skylake), 2400.71 MHz, 06-55-04 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,AVX512CD,AVX512BW,AVX512VL,PKU,IBRS,IBPB,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,MELTDOWN cpu3: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 4MB 64b/line 16-way L2 cache, 16MB 64b/line 16-way L3 cache cpu3: smt 0, core 3, package 0 cpu4 at mainbus0: apid 4 (application processor) cpu4: Intel Xeon Processor (Skylake), 2400.64 MHz, 06-55-04 cpu4: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,AVX512CD,AVX512BW,AVX512VL,PKU,IBRS,IBPB,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,MELTDOWN cpu4: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 4MB 64b/line 16-way L2 cache, 16MB 64b/line 16-way L3 cache cpu4: smt 0, core 0, package 1 cpu5 at mainbus0: apid 5 (application processor) cpu5: Intel Xeon Processor (Skylake), 2400.61 MHz, 06-55-04 cpu5:
Re: -current Haskell ports aborting with SIGILL
On 2024/02/18 09:02, Stuart Henderson wrote: > On 2024/02/17 22:08, Greg Steuck wrote: > > Oh wow, this is becoming eerily similar to the failures aja@ is getting. Do > > dig more into this! > > Antoine, can you send a dmesg from one of the exopi VMs, please? - specifically I am wondering if it could be someething to do with AVX, with AVX512 being the most likely to cause problems - Evan's machine does have this - my intel 11th gen doesn't because it has a mix of P+E cores, E cores don't implement it, so they disable it on the P cores too. ghc *does* have some code relating to AVX512.
Re: -current Haskell ports aborting with SIGILL
On 2024/02/17 22:08, Greg Steuck wrote: > Oh wow, this is becoming eerily similar to the failures aja@ is getting. Do > dig more into this! Antoine, can you send a dmesg from one of the exopi VMs, please? > On Sat, Feb 17, 2024, 21:34 Evan Silberman wrote: > > > Evan Silberman wrote: > > > Thanks to the amd64 snapshot package archives I was able to bisect to a > > > working pandoc package on my machine. OK on 2024-01-16, NG on > > > 2024-01-17. > > > > I guess what this means is my problem is traceable to the ghc 9.6.4 > > bump. I have determined I can elicit a SIGILL reproducibly, outside of > > the ports environment, when using ghc 9.6.4 to build unicode-collation > > 0.1.3.6 (https://hackage.haskell.org/package/unicode-collation), e.g.: > > > > unicode-collation-0.1.3.6 $ cabal build > > Resolving dependencies... > > Build profile: -w ghc-9.6.4 -O1 > > In order, the following will be built (use -v for more details): > >- unicode-collation-0.1.3.6 (lib) (first run) > > Configuring library for unicode-collation-0.1.3.6.. > > Preprocessing library for unicode-collation-0.1.3.6.. > > Building library for unicode-collation-0.1.3.6.. > > [ 1 of 10] Compiling Text.Collate.Lang > > [ 2 of 10] Compiling Text.Collate.Trie > > [ 3 of 10] Compiling Text.Collate.UnicodeData > > [ 4 of 10] Compiling Text.Collate.CanonicalCombiningClass > > Error: cabal: Failed to build unicode-collation-0.1.3.6. The build > > process > > terminated with exit code -4 > > > > Same build is fine with ghc 9.2.7p5 from January. > > > > A little more messing around suggests that the issue is related to > > Template Haskell. I might be able to generate a minimal reproducer. Stay > > tuned? > > > > Evan > >
Re: -current Haskell ports aborting with SIGILL
Greg Steuck wrote: > Oh wow, this is becoming eerily similar to the failures aja@ is getting. Do > dig more into this! I did, and now it's bedtime. Attached tarball contains a reasonably minimized reproduction of the crash on my machine with a short readme. It builds with GHC 9.2.7 and fails with GHC 9.6.4. The reproduction involves using template haskell to perform a compile-time computation on data read from a file. I hope this helps track down what's going on! Evan ghc-sigill-reproduction.tgz Description: GNU Zip compressed data
Re: -current Haskell ports aborting with SIGILL
Oh wow, this is becoming eerily similar to the failures aja@ is getting. Do dig more into this! On Sat, Feb 17, 2024, 21:34 Evan Silberman wrote: > Evan Silberman wrote: > > Thanks to the amd64 snapshot package archives I was able to bisect to a > > working pandoc package on my machine. OK on 2024-01-16, NG on > > 2024-01-17. > > I guess what this means is my problem is traceable to the ghc 9.6.4 > bump. I have determined I can elicit a SIGILL reproducibly, outside of > the ports environment, when using ghc 9.6.4 to build unicode-collation > 0.1.3.6 (https://hackage.haskell.org/package/unicode-collation), e.g.: > > unicode-collation-0.1.3.6 $ cabal build > Resolving dependencies... > Build profile: -w ghc-9.6.4 -O1 > In order, the following will be built (use -v for more details): >- unicode-collation-0.1.3.6 (lib) (first run) > Configuring library for unicode-collation-0.1.3.6.. > Preprocessing library for unicode-collation-0.1.3.6.. > Building library for unicode-collation-0.1.3.6.. > [ 1 of 10] Compiling Text.Collate.Lang > [ 2 of 10] Compiling Text.Collate.Trie > [ 3 of 10] Compiling Text.Collate.UnicodeData > [ 4 of 10] Compiling Text.Collate.CanonicalCombiningClass > Error: cabal: Failed to build unicode-collation-0.1.3.6. The build > process > terminated with exit code -4 > > Same build is fine with ghc 9.2.7p5 from January. > > A little more messing around suggests that the issue is related to > Template Haskell. I might be able to generate a minimal reproducer. Stay > tuned? > > Evan >
Re: -current Haskell ports aborting with SIGILL
Evan Silberman wrote: > Thanks to the amd64 snapshot package archives I was able to bisect to a > working pandoc package on my machine. OK on 2024-01-16, NG on > 2024-01-17. I guess what this means is my problem is traceable to the ghc 9.6.4 bump. I have determined I can elicit a SIGILL reproducibly, outside of the ports environment, when using ghc 9.6.4 to build unicode-collation 0.1.3.6 (https://hackage.haskell.org/package/unicode-collation), e.g.: unicode-collation-0.1.3.6 $ cabal build Resolving dependencies... Build profile: -w ghc-9.6.4 -O1 In order, the following will be built (use -v for more details): - unicode-collation-0.1.3.6 (lib) (first run) Configuring library for unicode-collation-0.1.3.6.. Preprocessing library for unicode-collation-0.1.3.6.. Building library for unicode-collation-0.1.3.6.. [ 1 of 10] Compiling Text.Collate.Lang [ 2 of 10] Compiling Text.Collate.Trie [ 3 of 10] Compiling Text.Collate.UnicodeData [ 4 of 10] Compiling Text.Collate.CanonicalCombiningClass Error: cabal: Failed to build unicode-collation-0.1.3.6. The build process terminated with exit code -4 Same build is fine with ghc 9.2.7p5 from January. A little more messing around suggests that the issue is related to Template Haskell. I might be able to generate a minimal reproducer. Stay tuned? Evan
Re: -current Haskell ports aborting with SIGILL
Stuart Henderson wrote: > On 2024/02/16 13:41, Evan Silberman wrote: > > Greg back to your question about when this worked, I wish I could tell > > you something meaningful. I think I had a pre-7.4 snap on this laptop > > vintage September, whenever I last updated the Pandoc port, and then I > > don't think I really turned it on between then and this week, when I > > upgraded. Good demonstration of the value of checking in with how things > > are working more often. > > You can try some older versions from ftp.hostserver.de/archive. Thanks to the amd64 snapshot package archives I was able to bisect to a working pandoc package on my machine. OK on 2024-01-16, NG on 2024-01-17. ~ $ doas pkg_add http://ftp.hostserver.de/archive/2024-01-16-0105/snapshots/packages/amd64/pandoc-3.1.8.tgz quirks-7.3 signed on 2024-02-16T12:35:01Z pandoc-3.1.8: ok ~ $ pandoc a.md Hello World! ~ $ doas pkg_delete pandoc pandoc-3.1.8: ok ~ $ doas pkg_add http://ftp.hostserver.de/archive/2024-01-17-0105/snapshots/packages/amd64/pandoc-3.1.8.tgz quirks-7.3 signed on 2024-02-16T12:35:01Z pandoc-3.1.8: ok ~ $ pandoc a.md Illegal instruction (core dumped) ~ $ Let me know if there's anything else I can provide. Evan
Re: -current Haskell ports aborting with SIGILL
On 2024/02/16 20:45, Stuart Henderson wrote: > It runs ok on ryzen. 11th gen intel + SIGILL - looks like an IBT issue. I've tried cargo-bundler and pandoc on my IBT machine now, they seem to run OK there too. On 2024/02/16 13:41, Evan Silberman wrote: > Greg back to your question about when this worked, I wish I could tell > you something meaningful. I think I had a pre-7.4 snap on this laptop > vintage September, whenever I last updated the Pandoc port, and then I > don't think I really turned it on between then and this week, when I > upgraded. Good demonstration of the value of checking in with how things > are working more often. You can try some older versions from ftp.hostserver.de/archive.
Re: -current Haskell ports aborting with SIGILL
Greg Steuck wrote: > Same here. That's intentional as the underlying problem got fixed > upstream a while ago. We removed WXNEEDED then: > > Date: Fri Jun 3 02:48:07 2022 + > > Remove USE_WXNEEDED from lang/ghc as it's no longer needed > > Thanks > Greg OK. I'm out of my depth a bit at this point, I'm kinda just pattern matching between my working -release vm and my broken -current laptop. The only other ELF difference I can spot between working-release-pandoc and broken-current-pandoc is that on -current pandoc has a .plt.sec ELF section (like assorted other binaries on the system also do) and on -release it doesn't. Duckduckgo suggests this could be control-flow-integrity related, which vaguely comports with Stuart's hunch, but I don't know enough to know if this is a real clue. I hope I haven't found, like, a GHC bug or an LLVM bug, I dunno how I would could run it to ground if I did. Still seems quite possible that the bug is I screwed up something with my install but sysupgrade makes it pretty hard to do. Evan
Re: -current Haskell ports aborting with SIGILL
Evan Silberman writes: > Can you repeat for OPENBSD_WXNEED? On my syspatched 7.4-release VPS, pandoc > from packages has wxneeded: > > ~$ uname -rsv > OpenBSD 7.4 GENERIC#1336 > ~$ readelf -Wl $(which pandoc) | grep OPENBSD_WXNEED > OPENBSD_WXNEED 0x00 0x 0x 0x00 > 0x00 E 0 > > Snapshot pandoc package on my laptop doesn't: > ~ $ readelf -Wl $(which pandoc) | grep OPENBSD_WXNEED > ~ $ > > Likewise doing a cabal build without ports infrastructure on my personal > project produces a binary without OPENBSD_WXNEED. Same here. That's intentional as the underlying problem got fixed upstream a while ago. We removed WXNEEDED then: Date: Fri Jun 3 02:48:07 2022 + Remove USE_WXNEEDED from lang/ghc as it's no longer needed Thanks Greg
Re: -current Haskell ports aborting with SIGILL
Greg Steuck wrote: > Evan Silberman writes: > > > I should've figured out how to use readelf before I bothered with this, > > the NOBTCFI elf segment is already present in the ports in question. > > Above diff is irrelevant. It does seem significant that the ports that > > work fine (ghc and cabal-install) are, naturally, the ones that don't > > use cabal.port.mk to build. > > Right, these two programs are built differently. Yet, just like you, > I don't see any difference in the headers: > > % readelf -e $(which pandoc) | grep -A1 OPENBSD_NOBTCF > OPENBSD_NOBTCF 0x 0x 0x > 0x 0xE0 > % readelf -e $(which cabal) | grep -A1 OPENBSD_NOBTCF > OPENBSD_NOBTCF 0x 0x 0x > 0x 0xE0 Can you repeat for OPENBSD_WXNEED? On my syspatched 7.4-release VPS, pandoc from packages has wxneeded: ~$ uname -rsv OpenBSD 7.4 GENERIC#1336 ~$ readelf -Wl $(which pandoc) | grep OPENBSD_WXNEED OPENBSD_WXNEED 0x00 0x 0x 0x00 0x00 E 0 Snapshot pandoc package on my laptop doesn't: ~ $ readelf -Wl $(which pandoc) | grep OPENBSD_WXNEED ~ $ Likewise doing a cabal build without ports infrastructure on my personal project produces a binary without OPENBSD_WXNEED. I would like to complete the loop and try building the same code on 7.4-release but my VPS /usr/local doesn't have enough space left to install ghc, lol. Evan
Re: -current Haskell ports aborting with SIGILL
Evan Silberman writes: > I should've figured out how to use readelf before I bothered with this, > the NOBTCFI elf segment is already present in the ports in question. > Above diff is irrelevant. It does seem significant that the ports that > work fine (ghc and cabal-install) are, naturally, the ones that don't > use cabal.port.mk to build. Right, these two programs are built differently. Yet, just like you, I don't see any difference in the headers: % readelf -e $(which pandoc) | grep -A1 OPENBSD_NOBTCF OPENBSD_NOBTCF 0x 0x 0x 0x 0xE0 % readelf -e $(which cabal) | grep -A1 OPENBSD_NOBTCF OPENBSD_NOBTCF 0x 0x 0x 0x 0xE0 > Greg back to your question about when this worked, I wish I could tell > you something meaningful. I think I had a pre-7.4 snap on this laptop > vintage September, whenever I last updated the Pandoc port, and then I > don't think I really turned it on between then and this week, when I > upgraded. Good demonstration of the value of checking in with how things > are working more often. Yeah, if you can figure out when this started happening, it would be helpful. Thanks Greg
Re: -current Haskell ports aborting with SIGILL
Evan Silberman wrote: > I'm trying with this: > > diff /usr/ports > commit - 5fdc9dbcebbd57477b047430dfc2cc4e987537ef > path + /usr/ports > blob - a51e834910b7ee582f3be8d92a4e137a0cb46ad1 > file + devel/cabal/cabal.port.mk > --- devel/cabal/cabal.port.mk > +++ devel/cabal/cabal.port.mk > @@ -85,7 +85,7 @@ MODCABAL_post-extract += \ > MODCABAL_post-extract += \ > && echo "package *\n ghc-options: -fdiagnostics-color=never" >> > ${WRKSRC}/cabal.project.local \ > && echo "package *\n ghc-options: -split-sections\n" >> > ${WRKSRC}/cabal.project.local \ > - && echo "package ${MODCABAL_STEM}\n ld-options: > -Wl,--gc-sections,--build-id" >> ${WRKSRC}/cabal.project.local > + && echo "package ${MODCABAL_STEM}\n ld-options: > -Wl,--gc-sections,--build-id,-z,nobtcfi" >> ${WRKSRC}/cabal.project.local > > # Automatically copies the cabal.project file if any. > MODCABAL_post-extract += \ I should've figured out how to use readelf before I bothered with this, the NOBTCFI elf segment is already present in the ports in question. Above diff is irrelevant. It does seem significant that the ports that work fine (ghc and cabal-install) are, naturally, the ones that don't use cabal.port.mk to build. Greg back to your question about when this worked, I wish I could tell you something meaningful. I think I had a pre-7.4 snap on this laptop vintage September, whenever I last updated the Pandoc port, and then I don't think I really turned it on between then and this week, when I upgraded. Good demonstration of the value of checking in with how things are working more often. Evan
Re: -current Haskell ports aborting with SIGILL
Greg Steuck wrote: > Stuart Henderson writes: > > > It runs ok on ryzen. 11th gen intel + SIGILL - looks like an IBT > > issue. > > Isn't -Wl,-z,nobtcfi supposed to have disabled this? > > https://codeberg.org/OpenBSD/ports/src/branch/master/lang/ghc/Makefile#L123 > > Thanks > Greg I'm trying with this: diff /usr/ports commit - 5fdc9dbcebbd57477b047430dfc2cc4e987537ef path + /usr/ports blob - a51e834910b7ee582f3be8d92a4e137a0cb46ad1 file + devel/cabal/cabal.port.mk --- devel/cabal/cabal.port.mk +++ devel/cabal/cabal.port.mk @@ -85,7 +85,7 @@ MODCABAL_post-extract += \ MODCABAL_post-extract += \ && echo "package *\n ghc-options: -fdiagnostics-color=never" >> ${WRKSRC}/cabal.project.local \ && echo "package *\n ghc-options: -split-sections\n" >> ${WRKSRC}/cabal.project.local \ - && echo "package ${MODCABAL_STEM}\n ld-options: -Wl,--gc-sections,--build-id" >> ${WRKSRC}/cabal.project.local + && echo "package ${MODCABAL_STEM}\n ld-options: -Wl,--gc-sections,--build-id,-z,nobtcfi" >> ${WRKSRC}/cabal.project.local # Automatically copies the cabal.project file if any. MODCABAL_post-extract += \
Re: -current Haskell ports aborting with SIGILL
Stuart Henderson writes: > It runs ok on ryzen. 11th gen intel + SIGILL - looks like an IBT > issue. Isn't -Wl,-z,nobtcfi supposed to have disabled this? https://codeberg.org/OpenBSD/ports/src/branch/master/lang/ghc/Makefile#L123 Thanks Greg
Re: -current Haskell ports aborting with SIGILL
Evan Silberman writes: > Hi ports@ and Greg, > > On -current (yesterday's amd64 snap, this morning's amd64 packages), > ports built with ghc are all getting SIGILL early in execution on my > laptop. Interesting, when was the last time it worked for you? > I'm not totally positive what is helpful to provide here but I > can provide it on request. I thought I was maybe caught in some base > snaps vs. packages synchronization issue, but I rebuilt > devel/cabal-bundler on my laptop and got the same result after a > successful build. This indicates that a number of GHC-compiled programs (ghc and cabal) kept working. I tried both cabal-bundler and pandoc from whatever snapshot is up there now. I couldn't match your kernel snapshot exactly, but with #1681 and #1691 the programs still work for me. This is my cpu: cpu0: AMD Ryzen 7 5700G with Radeon Graphics, 3800.00 MHz, 19-50-00, patch 0a5f cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,HWPSTATE,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,PKU,IBPB,IBRS,STIBP,STIBP_ALL,IBRS_PREF,IBRS_SM,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES > > lldb transcripts and dmesg below. > > -- Evan Silberman > > ~ $ lldb cabal-bundler > (lldb) target create "cabal-bundler" > Current executable set to '/usr/local/bin/cabal-bundler' (x86_64). > (lldb) run > Process 38310 launched: '/usr/local/bin/cabal-bundler' (x86_64) > Process 38310 stopped > * thread #1, stop reason = signal SIGILL > frame #0: 0x004c136b cabal-bundler`___lldb_unnamed_symbol523 + 27 > cabal-bundler`___lldb_unnamed_symbol523: > -> 0x4c136b <+27>: vmovdqa64 -0x29fa35(%rip), %zmm0 One potentially troublesome aspect of this instruction I can see is it reads from .code which would be a problem with X-only. But we don't enforce X-only for Haskell programs. That's what this --no-execute-only is all about: https://codeberg.org/OpenBSD/ports/src/branch/master/lang/ghc/Makefile#L122 Another interesting angle is the computed load address doesn't seem to be properly aligned, namely (0x4c136b - 0x29fa35) mod 32 /= 0. Maybe that's the cause? But I really don't understand how this code location is reached. % lldb /usr/local/bin/cabal-bundler (lldb) target create "/usr/local/bin/cabal-bundler" Current executable set to '/usr/local/bin/cabal-bundler' (x86_64). (lldb) b ___lldb_unnamed_symbol523 Breakpoint 2: where = cabal-bundler`___lldb_unnamed_symbol523, address = 0x004c1350 (lldb) r Process 14715 launched: '/usr/local/bin/cabal-bundler' (x86_64) Missing: PKG ... Maybe requesting a back-trace could be helpful? It usually isn't because of the GHC execution model. In summary, I don't have a better suggestion for you than trying to narrow down when the problem was introduced. Going back to 7.4 should hopefully get you back to a working system. > 0x4c1375 <+37>: jmp0x4c1380 ; <+48> > 0x4c1377 <+39>: int3 > 0x4c1378 <+40>: int3 > > ~ $ lldb pandoc -- a.md > (lldb) target create "pandoc" > Current executable set to '/usr/local/bin/pandoc' (x86_64). > (lldb) settings set -- target.run-args "a.md" > (lldb) run > Process 51989 launched: '/usr/local/bin/pandoc' (x86_64) > Process 51989 stopped > * thread #1, stop reason = signal SIGILL > frame #0: 0x0558050b pandoc`___lldb_unnamed_symbol1360 + 27 > pandoc`___lldb_unnamed_symbol1360: > -> 0x558050b <+27>: vmovdqa64 -0x40fb1d5(%rip), %zmm0 > 0x5580515 <+37>: jmp0x5580520 ; <+48> > 0x5580517 <+39>: int3 > 0x5580518 <+40>: int3 > > > OpenBSD 7.4-current (GENERIC.MP) #1688: Thu Feb 15 10:48:34 MST 2024 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > real mem = 16936267776 (16151MB) > avail mem = 16401862656 (15642MB) > random: good seed from bootblocks > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 3.3 @ 0x439e2000 (51 entries) > bios0: vendor INSYDE Corp. version "03.07" date 12/14/2021 > bios0: Framework Laptop > efi0 at bios0: UEFI 2.7 > efi0: INSYDE Corp. rev 0x307 > acpi0 at bios0: ACPI 6.1 > acpi0: sleep states S0 S3 S4 S5 > acpi0: tables DSDT FACP UEFI SSDT SSDT SSDT SSDT SSDT SSDT TPM2 SSDT > NHLT SSDT LPIT WSMT SSDT SSDT DBGP DBG2 ECDT HPET APIC MCFG SSDT DMAR > SSDT FPDT PTDT BGRT > acpi0: wakeup devices PEG0(S4) PEGP(S4) PEGP(S4) PEGP(S4) XHCI(S4) > XDCI(S4) HDAS(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) > PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) [...] > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpiec0 at acpi0 > acpihpet0 at acpi0: 1920 Hz > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0:
Re: -current Haskell ports aborting with SIGILL
It runs ok on ryzen. 11th gen intel + SIGILL - looks like an IBT issue. On 2024/02/16 11:48, Evan Silberman wrote: > Hi ports@ and Greg, > > On -current (yesterday's amd64 snap, this morning's amd64 packages), > ports built with ghc are all getting SIGILL early in execution on my > laptop. I'm not totally positive what is helpful to provide here but I > can provide it on request. I thought I was maybe caught in some base > snaps vs. packages synchronization issue, but I rebuilt > devel/cabal-bundler on my laptop and got the same result after a > successful build. > > lldb transcripts and dmesg below. > > -- Evan Silberman > > ~ $ lldb cabal-bundler > (lldb) target create "cabal-bundler" > Current executable set to '/usr/local/bin/cabal-bundler' (x86_64). > (lldb) run > Process 38310 launched: '/usr/local/bin/cabal-bundler' (x86_64) > Process 38310 stopped > * thread #1, stop reason = signal SIGILL > frame #0: 0x004c136b cabal-bundler`___lldb_unnamed_symbol523 + 27 > cabal-bundler`___lldb_unnamed_symbol523: > -> 0x4c136b <+27>: vmovdqa64 -0x29fa35(%rip), %zmm0 > 0x4c1375 <+37>: jmp0x4c1380 ; <+48> > 0x4c1377 <+39>: int3 > 0x4c1378 <+40>: int3 > > ~ $ lldb pandoc -- a.md > (lldb) target create "pandoc" > Current executable set to '/usr/local/bin/pandoc' (x86_64). > (lldb) settings set -- target.run-args "a.md" > (lldb) run > Process 51989 launched: '/usr/local/bin/pandoc' (x86_64) > Process 51989 stopped > * thread #1, stop reason = signal SIGILL > frame #0: 0x0558050b pandoc`___lldb_unnamed_symbol1360 + 27 > pandoc`___lldb_unnamed_symbol1360: > -> 0x558050b <+27>: vmovdqa64 -0x40fb1d5(%rip), %zmm0 > 0x5580515 <+37>: jmp0x5580520 ; <+48> > 0x5580517 <+39>: int3 > 0x5580518 <+40>: int3 > > > OpenBSD 7.4-current (GENERIC.MP) #1688: Thu Feb 15 10:48:34 MST 2024 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > real mem = 16936267776 (16151MB) > avail mem = 16401862656 (15642MB) > random: good seed from bootblocks > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 3.3 @ 0x439e2000 (51 entries) > bios0: vendor INSYDE Corp. version "03.07" date 12/14/2021 > bios0: Framework Laptop > efi0 at bios0: UEFI 2.7 > efi0: INSYDE Corp. rev 0x307 > acpi0 at bios0: ACPI 6.1 > acpi0: sleep states S0 S3 S4 S5 > acpi0: tables DSDT FACP UEFI SSDT SSDT SSDT SSDT SSDT SSDT TPM2 SSDT NHLT > SSDT LPIT WSMT SSDT SSDT DBGP DBG2 ECDT HPET APIC MCFG SSDT DMAR SSDT FPDT > PTDT BGRT > acpi0: wakeup devices PEG0(S4) PEGP(S4) PEGP(S4) PEGP(S4) XHCI(S4) XDCI(S4) > HDAS(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) > PXSX(S4) RP05(S4) [...] > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpiec0 at acpi0 > acpihpet0 at acpi0: 1920 Hz > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz, 4190.34 MHz, 06-8c-01, > patch 00b4 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,AVX512IFMA,CLFLUSHOPT,CLWB,PT,AVX512CD,SHA,AVX512BW,AVX512VL,AVX512VBMI,UMIP,PKU,SRBDS_CTRL,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,IBRS_ALL,SKIP_L1DFL,MDS_NO,IF_PSCHANGE,MISC_PKG_CT,ENERGY_FILT,DOITM,FBSDP_NO,GDS_CTRL,XSAVEOPT,XSAVEC,XGETBV1,XSAVES > cpu0: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line > 20-way L2 cache, 8MB 64b/line 8-way L3 cache > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges > cpu0: apic clock running at 38MHz > cpu0: mwait min=64, max=64, C-substates=0.2.0.1.2.1.1.1, IBE > cpu1 at mainbus0: apid 2 (application processor) > cpu1: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz, 4190.35 MHz, 06-8c-01, > patch 00b4 > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,AVX512IFMA,CLFLUSHOPT,CLWB,PT,AVX512CD,SHA,AVX512BW,AVX512VL,AVX512VBMI,UMIP,PKU,SRBDS_CTRL,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,IBRS_ALL,SKIP_L1DFL,MDS_NO,IF_PSCHANGE,MISC_PKG_CT,ENERGY_FILT,DOITM,FBSDP_NO,GDS_CTRL,XSAVEOPT,XSAVEC,XGETBV1,XSAVES > cpu1: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line > 20-way L2 cache, 8MB 64b/line 8-way L3 cache >
-current Haskell ports aborting with SIGILL
Hi ports@ and Greg, On -current (yesterday's amd64 snap, this morning's amd64 packages), ports built with ghc are all getting SIGILL early in execution on my laptop. I'm not totally positive what is helpful to provide here but I can provide it on request. I thought I was maybe caught in some base snaps vs. packages synchronization issue, but I rebuilt devel/cabal-bundler on my laptop and got the same result after a successful build. lldb transcripts and dmesg below. -- Evan Silberman ~ $ lldb cabal-bundler (lldb) target create "cabal-bundler" Current executable set to '/usr/local/bin/cabal-bundler' (x86_64). (lldb) run Process 38310 launched: '/usr/local/bin/cabal-bundler' (x86_64) Process 38310 stopped * thread #1, stop reason = signal SIGILL frame #0: 0x004c136b cabal-bundler`___lldb_unnamed_symbol523 + 27 cabal-bundler`___lldb_unnamed_symbol523: -> 0x4c136b <+27>: vmovdqa64 -0x29fa35(%rip), %zmm0 0x4c1375 <+37>: jmp0x4c1380 ; <+48> 0x4c1377 <+39>: int3 0x4c1378 <+40>: int3 ~ $ lldb pandoc -- a.md (lldb) target create "pandoc" Current executable set to '/usr/local/bin/pandoc' (x86_64). (lldb) settings set -- target.run-args "a.md" (lldb) run Process 51989 launched: '/usr/local/bin/pandoc' (x86_64) Process 51989 stopped * thread #1, stop reason = signal SIGILL frame #0: 0x0558050b pandoc`___lldb_unnamed_symbol1360 + 27 pandoc`___lldb_unnamed_symbol1360: -> 0x558050b <+27>: vmovdqa64 -0x40fb1d5(%rip), %zmm0 0x5580515 <+37>: jmp0x5580520 ; <+48> 0x5580517 <+39>: int3 0x5580518 <+40>: int3 OpenBSD 7.4-current (GENERIC.MP) #1688: Thu Feb 15 10:48:34 MST 2024 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 16936267776 (16151MB) avail mem = 16401862656 (15642MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.3 @ 0x439e2000 (51 entries) bios0: vendor INSYDE Corp. version "03.07" date 12/14/2021 bios0: Framework Laptop efi0 at bios0: UEFI 2.7 efi0: INSYDE Corp. rev 0x307 acpi0 at bios0: ACPI 6.1 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP UEFI SSDT SSDT SSDT SSDT SSDT SSDT TPM2 SSDT NHLT SSDT LPIT WSMT SSDT SSDT DBGP DBG2 ECDT HPET APIC MCFG SSDT DMAR SSDT FPDT PTDT BGRT acpi0: wakeup devices PEG0(S4) PEGP(S4) PEGP(S4) PEGP(S4) XHCI(S4) XDCI(S4) HDAS(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpiec0 at acpi0 acpihpet0 at acpi0: 1920 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz, 4190.34 MHz, 06-8c-01, patch 00b4 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,AVX512IFMA,CLFLUSHOPT,CLWB,PT,AVX512CD,SHA,AVX512BW,AVX512VL,AVX512VBMI,UMIP,PKU,SRBDS_CTRL,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,IBRS_ALL,SKIP_L1DFL,MDS_NO,IF_PSCHANGE,MISC_PKG_CT,ENERGY_FILT,DOITM,FBSDP_NO,GDS_CTRL,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu0: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 20-way L2 cache, 8MB 64b/line 8-way L3 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 38MHz cpu0: mwait min=64, max=64, C-substates=0.2.0.1.2.1.1.1, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz, 4190.35 MHz, 06-8c-01, patch 00b4 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,AVX512IFMA,CLFLUSHOPT,CLWB,PT,AVX512CD,SHA,AVX512BW,AVX512VL,AVX512VBMI,UMIP,PKU,SRBDS_CTRL,MD_CLEAR,IBT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,IBRS_ALL,SKIP_L1DFL,MDS_NO,IF_PSCHANGE,MISC_PKG_CT,ENERGY_FILT,DOITM,FBSDP_NO,GDS_CTRL,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu1: 48KB 64b/line 12-way D-cache, 32KB 64b/line 8-way I-cache, 1MB 64b/line 20-way L2 cache, 8MB 64b/line 8-way L3 cache cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 4 (application processor) cpu2: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz, 3791.27 MHz, 06-8c-01, patch 00b4 cpu2: