Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 05:55:42PM -0600, Jerry James wrote: > On Fri, Mar 27, 2020 at 4:42 PM Jerry James wrote: > > Good idea. Thank you! > > I got a segfault on s390x on the second build attempt. I'd like to > investigate the ephemeron angle a bit, I think. No crashes here overnight, but I'll leave it running. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
Jerry James writes: > I think Richard addressed most of the points you raised. Yes, and no new questions, either. So I think I'm done here (well, I'll spectate, but the internals of the OCaml compiler are well beyond my skillset :-). Good luck! Steve -- Associate Professor Division of Policy and Planning Science http://turnbull/sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnb...@sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 4:42 PM Jerry James wrote: > Good idea. Thank you! I got a segfault on s390x on the second build attempt. I'd like to investigate the ephemeron angle a bit, I think. -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 4:41 PM Richard W.M. Jones wrote: > I'll run it in a loop overnight and see if I can make it crash. Good idea. Thank you! -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 04:20:16PM -0600, Jerry James wrote: > On Fri, Mar 27, 2020 at 2:41 PM Richard W.M. Jones wrote: > > Yup, and the stack trace shows the failure happening when creating an > > ephemeron. > > On the other hand, a scratch build with coq upstream's final patch for > ocaml 4.10.0 succeeded on the first try: > > https://koji.fedoraproject.org/koji/taskinfo?taskID=42808115 > > Where this is an intermittent problem, I'll have to try more builds to > be sure, but this looks promising. I'll run it in a loop overnight and see if I can make it crash. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 2:41 PM Richard W.M. Jones wrote: > Yup, and the stack trace shows the failure happening when creating an > ephemeron. On the other hand, a scratch build with coq upstream's final patch for ocaml 4.10.0 succeeded on the first try: https://koji.fedoraproject.org/koji/taskinfo?taskID=42808115 Where this is an intermittent problem, I'll have to try more builds to be sure, but this looks promising. -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 11:54:44AM -0600, Jerry James wrote: > On Thu, Mar 26, 2020 at 10:28 AM Richard W.M. Jones wrote: > > Jerry, I suggest this bug is real, but is also likely to be a bug in > > Coq (most likely) or the OCaml runtime, possibly in the Weak module. > > You might have more luck asking the upstream developers for help. > > Coq issue: https://github.com/coq/coq/issues/11939 > > I looked around in the ocaml issue tracker and found a possibility: > > https://github.com/ocaml/ocaml/issues/9391 > > Coq uses ephemerons. Yup, and the stack trace shows the failure happening when creating an ephemeron. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 11:15:01AM -0600, Jerry James wrote: > On Fri, Mar 27, 2020 at 11:03 AM Jerry James wrote: > > Okay, I will give that a try. > > Except I can't. We're already not using %_smp_mflags. Ugh. I realized that my build machine has MAKEFLAGS=-j24 which is why it's building coq in parallel (and quickly). So it seems like you could be using _smp_mflags. Another thing I tried today was enabling debugging in the GC (see patch below). However it also didn't find any problems, so I'm still at a loss. Rich. diff --git a/coq.spec b/coq.spec index df2a847..1e590cc 100644 --- a/coq.spec +++ b/coq.spec @@ -164,6 +164,10 @@ done %global opt_option -bytecode-compiler yes -byte-only -coqide byte %endif +# Use debug runtime. +sed -i 's/^\(OCAMLC :=.*\)/\1 -runtime-variant d/' Makefile.build +sed -i 's/^\(OCAMLOPT :=.*\)/\1 -runtime-variant d/' Makefile.build + %global coqdocdir %{?_pkgdocdir}%{!?_pkgdocdir:%{_docdir}/coq-%{version}} %global coqdatadir %{_libdir}/coq -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 11:36:38AM -0600, Jerry James wrote: > On Fri, Mar 27, 2020 at 11:30 AM Zbigniew Jędrzejewski-Szmek > wrote: > > We are. Did you trying putting '%global _smp_mflags -j1' somewhere > > near the top of the spec file? > > We are? Where? That is what the packaging guidelines say [1]: > Whenever possible, invocations of make should be done as > %make_build (which is defined as $ rpmbuild --showrc |grep make_build make_build %{__make} %{_make_output_sync} %{?_smp_mflags} %{_make_verbose}). [1] https://docs.fedoraproject.org/en-US/packaging-guidelines/#_parallel_make > The spec file does this to build: > > make world VERBOSE=1 So it seems that this particular Makefile enables parallelization internally. (make defaults -j1 unless told otherwise). But it should be possible to somehow disable it wherever it is enabled. Zbyszek ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Thu, Mar 26, 2020 at 10:28 AM Richard W.M. Jones wrote: > Jerry, I suggest this bug is real, but is also likely to be a bug in > Coq (most likely) or the OCaml runtime, possibly in the Weak module. > You might have more luck asking the upstream developers for help. Coq issue: https://github.com/coq/coq/issues/11939 I looked around in the ocaml issue tracker and found a possibility: https://github.com/ocaml/ocaml/issues/9391 Coq uses ephemerons. -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 11:30 AM Zbigniew Jędrzejewski-Szmek wrote: > We are. Did you trying putting '%global _smp_mflags -j1' somewhere > near the top of the spec file? We are? Where? The spec file does this to build: make world VERBOSE=1 -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 11:15:01AM -0600, Jerry James wrote: > On Fri, Mar 27, 2020 at 11:03 AM Jerry James wrote: > > Okay, I will give that a try. > > Except I can't. We're already not using %_smp_mflags. Ugh. We are. Did you trying putting '%global _smp_mflags -j1' somewhere near the top of the spec file? Zbyszek ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Fri, Mar 27, 2020 at 11:03 AM Jerry James wrote: > Okay, I will give that a try. Except I can't. We're already not using %_smp_mflags. Ugh. -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Wed, Mar 25, 2020 at 10:21 PM Stephen J. Turnbull wrote: > Hi Jerry! Long time no see. Hi Stephen! It's good to hear from yet another person I've worked with via network for years. I think Richard addressed most of the points you raised. > Where is the segfault? Is it in coqc or is it in the generated code > or is it in the OCaml compiler or is it somewhere else? It's in coqc. Richard's evidence seems to point to garbage collector involvement, with weak references possibly playing a role. > Stay healthy! You, too, and likewise for everybody in the Fedora community. -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Thu, Mar 26, 2020 at 10:28 AM Richard W.M. Jones wrote: > I don't know, now I can't even get the ‘fedpkg local’ to reproduce it :-( > > Jerry, I suggest this bug is real, but is also likely to be a bug in > Coq (most likely) or the OCaml runtime, possibly in the Weak module. > You might have more luck asking the upstream developers for help. > > As for what to do about Fedora 32, how about disabling _smp_mflags, on > the basis that it might be load-related? That should at least reduce > the non-determinism. Okay, I will give that a try. I'm worried that even if I get a "good" build, that it's going to explode on the hapless Fedora users who install it. Thank you for all the work you put into this, Richard. I appreciate the investigation. I'll keep poking at it, too. -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
I don't know, now I can't even get the ‘fedpkg local’ to reproduce it :-( Jerry, I suggest this bug is real, but is also likely to be a bug in Coq (most likely) or the OCaml runtime, possibly in the Weak module. You might have more luck asking the upstream developers for help. As for what to do about Fedora 32, how about disabling _smp_mflags, on the basis that it might be load-related? That should at least reduce the non-determinism. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Thu, Mar 26, 2020 at 12:00:14PM +, Richard W.M. Jones wrote: > This bug is VERY annoying! Although not very reproducible, I can > usually hit it after about 4-6 hours of looping ‘fedpkg build’. I mean ‘fedpkg local’. Local builds, not building in Koji. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
This bug is VERY annoying! Although not very reproducible, I can usually hit it after about 4-6 hours of looping ‘fedpkg build’. Considering the whole Coq build is quite lengthy you'd think that one of these commands would hit it much more quickly: $ while true ; do gdb -ex 'set args -coqlib . -q -native-compiler yes theories/FSets/FMapFullAVL.v' -ex run bin/coqc -ex quit ; done or $ while bin/coqc -coqlib . -q -native-compiler yes theories/FSets/FMapFullAVL.v ; do date; done But nope ... My latest thing is to run the gdb command above _and_ run fedpkg build (in another directory) at the same time, in the hope that it's caused by system load. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Thu, Mar 26, 2020 at 01:20:58PM +0900, Stephen J. Turnbull wrote: > Where is the segfault? Is it in coqc or is it in the generated code > or is it in the OCaml compiler or is it somewhere else? I managed to reproduce this on my mostly Rawhide x86-64 development machine by running fedpkg build in a loop overnight. This also captured a core dump. Unfortunately the stack trace, even with full debuginfo installed, was not very helpful. I'm now trying to run the unstable bin/coqc command over and over again under gdb, but the bug has not reproduced there so far. One thing I did learn from the stack trace is that it appears to fail in C code (in the garbage collector). However that might be because the OCaml heap is corrupted by an earlier event. Another thing I learned is that it happens when doing something or other with weak references, which can be non-deterministic. > If you can't rule out OCaml: does it produce native code directly, or > does it produce C or similar and then use GCC or similar to produce > the native code? OCaml produces native code directly. GCC is only used to link the final object files together. Rich. Core was generated by `bin/coqc -coqlib . -q -native-compiler yes theories/FSets/FMapFullAVL.v'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x556989736c94 in ?? () (gdb) t a a bt Thread 1 (Thread 0x7f7453e1ec00 (LWP 2710541)): #0 0x556989736c94 in ?? () #1 0x0001 in ?? () #2 0x0001 in ?? () #3 0x7f7453440ec8 in ?? () #4 0xfec0 in ?? () #5 0x5569c42daba8 in ?? () #6 0x0004 in ?? () #7 0x0002 in ?? () #8 0x00042d7f6c00 in ?? () #9 0x0001 in ?? () ^-- probable bad stack frame #10 0x5569c35adc63 in caml_alloc_shr () ^-- garbage collector https://github.com/ocaml/ocaml/blob/0756052841d0d3e6266c8c1ca4a608dc16c11441/runtime/memory.c#L560 #11 0x5569c35c00bc in caml_ephe_set_key () ^-- weak ephemerons https://github.com/ocaml/ocaml/blob/0756052841d0d3e6266c8c1ca4a608dc16c11441/runtime/weak.c#L211 #12 0x5569c35419e8 in camlStdlib__obj__set_key_306 () #13 0x5569c358da66 in camlStdlib__ephemeron__create_1033 () ^-- OCaml functions Stdlib.Obj.set_key & Stdlib.Ephemeron.create #14 0x7ffd45de5508 in ?? () #15 0x7ffd45de5500 in ?? () #16 0x7f744b7cf7b0 in ?? () #17 0x30124a25 in ?? () #18 0x7f7453745e88 in ?? () #19 0x0225 in ?? () #20 0x4917d3992e9b8500 in ?? () #21 0x5569c3a3c858 in camlCsymtable__9 () #22 0x in ?? () -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Coq build issues in F32
Hi Jerry! Long time no see. Jerry James writes: > I built the whole stack for Rawhide with no issues. Zero compiler warnings? No fiddling to suppress compiler warnings *anywhere* in the build chain (whether options or pragmas)? Not suggesting you go looking for pragmas (probably have 10 million lines of code :-( ), but you could check for warnings in the build logs. I didn't see any in the tail, though. > Attempt 1: the x86_64 build segfaulted while running "bin/coqc -coqlib . -q > -native-compiler yes theories/FSets/FMapFullAVL.v", where coqc is a > binary built from the coq sources. Coq is a theorem prover. The coqc > run is essentially checking that a series of proof steps indeed prove > a theorem. The "-native-compiler yes" part means that the OCaml > compiler will be invoked to generate native code for parts of this > process. Where is the segfault? Is it in coqc or is it in the generated code or is it in the OCaml compiler or is it somewhere else? If you can't rule out the OCaml compiler (or other parts of the OCaml toolchain) and the generated code: - what happens if you use -native-compiler=no? - How is the native code used? Compiled and linked as a .so, then dynamically loaded and called? Compiled and linked as an executable, then run as a separate process? If you can't rule out OCaml: does it produce native code directly, or does it produce C or similar and then use GCC or similar to produce the native code? What optimizations to gcc are being used where it's being used? Are other compilers (LLVM) supported in lieu of GCC for any parts of the build chain? > > I was puzzled by the segfault. I decided to see if the segfault is > deterministic, so I launched... > > Attempt 2: the x86_64 build succeeded. The s390x build failed with an > illegal instruction error while running *the same command*. > > Attempt 3: The s390x build again failed with an illegal instruction > error while running the same command, and aarch64 segfaulted while > running the same command. All other arches succeeded. I saved the > URL for this one: > > https://koji.fedoraproject.org/koji/taskinfo?taskID=42768075 From the tail of the S390 build log: rm -f theories/FSets/FMapFullAVL.glob bin/coqc -coqlib . -q -native-compiler yes theories/FSets/FMapFullAVL.v make[1]: *** [Makefile.build:880: theories/FSets/FMapFullAVL.vo] Illegal instruction (core dumped) make[1]: *** [theories/FSets/FMapFullAVL.vo] Deleting file 'theories/FSets/FMapFullAVL.glob' make[1]: Leaving directory '/builddir/build/BUILD/coq-8.11.0' make: *** [Makefile:177: submake] Error 2 That seems pretty unhelpful, as Makefile.build:880 is presumably just the rule that invokes bin/coqc, which evidently is actually a compiler driver that invokes various executables. Do you have access to the core that was allegedly dumped, to at least determine the identity of the executable that segfaulted? Is it possible to reproduce outside of Koji? Stay healthy! -- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnb...@sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Wed, Mar 25, 2020 at 9:24 PM Jerry James wrote: > You'll want to view this with a fixed-width font. Thank you, gmail, for completely destroying my formatting work. Oh well, you can probably tell what I was trying to do. -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
On Wed, Mar 25, 2020 at 4:58 PM Jerry James wrote: > Either I got lucky when I launched the Rawhide build, or something is > fundamentally different between F32 and Rawhide. Clearly something > nondeterministic is at work. I have been unable to reproduce this > failure in mock so far, but will keep trying. I did a scratch build in Rawhide, just to see if this morning's successful build was a fluke. It succeeded on all architectures: https://koji.fedoraproject.org/koji/taskinfo?taskID=42770190 So far, 2 out of 2 Rawhide builds succeeded, and 3 out of 3 F32 builds ended with either a segfault or a bad instruction on at least 1 architecture. Here are the side-by-side differences in the packages in the aarch64 buildroots (since aarch64 was one of the failing architectures on my last attempt), Rawhide on the left, F32 on the right. You'll want to view this with a fixed-width font. adwaita-cursor-theme-3.36.0-1.fc33.noarch adwaita-cursor-theme-3.36.0-1.fc32.noarch adwaita-icon-theme-3.36.0-1.fc33.noarch adwaita-icon-theme-3.36.0-1.fc32.noarch annobin-9.12-3.fc33.aarch64 annobin-9.06-4.fc32.aarch64 at-spi2-atk-2.34.2-1.fc33.aarch64 at-spi2-atk-2.34.2-1.fc32.aarch64 at-spi2-atk-devel-2.34.2-1.fc33.aarch64 at-spi2-atk-devel-2.34.2-1.fc32.aarch64 at-spi2-core-2.36.0-1.fc33.aarch64 at-spi2-core-2.36.0-1.fc32.aarch64 at-spi2-core-devel-2.36.0-1.fc33.aarch64 at-spi2-core-devel-2.36.0-1.fc32.aarch64 audit-libs-3.0-0.19.20191104git1c2f876.fc33.aarch64 audit-libs-3.0-0.18.20191104git1c2f876.fc32.aarch64 avahi-libs-0.8-2.fc33.aarch64 avahi-libs-0.7-23.fc32.aarch64 binutils-2.34-2.fc33.aarch64 binutils-2.34-2.fc32.aarch64 binutils-gold-2.34-2.fc33.aarch64 binutils-gold-2.34-2.fc32.aarch64 cmake-filesystem-3.17.0-1.fc33.aarch64 cmake-filesystem-3.16.4-1.fc32.aarch64 coreutils-8.32-3.fc33.aarch64 coreutils-8.31-10.fc32.aarch64 coreutils-common-8.32-3.fc33.aarch64 coreutils-common-8.31-10.fc32.aarch64 cpp-10.0.1-0.9.fc33.aarch64 cpp-10.0.1-0.9.fc32.aarch64 crypto-policies-20200312-1.git3ae59d2.fc33.noarch crypto-policies-20191128-5.gitcd267a5.fc32.noarch cryptsetup-libs-2.3.1-1.fc33.aarch64 cryptsetup-libs-2.3.0-1.fc32.aarch64 curl-7.69.0-2.fc33.aarch64 curl-7.69.1-1.fc32.aarch64 cyrus-sasl-lib-2.1.27-4.fc33.aarch64 cyrus-sasl-lib-2.1.27-3.fc32.aarch64 dbus-1:1.12.16-5.fc33.aarch64 dbus-1:1.12.16-4.fc32.aarch64 dbus-broker-22-1.fc33.aarch64 dbus-broker-22-1.fc32.aarch64 dbus-common-1:1.12.16-5.fc33.noarch dbus-common-1:1.12.16-4.fc32.noarch dbus-devel-1:1.12.16-5.fc33.aarch64 dbus-devel-1:1.12.16-4.fc32.aarch64 dbus-libs-1:1.12.16-5.fc33.aarch64 dbus-libs-1:1.12.16-4.fc32.aarch64 dejavu-sans-fonts-2.37-7.fc33.noarch dejavu-sans-fonts-2.37-7.fc32.noarch fedora-gpg-keys-33-0.3.noarch fedora-gpg-keys-32-0.7.noarch fedora-release-33-0.3.noarch fedora-release-32-0.8.noarch fedora-release-common-33-0.3.noarch fedora-release-common-32-0.8.noarch fedora-repos-33-0.3.noarch fedora-repos-32-0.7.noarch fedora-repos-rawhide-33-0.3.noarch file-5.38-3.fc33.aarch64 file-5.38-2.fc32.aarch64 file-libs-5.38-3.fc33.aarch64 file-libs-5.38-2.fc32.aarch64 fontconfig-2.13.92-8.fc33.aarch64 fontconfig-2.13.92-8.fc32.aarch64 fontconfig-devel-2.13.92-8.fc33.aarch64 fontconfig-devel-2.13.92-8.fc32.aarch64 fonts-filesystem-2.0.3-1.fc33.noarch fonts-filesystem-2.0.3-1.fc32.noarch fonts-srpm-macros-2.0.3-1.fc33.noarch fonts-srpm-macros-2.0.3-1.fc32.noarch fribidi-1.0.9-1.fc33.aarch64 fribidi-1.0.9-1.fc32.aarch64 fribidi-devel-1.0.9-1.fc33.aarch64 fribidi-devel-1.0.9-1.fc32.aarch64 gcc-10.0.1-0.9.fc33.aarch64 gcc-10.0.1-0.9.fc32.aarch64 gdb-minimal-9.1-4.fc33.aarch64 gdb-minimal-9.1-3.fc32.aarch64 ghostscript-9.50-1.fc33.aarch64 ghostscript-9.50-1.fc32.aarch64 ghostscript-core-9.50-1.fc33.aarch64 ghostscript-core-9.50-1.fc32.aarch64 ghostscript-tools-fonts-9.50-1.fc33.aarch64 ghostscript-tools-fonts-9.50-1.fc32.aarch64 ghostscript-tools-printing-9.50-1.fc33.aarch64 ghostscript-tools-printing-9.50-1.fc32.aarch64 glib-networking-2.64.0-1.fc33.aarch64 glib-networking-2.64.0-1.fc32.aarch64 glib2-2.64.1-1.fc33.aarch64 glib2-2.64.1-1.fc32.aarch64 glib2-devel-2.64.1-1.fc33.aarch64 glib2-devel-2.64.1-1.fc32.aarch64 glibc-2.31.9000-4.fc33.aarch64 glibc-2.31-1.fc32.aarch64 glibc-common-2.31.9000-4.fc33.aarch64 glibc-common-2.31-1.fc32.aarch64 glibc-devel-2.31.9000-4.fc33.aarch64 glibc-devel-2.31-1.fc32.aarch64 glibc-headers-2.31.9000-4.fc33.aarch64 glibc-headers-2.31-1.fc32.aarch64 glibc-minimal-langpack-2.31.9000-4.fc33.aarch64 glibc-minimal-langpack-2.31-1.fc32.aarch64 gmp-1:6.2.0-1.fc33.aarch64 gmp-1:6.1.2-13.fc32.aarch64 google-droid-sans-fonts-20200215-3.fc33.noarch google-droid-sans-fonts-20200215-3.fc32.noarch google-roboto-slab-fonts-1.100263-0.12.20150923git.fc33.noarch google-roboto-slab-fonts-1.100263-0.11.20150923git.fc32.noarch groff-base-1.22.3-24.fc33.aarch64 groff-base-1.22.3-21.fc32.aarch64 gsettings-desktop-schemas-3.36.0-1.fc33.aarch64 gsettings-desktop-schemas-3.36.0-1.fc32.aarch64 gtk-update-icon-cache-3.24.14-1.fc33.aarch64
Re: Coq build issues in F32
On Wed, Mar 25, 2020 at 8:57 PM Paul Dufresne via devel wrote: > ... yeah, I know, might have no link at all with your Coq build... I > don't know. Maybe ... but that sure is suspicious. I recall hearing that OpenJDK is also suffering from weird segfaults. H. Thanks for the input, Paul. I'll keep investigating. -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: Coq build issues in F32
Le Wed, 25 Mar 2020 16:58:33 -0600, Jerry James a écrit : > Either I got lucky when I launched the Rawhide build, or something is > fundamentally different between F32 and Rawhide. Clearly something > nondeterministic is at work. I have been unable to reproduce this > failure in mock so far, but will keep trying. > > If anybody has any ideas, I am all ears. Thank you, Well... It ring a bell for me, because I kind of seeing what seems to be weird bugs in F32... I am kind of suspecting a gcc bug giving some weird things... but I am not a developer, know a bit of assembler but not enough to make the link between C code and assembler. Anyway, one hour a go, I got a segment violation error while doing updates with dnfdragora... and it seems that the code just jump to a low address outside of the program: rh bug #1817269 The other bug that make me suspect something might be wrong with gcc was a bug that make a segment violation (seems to access a bit lower address than a .png font file) in a game package that have not been updated in many years as far as I know: rh bug #1816471 but in that case, frankly the C code seems a bit hard for me to take time to try to understand it. What puzzle me in this case, is why a very old code would have stop to work... that's why I came to the idea that maybe this is the generated code itself that changed. ... yeah, I know, might have no link at all with your Coq build... I don't know. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Coq build issues in F32
I've been talking to Richard Jones about this privately, and he suggested that a message to fedora-devel could be helpful. I've been talking for weeks about updating the entire coq stack, badgering people into swapping reviews with me, etc. Today I finally had all the bits in place to do the updates. I built the whole stack for Rawhide with no issues. The same builds need to happen for F32, where parts of the stack currently have broken deps. But I can't get coq to build successfully, even though it did in Rawhide. It went like this: Attempt 1: the x86_64 build segfaulted while running "bin/coqc -coqlib . -q -native-compiler yes theories/FSets/FMapFullAVL.v", where coqc is a binary built from the coq sources. Coq is a theorem prover. The coqc run is essentially checking that a series of proof steps indeed prove a theorem. The "-native-compiler yes" part means that the OCaml compiler will be invoked to generate native code for parts of this process. I was puzzled by the segfault. I decided to see if the segfault is deterministic, so I launched... Attempt 2: the x86_64 build succeeded. The s390x build failed with an illegal instruction error while running *the same command*. Attempt 3: The s390x build again failed with an illegal instruction error while running the same command, and aarch64 segfaulted while running the same command. All other arches succeeded. I saved the URL for this one: https://koji.fedoraproject.org/koji/taskinfo?taskID=42768075 Either I got lucky when I launched the Rawhide build, or something is fundamentally different between F32 and Rawhide. Clearly something nondeterministic is at work. I have been unable to reproduce this failure in mock so far, but will keep trying. If anybody has any ideas, I am all ears. Thank you, -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org