[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 Jeffrey A. Law changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED CC||law at gcc dot gnu.org --- Comment #14 from Jeffrey A. Law --- I'd think the right thing to do is close this one and track in the newer bug. It's not clear they're actually the same underlying problem, even though they have the same failure signature.
[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 --- Comment #13 from Richard Biener --- While this issue seems fixed(?), there's now a new one with the same symptom, not sure if we should dup and keep this one open?
[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 --- Comment #12 from CVS Commits --- The releases/gcc-13 branch has been updated by Jeff Law : https://gcc.gnu.org/g:ab8fed849ab345974e5b83472749ac1393878f71 commit r13-7709-gab8fed849ab345974e5b83472749ac1393878f71 Author: Thomas Neumann Date: Fri Aug 11 09:20:27 2023 -0600 preserve base pointer for __deregister_frame [PR110956] Original bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 Rainer Orth successfully tested the patch on Solaris with a full bootstrap. Some uncommon unwinding table encodings need to access the base pointer for address computations. We do not have that information in calls to __deregister_frame_info_bases, and previously simply used nullptr as base pointer. That is usually fine, but for some Solaris i386 shared libraries that results in wrong address computations. To fix this problem we now associate the unwinding object with the table pointer itself, which is always known, in addition to the PC range. When deregistering a frame, we first locate the object using the table pointer, and then use the base pointer stored within the object to compute the PC range. libgcc/ChangeLog: PR libgcc/110956 * unwind-dw2-fde.c: Associate object with address of unwinding table.
[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 --- Comment #11 from CVS Commits --- The master branch has been updated by Jeff Law : https://gcc.gnu.org/g:c46bded78f3733ad1312d141ebf1ae541032a48b commit r14-3154-gc46bded78f3733ad1312d141ebf1ae541032a48b Author: Thomas Neumann Date: Fri Aug 11 09:20:27 2023 -0600 preserve base pointer for __deregister_frame [PR110956] Original bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 Rainer Orth successfully tested the patch on Solaris with a full bootstrap. Some uncommon unwinding table encodings need to access the base pointer for address computations. We do not have that information in calls to __deregister_frame_info_bases, and previously simply used nullptr as base pointer. That is usually fine, but for some Solaris i386 shared libraries that results in wrong address computations. To fix this problem we now associate the unwinding object with the table pointer itself, which is always known, in addition to the PC range. When deregistering a frame, we first locate the object using the table pointer, and then use the base pointer stored within the object to compute the PC range. libgcc/ChangeLog: PR libgcc/110956 * unwind-dw2-fde.c: Associate object with address of unwinding table.
[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 --- Comment #10 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #9 from ro at CeBiTec dot Uni-Bielefeld.DE Uni-Bielefeld.DE> --- [...] > I'm currently running a full i386-pc-solaris2.11 bootstrap. ... which just completed without regressions. Thanks again.
[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 --- Comment #9 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #8 from Thomas Neumann --- > Created attachment 55715 > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55715=edit > patch to use the correct base pointer > > The attached patch fixes the test case by using the correct base pointer > during > frame deregistration. Amazing: thanks for the analysis and the patch. I'm currently running a full i386-pc-solaris2.11 bootstrap. FWIW, the problematic library was apparently built with the original CodeSourcery GCC 3.4.3 port to support Solaris 10/amd64.
[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 --- Comment #8 from Thomas Neumann --- Created attachment 55715 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55715=edit patch to use the correct base pointer The attached patch fixes the test case by using the correct base pointer during frame deregistration.
[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 --- Comment #7 from Thomas Neumann --- Thanks for the pointer, I could reproduce the problem in a VM now. That shared library uses an usual table encoding that has to reference the original base pointer within get_pc_range. But when deregistering a frame we simply set the base pointer to nullptr, which does not work here. I will write a patch that makes sure we always have to correct base pointer available.
[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 --- Comment #6 from ro at manam dot mail-host-address-is-not-set --- > --- Comment #3 from Rainer Orth --- > (In reply to Thomas Neumann from comment #1) >> The assert says that the code tries to de-register a frame that it did not >> register before or that was deregistered before. If you see that failing you >> might want to add some print statements to >> __register_frame_info_{tables_|}bases and __deregister_frame_info_bases. >> But the example that you included in the report does not seem to trigger the >> assert but crashes during unwinding instead? This has to be looked at in a >> debugger. > > Given my nightmarish experiences debugging unwinder issues, I guess I'll > rather > start with a reghunt to identify when this started. That reghunt just has completed and identified this patch as the culprit: commit 6e80a1d164d1f996ad08a512c25a7c2ca893 Author: Thomas Neumann Date: Tue Mar 1 21:57:35 2022 +0100 eliminate mutex in fast path of __register_frame >> I do not have access to Solaris. If you can give me remote access to a >> suitable machine I am willing to debug this, otherwise you must check >> yourself why the crash happens. > > There's still no Solaris/x86 box in the cfarm, unfortunately, only > Solaris/SPARC. If you feel like it, there are two options: * A Solaris 11.4/x86 VirtualBox template: https://www.oracle.com/solaris/solaris11/downloads/solaris11-vm-templates-downloads.html This is Solaris 11.4 FCS, unfortunately, thus almost 5 years old by now. * Then there's the Solaris 11.4 SRU 42 download. It's way newer (last year), but provides only installer images which you'd have to use yourself. I'd expect no one to go this far to debug an issue on an unfamiliar platform, though.
[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 --- Comment #5 from Thomas Neumann --- The assert itself is old, it was just updated due to code changes. And asserting there makes sense, if we keep an old frame around we might see a crash later during unwinding if the unwinder tries to access code that does no longer exist due to dlclose. This does not apply if the bug is in the program itself, of course, but I think it is more probable that it is a bug in gcc. If somebody has a VM image or a remote machine I can use I would be happy to debug the problem myself, but I do not have access to Solaris.
[Bug libgcc/110956] [13/14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 --- Comment #4 from Richard Biener --- (In reply to Thomas Neumann from comment #1) > The assert says that the code tries to de-register a frame that it did not > register before or that was deregistered before. Did we assert for these cases before? Can we "safely" continue, doing nothing?
[Bug libgcc/110956] [13, 14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 Rainer Orth changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #3 from Rainer Orth --- (In reply to Thomas Neumann from comment #1) > The assert says that the code tries to de-register a frame that it did not > register before or that was deregistered before. If you see that failing you > might want to add some print statements to > __register_frame_info_{tables_|}bases and __deregister_frame_info_bases. > But the example that you included in the report does not seem to trigger the > assert but crashes during unwinding instead? This has to be looked at in a > debugger. Given my nightmarish experiences debugging unwinder issues, I guess I'll rather start with a reghunt to identify when this started. > I do not have access to Solaris. If you can give me remote access to a > suitable machine I am willing to debug this, otherwise you must check > yourself why the crash happens. There's still no Solaris/x86 box in the cfarm, unfortunately, only Solaris/SPARC.
[Bug libgcc/110956] [13, 14 regression] gcc_assert is hit at gcc-13.2.0/libgcc/unwind-dw2-fde.c#L291 with some special library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956 Rainer Orth changed: What|Removed |Added Target Milestone|--- |13.3 Summary|gcc_assert is hit at|[13, 14 regression] |gcc-13.2.0/libgcc/unwind-dw |gcc_assert is hit at |2-fde.c#L291 with some |gcc-13.2.0/libgcc/unwind-dw |special library |2-fde.c#L291 with some ||special library Known to fail||13.2.1, 14.0 Build||i386-pc-solaris2.11 Last reconfirmed||2023-8-9 Host||i386-pc-solaris2.11 Known to work||12.3.1 CC||ro at gcc dot gnu.org Target||i386-pc-solaris2.11