[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread tneumann at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #26 from Thomas Neumann  ---
(In reply to Florian Weimer from comment #23)
> 
> u is the original read pointer as far as I can see. So it looks like it
> should look like this:
> 
> diff --git a/libgcc/unwind-dw2-fde-dip.c b/libgcc/unwind-dw2-fde-dip.c
> index 6223f5f18a2..5a6352227cc 100644
> --- a/libgcc/unwind-dw2-fde-dip.c
> +++ b/libgcc/unwind-dw2-fde-dip.c
> @@ -403,8 +403,8 @@ find_fde_tail (_Unwind_Ptr pc,
>BFD ld generates.  */
>signed value __attribute__ ((mode (SI)));
>memcpy (, p, sizeof (value));
> +  eh_frame = p + value
>p += sizeof (value);
> -  dbase = value; /* No adjustment because pcrel has base 0.  */
>  }
>else
>  p = read_encoded_value_with_base (hdr->eh_frame_ptr_enc,


that seems to be correct, the test case succeeds with that patch.

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #25 from Carlos Galvez  ---
Perhaps this is a stupid comment, but isn't "ob.s.b.encoding" uninitialized?

  /* inside find_fde_tail */
  struct object ob;

  ...

  ob.pc_begin = NULL;
  ob.tbase = NULL;
  ob.dbase = (void *) dbase;
  ob.u.single = (fde *) eh_frame;
  ob.s.i = 0;
  ob.s.b.mixed_encoding = 1;  /* Need to assume worst case.  */
  const fde *entry = linear_search_fdes (, (fde *) eh_frame, (void *) pc);

Above, only "ob.s.b.mixed_encoding" is set, not "ob.s.b.encoding".

After that, "linear_search_fdes" expects that it's set:

static const fde *
linear_search_fdes (struct object *ob, const fde *this_fde, void *pc)
{
  const struct dwarf_cie *last_cie = 0;
  int encoding = ob->s.b.encoding;
  _Unwind_Ptr base = base_from_object (ob->s.b.encoding, ob);

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #24 from Florian Weimer  ---
(With the missing ; added, of course.)

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #23 from Florian Weimer  ---
(In reply to Thomas Neumann from comment #21)
> It must be something more complex. value is small here (more precisely: 1888
> in the crashes later), which is not a valid pointer address. We probably
> have to add this to some base pointer? But it is not obvious to me to which
> one.

read_encoded_value_with_base has this:

  result += ((encoding & 0x70) == DW_EH_PE_pcrel
 ? (_Unwind_Internal_Ptr) u : base);

u is the original read pointer as far as I can see. So it looks like it should
look like this:

diff --git a/libgcc/unwind-dw2-fde-dip.c b/libgcc/unwind-dw2-fde-dip.c
index 6223f5f18a2..5a6352227cc 100644
--- a/libgcc/unwind-dw2-fde-dip.c
+++ b/libgcc/unwind-dw2-fde-dip.c
@@ -403,8 +403,8 @@ find_fde_tail (_Unwind_Ptr pc,
 BFD ld generates.  */
   signed value __attribute__ ((mode (SI)));
   memcpy (, p, sizeof (value));
+  eh_frame = p + value
   p += sizeof (value);
-  dbase = value;   /* No adjustment because pcrel has base 0.  */
 }
   else
 p = read_encoded_value_with_base (hdr->eh_frame_ptr_enc,

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #22 from Carlos Galvez  ---
Indeed it's an uninitialized read according to valgrind:

==15475== Use of uninitialised value of size 8
==15475==at 0x1E81C2E9: base_from_object (unwind-dw2-fde.c:319)
==15475==by 0x1E81C2E9: linear_search_fdes (unwind-dw2-fde.c:975)
==15475==by 0x1E81CE50: find_fde_tail (unwind-dw2-fde-dip.c:519)
==15475==by 0x1E81CE50: _Unwind_Find_FDE (unwind-dw2-fde-dip.c:573)
==15475==by 0x1E8184A9: uw_frame_state_for (unwind-dw2.c:1005)
==15475==by 0x1E819EFC: _Unwind_RaiseException (unwind.inc:104)
==15475==by 0x1E2B8089: __cxa_throw (in
/path/to/gcc/usr/lib64/libstdc++.so.6.0.32)

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread tneumann at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #21 from Thomas Neumann  ---
It must be something more complex. value is small here (more precisely: 1888 in
the crashes later), which is not a valid pointer address. We probably have to
add this to some base pointer? But it is not obvious to me to which one.

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

Florian Weimer  changed:

   What|Removed |Added

 CC||fw at gcc dot gnu.org

--- Comment #20 from Florian Weimer  ---
Thanks for looking into this, Thomas.  I suspect it's a simple typo, which
happens not to matter in many cases because both dbase and eh_frame are unused:

diff --git a/libgcc/unwind-dw2-fde-dip.c b/libgcc/unwind-dw2-fde-dip.c
index 6223f5f18a2..b7b09d584c8 100644
--- a/libgcc/unwind-dw2-fde-dip.c
+++ b/libgcc/unwind-dw2-fde-dip.c
@@ -404,7 +404,7 @@ find_fde_tail (_Unwind_Ptr pc,
   signed value __attribute__ ((mode (SI)));
   memcpy (, p, sizeof (value));
   p += sizeof (value);
-  dbase = value;   /* No adjustment because pcrel has base 0.  */
+  eh_frame = value;/* No adjustment because pcrel has base
0.  */
 }
   else
 p = read_encoded_value_with_base (hdr->eh_frame_ptr_enc,

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread tneumann at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #19 from Thomas Neumann  ---
Hm, then I don't know how we end up with the non-regular table content. The
code checks for hdr->fde_count_enc != DW_EH_PE_omit, and that is false in the
executable that you provided.

But regardless of why the table is strange, the bug is definitively caused by
the uninitialized eh_frame variable in find_fde_tail. It must be set in the
fast path, too.

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #18 from Carlos Galvez  ---
Thanks for the investigation! To clarify: my last reproducible example does not
use gold, instead it uses the default GNU ld version 2.38.

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread tneumann at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #17 from Thomas Neumann  ---
The bug was introduced by gcc commit e724b04. It avoids calls to
read_encoded_value_with_base for performance reasons, but unfortunately this
causes the variable eh_frame to be uninitialized if the fast path is taken in
find_fde_tail (unwind-dw2-fde-dip.c).

This is only visible with the gold linker because gold does not provide a
conveniently organized unwind table, with causes the code to fall back to the
slow linear_search_fdes, which uses the (uninitialized) eh_frame value.

Florian, can you fix that? For me it is not obvious how to compute the correct
eh_frame value without calling read_encoded_value_with_base, but you probably
know how to do that.

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread tneumann at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

Thomas Neumann  changed:

   What|Removed |Added

 CC||fweimer at redhat dot com

--- Comment #16 from Thomas Neumann  ---
e724b04

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #15 from Carlos Galvez  ---
Created attachment 55261
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55261=edit
Reproducible example nvinfer

Attaching (hopefully) reproducible example as a tarball, containing:

- download.sh: script to download an unpack the Nvidia dependencies.
- test.sh: script to build and test the application. It expects a GCC_BASE
environment variable existing pointing to the base GCC trunk
installation/build.
- a.out: the compiled binary.

This has been tested on Ubuntu 22.04.

Thank you for your time!

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread tneumann at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #14 from Thomas Neumann  ---
I cannot reproduce the problem, but admittedly I used a newer Ubuntu version. I
tried compiling it with gcc 7.5.0, linking it with gold 1.16, and using the gcc
version you specified (07c52d1eec9) for the shared library without problems.

Can you attach the a.out that you generate? This will hopefully allow me to
reproduce the problem.

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-05 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

Richard Biener  changed:

   What|Removed |Added

 CC||tneumann at users dot 
sourceforge.
   ||net
 Resolution|INVALID |---
 Status|RESOLVED|REOPENED

--- Comment #13 from Richard Biener  ---
Let's re-open this as it seems it needs more analysis.  Thomas might also know
of relevant bugs that have been fixed meanwhile and maybe has better
understanding of how things could break here.

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-04 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #12 from Carlos Galvez  ---
I just tested latest and greatest trunk (git commit
2415024e0f81f8c09bf08f947c790b43de9d0bbc) and the problem persists. Slightly
different line numbers but essentially same backtrace:

#0  linear_search_fdes (ob=0x7fffd1d0, this_fde=0x0, pc=0x7fffdf4b6624) at
../../../gcc/libgcc/unwind-dw2-fde.c:977
#1  0x7fffdde1ce51 in find_fde_tail (dbase=2424076, bases=0x7fffd428,
hdr=0x7690ca70, pc=140736939648548) at
../../../gcc/libgcc/unwind-dw2-fde-dip.c:519
#2  _Unwind_Find_FDE (pc=, bases=bases@entry=0x7fffd428) at
../../../gcc/libgcc/unwind-dw2-fde-dip.c:573
#3  0x7fffdde184aa in uw_frame_state_for (context=0x7fffd380,
fs=0x7fffd470) at ../../../gcc/libgcc/unwind-dw2.c:1005
#4  0x7fffdde19efd in _Unwind_RaiseException (exc=0x1d994fb0) at
../../../gcc/libgcc/unwind.inc:104
#5  0x7fffde2b808a in __cxa_throw () from
/path/to/gcc/usr/lib64/libstdc++.so.6
#6  0x7fffdf4b6625 in ?? () from /path/to/nvinfer/lib/libnvinfer.so.8
#7  0x7fffdf0f5df3 in ?? () from /path/to/nvinfer/lib/libnvinfer.so.8
#8  0x7fffe1bff20b in ?? () from /path/to/nvinfer/lib/libnvinfer.so.8
#9  0x7fffdf428c3f in ?? () from /path/to/nvinfer/lib/libnvinfer.so.8
#10 0x7fffdf4297db in ?? () from /path/to/nvinfer/lib/libnvinfer.so.8
#11 0x7fffdf429ef7 in ?? () from /path/to/nvinfer/lib/libnvinfer.so.8
#12 0x7fffdf42a17d in createInferBuilder_INTERNAL () from
/path/to/nvinfer/lib/libnvinfer.so.8
#13 0x00401163 in nvinfer1::(anonymous namespace)::createInferBuilder
(logger=...) at nvinfer/include/NvInfer.h:9093
#14 0x00401182 in main () at main.cpp:13

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #11 from Andrew Pinski  ---
(In reply to Carlos Galvez from comment #10)
> So the library was compiled with GCC 7 and has a dependency on
> libstdc++.so.6. Via LD_LIBRARY_PATH, I run my executable using GCC trunk
> (14)'s libstdc++.so.6.

Was the trunk before or after r14-1515-g38e88d41f50d844f1404 ? Can you try
building the trunk again?

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-06-03 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #10 from Carlos Galvez  ---
Hi!

I've continued to look into this and am having a slightly different but
essentially same error with yet another Nvidia library, but this time is with a
pure shared library, "libnvinfer.so", which was compiled against GCC 7. Most
likely the library is statically linked against libcudart_static.a. The issue
is not solved using bfd or lld linker.

My program simply links against that library, which internally throws an
exception. I get a very similar backtrace:

#0  linear_search_fdes (ob=0x7fffd350, this_fde=0x0, pc=0x7fffdf4b6a69) at
../../../gcc/libgcc/unwind-dw2-fde.c:973
#1  0x7fffdde1cde1 in find_fde_tail (dbase=2424076, bases=0x7fffd5a8,
hdr=0x7690ca70, pc=140736939649641) at
../../../gcc/libgcc/unwind-dw2-fde-dip.c:519
#2  _Unwind_Find_FDE (pc=, bases=bases@entry=0x7fffd5a8) at
../../../gcc/libgcc/unwind-dw2-fde-dip.c:573
#3  0x7fffdde1847a in uw_frame_state_for (context=0x7fffd500,
fs=0x7fffd5f0) at ../../../gcc/libgcc/unwind-dw2.c:1005
#4  0x7fffdde19ecd in _Unwind_RaiseException (exc=0x904320) at
../../../gcc/libgcc/unwind.inc:104
#5  0x7fffde2b7e6a in __cxa_throw () from /path/to/usr/lib64/libstdc++.so.6
#6  0x7fffdf4b6a6a in ?? () from /path/to/nvinfer/lib/libnvinfer.so.8
#7  0x7fffdf4c21b5 in ?? () from /path/to/nvinfer/lib/libnvinfer.so.8
#8  0x7fffdfbddf02 in ?? () from /path/to/nvinfer/lib/libnvinfer.so.8
#9  0x7fffdf42a118 in createInferBuilder_INTERNAL () from
/path/to/nvinfer/lib/libnvinfer.so.8
#10 0x00401163 in nvinfer1::(anonymous namespace)::createInferBuilder
(logger=...) at nvinfer/include/NvInfer.h:9093
#11 0x00401182 in main () at main.cpp:13


So the library was compiled with GCC 7 and has a dependency on libstdc++.so.6.
Via LD_LIBRARY_PATH, I run my executable using GCC trunk (14)'s libstdc++.so.6.

Now, I try to see if "libnvinfer_static.a" uses any symbol from "libgcc_eh.a",
by doing:

- Run "nm libgcc_eh.a" and store a list of all "T" or "t" symbols.
- Run "nm libnvinfer_static.a" and store a list of all "U" symbols.
- Compute the intersection between those two lists.

This results in that "libnvinfer_static.a" only uses 1 symbol from libgcc_eh.a:
_Unwind_Resume.

Is the above test procedure correct to determine the symbols used from
libgcc_eh.a?

How come linking a pure shared library such as libnvinfer.so would lead to
mixing types from different versions of libgcc_eh.a, i.e. how could those
internal changes leak outside the shared library boundaries? 

After all this comes from __cxa_throw() from libstdc++.so.6, which is a
versioned symbol. Shouldn't that function get a new symbol version if there's
an ABI incompatible change?

Thank you for your time and help, really appreciated!

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-05-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #9 from Richard Biener  ---
Yes, using a newer libgcc_s.so.1 or libstdc++.so.6 should work fine - again,
unless we end up with mixing static/dynamic parts of the unwinder of different
versions.

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-05-15 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #8 from Carlos Galvez  ---
Upon closer inspection, it turns out we were building with GCC 7, but then
using libgcc_s.so.1 and libstdc++.so.6 from GCC trunk at runtime (via
LD_LIBRARY_PATH). Building with GCC trunk instead solves the segfault I
described above.

In particular it seems the problem is libgcc_s.so.1 - if I use the system-wide 
one (older) instead of the one from GCC trunk, the problem goes away.

Is this expected though? My understanding was that libgcc_s and libstdc++ are
backwards compatible, i.e. I can always keep the latest one installed on my
system and I should be able to run applications linked against older libraries
(which is what is happening here). There's also symbol versioning so old
symbols are kept.

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-05-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #7 from Richard Biener  ---
(In reply to Carlos Galvez from comment #6)
> Hi again!
> 
> I realized there is still one more problem missing, so I suspect the linker
> was not the only culprit. It does not segfault, but it gets stuck in an
> infinite loop, once again when mixing exceptions and libcudart_static.a.
> 
> @Richard you mentioned:
> 
> > Does libcudart_static.a by chance contain any symbols from the libgcc 
> > runtime (of an old toolchain)?
> 
> Do you know how I could verify this? I'm pretty new when it comes to
> troubleshooting these things.
> 
> My understanding is that libstdc++.so and libgcc_s.so are always backwards
> compatible so using "the latest" ensures you can use the newest features and
> also run older built code. Is there a flaw/pitfall in that reasoning?

There were changes to the internal data structures of the unwinder so I
wondered if you somehow managed to mix unwinder parts of different versions.

You probably have a libgcc_eh.a file as part of your GCC install, you could
look for symbols this library provides in the NVIDIA static archives.

> Thanks!

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-05-13 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #6 from Carlos Galvez  ---
Hi again!

I realized there is still one more problem missing, so I suspect the linker was
not the only culprit. It does not segfault, but it gets stuck in an infinite
loop, once again when mixing exceptions and libcudart_static.a.

@Richard you mentioned:

> Does libcudart_static.a by chance contain any symbols from the libgcc runtime 
> (of an old toolchain)?

Do you know how I could verify this? I'm pretty new when it comes to
troubleshooting these things.

My understanding is that libstdc++.so and libgcc_s.so are always backwards
compatible so using "the latest" ensures you can use the newest features and
also run older built code. Is there a flaw/pitfall in that reasoning?

Thanks!

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-05-04 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

Carlos Galvez  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #5 from Carlos Galvez  ---
Works with LLD as well, so it seems likely a Gold bug. I wasn't aware is was no
longer well maintained, good to know! Thanks again for your help :)

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-05-04 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #4 from Carlos Galvez  ---
> Does libcudart_static.a by chance contain any symbols from the libgcc runtime

I'm not sure, do you know how I could check that (I'm pretty n00b on these
things :)). What I know is that libcudart.so does not have a dependency to
neither libstdc++.so nor libgcc_s.so, only to libc, libdl, libpthread, librt.

> This could also point to a bug with the GOLD linker. 

Indeed switching to BFD also solves the problem. I will try with LLD as well!

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-05-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2023-05-03

--- Comment #3 from Andrew Pinski  ---
>find . -iname "*.deb" -exec dpkg-deb -x {} cuda \;

This won't work on non-debian based targets. IIRC debs are archives which
contain two tar files which you can just use ar followed by tar to extract the
files instead of using dpkg-deb.

[Bug libgcc/109712] Segmentation fault in linear_search_fdes

2023-05-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

Andrew Pinski  changed:

   What|Removed |Added

  Component|c++ |libgcc

--- Comment #2 from Andrew Pinski  ---
>* The problem happens only when using the Gold linker.

This could also point to a bug with the GOLD linker. I am not sure if the gold
linker is even maintained these days or well enough tested.