On 08/05/2023 17:35, Mark Wielaard wrote:
Hi Florian, Hi Luke,

On Tue, 2023-05-02 at 09:57 +0200, Florian Weimer via Elfutils-devel
wrote:
* Luke Diamand via Elfutils-devel:

I've got a few cores where report_r_debug() in link_map.c fails to
find all of the modules - for example I had libc.so missing. This
obviously meant that elfutils could not backtrace my core.

It seems to be related to this code:

   /* There can't be more elements in the link_map list than there are
      segments.  DWFL->lookup_elts is probably twice that number, so it
      is certainly above the upper bound.  If we iterate too many times,
      there must be a loop in the pointers due to link_map clobberation.  */
   size_t iterations = 0;

   while (next != 0 && ++iterations < dwfl->lookup_elts)

I've changed this to just keep going until it reaches
dwfl->lookup_elts*5, which seems to "fix" it, but I feel there must be
a better fix!

The most recent core I saw with this had lookup_elts=36, and hit 109
iterations of the loop and then backtraced just fine.

It's probably another fallout from -z separate-code, which tends to
create four LOAD segments.  The magic number 5 sounds about right, as
gold also has -z text-unlikely-segment, which might result in creating
that number of load segments (but I haven't tried).

Wow, that had never occurred to me. Thanks.

Luke does the binary/libraries from which your core file was generated
contain multiple PT_LOAD segments?

We could add something like:

diff --git a/libdwfl/link_map.c b/libdwfl/link_map.c
index 06d85eb6..76f23354 100644
--- a/libdwfl/link_map.c
+++ b/libdwfl/link_map.c
@@ -331,11 +331,17 @@ report_r_debug (uint_fast8_t elfclass, uint_fast8_t 
elfdata,
    int result = 0;
/* There can't be more elements in the link_map list than there are
-     segments.  DWFL->lookup_elts is probably twice that number, so it
-     is certainly above the upper bound.  If we iterate too many times,
-     there must be a loop in the pointers due to link_map clobberation.  */
+     segments.  A segment is created for each PT_LOAD and there can be
+     up to 5 per module (-z separate-code, tends to create four LOAD
+     segments, gold has -z text-unlikely-segment, which might result
+     in creating that number of load segments) DWFL->lookup_elts is
+     probably twice the number of modules, so that multiplied by max
+     PT_LOADs is certainly above the upper bound.  If we iterate too
+     many times, there must be a loop in the pointers due to link_map
+     clobberation.  */
+#define MAX_PT_LOAD 5
    size_t iterations = 0;
-  while (next != 0 && ++iterations < dwfl->lookup_elts)
+  while (next != 0 && ++iterations < dwfl->lookup_elts * MAX_PT_LOAD)
      {
        if (read_addrs (&memory_closure, elfclass, elfdata,
                       &buffer, &buffer_available, next, &read_vaddr,

Does that sound reasonable?

Sorry - I did not see this until just after sending in my patch!

Let me try it with this change and I will re-roll it.

Luke

Reply via email to