TL;DR: We're making some changes to how inlined functions are handled
in our crash reports on non-Windows platforms in bug 524410.  This
change should mostly result in more understandable crash stacks for
code that uses lots of inlining, and shouldn't make things any worse.
Some crash signatures may change as a result.  If you have concerns,
or you happen to have crash stacks that you're curious about whether
they'd change under this new policy, please comment in bug 524410 or
email me.

For the grotty details, read on.

Modern C++/Rust code relies on inlining for efficiency, and modern
compilers have gotten very good at accommodating such code: it's not
unusual for code to feature double-digit levels of inlining (A inlined
into B inlined into C...inlined into J).  A simple Rust function that
looks like:

  slice.into_iter().map(|...| { ... })

and you think of as spanning addresses BEGIN to END, might actually
feature small ranges of instructions from a dozen different functions,
and the compiler will (mostly) faithfully tell you a precise location
for each range.  (Instruction A comes from some iterator code,
instruction B comes from a bit of your closure, instruction C comes
from some inlined function three levels deep inside of your
closure...) Unfortunately, this faithfulness means that in the event
of a crash, the crashing instruction might get attributed to some code
in libstd with no indication of how that relates to the original code.

Bug 524410, supporting inline functions in the symbol dumper, has been
open for a decade now.  The idea is that compilers (on Unix platforms,
not entirely sure this is true on Windows) will not only give you
precise information about what function particular instruction ranges
come from, they will also give you information about the chain of
inlining that resulted in those particular instructions.  The symbol
dumper ought to be able to emit enough information to reconstruct
"frames" from inlined functions.  That is, if you have:

```
addr0 ---+
...      | A
addr1 ---+
addr2 ---+   ---+         ------+
...      | B    |               |
addr3 ---+      | operator+     |
addr4 ---+      |               |
...      | C    |               |
addr5 ---+   ---+               | DoTheThing
addr6 ---+   ---+               |
...      | D    |               |
addr7 ---+      | operator[]    |
addr8 ---+      |               |
...      | E    |               |
addr9 ---+   ---+         ------+
...
```

(apologies if the ASCII art doesn't come through), and you're crashing
at `addr6`, the status quo is that you know you crashed in `D`, and
the next frame is whoever called `DoTheThing`.  The ideal state of
affairs is that you're told that you crashed in `D`, called from
`operator[]`, called from `DoTheThing`, and so forth.

We're not there yet.  Changing the symbol dumper to emit this
information requires changing the symbol file format, which requires
some coordinated updates to several pieces of infrastructure (and
probably others that I don't know about).  It also requires discussion
with upstream Breakpad (and therefore breaking many more consumers
than just Mozilla-internal ones) and/or forking Breakpad completely
and/or rewriting all our tools (which is a special case of forking).
We want to get to that end state, but it's a fair bit of work.

The patches on the bug implement a truncated version of the above that
doesn't require dumping the entire inlining hierarchy.  The idea is
that, as much as possible, for addresses in some function, you want to
attribute those addresses to source code lines for said function.  So
instead of recording the most precise lines possible (A, B, C, D, E,
etc. in the above), you want to simply attribute the entire range of
B-E to `DoTheThing`.  This transformation loses information, but it
tends to produce stack traces that make more sense to humans.  You can
see some examples of the changes in:

https://bugzilla.mozilla.org/show_bug.cgi?id=524410#c22
https://bugzilla.mozilla.org/show_bug.cgi?id=524410#c29

which result in more sensible crash stacks, even if they don't
immediately point out what's going wrong.

This work has not landed yet, but should land sometime next week.  If
you have concerns, or you happen to have crash stacks that you're
curious about whether they'd change under this new policy, please
comment in bug 524410 or email me.

Thanks,
-Nathan
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to