Hi!

[Long mail!  Save for when you have coffee :)]

I got stuck a week ago with a really strange crash that I couldn't
understand.  Maybe this rarely happens to you all :)  To me it happens
sometimes but my experience with GDB has always been frustrating.  With
JITted code you can't even get a backtrace that makes any sense.

For example in my most recent crash:

(gdb) bt
#0  0x00002bc321065087 in  ()
#1  0x0000000002404940 in  ()
#2  0x00001c7853f5d779 in  ()
#3  0x00003bdb38cf3e69 in  ()
#4  0x00001c7853fa5c71 in  ()
#5  0x00003bdb38cf3e09 in  ()
#6  0x00007fffffffd5b0 in  ()
#7  0x00002bc32106a135 in  ()
#8  0x00002d7154f397d1 in  ()
#9  0x0000000000000000 in  ()

which is totally bogus.  Now I know there are functions that you can
call to print a backtrace, but sometimes you wonder if you'd be
perturbing the state of the program, and often when you get into this
situation the functions you call will end up giving you a DCHECK
failure so you end up having to force-return from frames.

I finally broke down and decided I have been working on V8 too long to
put up with this.  Perhaps I made the wrong decision as I've spent a
week on it already, but maybe it's the right thing, who knows.

One thing I wrote was a simple backtrace printer, building on my
previous Scheme pretty-printers for V8 objects.  I attach the current
version, which is getting pretty ugly.  Anyway the backtrace printer
will basically do a SafeStackFrameIterator, but completely from the GDB
side, without calling anything in the inferior except a routine to get
the current isolate, as getting thread-locals without calling into the
runtime is hell.  Like the previous pretty-printer work, it gets all it
needs from the DWARF constants, and doesn't hard-code anything at all
[*].

So from this situation above I can get the stack trace:

  type: exit
    sp: 0x7fffffffd2a0
    fp: 0x7fffffffd2b0
    pc: 0x2bc321007f9b
  type: javascript
    sp: 0x7fffffffd2c0
    fp: 0x7fffffffd328
    pc: 0x2bc321061aaa
    name: ScriptBreakPoint.set at native debug.js:291
  type: javascript
    sp: 0x7fffffffd338
    fp: 0x7fffffffd378
    pc: 0x2bc321071677
    name: Debug.setScriptBreakPoint at native debug.js:560
  type: javascript
    sp: 0x7fffffffd388
    fp: 0x7fffffffd3e0
    pc: 0x2bc32107138a
    name: Debug.setScriptBreakPointById at native debug.js:569
  type: javascript
    sp: 0x7fffffffd3f0
    fp: 0x7fffffffd440
    pc: 0x2bc321006395
    name: Debug.setScriptBreakPointById at native debug.js:<unknown>
  type: javascript
    sp: 0x7fffffffd450
    fp: 0x7fffffffd4c8
    pc: 0x2bc32106db57
    name: Debug.setBreakPoint at native debug.js:449
  type: javascript
    sp: 0x7fffffffd4d8
    fp: 0x7fffffffd518
    pc: 0x2bc321006395
    name: Debug.setBreakPoint at native debug.js:<unknown>
  type: javascript
    sp: 0x7fffffffd528
    fp: 0x7fffffffd560
    pc: 0x2bc32106c4de
    name: TestCase at /hack/v8/test/mjsunit/debug-step-4-in-frame.js:97
  type: javascript
    sp: 0x7fffffffd570
    fp: 0x7fffffffd5b0
    pc: 0x2bc32106a135
    name:  at /hack/v8/test/mjsunit/debug-step-4-in-frame.js:117
  type: internal
    sp: 0x7fffffffd5c0
    fp: 0x7fffffffd5e8
    pc: 0x2bc32102c620
  type: entry
    sp: 0x7fffffffd5f8
    fp: 0x7fffffffd670
    pc: 0x2bc321014d31

which as you will understand is muuuuuuch better.  Yes, I built a
RelocInfoIterator entirely in Scheme as a GDB extension.  Yes, I iterate
over heap pages.  I don't know whether to be proud or ashamed so I guess
both is OK!  It works on core files too!

OK, so what about GDB's backtrace?  Why don't I see any frames below the
entry frame, and indeed why are the frames all wrong in GDB's backtrace?
I first looked to the "frame filter" interface in GDB:

  
http://sourceware.org/gdb/current/onlinedocs/gdb/Frame-Filter-API.html#Frame-Filter-API

I added support for frame filters to the Guile extensions:

  http://thread.gmane.org/gmane.comp.gdb.patches/104823

and then I wrote some filters for Guile itself, and it seems they work:

  http://article.gmane.org/gmane.lisp.guile.user/11762

But since Guile is just a bytecode interpreter, there is a proper C
backtrace to start with, so interleaving Scheme and C frames is no big
deal.  In V8 we don't even have a proper backtrace to begin with,
because GDB doesn't know how to unwind the frames and indeed does it
wrong.  Fortunately there is *another* GDB interface, the dynamic JIT
reader:

  
http://sourceware.org/gdb/current/onlinedocs/gdb/Custom-Debug-Info.html#Custom-Debug-Info

This allows a loadable module to unwind frames for GDB.  Writing such a
thing and adding it to the V8 build sounds a bit excessive though, and
you wouldn't want to be on the hook for testing this and making sure it
builds, but it would be nice.  Now I thought, you know I _have_ all this
information already in Guile.  Why not add some GDB API to allow Guile
to implement JIT readers?  Which is what I did today, and
then... blah... turns out you have to have _already_ implemented parts
of the static GDB JIT interface to use the dynamic interface!  (You only
get dynamic JIT reader callbacks on memory regions you register via the
static GDB JIT interface.)

Which now, we arrive at the question you have probably had in your head
all this time, which is what about the gdbjit interface that V8 has
already.  Well first of all we must say that it is a ridiculous
proposition to create native object files in memory just so that GDB can
know what's going on.  You probably know, but GDB has a mini ELF linker
and a mini Mach-O linker inside it, and one of them gets compiled when
you enable the GDBJIT interface.  But really it's an irritating thing to
have to do -- it takes up memory and time and it's never on when you
need it, because who compiles V8 that way.  Not me even and it was me
who suggested that Sanjoy write that code when he was interning at
Igalia.

However!!!  (What if you had a kid and named them However?  Disregard
that, I am getting loopy.  Yak fumes.)  So you might be thinking, why
would GDB want you to create ELF objects, only to have to read them
using a loadable shared object with the dynamic JIT reader interface?
Well actually (yep I said it) with the dynamic GDB JIT interface you
don't have to make an ELF.  You just add a node to the linked list of
code descriptors, and the dynamic JIT reader gets a first crack at it.
If it succeeds in providing symbol and line info for the region, then
GDB doesn't go further.  Otherwise GDB tries to parse the file as a
native object file (e.g. ELF), and if that fails it gives up (silently).

Sooooooooooooo, I have a proposal.  It is that we add a mode for V8's
gdbjit integration to do "minimal" gdbjit, where we just register code
blocks (Code* objects) and nothing else -- no names, no line numbers,
nothing but the addresses.  We turn this mode on by default when
building in debug mode, as it has little impact on spacetime.  Perhaps
eventually we remove the "full" gdbjit integration, so as to remove the
ELF writer from our source tree.  Instead we expect people interested in
GDB debugging to load extensions to read the debuginfo, and to unwind
the stack.

My plan has some drawbacks:

  1. It relies on patches that are not yet in GDB

  2. It is implemented in Guile, which ain't a googley language

  3. It is prone to breaking as v8 changes, as it's not tightly coupled

  4. It relies on complete debugging information.  Surprisingly this
     just works, just the exceptions noted below.

However building a shared object would also be pretty bad, and writing
this in python would still have all the drawbacks except (2) and also
that there is no one to implement it, heh heh :)

But if this all comes together you'll be able to get full backtraces
from GDB in all normal circumstances, which is much better than what's
the case today.

What do people think?  Like I say it's hard for me to see clearly
through the yak but maybe you all are far enough away to have some
perspective.

Happy hacking,

Andy


[*] Two exceptions, which are related.  G++ has a bug where classes that
    are never instantiated are never included in the debug info of a
    program, even if their static members are used:

      https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65108

    There are few classes like this in V8.  The various
    StackFrameConstants are some of them, though.  In my local d8 I have
    changed them to namespaces, which hacks around this issue.  I'll
    submit a patch though it's a hacky thing to be mindlessly changing
    code just so we work around compiler deficiencies.  (I haven't
    checked with LLVM yet.)

    There is another instance though, which is the various BitField
    templated classes: they are never instantiated, but the previous
    hack doesn't work as namespaces can be the results of template
    expansion (as far as I know anyway).  So in this second case I
    hard-code defaults if the lookup fails.

-- 
-- 
v8-users mailing list
[email protected]
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to