Re: [lldb-dev] eh_frame or debug_frame

Jason Molenda Wed, 15 Oct 2014 18:19:00 -0700

That one doesn't seem to repo the problem.  The unwinder always comes back with


(lldb) bt
* thread #1: tid = 0x4ff734, 0x000000000000201a test`main.foo(x=1) + 26 at 
test.go:4, stop reason = breakpoint 1.1
  * #0: 0x000000000000201a test`main.foo(x=1) + 26 at test.go:4
    #1: 0x0000000000002111 test`main.main + 49 at test.go:15
    #2: 0x000000000000d463 test`runtime.main + 243 at proc.go:63
    #3: 0x00000000000259f0 test`runtime.gosched_m + 192 at proc.c:1641
(lldb)

I can si and backtrace again and I get the same backtrace -- lldb sees that the 
saved pc for frame 4 would be in non-executable memory and stops the stack walk:

    th1/fr4 pc = 0x00000002080b7f98
   th1/fr3 supplying caller's saved reg 6's location using x86_64 default 
unwind plan UnwindPlan
   th1/fr3 supplying caller's register 6 from the stack, saved at CFA plus 
offset -16
    th1/fr4 fp = 0x0000000000000000
   th1/fr3 supplying caller's stack pointer (7) value, computed from CFA
    th1/fr4 sp = 0x00000002080c0010
    th1/fr4 using architectural default unwind method
    th1/fr4 pc is in a non-executable section of memory and this isn't the 2nd 
frame in the stack walk.
    Frame 4 invalid RegisterContext for this frame, stopping stack walk


I needed the patch you appended to http://llvm.org/bugs/show_bug.cgi?id=21118 / 
http://reviews.llvm.org/D5735 to run the program.  Do you have llvm commit 
access?  I'll commit the patch if you don't.

J


> On Oct 15, 2014, at 5:33 PM, Ryan Brown <rib...@google.com> wrote:
> 
> Yes, I'm using os x. You could try using the binary I uploaded in 
> http://llvm.org/bugs/show_bug.cgi?id=21118
> That will only have one go thread though.
> 
> -- Ryan Brown
> 
> On Wed, Oct 15, 2014 at 5:22 PM, Jason Molenda <jmole...@apple.com> wrote:
> Yeah, I was afraid of that.
> 
> What I'm trying to do with this code is say "unwind using your super-super 
> smart techniques ... but if you hit a wall, try the simplistic unwind method 
> and see if you can get further."
> 
> The problem here is that lldb is doing the full stack walk as far as it can 
> be walked ... but it thinks maybe switching to the architecture default 
> unwind plan might get it further (which it does not).  The switch to the arch 
> default unwind plan is destructive - it replaces the assembly profile unwind 
> instructions for that function - and is remembered for future stack walks.  
> That's why your threads get progressively fewer backtraces.
> 
> I'll need to look into this and come up with a fix.  I don't suppose your go 
> binary runs on mac os x, does it?  It would be great if I had a failing test 
> program in front of me while I try to come up with a fix.
> 
> 
> > On Oct 15, 2014, at 5:13 PM, Ryan Brown <rib...@google.com> wrote:
> >
> > So adding "return false" to the top of TryFallbackUnwindPlan() fixes the 
> > problem.
> > The call at UnwindLLDB:177, when !reg_ctx_sp->IsValid() seems to be the 
> > only one I'm hitting.
> >
> > -- Ryan Brown
> >
> > On Wed, Oct 15, 2014 at 3:20 PM, Jason Molenda <jmole...@apple.com> wrote:
> > urgh, sorry, I wasn't paying attention to the svn log output when I copy & 
> > pasted the rev.  It's this change I wanted to mention - r219247.  It's 
> > going to be someone calling TryFallbackUnwindPlan(), I just added some new 
> > cases where that could be called.  It may not be my most recent change 
> > (219247) but it's going to be that method which is causing the problem.
> >
> >
> > > On Oct 15, 2014, at 2:59 PM, Ryan Brown <rib...@google.com> wrote:
> > >
> > > rolling back r219772 (Be more consistent about null checks for the 
> > > Process and ABI in GetFullUnwindPlanForFrame) doesn't seem to have any 
> > > effect.
> > >
> > > -- Ryan Brown
> > >
> > > On Wed, Oct 15, 2014 at 2:31 PM, Jason Molenda <jmole...@apple.com> wrote:
> > >
> > > > On Oct 15, 2014, at 1:43 PM, Ryan Brown <rib...@google.com> wrote:
> > > >
> > > > Go doesn't have exception handlers, so it doesn't write .eh_frame. 
> > > > Wouldn't it make sense to use .debug_frame if .eh_frame is missing?
> > >
> > >
> > > We could do that.  I'm surprised if go is emitting x86_64 code without 
> > > eh_frame.  As Joerg points out, debug_frame is great but it may not be 
> > > available when an analysis tool is examining a binary.  eh_frame has the 
> > > benefit of always being in the binary.
> > >
> > > >
> > > > With my custom RegisterContext I got backtraces to work for my memory 
> > > > threads. But something strange is going on. I have 10 threads that 
> > > > should have identical traces, but the first has 5 frames, then 4, 3, 2, 
> > > > and the rest only have 1 frame.
> > >
> > >
> > > It's easiest to isolate one thread backtrace in a situation like this.  
> > > For instance, looking at thread 7 in your program.  (the unwind 
> > > algorithms have no cross-thread information passing):
> > >
> > >
> > > th7/fr0 initialized frame current pc is 0xdaef cfa is 0x20809feb8 using 
> > > assembly insn profiling UnwindPlan
> > >
> > > lldb is using the assembly unwind inspection for frame 0.  You said that 
> > > all ten threads should have the same backtrace but thread #2 is at 
> > > 0x2fe8c, #3 is at 0x209a, threads 4-15 are at 0xdaef.  You meant threads 
> > > 4-15 should all be the same.
> > >
> > >
> > >      th7/fr5 pc = 0x0000000000002078
> > >      th7/fr5 fp = 0xffffffffffffffff
> > >     th7/fr4 supplying caller's stack pointer (7) value, computed from CFA
> > >      th7/fr5 sp = 0x000000020809ffc8
> > >      th7/fr5 active row: 0x0000000000002050: CFA=rbp+16 => rbp=[rbp] 
> > > rsp=rbp+16 rip=[rbp+8]
> > >
> > > That's the architectural default unwind plan for x86_64 ABIs.  Over in 
> > > thread 6, it looks like failed to unwind past frame 5 with the assembly 
> > > unwind, figured the assembly unwind was incorrect, and tried switching 
> > > over to using the architectural default unwind plan:
> > >
> > > th6/fr0 supplying caller's saved reg 6's location, cached
> > >      th6/fr5 full unwind plan 'assembly insn profiling' has been replaced 
> > > by architecture default unwind plan 'x86_64 default unwind plan' for this 
> > > function from now on.
> > >      th6/fr5 supplying caller's saved reg 16's location using x86_64 
> > > default unwind plan UnwindPlan
> > >      th6/fr5 supplying caller's register 16 from the stack, saved at CFA 
> > > plus offset -8
> > >       th6/fr6 could not get pc value
> > >       Frame 6 invalid RegisterContext for this frame, stopping stack walk
> > > th6 Unwind of this thread is complete.
> > >
> > > From this point forward main.okread() will use the arch default unwind 
> > > plan which isn't going to work.
> > >
> > > Can you try rolling back r219772 and seeing if that helps?  I suspect 
> > > lldb may be slowly stripping off the last frame of the unwind for each 
> > > thread as it progresses.
> > >
> > > J
> > >
> > > PS- "bt all" works just as well as "thread backtrace all".
> > >
> > >
> > > >
> > > > There's a log here, thread 6 is the one with the complete backtrace. 
> > > > https://gist.github.com/ribrdb/386fb0e555e82483d21d
> > > >
> > > > Comparing thread 7 with thread 6, things seem fine up to line 627:
> > > >     th7/fr4 supplying caller's stack pointer (7) value, computed from 
> > > > CFA
> > > >      th7/fr5 sp = 0x000000020809ffc8
> > > >      th7/fr5 active row: 0x0000000000002050: CFA=rbp+16 => rbp=[rbp] 
> > > > rsp=rbp+16 rip=[rbp+8]
> > > >
> > > > While thread 6 has:
> > > >      th6/fr4 supplying caller's stack pointer (7) value, computed from 
> > > > CFA
> > > >      th6/fr5 sp = 0x000000020809f7c8
> > > >      th6/fr5 active row: 0x000000000000206a: CFA=rsp+16 => rsp=rsp+16 
> > > > rip=[rsp+8]
> > > >
> > > > I don't know where rbp came from, it's not in the function at all:
> > > > 0x2050 <main.okread>: movq   %gs:0x8a0, %rcx
> > > > 0x2059 <main.okread+9>: cmpq   0x10(%rcx), %rsp
> > > > 0x205d <main.okread+13>: ja     0x2066                    ; main.okread 
> > > > + 22 at test.go:9
> > > > 0x205f <main.okread+15>: callq  0x2d510                   ; 
> > > > runtime.morestack_noctxt at asm_amd64.s:330
> > > > 0x2064 <main.okread+20>: jmp    0x2050                    ; main.okread 
> > > > at test.go:9
> > > > 0x2066 <main.okread+22>: subq   $0x8, %rsp
> > > > 0x206a <main.okread+26>: movq   0x10(%rsp), %rbx
> > > > 0x206f <main.okread+31>: movq   %rbx, (%rsp)
> > > > 0x2073 <main.okread+35>: callq  0x2000                    ; main.doread 
> > > > at test.go:5
> > > > 0x2078 <main.okread+40>: addq   $0x8, %rsp
> > > > 0x207c <main.okread+44>: retq
> > > > 0x207d <main.okread+45>: addb   %al, (%rax)
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > -- Ryan Brown
> > > >
> > > > On Wed, Oct 15, 2014 at 11:48 AM, Ryan Brown <rib...@google.com> wrote:
> > > > Yes, I'm writing a class to do that now. It's just not supported by any 
> > > > of the existing register contexts.
> > > >
> > > > -- Ryan Brown
> > > >
> > > > On Wed, Oct 15, 2014 at 11:37 AM, Jason Molenda <ja...@molenda.com> 
> > > > wrote:
> > > > Can't your OS plugin for the goroutines use the same sp and ip register 
> > > > numbers as x86_64 (instead of 0 and 1 like you might be using right 
> > > > now) when it reports them to lldb, and return all the other registers 
> > > > as "unavailable" if they're requested?
> > > >
> > > > The tricky bit about living on eh_frame / debug_frame is that lldb 
> > > > doesn't know what kind of unwind info it is being given.  Is it just 
> > > > for exception handling locations?  Does it contain prologue setup?  
> > > > epilogue?  Is it fully asynchronous - giving unwind details at all 
> > > > locations?  There aren't any flags in eh_frame/debug_frame that could 
> > > > give us a hint about what we're working with.
> > > >
> > > >
> > > >
> > > > On Oct 15, 2014, at 11:24 AM, Ryan Brown <rib...@google.com> wrote:
> > > >
> > > > > I'm actually struggling with this right now. I'm trying to implement 
> > > > > an OS plugin so goroutines show up as threads.
> > > > > The go compiler puts instruction accurate unwind info into 
> > > > > .debug_frame, I'm not sure what (if anything) goes into eh_frame.
> > > > > However lldb uses the disassembly instead of the dwarf info. The x86 
> > > > > unwinder assumes that all threads have the same LLDB register 
> > > > > numbers, but other parts of the code require that the LLDB register 
> > > > > number is < (number of registers). Goroutines only store sp and ip, 
> > > > > so it seems I'm going to have to create a custom RegisterContext 
> > > > > subclass to get the existing unwinder to work for goroutines.
> > > > >
> > > > > On Tue, Oct 14, 2014 at 5:51 PM, Jason Molenda <jmole...@apple.com> 
> > > > > wrote:
> > > > > > On Oct 13, 2014, at 9:55 AM, Greg Clayton <gclayton at apple.com
> > > > > > wrote:
> > > > >
> > > > > >
> > > > >
> > > > >
> > > > > >
> > > > >
> > > > >
> > > > > >> On Oct 10, 2014, at 1:58 PM, Francois Pichet <pichet2000 at 
> > > > > >> gmail.com
> > > > > > wrote:
> > > > >
> > > > > >>
> > > > >
> > > > >
> > > > > >>
> > > > >
> > > > >
> > > > > >>
> > > > >
> > > > >
> > > > > >> On Fri, Oct 10, 2014 at 4:20 PM, Greg Clayton <gclayton at 
> > > > > >> apple.com
> > > > > > wrote:
> > > > >
> > > > > >>
> > > > >
> > > > >
> > > > > >>> On Oct 10, 2014, at 1:05 PM, Philippe Lavoie <philippe.lavoie at 
> > > > > >>> octasic.com
> > > > > > wrote:
> > > > >
> > > > > >>>
> > > > >
> > > > >
> > > > > >>>
> > > > >  Hi,
> > > > >
> > > > > >>>
> > > > >
> > > > >
> > > > > >>>
> > > > >  I noticed that by default lldb does not read .debug_frame section to 
> > > > > unwind frames but relies instead on .eh_frame .
> > > > >
> > > > > >>>
> > > > >
> > > > >
> > > > > >>>
> > > > >  Is there a way to fallback to reading .debug_frame?
> > > > >
> > > > > >>
> > > > >
> > > > >
> > > > > >>
> > > > >  Not currently. Most compilers (gcc _and_ clang) put the same old 
> > > > > stuff in .debug_frame as they do in .eh_frame, so we haven't had to 
> > > > > use .debug_frame over .eh_frame yet. What compiler are using that is 
> > > > > putting different (more complete) info in .debug_frame vs .eh_frame?
> > > > >
> > > > > >>
> > > > >
> > > > >
> > > > > >>
> > > > >
> > > > >
> > > > > >>
> > > > >  What about about C or C++ program compiled with -fno-exceptions?
> > > > >
> > > > > >>
> > > > >  They will fall back to the UnwindAssembly way even if the 
> > > > > .debug_frame is present right?
> > > > >
> > > > > >
> > > > >
> > > > >
> > > > > >
> > > > >  If no EH frame exists for a frame, then we will always fall back to 
> > > > > UnwindAssembly. We always use UnwindAssembly for the first frame and 
> > > > > for any frame that is past an async interrupt (sigtramp). We use the 
> > > > > EH frame/.debug_frame for any non-zero frames, but will always use 
> > > > > UnwindAssembly if there is no such info.
> > > > >
> > > > >
> > > > >
> > > > > I want to expand on what Greg said earlier about eh_frame versus 
> > > > > debug_frame.
> > > > >
> > > > > Ideally, eh_frame will be the minimal unwind instructions necessary 
> > > > > to unwind the stack when exceptions are thrown/caught.  eh_frame will 
> > > > > not include unwind instructions for the prologue instructions or 
> > > > > epilogue instructions -- because we can't throw an exception there, 
> > > > > or have an exception thrown from a called function "below" us on the 
> > > > > stack.  We call these unwind instructions "synchronous" because they 
> > > > > only describe the unwind state from a small set of locations.
> > > > >
> > > > > debug_frame would describe how to unwind the stack at every 
> > > > > instruction location.  Every instruction of the prologue and 
> > > > > epilogue.  If the code is built without a frame pointer, then it 
> > > > > would have unwind instructions at every place where the stack pointer 
> > > > > is modified.  We describe these unwind instructions as "asynchronous" 
> > > > > because they describe the unwind state at every instruction location.
> > > > >
> > > > >
> > > > > Instead what we have with gcc and clang is eh_frame instructions that 
> > > > > describe the prologue (and some versions of gcc, the epilogue) plus 
> > > > > the unwind state at synchronous unwind locations (where an exception 
> > > > > can be thrown).  We have a half-way blend of asynchronous and 
> > > > > synchronous ... it's "pretty good" but not "guaranteed" from a 
> > > > > debugger's perspective.  It would be great if eh_frame was genuinely 
> > > > > only the unwind instructions for exception handling and debug_frame 
> > > > > had the full unwind state at every instruction and we could depend on 
> > > > > debug_frame.  But in reality, the same unwind instructions are put in 
> > > > > both eh_frame and debug_frame -- so there's little point in ever 
> > > > > reading debug_frame.  lldb does not read debug_frame today, although 
> > > > > it would be easy to do so.
> > > > >
> > > > >
> > > > > As an experiment starting late August (r216406), lldb is now trying 
> > > > > to use eh_frame for the currently-executing frame.  Even though it 
> > > > > isn't *guaranteed* to be accurate at all instructions, in practice 
> > > > > it's pretty good -- good enough that gdb seems to be able to live on 
> > > > > it.  Tong Shen's patch in r216406 does augment the eh_frame unwind 
> > > > > instructions with the epilogue unwind... newer gcc's apparently 
> > > > > describe the epilogue in eh_frame but few other compilers do.
> > > > >
> > > > > It's an open question how well living off eh_frame unwind 
> > > > > instructions will work with a non-gcc/non-clang compiler.  That's why 
> > > > > I say this is an "experiment" - we may have to revert to lldb's 
> > > > > UnwindAssembly profiling code for the currently-executing function if 
> > > > > this breaks with other compilers.
> > > > >
> > > > > J
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > lldb-dev mailing list
> > > > > lldb-dev@cs.uiuc.edu
> > > > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > lldb-dev mailing list
> > > > lldb-dev@cs.uiuc.edu
> > > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> > >
> > >
> >
> >
> 
> 


_______________________________________________
lldb-dev mailing list
lldb-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Re: [lldb-dev] eh_frame or debug_frame

Reply via email to