Ping!

FYI Jason, I verified that the original patch (attached again) continues to 
apply cleanly and resolve the failure in functionalities/inferior-assert with 
SVN trunk.

- Ashok

-----Original Message-----
From: [email protected] [mailto:[email protected]] On 
Behalf Of Thirumurthi, Ashok
Sent: Wednesday, June 12, 2013 12:34 PM
To: Jason Molenda
Cc: [email protected]
Subject: Re: [lldb-dev] regarding [Bug 15671] New: backtrace truncated after 
assertion failure in inferior

> Hi Ashok, thanks for working on this -- I know the unwinder code can 
> be a hard to modify, RegisterContextLLDB.cpp is a little complex in 
> places. :/
For sure, Jason, thanks for the sophisticated unwinder.


> A recent change to ObjectFileMachO is that it also gets the function start 
> addresses from the eh_frame information if LC_FUNCTION_STARTS doesn't exist:
Nice, I see how that's an advantage in spite of the performance hit.  I'll 
certainly look at reworking ObjectFileELF to add the function symbols for 
stripped symbols from the eh_frame information.


> Let me know what you think.
Perhaps the best approach is to do both.  Having my suggested new code path in 
the unwinder isn't fundamentally wrong or a performance concern.  In contrast, 
it does unblock Linux core file support and a high-profile bug for a common use 
case.  I think it also improves the applicability of the unwinder while looking 
for improvements in other object-file formats (i.e. ObjectFilePECOFF).

If you like the idea, I'm happy to commit & improve,

- Ashok


On Jun 7, 2013, at 11:46 AM, "Thirumurthi, Ashok" <[email protected]> 
wrote:

> Hi Jason,
> 
>>  Frame 2 did not get a valid CFA for this frame, stopping stack walk
> So, the attached patch allows the unwinder to get past frame 2 using eh_frame 
> information that is dug up based on the pc rather than the start address of 
> the function (i.e. to handle the case where the function symbol is 
> unavailable).
> 
> This fix is coupled with GetFullUnwindPlanForFrame rather than lowered to 
> UnwindTable and FuncUnwinders.  Alternately, I could add or modify routines 
> like GetFuncUnwindersContainingAddress to avoid the requirement for a 
> SymbolContext.  Similarly, I could add or modify routines like 
> GetUnwindPlanAtCallSite to allow the caller to specify a pc.
> 
> The attached patch also slides m_current_pc in the case where a Symbol is 
> found at pc - 1.  Note that the log while adding frame 2 indicates a bogus fp:
>  th1/fr2 supplying caller's register 6 from the stack, saved at CFA plus 
> offset
>   th1/fr3 fp = 0x00000000004006db
> 
> The slide keeps me out of the weeds while adding frame 3 (see the attached 
> log).  The combined result is a healthy stack:
> 
> (lldb) bt
> * thread #1: tid = 0x2987, 0x00007ffba7b23425 libc.so.6`raise + 53, stop 
> reason = signal SIGABRT
>     frame #0: 0x00007ffba7b23425 libc.so.6`raise + 53
>     frame #1: 0x00007ffba7b26b8b libc.so.6`abort + 379
>     frame #2: 0x00007ffba7b1c0ee libc.so.6
>     frame #3: 0x00007ffba7b1c192 libc.so.6`__assert_fail + 66
>     frame #4: 0x00000000004005c0 a.out`main(argc=1, argv=0x00007fff1ccfbd68) 
> + 112 at main.c:18
>     frame #5: 0x00007ffba7b0e76d libc.so.6`__libc_start_main + 237
>     frame #6: 0x0000000000400489 a.out`_start + 41
> 
> Perhaps it would be helpful to provide a slightly different entry for frame 
> #2 like:
>     frame #2: 0x00007ffba7b1c0ee libc.so.6`??? + offset
> 
> For now, I set eSkipFrame which is documented as a frame state that indicates 
> that the unwinder found issues and is hoping to recover. Perhaps a new value 
> would better document the fact that the frame goes with a function with no 
> known symbol.  
> 
> I'll commit this patch by next Monday since this is an important use 
> case for lldb 3.3 (and I assume that WDC is all encompassing for a 
> bit), but do fire away with any feedback.  Cheers,
> 
> - Ashok
> 
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Thirumurthi, Ashok
> Sent: Tuesday, May 28, 2013 10:52 AM
> To: [email protected]
> Subject: Re: [lldb-dev] regarding [Bug 15671] New: backtrace truncated 
> after assertion failure in inferior
> 
> FYI, gdb can identify the frame addresses for/relative to mystery frame 2 
> while at the assert site:
> 
> (gdb) f 2
> #2  0x00007ffff7a4a0ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> 
> (gdb) info frame
> Stack level 2, frame at 0x7fffffffdee0:
> rip = 0x7ffff7a4a0ee; saved rip 0x7ffff7a4a192  called by frame at 
> 0x7fffffffdf10, caller of frame at 0x7fffffffde80  Arglist at 0x7fffffffde78, 
> args:
> Locals at 0x7fffffffde78, Previous frame's sp is 0x7fffffffdee0  Saved 
> registers:
>  rbx at 0x7fffffffdec0, rbp at 0x7fffffffdec8, r12 at 0x7fffffffded0, 
> rip at 0x7fffffffded8
> 
> - Ashok
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Thirumurthi, Ashok
> Sent: Monday, May 27, 2013 5:09 PM
> To: [email protected]
> Subject: Re: [lldb-dev] regarding [Bug 15671] New: backtrace truncated 
> after assertion failure in inferior
> 
> Hi Jason,
> 
> So, this thread is still relevant and reproducible using 
> functionalities/inferior-asserting on platforms where libc.so is compiled 
> with -fomit-frame-pointer.
> 
>>>> The only solution I can think of here is if abort()'s eh_frame does 
>>>> provide a saved location for rbp but lldb failed to read it correctly.  
>>>> Else, I have no idea how gdb managed to unwind out of this one.
> 
> FYI, the routine RegisterContextLLDB::InitializeNoneZerothFrame calls 
> ReadGPRValue for active_row->GetCFARegister(), which allows m_cfa to be set 
> for frame 1 'abort'.  When this routine runs for the mystery frame 2, 
> m_sym_ctx.GetAddressRange comes up empty handed (consistent with gdb's 
> backtrace), so addr_range.GetBaseAddress() is not valid.  As a result, 
> m_current_offset is -1, and this routine returns before m_cfa is read, 
> resulting in an invalid frame.
> 
> 
>> But in this particular backtrace we've got -fomit-frame-pointer frames using 
>> eh_frame, then one function that doesn't have any symbol name or eh_frame 
>> entry, and I honestly have no idea how gdb found its way out of that one.  
> 
> Even if the function for frame 2 doesn't have a symbol name, is it possible 
> that it has an eh_frame entry that we can use?
> 
> 
>>>> The only reasonable approach here would be to assume that this frame used 
>>>> a frame pointer (rbp), grab the saved rbp value and try to find the 
>>>> caller's pc based on that -- but that failed.
> 
> So, I see the code that executes to handle the case where a function ends 
> with a call instruction, which backs up the PC by one byte. However, 
> ResolveSymbolContextForAddress fails, and SymbolContext::GetAddressRange 
> comes up empty handed because the member function is 0, so addr_range is not 
> set by this code.
> 
> Without a function symbol, is there a way to set m_current_offset so 
> that ReadGPRRegister can read the saved rbp for frame 2?  Thanks,
> 
> - Ashok
> 
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Langmuir, Ben
> Sent: Monday, April 08, 2013 10:12 AM
> To: Luddy Harrison; Jason Molenda
> Cc: [email protected]
> Subject: Re: [lldb-dev] regarding [Bug 15671] New: backtrace truncated 
> after assertion failure in inferior
> 
> I've updated bugzilla with the output of image show-unwind -n abort.  I 
> couldn't attach the output of readelf -wf libc.so.6 (too big) - is there a 
> way to only show info about the abort function?  The name 'abort' doesn't 
> appear in the output.
> 
> Ben
> 
> -----Original Message-----
> From: Luddy Harrison [mailto:[email protected]]
> Sent: Monday, April 08, 2013 6:18 AM
> To: Jason Molenda
> Cc: Langmuir, Ben; [email protected]
> Subject: Re: [lldb-dev] regarding [Bug 15671] New: backtrace truncated 
> after assertion failure in inferior
> 
> hi, just to clarify, I regularly write asm with no eh frames or fonction 
> bounds, no .cfi.   gdb unwinds my leaf funtions fine.  it is my impression 
> that gdb will in the absence of frame info assume that the topmost item on 
> the stack at a trap is a return pc (even though the trapped pc cannot be 
> identified and has invalid rbp, so disasm of the leaf itself is not possible
> 
> put differently if one can't figure out the leaf one can grope for the return 
> pc on the stack and try again at the caller.  if the teturn pc points just 
> after a plausible-looking call insn then you're good.   hope that makes 
> sense...
> 
> Sent from my iPhone
> 
> On 8 Apr, 2013, at 17:43, Jason Molenda <[email protected]> wrote:
> 
>> Yeah, lldb uses similar tricks.  If you have eh_frame instructions, 
>> unwinding from -fomit-frame-pointer code is easy.  And if you have accurate 
>> function bounds for all the frames, lldb can usually manage to unwind an 
>> -fomit-frame-pointer stack without eh_frame (because it inspects the actual 
>> assembly instructions in the prologue to understand the stack setup).  But 
>> in this particular backtrace we've got -fomit-frame-pointer frames using 
>> eh_frame, then one function that doesn't have any symbol name or eh_frame 
>> entry, and I honestly have no idea how gdb found its way out of that one.  
>> The only reasonable approach here would be to assume that this frame used a 
>> frame pointer (rbp), grab the saved rbp value and try to find the caller's 
>> pc based on that -- but that failed.
>> 
>> Well, maybe the additional information from Ben (the eh_frame instructions 
>> for abort() most importantly) will provide a hint.  The only thing I can 
>> think is that maybe lldb misinterpreted that function's eh_frame 
>> instructions.
>> 
>> J
>> 
>> 
>> On Apr 8, 2013, at 1:20 AM, Luddy Harrison wrote:
>> 
>>> having done lots of asm debugging with gdb, I can offer a guess.  gdb seems 
>>> to able to unwind frameless leaf functions with no unwind info.   so 
>>> perhaps as a final fallback it pops the top entry on the stack and treats 
>>> it as the return pc.  if it can unwind the caller using that pc, the it is 
>>> good.
>>> 
>>> just a guess...
>>> 
>>> -Luddy
>>> 
>>> Sent from my iPhone
>>> 
>>> On 8 Apr, 2013, at 6:01, Jason Molenda <[email protected]> wrote:
>>> 
>>>> I see what's going on here.
>>>> 
>>>> /lib/x86_64-linux-gnu/libc.so.6 was built -fomit-frame-pointer, and 
>>>> it includes eh_frame instructions on how to unwind the frames.  But 
>>>> when lldb gets to
>>>> 
>>>> #2  0x00007ffff7a4a0ee in ?? () from
>>>> /lib/x86_64-linux-gnu/libc.so.6
>>>> 
>>>> it doesn't have any eh_frame instructions.  lldb can figure out the stack 
>>>> pointer value (from frame 1) which tells us the "bottom" of this stack 
>>>> frame but it can't find the "top" without eh_frame unwind instructions or 
>>>> knowing what function it is in so it can do an assembly instruction scan 
>>>> to understand how the stack frame was set up.  lldb tries to get a saved 
>>>> frame pointer (rbp) which would give us the "top" of the stack frame but 
>>>> the saved rbp value it gets (0x40067e0) is obviously invalid.
>>>> 
>>>> It might be interesting to see the output of
>>>> 
>>>> image show-unwind -n abort
>>>> 
>>>> to see exactly what the eh_frame instructions read (this is lldb's 
>>>> interpretation of the eh_frame instructions, of course, it might be 
>>>> useful to include the output of readelf -wf libc.so.6 or readelf 
>>>> -wF
>>>> libc.so.6 for the abort() function, going by a web page for readelf 
>>>> I found on the web.)  The log output included this,
>>>> 
>>>> th1/fr0 supplying caller's saved reg 16's location, cached
>>>> th1/fr1 requested caller's saved PC but this UnwindPlan uses a RA 
>>>> reg; getting reg 16 instead
>>>> th1/fr1 supplying caller's saved reg 16's location using eh_frame 
>>>> CFI UnwindPlan
>>>> th1/fr1 supplying caller's register 16 from the stack, saved at CFA 
>>>> plus offset
>>>> th1/fr2 pc = 0x00007f216e4850ee
>>>> 
>>>> That bit about "this UnwindPlan uses a RA reg" is novel for x86 code, it's 
>>>> normally you see in arm code where the caller's saved pc value is in the 
>>>> link register on a function call.  But as you'd guess from the name 
>>>> abort(), this may have the caller's register context saved in an unusual 
>>>> way so this may be fine.
>>>> 
>>>> I'm surprised gdb can unwind this successfully.
>>>> 
>>>> As I alluded to above, lldb can profile the assembly language instructions 
>>>> of a function to understand the prologue setup (where registers are saved, 
>>>> how the stack is set up, etc.) -- but to do this, it needs to know the 
>>>> start address of the function.  This "#2  0x00007ffff7a4a0ee in ?? ()" 
>>>> frame clearly doesn't have any symbolic information with its address range 
>>>> so lldb can't do its assembly scan.  And it doesn't have eh_frame 
>>>> instructions to help either.
>>>> 
>>>> On Mac OS X we're often working with binaries that have had most of their 
>>>> symbols stripped.  Because it is so valuable to lldb to have accurate 
>>>> function ranges, we supplement the symbol table with two sources:  The 
>>>> LC_FUNCTION_STARTS section, and barring that (this is new), the eh_frame 
>>>> section.  LC_FUNCTION_STARTS is an array of LEB128 encoded offsets of all 
>>>> the start addresses of the functions in the file.  The first function is 
>>>> at offset 0, etc.  It's real compact, typically a few bytes per function.  
>>>> The eh_frame section is another great source of function bounds 
>>>> information but it tends to be larger and slower to parse through. lldb 
>>>> adds fake symbol names for these function ranges that it adds, e.g. a fake 
>>>> symbol added to the program Dock might be 
>>>> "__lldb_unnamed_function3491$$Dock".
>>>> 
>>>> Of course, given that lldb couldn't find eh_frame instructions for "#2  
>>>> 0x00007ffff7a4a0ee in ?? ()", maybe even that wouldn't have helped.
>>>> 
>>>> 
>>>> The only solution I can think of here is if abort()'s eh_frame does 
>>>> provide a saved location for rbp but lldb failed to read it correctly.  
>>>> Else, I have no idea how gdb managed to unwind out of this one.
>>>> 
>>>> 
>>>> On Apr 7, 2013, at 5:46 AM, Langmuir, Ben wrote:
>>>> 
>>>>> Done.
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Jason Molenda [mailto:[email protected]]
>>>>> Sent: Sunday, April 07, 2013 5:50 AM
>>>>> To: Langmuir, Ben
>>>>> Subject: regarding [Bug 15671] New: backtrace truncated after 
>>>>> assertion failure in inferior
>>>>> 
>>>>> I don't know if I have a bugzilla account on llvm.org (I should 
>>>>> but I don't know what password it might have) but I wanted to ask 
>>>>> you to do
>>>>> 
>>>>> (lldb) log enable lldb unwind
>>>>> (lldb) run
>>>>> (lldb) bt
>>>>> 
>>>>> 
>>>>> and attach that output to
>>>>> http://llvm.org/bugs/show_bug.cgi?id=15671
>>>>> 
>>>>> lldb should use a DefaultUnwindPlan for frame 2 ("?? ()" in gdb's 
>>>>> backtrace) to continue the unwind.  I don't have linux installed on any 
>>>>> devices so I haven't looked but the output will probably be a good clue 
>>>>> as to why the unwind stopped early.
>>>>> 
>>>>> 
>>>>> 
>>>>> J
>>>> 
>>>> 
>>>> _______________________________________________
>>>> lldb-dev mailing list
>>>> [email protected]
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>> 
> 
> _______________________________________________
> lldb-dev mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> 
> _______________________________________________
> lldb-dev mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> 
> _______________________________________________
> lldb-dev mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> <pr15671.patch><unwind-full.txt>


_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Attachment: pr15671.patch
Description: pr15671.patch

_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Reply via email to