That will only happen on i386, and only if the code is built -fomit-frame-pointer. And on Mac OS X with the current generation tools, we don't emit eh_frame instructions for i386/x86_64 any more.
I'd like to try living off eh_frame and see how it goes. For i386/x84_64 code, if the code uses ebp as the frame pointer (instead of using it as a scratch reg), I think it will be fine - my main concern is that eh_frame is not guaranteed to describe the prologue or epilogue. There was enough register pressure on i386 that using ebp as a scratch was tempting but x86_64 there's little reason to bother. > On Aug 19, 2014, at 4:29 PM, Greg Clayton <gclay...@apple.com> wrote: > > The EH frame doesn't track the PIC bump stuff and that can/will hose up > stepping. > >> On Aug 19, 2014, at 4:22 PM, Jason Molenda <jmole...@apple.com> wrote: >> >> Hi Tong, my message was a little rambling. Let's be specific. >> >> We are changing lldb to trust eh_frame instructions on the >> currently-executing aka 0th frame. >> >> In practice, gcc and clang eh_frame both describe the prologue, so this is >> OK. >> >> Old gcc and clang eh_frame do not describe the epilogue. So we need to add >> a pass for i386/x86_64 (at least) to augment the eh_frame-sourced unwind >> instructions. I don't know if it would be best to augment eh_frame >> UnwindPlans when we create them in DWARFCallFrameInfo or if it would be >> better to do it lazily when we are actually using the unwind instructions in >> RegisterContextLLDB (probably RegisterContextLLDB like you were doing). We >> should only do it once for a given function, of course. >> >> I think it would cleanest if the augmentation function lived in the >> UnwindAssembly class. But I haven't looked how easy it is to get an >> UnwindAssembly object where we need it. >> >> >> Thanks for taking this on. It will be interesting to try living entirely >> off eh_frame and see how that works for all the architectures/environments >> lldb supports. >> >> I worry a little that we're depending on the generous eh_frame from >> clang/gcc and if we try to run on icc (Intel's compiler) or something like >> that, we may have no prologue instructions and stepping will work very >> poorly. But we'll cross that bridge when we get to it. >> >> >> >>> On Aug 15, 2014, at 8:07 PM, Jason Molenda <jmole...@apple.com> wrote: >>> >>> Hi Tong, sorry for the delay in replying. >>> >>> I have a couple thoughts about the patch. First, the change in >>> RegisterContextLLDB::GetFullUnwindPlanForFrame() forces the use of eh_frame >>> unwind instructions ("UnwindPlanAtCallSite" - which normally means the >>> eh_frame unwind instructions) for the currently-executing aka zeroth frame. >>> We've talked about this before, but it's worth noting that this patch >>> includes that change. >>> >>> There's still the problem of detecting how *asynchronous* those eh_frame >>> unwind instructions are. For instance, what do you get for an i386 program >>> that does >>> >>> #include <stdio.h> >>> int main() >>> { >>> puts ("HI"); >>> } >>> >>> Most codegen will use a sequence like >>> >>> call LNextInstruction >>> .LNextInstruction >>> pop ebx >>> >>> this call & pop sequence is establishing the "pic base", it the program >>> will then use that address to find the "HI" constant data. If you compile >>> this -fomit-frame-pointer, so we have to use the stack pointer to find the >>> CFA, do the eh_frame instructions describe this? >>> >>> It's a bit of an extreme example but it's one of those tricky cases where >>> asynchronous ("accurate at every instruction") unwind instructions and >>> synchronous ("accurate at places where we can throw an exception, or a >>> callee can throw an exception") unwind instructions are different. >>> >>> >>> I would use behaves_like_zeroth_frame instead of if (IsFrameZero()) because >>> you can have a frame in the middle of the stack which was the zeroth frame >>> when an asynchronous signal came in -- in which case, the "callee" stack >>> frame will be sigtramp. >>> >>> >>> You'd want to update the UnwindLogMsgVerbose() text, of course. >>> >>> >>> What your DWARFCallFrameInfo::PatchUnwindPlanForX86() function is doing is >>> assuming that the unwind plan fails to include an epilogue description, >>> steps through all the instructions in the function looking for the >>> epilogue. >>> >>> DWARFCallFrameInfo doesn't seem like the right place for this. There's an >>> assumption that the instructions came from eh_frame and that they are >>> incomplete. It seems like it would more naturally live in the >>> UnwindAssembly plugin and it would have a name like >>> AugmentIncompleteUnwindPlanWithEpilogue or something like that. >>> >>> What if the CFI already does describe the epilogue? I imagine we'll just >>> end up with a doubling of UnwindPlan Rows that describe the epilogue >>> instructions. >>> >>> What if we have a mid-function epilogue? I've never seen gcc/clang >>> generate these for x86, but it's possible. It's a common code sequence on >>> arm/arm64. You can see a messy bit of code in >>> UnwindAssemblyInstEmulation::GetNonCallSiteUnwindPlanFromAssembly which >>> handles these -- saving the UnwindPlan's unwind instructions when we see >>> the beginning of an epilogue, and once the epilogue is complete, restoring >>> the unwind instructions. >>> >>> >>> I'm not opposed to the patch - but it does make the assumption that we're >>> going to use eh_frame for the currently executing function and that the >>> eh_frame instructions do not include a description of the epilogue. (and >>> that there is only one epilogue in the function). Mostly I want to call >>> all of those aspects out so we're clear what we're talking about here. >>> Let's clean it up a bit, put it in and see how it goes. >>> >>> J >>> >>> >>>> On Aug 14, 2014, at 6:31 PM, Tong Shen <endlessr...@google.com> wrote: >>>> >>>> Hi Jason, >>>> >>>> Turns out we still need CFI for frame 0 in certain situations... >>>> >>>> A possible approach is to disassemble machine code, and manually adjust >>>> CFI for frame 0. For example, if we see "pop ebp; => ret", we set cfa to >>>> [esp]; if we see "call next-insn; => pop %ebp", we set cfa_offset+=4. >>>> >>>> Patch attached, now it just implements adjustment for "pop ebp; ret". >>>> >>>> If you think this approach is OK, I will go ahead and add other >>>> tricks(i386 pc relative addressing, more styles of epilogue, etc). >>>> >>>> Thank you for your time! >>>> >>>> >>>> On Thu, Jul 31, 2014 at 12:50 PM, Tong Shen <endlessr...@google.com> wrote: >>>> I think gdb's rationale for using CFI for leaf function is: >>>> - gcc always generate CFI for progolue, so at function entry, we know the >>>> correct CFA; >>>> - any stack pointer altering operation after that(mid-function & >>>> epilogue), we can recognize and handle them. >>>> So basically, it assumes 2, hacks its way through 3 & 4, and pretends we >>>> are at 5. >>>> Number of hacks we need seems to be small in x86 world, so this tradition >>>> is still here. >>>> >>>> Here's what gdb does for epilogue: normally when you run 'n', it will run >>>> one instruction a time till the next line/different stack id. But when it >>>> sees "pop %rbp; ret", it won't step into these instructions. Instead it >>>> will execute past them directly. >>>> I didn't experiment with x86 pc-relative addressing; but I guess it will >>>> also recognize and execute past this pattern directly. >>>> >>>> So for compiler generated functions, what we do now with assembly parser >>>> now can be done with CFI + those gdb hacks. >>>> And for hand-written assembly, i think CFI is almost always precise at >>>> instruction level. In this case, utilizing CFI instead of assembly parser >>>> will be a big help. >>>> >>>> So maybe we can apply those hacks, and trust CFI only for x86 & x86_64 >>>> targets? >>>> >>>> >>>> On Thu, Jul 31, 2014 at 12:02 AM, Jason Molenda <jmole...@apple.com> wrote: >>>> I think we could think of five levels of eh_frame information: >>>> >>>> >>>> 1 unwind instructions at exception throw locations & locations where a >>>> callee may throw an exception >>>> >>>> 2 unwind instructions that describe the prologue >>>> >>>> 3 unwind instructions that describe the epilogue at the end of the function >>>> >>>> 4 unwind instructions that describe mid-function epilogues (I see these on >>>> arm all the time, don't see them on x86 with compiler generated code - but >>>> we don't use eh_frame on arm at Apple, I'm just mentioning it for >>>> completeness) >>>> >>>> 5 unwind instructions that describe any changes mid-function needed to >>>> unwind at all instructions ("asynchronous unwind information") >>>> >>>> >>>> The eh_frame section only guarantees #1. gcc and clang always do #1 and >>>> #2. Modern gcc's do #3. I don't know if gcc would do #4 on arm but it's >>>> not important, I just mention it for completeness. And no one does #5 (as >>>> far as I know), even in the DWARF debug_frame section. >>>> >>>> I think it maybe possible to detect if an eh_frame entry fulfills #3 by >>>> looking if the CFA definition on the last row is the same as the initial >>>> CFA definition. But I'm not sure how a debugger could use heuristics to >>>> determine much else. >>>> >>>> >>>> In fact, detecting #3 may be the easiest thing to detect. I'm not sure if >>>> the debugger could really detect #2 except maybe if the function had a >>>> standard prologue (push rbp, mov rsp rbp) and the eh_frame didn't describe >>>> the effects of these instructions, the debugger could know that the >>>> eh_frame does not describe the prologue. >>>> >>>> >>>> >>>> >>>>> On Jul 30, 2014, at 6:58 PM, Tong Shen <endlessr...@google.com> wrote: >>>>> >>>>> Ah I understand now. >>>>> >>>>> Now prologue seems always included in CFI fro gcc & clang; and newer gcc >>>>> includes epilogue as well. >>>>> Maybe we can detect and use them when they are available? >>>>> >>>>> >>>>> On Wed, Jul 30, 2014 at 6:44 PM, Jason Molenda <jmole...@apple.com> wrote: >>>>> Ah, it looks like gcc changed since I last looked at its eh_frame output. >>>>> >>>>> It's not a bug -- the eh_frame unwind instructions only need to be >>>>> accurate at instructions where an exception can be thrown, or where a >>>>> callee function can throw an exception. There's no requirement to >>>>> include prologue or epilogue instructions in the eh_frame. >>>>> >>>>> And unfortunately from lldb's perspective, when we see eh_frame we'll >>>>> never know how descriptive it is. If it's old-gcc or clang, it won't >>>>> include epilogue instructions. If it's from another compiler, it may not >>>>> include any prologue/epilogue instructions at all. >>>>> >>>>> Maybe we could look over the UnwindPlan rows and see if the CFA >>>>> definition of the last row matches the initial row's CFA definition. >>>>> That would show that the epilogue is described. Unless it is a tail-call >>>>> (aka noreturn) function - in which case the stack is never restored. >>>>> >>>>> >>>>> >>>>> >>>>>> On Jul 30, 2014, at 6:32 PM, Tong Shen <endlessr...@google.com> wrote: >>>>>> >>>>>> GCC seems to generate a row for epilogue. >>>>>> Do you think this is a clang bug, or at least a discrepancy between >>>>>> clang & gcc? >>>>>> >>>>>> Source: >>>>>> int f() { >>>>>> puts("HI\n"); >>>>>> return 5; >>>>>> } >>>>>> >>>>>> Compile option: only -g >>>>>> >>>>>> gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) >>>>>> clang version 3.5.0 (213114) >>>>>> >>>>>> Env: Ubuntu 14.04, x86_64 >>>>>> >>>>>> drawfdump -F of clang binary: >>>>>> < 2><0x00400530:0x00400559><f><fde offset 0x00000088 length: >>>>>> 0x0000001c><eh aug data len 0x0> >>>>>> 0x00400530: <off cfa=08(r7) > <off r16=-8(cfa) > >>>>>> 0x00400531: <off cfa=16(r7) > <off r6=-16(cfa) > <off r16=-8(cfa) > >>>>>> 0x00400534: <off cfa=16(r6) > <off r6=-16(cfa) > <off r16=-8(cfa) > >>>>>> >>>>>> drawfdump -F of gcc binary: >>>>>> < 1><0x0040052d:0x00400542><f><fde offset 0x00000070 length: >>>>>> 0x0000001c><eh aug data len 0x0> >>>>>> 0x0040052d: <off cfa=08(r7) > <off r16=-8(cfa) > >>>>>> 0x0040052e: <off cfa=16(r7) > <off r6=-16(cfa) > <off r16=-8(cfa) > >>>>>> 0x00400531: <off cfa=16(r6) > <off r6=-16(cfa) > <off r16=-8(cfa) > >>>>>> 0x00400541: <off cfa=08(r7) > <off r6=-16(cfa) > <off r16=-8(cfa) > >>>>>> >>>>>> >>>>>> On Wed, Jul 30, 2014 at 5:43 PM, Jason Molenda <jmole...@apple.com> >>>>>> wrote: >>>>>> I'm open to trying to trust eh_frame at frame 0 for x86_64. The lack of >>>>>> epilogue descriptions in eh_frame is the biggest problem here. >>>>>> >>>>>> When you "step" or "next" in the debugger, the debugger instruction >>>>>> steps across the source line until it gets to the next source line. >>>>>> Every time it stops after an instruction step, it confirms that it is >>>>>> (1) between the start and end pc values for the source line, and (2) >>>>>> that the "stack id" (start address of the function + CFA address) is the >>>>>> same. If it stops and the stack id has changed, for a "next" command, >>>>>> it will backtrace one stack frame to see if it stepped into a function. >>>>>> If so, it sets a breakpoint on the return address and continues. >>>>>> >>>>>> If you switch lldb to prefer eh_frame instructions for x86_64, e.g. >>>>>> >>>>>> Index: source/Plugins/Process/Utility/RegisterContextLLDB.cpp >>>>>> =================================================================== >>>>>> --- source/Plugins/Process/Utility/RegisterContextLLDB.cpp >>>>>> (revision 214344) >>>>>> +++ source/Plugins/Process/Utility/RegisterContextLLDB.cpp (working >>>>>> copy) >>>>>> @@ -791,6 +791,22 @@ >>>>>> } >>>>>> } >>>>>> >>>>>> + // For x86_64 debugging, let's try using the eh_frame instructions >>>>>> even if this is the currently >>>>>> + // executing function (frame zero). >>>>>> + Target *target = exe_ctx.GetTargetPtr(); >>>>>> + if (target >>>>>> + && (target->GetArchitecture().GetCore() == >>>>>> ArchSpec::eCore_x86_64_x86_64h >>>>>> + || target->GetArchitecture().GetCore() == >>>>>> ArchSpec::eCore_x86_64_x86_64)) >>>>>> + { >>>>>> + unwind_plan_sp = func_unwinders_sp->GetUnwindPlanAtCallSite >>>>>> (m_current_offset_backed_up_one); >>>>>> + int valid_offset = -1; >>>>>> + if (IsUnwindPlanValidForCurrentPC(unwind_plan_sp, valid_offset)) >>>>>> + { >>>>>> + UnwindLogMsgVerbose ("frame uses %s for full UnwindPlan, >>>>>> preferred over assembly profiling on x86_64", >>>>>> unwind_plan_sp->GetSourceName().GetCString()); >>>>>> + return unwind_plan_sp; >>>>>> + } >>>>>> + } >>>>>> + >>>>>> // Typically the NonCallSite UnwindPlan is the unwind created by >>>>>> inspecting the assembly language instructions >>>>>> if (behaves_like_zeroth_frame) >>>>>> { >>>>>> >>>>>> >>>>>> you'll find that you have to "next" twice to step out of a function. >>>>>> Why? With a simple function like: >>>>>> >>>>>> * thread #1: tid = 0xaf31e, 0x0000000100000eb9 a.out`foo + 25 at a.c:5, >>>>>> queue = 'com.apple.main-thread', stop reason = step over >>>>>> #0: 0x0000000100000eb9 a.out`foo + 25 at a.c:5 >>>>>> 2 int foo () >>>>>> 3 { >>>>>> 4 puts("HI"); >>>>>> -> 5 return 5; >>>>>> 6 } >>>>>> 7 >>>>>> 8 int bar () >>>>>> (lldb) disass >>>>>> a.out`foo at a.c:3: >>>>>> 0x100000ea0: pushq %rbp >>>>>> 0x100000ea1: movq %rsp, %rbp >>>>>> 0x100000ea4: subq $0x10, %rsp >>>>>> 0x100000ea8: leaq 0x6b(%rip), %rdi ; "HI" >>>>>> 0x100000eaf: callq 0x100000efa ; symbol stub for: puts >>>>>> 0x100000eb4: movl $0x5, %ecx >>>>>> -> 0x100000eb9: movl %eax, -0x4(%rbp) >>>>>> 0x100000ebc: movl %ecx, %eax >>>>>> 0x100000ebe: addq $0x10, %rsp >>>>>> 0x100000ec2: popq %rbp >>>>>> 0x100000ec3: retq >>>>>> >>>>>> >>>>>> if you do "next" lldb will instruction step, comparing the stack ID at >>>>>> every stop, until it gets to 0x100000ec3 at which point the stack ID >>>>>> will change. The CFA address (which the eh_frame tells us is rbp+16) >>>>>> just changed to the caller's CFA address because we're about to return. >>>>>> The eh_frame instructions really need to tell us that the CFA is now >>>>>> rsp+8 at 0x100000ec3. >>>>>> >>>>>> The end result is that you need to "next" twice to step out of a >>>>>> function. >>>>>> >>>>>> AssemblyParse_x86 has a special bit where it looks or the 'ret' >>>>>> instruction sequence at the end of the function - >>>>>> >>>>>> // Now look at the byte at the end of the AddressRange for a limited >>>>>> attempt at describing the >>>>>> // epilogue. We're looking for the sequence >>>>>> >>>>>> // [ 0x5d ] mov %rbp, %rsp >>>>>> // [ 0xc3 ] ret >>>>>> // [ 0xe8 xx xx xx xx ] call __stack_chk_fail (this is sometimes the >>>>>> final insn in the function) >>>>>> >>>>>> // We want to add a Row describing how to unwind when we're stopped on >>>>>> the 'ret' instruction where the >>>>>> // CFA is no longer defined in terms of rbp, but is now defined in >>>>>> terms of rsp like on function entry. >>>>>> >>>>>> >>>>>> and adds an extra row of unwind details for that instruction. >>>>>> >>>>>> >>>>>> I mention x86_64 as being a possible good test case here because I worry >>>>>> about the i386 picbase sequence (call next-instruction; pop $ebx) which >>>>>> occurs a lot. But for x86_64, my main concern is the epilogues. >>>>>> >>>>>> >>>>>> >>>>>>> On Jul 30, 2014, at 2:52 PM, Tong Shen <endlessr...@google.com> wrote: >>>>>>> >>>>>>> Thanks Jason! That's a very informative post, clarify things a lot :-) >>>>>>> >>>>>>> Well I have to admit that my patch is specifically for certain kind of >>>>>>> functions, and now I see that's not the general case. >>>>>>> >>>>>>> I did some experiment with gdb. gdb uses CFI for frame 0, either x86 or >>>>>>> x86_64. It looks for FDE of frame 0, and do CFA calculations according >>>>>>> to that. >>>>>>> >>>>>>> - For compiler generated functions: I think there are 2 usage scenarios >>>>>>> for frame 0: breakpoint and signal. >>>>>>> - Breakpoints are usually at source line boundary instead of >>>>>>> instruction boundary, and generally we won't be caught at stack pointer >>>>>>> changing locations, so CFI is still valid. >>>>>>> - For signal, synchronous unwind table may not be sufficient here. But >>>>>>> only stack changing instructions will cause incorrect CFA calculation, >>>>>>> so it' not always the case. >>>>>>> - For hand written assembly functions: from what I've seen, most of the >>>>>>> time CFI is present and actually asynchronous. >>>>>>> So it seems that in most cases, even with only synchronous unwind >>>>>>> table, CFI is still correct. >>>>>>> >>>>>>> I believe we can trust eh_frame for frame 0 and use assembly profiling >>>>>>> as fallback. If both failed, maybe code owner should use >>>>>>> -fasynchronous-unwind-tables :-) >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 29, 2014 at 4:59 PM, Jason Molenda <jmole...@apple.com> >>>>>>> wrote: >>>>>>> It was a tricky one and got lost in the shuffle of a busy week. I was >>>>>>> always reluctant to try profiling all the instructions in a function. >>>>>>> On x86, compiler generated code (gcc/clang anyway) is very simplistic >>>>>>> about setting up the stack frame at the start and only having one >>>>>>> epilogue - so anything fancier risked making mistakes and could >>>>>>> possibly have a performance impact as we run functions through the >>>>>>> disassembler. >>>>>>> >>>>>>> For hand-written assembly functions (which can be very creative with >>>>>>> their prologue/epilogue and where it is placed), my position is that >>>>>>> they should write eh_frame instructions in their assembly source to >>>>>>> tell lldb where to find things. There is one or two libraries on Mac >>>>>>> OS X where we break the "ignore eh_frame for the currently executing >>>>>>> function" because there are many hand-written assembly functions in >>>>>>> there and the eh_frame is going to beat our own analysis. >>>>>>> >>>>>>> >>>>>>> After I wrote the x86 unwinder, Greg and Caroline implemented the arm >>>>>>> unwinder where it emulates every instruction in the function looking >>>>>>> for prologue/epilogue instructions. We haven't seen it having a >>>>>>> particularly bad impact performance-wise (lldb only does this >>>>>>> disassembly for functions that it finds on stacks during an execution >>>>>>> run, and it saves the result so it won't re-compute it for a given >>>>>>> function). The clang armv7 codegen often has mid-function epilogues >>>>>>> (early returns) which definitely complicated things and made it >>>>>>> necessary to step through the entire function bodies. There's a bunch >>>>>>> of code I added to support these mid-function epilogues - I have to >>>>>>> save the register save state when I see an instruction which looks like >>>>>>> an epilogue, and when I see the final ret instruction (aka restoring >>>>>>> the saved lr contents into pc), I re-install the register save state >>>>>>> from before the epilogue started. >>>>>>> >>>>>>> These things always make me a little nervous because the instruction >>>>>>> analyzer obviously is doing a static analysis so it knows nothing about >>>>>>> flow control. Tong's patch stops when it sees the first CALL >>>>>>> instruction - but that's not right, that's just solving the problem for >>>>>>> his particular function which doesn't have any CALL instructions before >>>>>>> his prologue. :) You could imagine a function which saves a couple of >>>>>>> registers, calls another function, then saves a couple more because it >>>>>>> needs more scratch registers. >>>>>>> >>>>>>> If we're going to change to profiling deep into the function -- and I'm >>>>>>> not opposed to doing that, it's been fine on arm -- we should just do >>>>>>> the entire function I think. >>>>>>> >>>>>>> >>>>>>> Another alternative would be to trust eh_frame on x86_64 at frame 0. >>>>>>> This is one of those things where there's not a great solution. The >>>>>>> unwind instructions in eh_frame are only guaranteed to be accurate for >>>>>>> synchronous unwinds -- that is, they are only guaranteed to be accurate >>>>>>> at places where an exception could be thrown - at call sites. So for >>>>>>> instances, there's no reason why the compiler has to describe the >>>>>>> function prologue instructions at all. There's no requirement that the >>>>>>> eh_frame instructions describe the epilogue instructions. The >>>>>>> information about spilled registers only needs to be emitted where we >>>>>>> could throw an exception, or where a callee could throw an exception. >>>>>>> >>>>>>> clang/gcc both emit detailed instructions for the prologue setup. But >>>>>>> for i386 codegen if the compiler needs to access some pc-relative data, >>>>>>> it will do a "call next-instruction; pop %eax" to get the current pc >>>>>>> value. (x86_64 has rip-relative addressing so this isn't needed) If >>>>>>> you're debugging -fomit-frame-pointer code, that means your CFA is >>>>>>> expressed in terms of the stack pointer and the stack pointer just >>>>>>> changed mid-function --- and eh_frame instructions don't describe this. >>>>>>> >>>>>>> The end result: If you want accurate unwinds 100% of the time, you >>>>>>> can't rely on the unwind instructions from eh_frame. But they'll get >>>>>>> you accurate unwinds 99.9% of the time ... also, last I checked, >>>>>>> neither clang nor gcc describe the epilogue instructions. >>>>>>> >>>>>>> >>>>>>> In *theory* the unwind instructions from the DWARF debug_frame section >>>>>>> should be asynchronous -- they should describe how to find the CFA >>>>>>> address for every instruction in the function. Which makes sense - you >>>>>>> want eh_frame to be compact because it's bundled into the executable, >>>>>>> so it should only have the information necessary for exception handling >>>>>>> and you can put the verbose stuff in debug_frame DWARF for debuggers. >>>>>>> But instead (again, last time I checked), the compilers put the exact >>>>>>> same thing in debug_frame even if you use the >>>>>>> -fasynchronous-unwind-tables (or whatever that switch was) option. >>>>>>> >>>>>>> >>>>>>> So I don't know, maybe we should just start trusting eh_frame at frame >>>>>>> 0 and write off those .1% cases where it isn't correct instead of >>>>>>> trying to get too fancy with the assembly analysis code. >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Jul 29, 2014, at 4:17 PM, Todd Fiala <tfi...@google.com> wrote: >>>>>>>> >>>>>>>> Hey Jason, >>>>>>>> >>>>>>>> Do you have any feedback on this? >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> -Todd >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jul 25, 2014 at 1:42 PM, Tong Shen <endlessr...@google.com> >>>>>>>> wrote: >>>>>>>> Sorry, wrong version of patch... >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jul 25, 2014 at 1:41 PM, Tong Shen <endlessr...@google.com> >>>>>>>> wrote: >>>>>>>> Hi Molenda, lldb-commits, >>>>>>>> >>>>>>>> For now, x86 assembly profiler will stop after 10 "non-prologue" >>>>>>>> instructions. In practice it may not be sufficient. For example, we >>>>>>>> have a hand-written assembly function, which have hundreds of >>>>>>>> instruction before actual (stack-adjusting) prologue instructions. >>>>>>>> >>>>>>>> One way is to change the limit to 1000; but there will always be >>>>>>>> functions that break the limit :-) I believe the right thing to do >>>>>>>> here is parsing all instructions before "ret"/"call" as prologue >>>>>>>> instructions. >>>>>>>> >>>>>>>> Here's what I changed: >>>>>>>> - For "push %rbx" and "mov %rbx, -8(%rbp)": only add first row for >>>>>>>> that register. They may appear multiple times in function body. But as >>>>>>>> long as one of them appears, first appearance should be in prologue(If >>>>>>>> it's not in prologue, this function will not use %rbx, so these 2 >>>>>>>> instructions should not appear at all). >>>>>>>> - Also monitor "add %rsp 0x20". >>>>>>>> - Remove non prologue instruction count. >>>>>>>> - Add "call" instruction detection, and stop parsing after it. >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>> -- >>>>>>>> Best Regards, Tong Shen >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best Regards, Tong Shen >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> lldb-commits mailing list >>>>>>>> lldb-commits@cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Todd Fiala | Software Engineer | tfi...@google.com | >>>>>>>> 650-943-3180 >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, Tong Shen >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, Tong Shen >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, Tong Shen >>>> >>>> >>>> >>>> >>>> -- >>>> Best Regards, Tong Shen >>>> >>>> >>>> >>>> -- >>>> Best Regards, Tong Shen >>>> <adjust_cfi_for_frame_zero.patch> >>> >> >> _______________________________________________ >> lldb-commits mailing list >> lldb-commits@cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits > _______________________________________________ lldb-commits mailing list lldb-commits@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits