Re: [Lldb-commits] [PATCH] D94846: Allow breakpoints to be set on C++11 inline initializers

Jim Ingham via lldb-commits Tue, 19 Jan 2021 15:58:24 -0800


> On Jan 19, 2021, at 3:34 PM, David Blaikie <dblai...@gmail.com> wrote:
> 
> On Tue, Jan 19, 2021 at 2:55 PM Jim Ingham <jing...@apple.com> wrote:
>> 
>> 
>> 
>>> On Jan 19, 2021, at 11:40 AM, David Blaikie <dblai...@gmail.com> wrote:
>>> 
>>> On Tue, Jan 19, 2021 at 10:17 AM Jim Ingham <jing...@apple.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Jan 17, 2021, at 10:47 AM, David Blaikie <dblai...@gmail.com> wrote:
>>>>> 
>>>>> On Fri, Jan 15, 2021 at 6:22 PM Jim Ingham <jing...@apple.com> wrote:
>>>>>> 
>>>>>> If you set a breakpoint on source lines with no line table entries, lldb 
>>>>>> slides the actual match forward to the nearest line table entry to the 
>>>>>> given line number.  That's necessary for instance because if you lay out 
>>>>>> your function definitions over multiple lines they won't all get line 
>>>>>> table entries, but we don't want people to have to guess which of them 
>>>>>> was...  Same is true of more complicated compound statements.
>>>>> 
>>>>> Ah, sure - gdb seems to do that too, totally makes sense - as you say,
>>>>> it'd be pretty hard to know exactly which tokens end up with
>>>>> corresponding entries in the line table and which don't (especially
>>>>> under optimizations and complex compound statements).
>>>> 
>>>> Yeah, I think you either have to have this heuristic, or have a debugger 
>>>> that can display "valid breakpoint locations".  The old Metrowerks 
>>>> debugger used to do it that way.  They only put up breakpoint affordances 
>>>> in the UI for the lines that had line table entries and there was no 
>>>> free-form way to set breakpoints, so you couldn't do an ambiguous thing.  
>>>> But I don't think we have that option in lldb.
>>> 
>>> Could be nice to have in source printing on the command line and could
>>> be done more fully in IDE integrations like XCode - but yeah, still
>>> would want all this sliding stuff for raw command line usage even if
>>> we had the rest.
>> 
>> If we just cared about lines and not columns, it would be pretty easy to 
>> annotate the source line output, maybe just a • before each breakable line.  
>> Marking the column locations as well, while keeping the whole thing 
>> readable, is a bit trickier.
>> 
>> The problem with Xcode is that they don't do full compiles as you are 
>> editing.  They do run the front end for code completion, brace matching, 
>> etc...  But that doesn't generate line tables.  So you would have to support 
>> breakpoints in files where you do and don't know the valid line table 
>> entries.  If you didn't design for using the compiled code for UI in the 
>> IDE, it's not easy to add it in after the fact.
>> 
>>> 
>>>>>> So people like that, but they really hate it when they have:
>>>>>> 
>>>>>> 
>>>>>> #ifdef SOMETHING_NOT_DEFINED
>>>>>> 
>>>>>> int foo() {
>>>>>> 
>>>>>> }
>>>>>> 
>>>>>> #else
>>>>>> int bar() {
>>>>>> 
>>>>>> }
>>>>>> #end
>>>>>> 
>>>>>> but with lots more junk so you can't see the ifdef's and then they set a 
>>>>>> breakpoint in the "int foo" part and the breakpoint gets moved to bar 
>>>>>> instead and then doesn't get hit.
>>>>> 
>>>>> Ah, indeed - that is a curious case that could be surprising to a
>>>>> user. Thanks for explaining it - though it seems gdb doesn't special
>>>>> case this & does produce the not-quite-what-the-user-intended
>>>>> behavior.
>>>>> 
>>>>>> You might try to argue that they should have checked where the 
>>>>>> breakpoint actually landed before coming into your office to yell at 
>>>>>> you, but it's likely to leave you with fewer friends...
>>>>> 
>>>>> FWIW: I don't have coworkers or friends come into my office to yell at
>>>>> me about anything like this, and I don't really think anyone should -
>>>>> that's not appropriate behavior for a workplace.
>>>>> 
>>>>> Having a nuanced discussion about the tradeoff of features - sure.
>>>> 
>>>> Yeah, IME people get cheesed off at debuggers in ways they don't with 
>>>> compilers.  I have some theories, but more of the after work quality than 
>>>> the public mailing list quality.
>>>> 
>>>>>> So I was trying to detect this case and not move the breakpoint if 
>>>>>> sliding it crossed over the function start boundary.  That way you'd see 
>>>>>> that the breakpoint didn't work, and go figure out why.
>>>>> 
>>>>> Neat idea!
>>>>> 
>>>>>> The thinko in the original version was that we were still doing this 
>>>>>> when we DIDN'T have to slide the breakpoint, when we got an exact match 
>>>>>> in the line table.  In that case, we shouldn't try to second guess the 
>>>>>> line table at all.  That's the patch in this fix.
>>>>> 
>>>>> Might this still leave some confusing behavior. Now it'll slide
>>>>> forward into a function, but it won't slide forward into an
>>>>> initializer? (so now the user has to know exactly which lines of the
>>>>> initializer are attributed to the line table, making that somewhat
>>>>> difficult to use?)
>>>> 
>>>> 
>>>> Yes I knew this was a tradeoff of the implementation.  If I did a more in 
>>>> depth search in order to slide this over to function initializer, it would 
>>>> IMO add way more complexity to an already overly fiddly part of the 
>>>> debugger, and the trade-off didn't seem worth it.  Most initializers have 
>>>> a fairly simple structure, as opposed say to the before the body part of a 
>>>> function definition (that's got to have a name, but I don't know it off 
>>>> the top of my head.)  So I think the payoff would be a lot less in this 
>>>> area.  I asked around here and the folks I polled agreed that it wasn't 
>>>> worth the complexity.
>>>> 
>>>>> 
>>>>> Testing some things out:
>>>>> 
>>>>> $ cat break.cpp
>>>>> __attribute__((optnone)) int f1() { return 1; }
>>>>> struct t1 {
>>>>> int     // 3
>>>>>    i   // 4
>>>>>    =   // 5
>>>>>    f1  // 6
>>>>>    ();
>>>>> t1() {
>>>>> }
>>>>> };
>>>>> 
>>>>> t1 v1;
>>>>> int main() {
>>>>> }
>>>>> 
>>>>> The line table:
>>>>> 
>>>>> Address            Line   Column File   ISA Discriminator Flags
>>>>> ------------------ ------ ------ ------ --- ------------- -------------
>>>>> 0x0000000000401140      1      0      1   0             0  is_stmt
>>>>> 0x0000000000401144      1     37      1   0             0  is_stmt 
>>>>> prologue_end
>>>>> 0x0000000000401150     13      0      1   0             0  is_stmt
>>>>> 0x0000000000401154     14      1      1   0             0  is_stmt 
>>>>> prologue_end
>>>>> 0x0000000000401158     14      1      1   0             0  is_stmt 
>>>>> end_sequence
>>>>> 0x0000000000401020      0      0      1   0             0  is_stmt
>>>>> 0x0000000000401024     12      4      1   0             0  is_stmt 
>>>>> prologue_end
>>>>> 0x0000000000401040      0      0      1   0             0  is_stmt
>>>>> 0x000000000040104b      0      0      1   0             0  is_stmt 
>>>>> end_sequence
>>>>> 0x0000000000401160      8      0      1   0             0  is_stmt
>>>>> 0x0000000000401174      6      7      1   0             0  is_stmt 
>>>>> prologue_end
>>>>> 0x000000000040117f      4      7      1   0             0  is_stmt
>>>>> 0x0000000000401181      9      3      1   0             0  is_stmt
>>>>> 0x0000000000401187      9      3      1   0             0  is_stmt 
>>>>> end_sequence
>>>>> 
>>>>> Has entries for line 4 and 6 from the initializer (and 8 and 9 from
>>>>> the ctor), but gdb doesn't seem to want to break on any of them. So
>>>>> I'm not sure what gdb's doing, but it seems pretty unhelpful
>>>>> (apparently gdb breaks in to the ctor too late to even let you step
>>>>> into the initializers... guess that's because the initializers are
>>>>> called in the prologue according to the debug info):
>>>>> (gdb) b 1
>>>>> Breakpoint 1 at 0x401144: file break.cpp, line 1.
>>>>> (gdb) b 2
>>>>> Breakpoint 2 at 0x401181: file break.cpp, line 9.
>>>>> (gdb) b 3
>>>>> Note: breakpoint 2 also set at pc 0x401181.
>>>>> Breakpoint 3 at 0x401181: file break.cpp, line 9.
>>>>> (gdb) b 4
>>>>> Note: breakpoints 2 and 3 also set at pc 0x401181.
>>>>> Breakpoint 4 at 0x401181: file break.cpp, line 9.
>>>>> (gdb) b 5
>>>>> Note: breakpoints 2, 3 and 4 also set at pc 0x401181.
>>>>> Breakpoint 5 at 0x401181: file break.cpp, line 9.
>>>>> (gdb) b 6
>>>>> Note: breakpoints 2, 3, 4 and 5 also set at pc 0x401181.
>>>>> Breakpoint 6 at 0x401181: file break.cpp, line 9.
>>>>> (gdb) b 7
>>>>> Note: breakpoints 2, 3, 4, 5 and 6 also set at pc 0x401181.
>>>>> Breakpoint 7 at 0x401181: file break.cpp, line 9.
>>>>> (gdb) b 8
>>>>> Note: breakpoints 2, 3, 4, 5, 6 and 7 also set at pc 0x401181.
>>>>> Breakpoint 8 at 0x401181: file break.cpp, line 9.
>>>>> (gdb) b 9
>>>>> Note: breakpoints 2, 3, 4, 5, 6, 7 and 8 also set at pc 0x401181.
>>>>> Breakpoint 9 at 0x401181: file break.cpp, line 9.
>>>>> 
>>>>> lldb does already seem to let me break on the initializer without this
>>>>> patch - so any idea what part of this already works compared to what's
>>>>> being fixed here? (though lldb breaks on line 6 even when I asked to
>>>>> break on line 7 or 8, but it doesn't break on 6 when I ask to break on
>>>>> 6... )
>>>> 
>>>> 
>>>> Yes.  This is because the line table entry for the "first line" of a 
>>>> function is also imprecise.  For instance, if you do:
>>>> 
>>>> int
>>>> foo()
>>>> {
>>>> 
>>>> }
>>>> 
>>>> the "int" line in the function definition pretty much never gets a line 
>>>> table entry, yet that's a perfectly reasonable place to set a breakpoint 
>>>> on the function.  So the "am I sliding across function definitions" check 
>>>> has a little slop in the comparison.  Just put a handful of blank lines 
>>>> between the "();" of the struct initialization and the definition of the 
>>>> constructor, and then the breakpoint won't take.
>>> 
>>> Ah, right right - I was still a bit confused by why the line was
>>> showing as line 6, but I get it - it's because that's where the first
>>> instruction for function is. Hmm, actually that doesn't quite make
>>> sense to me still - if I set a breakpoint at line 9, why wouldn't I
>>> get a breakpoint on that line? I guess maybe lldb detects that that is
>>> the line where a function starts according to the DW_AT_decl_line in
>>> the DW_TAG_subprogram, and instead of breaking on line 9 per the line
>>> table, it breaks on the start of the function which is line 6? Though
>>> when the program... huh, it does stop at 6, even though I said break 9
>>> (well, I said break 13 in my new example where I've added the
>>> suggested handful of blank lines)
>>> 
>>> Seems a bit surprising/not something I can readily explain:
>>> 
>>> $ cat -n test.cpp
>>>    1  __attribute__((optnone)) int f1() { return 1; }
>>>    2  struct t1 {
>>>    3    int     // 3
>>>    4        i   // 4
>>>    5        =   // 5
>>>    6        f1  // 6
>>>    7        ();
>>>    8
>>>    9
>>>   10
>>>   11
>>>   12
>>>   13    t1() {}
>>>   14  };
>>>   15  t1 v1;
>>>   16  int main() {}
>>> blaikie@blaikie-linux2:~/dev/scratch$ lldb ./a.out
>>> (lldb) target create "./a.out"
>>> Current executable set to
>>> '/usr/local/google/home/blaikie/dev/scratch/a.out' (x86_64).
>>> (lldb) b 13
>>> Breakpoint 1: where = a.out`t1::t1() + 20 at test.cpp:6:7, address =
>>> 0x0000000000401174
>>> (lldb) r
>>> Process 554103 launched:
>>> '/usr/local/google/home/blaikie/dev/scratch/a.out' (x86_64)
>>> Process 554103 stopped
>>> * thread #1, name = 'a.out', stop reason = breakpoint 1.1
>>>   frame #0: 0x0000000000401174 a.out`t1::t1(this=0x000000000040402c)
>>> at test.cpp:6:7
>>>  3      int     // 3
>>>  4          i   // 4
>>>  5          =   // 5
>>> -> 6          f1  // 6
>>>  7          ();
>>>  8
>>>  9
>>> (lldb)
>> 
>> There's another "fiddly" bit of the breakpoint setting that likely explains 
>> what you are seeing.  If you set breakpoint (either by name or by file & 
>> line) that happens to fall inside the prologue of a function, lldb moves the 
>> breakpoint to the first instruction after the prologue.  If we don't do this 
>> then when you stop in the function at this breakpoint, argument values are 
>> likely to be incorrect, since the debug information generally only gives 
>> correct locations after the prologue.  It also makes getting the backtrace 
>> right trickier since you then have to understand where in the prologue you 
>> are.
>> 
>> Since not very many people actually need to debug the prologue sequence of a 
>> function, this avoids a lot of confusion at the cost of a just a smidge more 
>> mystery.  gdb does this as well (or at least it did the last time I played 
>> with it.)  There is a setting to not skip the prologue, which you need to do 
>> for instance if you want to ensure that you stop while the arguments are 
>> still all in their ABI specified locations.  But the default is to do the 
>> slide.
>> 
>> So if the first thing that happens after the prologue in t1() is the 
>> initialization of i, then the sliding of the breakpoint past the prologue 
>> would move it from line 13 to line 6, as you saw.  And we don't let the 
>> prologue slide cross function boundaries so we don't do a "cross function 
>> boundaries" check in that move.
> 
> Hmm, yep, that seems to be it. Which just makes me more confused about
> what gdb's doing then (when I set a breakpoint at the function, it
> seems to be breaking after all the initializers have run - well beyond
> the prologue - perhaps that's what they've done to try to get the user
> to where the member initializers have finished running).


Interesting.  I haven't worked on gdb in decades so I don't have any insight 
there.

I wouldn't personally move the breakpoint of a constructor past the 
initializers.  Seems like getting back into the initializers if you wanted to 
would be tedious.  And if you were counting on it on the first hit you would 
have already missed the event you were looking for.  But I could see somebody 
getting annoyed with stepping through all those boring initializers and doing 
it the other way.

> 
> Thanks for talking me through it.
> 

Thanks for giving this a thorough look!

Jim


> - Dave
> 
>> 
>> If somebody has some spare time on their hands it might be fun to try to 
>> build up a little report describing these moves, and then store that in the 
>> location and print it out with "break list -v".  That would help when you 
>> are puzzling over how your breakpoint got where it did...
>> 
>> Jim
>> 
>> 
>> 
>>> 
>>>>> (lldb) target create "./a.out"
>>>>> Current executable set to
>>>>> '/usr/local/google/home/blaikie/dev/scratch/a.out' (x86_64).
>>>>> (lldb) b 1
>>>>> Breakpoint 1: where = a.out`f1() + 4 at break.cpp:1:37, address =
>>>>> 0x0000000000401144
>>>>> (lldb) b 2
>>>>> Breakpoint 2: no locations (pending).
>>>>> WARNING:  Unable to resolve breakpoint to any actual locations.
>>>>> (lldb) b 3
>>>>> Breakpoint 3: no locations (pending).
>>>>> WARNING:  Unable to resolve breakpoint to any actual locations.
>>>>> (lldb) b 4
>>>>> Breakpoint 4: no locations (pending).
>>>>> WARNING:  Unable to resolve breakpoint to any actual locations.
>>>>> (lldb) b 5
>>>>> Breakpoint 5: no locations (pending).
>>>>> WARNING:  Unable to resolve breakpoint to any actual locations.
>>>>> (lldb) b 6
>>>>> Breakpoint 6: no locations (pending).
>>>>> WARNING:  Unable to resolve breakpoint to any actual locations.
>>>>> (lldb) b 7
>>>>> Breakpoint 7: where = a.out`t1::t1() + 20 at break.cpp:6:7, address =
>>>>> 0x0000000000401174
>>>>> (lldb) b 8
>>>>> Breakpoint 8: where = a.out`t1::t1() + 20 at break.cpp:6:7, address =
>>>>> 0x0000000000401174
>>>>> (lldb) b 9
>>>>> Breakpoint 9: where = a.out`t1::t1() + 33 at break.cpp:9:3, address =
>>>>> 0x0000000000401181
>>>>> (lldb) r
>>>>> 
>>>>>> BTW, the check wouldn't have affected code from .defs files because I 
>>>>>> only do it if the original breakpoint specification and the "function 
>>>>>> start" are from the same source file.
>>>>> 
>>>>> Sure enough - no doubt someone out there has some file that #includes
>>>>> itself in some entertaining way, but likely quite rare.
>>>> 
>>>> Given that the penalty you have to pay for such a deed is having to be a 
>>>> little more careful how you set breakpoints, I don't feel the need for 
>>>> super-human efforts in this area.
>>>> 
>>>> Jim
>>>> 
>>>> 
>>>>> 
>>>>>> And we know about inlined blocks so inlining isn't going to fool us 
>>>>>> either.
>>>>>> 
>>>>>> Jim
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Jan 15, 2021, at 5:09 PM, David Blaikie <dblai...@gmail.com> wrote:
>>>>>>> 
>>>>>>> "But because their source lines are outside the function source range"
>>>>>>> 
>>>>>>> Not sure I understand that - the DWARF doesn't describe a function
>>>>>>> source range, right? Only the line a function starts on. And a
>>>>>>> function can have code from source lines in many files/offsets that
>>>>>>> are unrelated to the function start line (LLVM in several places
>>>>>>> #includes .def files into functions to stamp out tables, switches,
>>>>>>> arrays, etc, for instance)
>>>>>>> 
>>>>>>> On Fri, Jan 15, 2021 at 4:47 PM Jim Ingham via Phabricator via
>>>>>>> lldb-commits <lldb-commits@lists.llvm.org> wrote:
>>>>>>>> 
>>>>>>>> jingham created this revision.
>>>>>>>> jingham requested review of this revision.
>>>>>>>> Herald added a project: LLDB.
>>>>>>>> Herald added a subscriber: lldb-commits.
>>>>>>>> 
>>>>>>>> The inline initializers contribute code to the constructor(s).  You 
>>>>>>>> will step onto them in the source view as you step through the 
>>>>>>>> constructor, for instance.  But because their source lines are outside 
>>>>>>>> the function source range, lldb thought a breakpoint on the 
>>>>>>>> initializer line was crossing from one function to another, which file 
>>>>>>>> & line breakpoints don't allow.  That meant if you tried to set a 
>>>>>>>> breakpoint on one of these lines it doesn't create any locations.
>>>>>>>> 
>>>>>>>> This patch fixes that by asserting that if the LineEntry in one of the 
>>>>>>>> SymbolContexts that the search produced exactly matches the file & 
>>>>>>>> line specifications in the breakpoint, it has to be a valid place to 
>>>>>>>> set the breakpoint, and we should just set it.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Repository:
>>>>>>>> rG LLVM Github Monorepo
>>>>>>>> 
>>>>>>>> https://reviews.llvm.org/D94846
>>>>>>>> 
>>>>>>>> Files:
>>>>>>>> lldb/source/Breakpoint/BreakpointResolverFileLine.cpp
>>>>>>>> lldb/test/API/lang/cpp/break-on-initializers/Makefile
>>>>>>>> lldb/test/API/lang/cpp/break-on-initializers/TestBreakOnCPP11Initializers.py
>>>>>>>> lldb/test/API/lang/cpp/break-on-initializers/main.cpp
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> lldb-commits mailing list
>>>>>>>> lldb-commits@lists.llvm.org
>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Re: [Lldb-commits] [PATCH] D94846: Allow breakpoints to be set on C++11 inline initializers

Reply via email to