Re: [Lldb-commits] [PATCH] D94846: Allow breakpoints to be set on C++11 inline initializers

David Blaikie via lldb-commits Tue, 19 Jan 2021 15:34:27 -0800

On Tue, Jan 19, 2021 at 2:55 PM Jim Ingham <jing...@apple.com> wrote:
>
>
>
> > On Jan 19, 2021, at 11:40 AM, David Blaikie <dblai...@gmail.com> wrote:
> >
> > On Tue, Jan 19, 2021 at 10:17 AM Jim Ingham <jing...@apple.com> wrote:
> >>
> >>
> >>
> >>> On Jan 17, 2021, at 10:47 AM, David Blaikie <dblai...@gmail.com> wrote:
> >>>
> >>> On Fri, Jan 15, 2021 at 6:22 PM Jim Ingham <jing...@apple.com> wrote:
> >>>>
> >>>> If you set a breakpoint on source lines with no line table entries, lldb 
> >>>> slides the actual match forward to the nearest line table entry to the 
> >>>> given line number.  That's necessary for instance because if you lay out 
> >>>> your function definitions over multiple lines they won't all get line 
> >>>> table entries, but we don't want people to have to guess which of them 
> >>>> was...  Same is true of more complicated compound statements.
> >>>
> >>> Ah, sure - gdb seems to do that too, totally makes sense - as you say,
> >>> it'd be pretty hard to know exactly which tokens end up with
> >>> corresponding entries in the line table and which don't (especially
> >>> under optimizations and complex compound statements).
> >>
> >> Yeah, I think you either have to have this heuristic, or have a debugger 
> >> that can display "valid breakpoint locations".  The old Metrowerks 
> >> debugger used to do it that way.  They only put up breakpoint affordances 
> >> in the UI for the lines that had line table entries and there was no 
> >> free-form way to set breakpoints, so you couldn't do an ambiguous thing.  
> >> But I don't think we have that option in lldb.
> >
> > Could be nice to have in source printing on the command line and could
> > be done more fully in IDE integrations like XCode - but yeah, still
> > would want all this sliding stuff for raw command line usage even if
> > we had the rest.
>
> If we just cared about lines and not columns, it would be pretty easy to 
> annotate the source line output, maybe just a • before each breakable line.  
> Marking the column locations as well, while keeping the whole thing readable, 
> is a bit trickier.
>
> The problem with Xcode is that they don't do full compiles as you are 
> editing.  They do run the front end for code completion, brace matching, 
> etc...  But that doesn't generate line tables.  So you would have to support 
> breakpoints in files where you do and don't know the valid line table 
> entries.  If you didn't design for using the compiled code for UI in the IDE, 
> it's not easy to add it in after the fact.
>
> >
> >>>> So people like that, but they really hate it when they have:
> >>>>
> >>>>
> >>>> #ifdef SOMETHING_NOT_DEFINED
> >>>>
> >>>> int foo() {
> >>>>
> >>>> }
> >>>>
> >>>> #else
> >>>> int bar() {
> >>>>
> >>>> }
> >>>> #end
> >>>>
> >>>> but with lots more junk so you can't see the ifdef's and then they set a 
> >>>> breakpoint in the "int foo" part and the breakpoint gets moved to bar 
> >>>> instead and then doesn't get hit.
> >>>
> >>> Ah, indeed - that is a curious case that could be surprising to a
> >>> user. Thanks for explaining it - though it seems gdb doesn't special
> >>> case this & does produce the not-quite-what-the-user-intended
> >>> behavior.
> >>>
> >>>> You might try to argue that they should have checked where the 
> >>>> breakpoint actually landed before coming into your office to yell at 
> >>>> you, but it's likely to leave you with fewer friends...
> >>>
> >>> FWIW: I don't have coworkers or friends come into my office to yell at
> >>> me about anything like this, and I don't really think anyone should -
> >>> that's not appropriate behavior for a workplace.
> >>>
> >>> Having a nuanced discussion about the tradeoff of features - sure.
> >>
> >> Yeah, IME people get cheesed off at debuggers in ways they don't with 
> >> compilers.  I have some theories, but more of the after work quality than 
> >> the public mailing list quality.
> >>
> >>>> So I was trying to detect this case and not move the breakpoint if 
> >>>> sliding it crossed over the function start boundary.  That way you'd see 
> >>>> that the breakpoint didn't work, and go figure out why.
> >>>
> >>> Neat idea!
> >>>
> >>>> The thinko in the original version was that we were still doing this 
> >>>> when we DIDN'T have to slide the breakpoint, when we got an exact match 
> >>>> in the line table.  In that case, we shouldn't try to second guess the 
> >>>> line table at all.  That's the patch in this fix.
> >>>
> >>> Might this still leave some confusing behavior. Now it'll slide
> >>> forward into a function, but it won't slide forward into an
> >>> initializer? (so now the user has to know exactly which lines of the
> >>> initializer are attributed to the line table, making that somewhat
> >>> difficult to use?)
> >>
> >>
> >> Yes I knew this was a tradeoff of the implementation.  If I did a more in 
> >> depth search in order to slide this over to function initializer, it would 
> >> IMO add way more complexity to an already overly fiddly part of the 
> >> debugger, and the trade-off didn't seem worth it.  Most initializers have 
> >> a fairly simple structure, as opposed say to the before the body part of a 
> >> function definition (that's got to have a name, but I don't know it off 
> >> the top of my head.)  So I think the payoff would be a lot less in this 
> >> area.  I asked around here and the folks I polled agreed that it wasn't 
> >> worth the complexity.
> >>
> >>>
> >>> Testing some things out:
> >>>
> >>> $ cat break.cpp
> >>> __attribute__((optnone)) int f1() { return 1; }
> >>> struct t1 {
> >>> int     // 3
> >>>     i   // 4
> >>>     =   // 5
> >>>     f1  // 6
> >>>     ();
> >>> t1() {
> >>> }
> >>> };
> >>>
> >>> t1 v1;
> >>> int main() {
> >>> }
> >>>
> >>> The line table:
> >>>
> >>> Address            Line   Column File   ISA Discriminator Flags
> >>> ------------------ ------ ------ ------ --- ------------- -------------
> >>> 0x0000000000401140      1      0      1   0             0  is_stmt
> >>> 0x0000000000401144      1     37      1   0             0  is_stmt 
> >>> prologue_end
> >>> 0x0000000000401150     13      0      1   0             0  is_stmt
> >>> 0x0000000000401154     14      1      1   0             0  is_stmt 
> >>> prologue_end
> >>> 0x0000000000401158     14      1      1   0             0  is_stmt 
> >>> end_sequence
> >>> 0x0000000000401020      0      0      1   0             0  is_stmt
> >>> 0x0000000000401024     12      4      1   0             0  is_stmt 
> >>> prologue_end
> >>> 0x0000000000401040      0      0      1   0             0  is_stmt
> >>> 0x000000000040104b      0      0      1   0             0  is_stmt 
> >>> end_sequence
> >>> 0x0000000000401160      8      0      1   0             0  is_stmt
> >>> 0x0000000000401174      6      7      1   0             0  is_stmt 
> >>> prologue_end
> >>> 0x000000000040117f      4      7      1   0             0  is_stmt
> >>> 0x0000000000401181      9      3      1   0             0  is_stmt
> >>> 0x0000000000401187      9      3      1   0             0  is_stmt 
> >>> end_sequence
> >>>
> >>> Has entries for line 4 and 6 from the initializer (and 8 and 9 from
> >>> the ctor), but gdb doesn't seem to want to break on any of them. So
> >>> I'm not sure what gdb's doing, but it seems pretty unhelpful
> >>> (apparently gdb breaks in to the ctor too late to even let you step
> >>> into the initializers... guess that's because the initializers are
> >>> called in the prologue according to the debug info):
> >>> (gdb) b 1
> >>> Breakpoint 1 at 0x401144: file break.cpp, line 1.
> >>> (gdb) b 2
> >>> Breakpoint 2 at 0x401181: file break.cpp, line 9.
> >>> (gdb) b 3
> >>> Note: breakpoint 2 also set at pc 0x401181.
> >>> Breakpoint 3 at 0x401181: file break.cpp, line 9.
> >>> (gdb) b 4
> >>> Note: breakpoints 2 and 3 also set at pc 0x401181.
> >>> Breakpoint 4 at 0x401181: file break.cpp, line 9.
> >>> (gdb) b 5
> >>> Note: breakpoints 2, 3 and 4 also set at pc 0x401181.
> >>> Breakpoint 5 at 0x401181: file break.cpp, line 9.
> >>> (gdb) b 6
> >>> Note: breakpoints 2, 3, 4 and 5 also set at pc 0x401181.
> >>> Breakpoint 6 at 0x401181: file break.cpp, line 9.
> >>> (gdb) b 7
> >>> Note: breakpoints 2, 3, 4, 5 and 6 also set at pc 0x401181.
> >>> Breakpoint 7 at 0x401181: file break.cpp, line 9.
> >>> (gdb) b 8
> >>> Note: breakpoints 2, 3, 4, 5, 6 and 7 also set at pc 0x401181.
> >>> Breakpoint 8 at 0x401181: file break.cpp, line 9.
> >>> (gdb) b 9
> >>> Note: breakpoints 2, 3, 4, 5, 6, 7 and 8 also set at pc 0x401181.
> >>> Breakpoint 9 at 0x401181: file break.cpp, line 9.
> >>>
> >>> lldb does already seem to let me break on the initializer without this
> >>> patch - so any idea what part of this already works compared to what's
> >>> being fixed here? (though lldb breaks on line 6 even when I asked to
> >>> break on line 7 or 8, but it doesn't break on 6 when I ask to break on
> >>> 6... )
> >>
> >>
> >> Yes.  This is because the line table entry for the "first line" of a 
> >> function is also imprecise.  For instance, if you do:
> >>
> >> int
> >> foo()
> >> {
> >>
> >> }
> >>
> >> the "int" line in the function definition pretty much never gets a line 
> >> table entry, yet that's a perfectly reasonable place to set a breakpoint 
> >> on the function.  So the "am I sliding across function definitions" check 
> >> has a little slop in the comparison.  Just put a handful of blank lines 
> >> between the "();" of the struct initialization and the definition of the 
> >> constructor, and then the breakpoint won't take.
> >
> > Ah, right right - I was still a bit confused by why the line was
> > showing as line 6, but I get it - it's because that's where the first
> > instruction for function is. Hmm, actually that doesn't quite make
> > sense to me still - if I set a breakpoint at line 9, why wouldn't I
> > get a breakpoint on that line? I guess maybe lldb detects that that is
> > the line where a function starts according to the DW_AT_decl_line in
> > the DW_TAG_subprogram, and instead of breaking on line 9 per the line
> > table, it breaks on the start of the function which is line 6? Though
> > when the program... huh, it does stop at 6, even though I said break 9
> > (well, I said break 13 in my new example where I've added the
> > suggested handful of blank lines)
> >
> > Seems a bit surprising/not something I can readily explain:
> >
> > $ cat -n test.cpp
> >     1  __attribute__((optnone)) int f1() { return 1; }
> >     2  struct t1 {
> >     3    int     // 3
> >     4        i   // 4
> >     5        =   // 5
> >     6        f1  // 6
> >     7        ();
> >     8
> >     9
> >    10
> >    11
> >    12
> >    13    t1() {}
> >    14  };
> >    15  t1 v1;
> >    16  int main() {}
> > blaikie@blaikie-linux2:~/dev/scratch$ lldb ./a.out
> > (lldb) target create "./a.out"
> > Current executable set to
> > '/usr/local/google/home/blaikie/dev/scratch/a.out' (x86_64).
> > (lldb) b 13
> > Breakpoint 1: where = a.out`t1::t1() + 20 at test.cpp:6:7, address =
> > 0x0000000000401174
> > (lldb) r
> > Process 554103 launched:
> > '/usr/local/google/home/blaikie/dev/scratch/a.out' (x86_64)
> > Process 554103 stopped
> > * thread #1, name = 'a.out', stop reason = breakpoint 1.1
> >    frame #0: 0x0000000000401174 a.out`t1::t1(this=0x000000000040402c)
> > at test.cpp:6:7
> >   3      int     // 3
> >   4          i   // 4
> >   5          =   // 5
> > -> 6          f1  // 6
> >   7          ();
> >   8
> >   9
> > (lldb)
>
> There's another "fiddly" bit of the breakpoint setting that likely explains 
> what you are seeing.  If you set breakpoint (either by name or by file & 
> line) that happens to fall inside the prologue of a function, lldb moves the 
> breakpoint to the first instruction after the prologue.  If we don't do this 
> then when you stop in the function at this breakpoint, argument values are 
> likely to be incorrect, since the debug information generally only gives 
> correct locations after the prologue.  It also makes getting the backtrace 
> right trickier since you then have to understand where in the prologue you 
> are.
>
> Since not very many people actually need to debug the prologue sequence of a 
> function, this avoids a lot of confusion at the cost of a just a smidge more 
> mystery.  gdb does this as well (or at least it did the last time I played 
> with it.)  There is a setting to not skip the prologue, which you need to do 
> for instance if you want to ensure that you stop while the arguments are 
> still all in their ABI specified locations.  But the default is to do the 
> slide.
>
> So if the first thing that happens after the prologue in t1() is the 
> initialization of i, then the sliding of the breakpoint past the prologue 
> would move it from line 13 to line 6, as you saw.  And we don't let the 
> prologue slide cross function boundaries so we don't do a "cross function 
> boundaries" check in that move.


Hmm, yep, that seems to be it. Which just makes me more confused about
what gdb's doing then (when I set a breakpoint at the function, it
seems to be breaking after all the initializers have run - well beyond
the prologue - perhaps that's what they've done to try to get the user
to where the member initializers have finished running).

Thanks for talking me through it.

- Dave

>
> If somebody has some spare time on their hands it might be fun to try to 
> build up a little report describing these moves, and then store that in the 
> location and print it out with "break list -v".  That would help when you are 
> puzzling over how your breakpoint got where it did...
>
> Jim
>
>
>
> >
> >>> (lldb) target create "./a.out"
> >>> Current executable set to
> >>> '/usr/local/google/home/blaikie/dev/scratch/a.out' (x86_64).
> >>> (lldb) b 1
> >>> Breakpoint 1: where = a.out`f1() + 4 at break.cpp:1:37, address =
> >>> 0x0000000000401144
> >>> (lldb) b 2
> >>> Breakpoint 2: no locations (pending).
> >>> WARNING:  Unable to resolve breakpoint to any actual locations.
> >>> (lldb) b 3
> >>> Breakpoint 3: no locations (pending).
> >>> WARNING:  Unable to resolve breakpoint to any actual locations.
> >>> (lldb) b 4
> >>> Breakpoint 4: no locations (pending).
> >>> WARNING:  Unable to resolve breakpoint to any actual locations.
> >>> (lldb) b 5
> >>> Breakpoint 5: no locations (pending).
> >>> WARNING:  Unable to resolve breakpoint to any actual locations.
> >>> (lldb) b 6
> >>> Breakpoint 6: no locations (pending).
> >>> WARNING:  Unable to resolve breakpoint to any actual locations.
> >>> (lldb) b 7
> >>> Breakpoint 7: where = a.out`t1::t1() + 20 at break.cpp:6:7, address =
> >>> 0x0000000000401174
> >>> (lldb) b 8
> >>> Breakpoint 8: where = a.out`t1::t1() + 20 at break.cpp:6:7, address =
> >>> 0x0000000000401174
> >>> (lldb) b 9
> >>> Breakpoint 9: where = a.out`t1::t1() + 33 at break.cpp:9:3, address =
> >>> 0x0000000000401181
> >>> (lldb) r
> >>>
> >>>> BTW, the check wouldn't have affected code from .defs files because I 
> >>>> only do it if the original breakpoint specification and the "function 
> >>>> start" are from the same source file.
> >>>
> >>> Sure enough - no doubt someone out there has some file that #includes
> >>> itself in some entertaining way, but likely quite rare.
> >>
> >> Given that the penalty you have to pay for such a deed is having to be a 
> >> little more careful how you set breakpoints, I don't feel the need for 
> >> super-human efforts in this area.
> >>
> >> Jim
> >>
> >>
> >>>
> >>>> And we know about inlined blocks so inlining isn't going to fool us 
> >>>> either.
> >>>>
> >>>> Jim
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> On Jan 15, 2021, at 5:09 PM, David Blaikie <dblai...@gmail.com> wrote:
> >>>>>
> >>>>> "But because their source lines are outside the function source range"
> >>>>>
> >>>>> Not sure I understand that - the DWARF doesn't describe a function
> >>>>> source range, right? Only the line a function starts on. And a
> >>>>> function can have code from source lines in many files/offsets that
> >>>>> are unrelated to the function start line (LLVM in several places
> >>>>> #includes .def files into functions to stamp out tables, switches,
> >>>>> arrays, etc, for instance)
> >>>>>
> >>>>> On Fri, Jan 15, 2021 at 4:47 PM Jim Ingham via Phabricator via
> >>>>> lldb-commits <lldb-commits@lists.llvm.org> wrote:
> >>>>>>
> >>>>>> jingham created this revision.
> >>>>>> jingham requested review of this revision.
> >>>>>> Herald added a project: LLDB.
> >>>>>> Herald added a subscriber: lldb-commits.
> >>>>>>
> >>>>>> The inline initializers contribute code to the constructor(s).  You 
> >>>>>> will step onto them in the source view as you step through the 
> >>>>>> constructor, for instance.  But because their source lines are outside 
> >>>>>> the function source range, lldb thought a breakpoint on the 
> >>>>>> initializer line was crossing from one function to another, which file 
> >>>>>> & line breakpoints don't allow.  That meant if you tried to set a 
> >>>>>> breakpoint on one of these lines it doesn't create any locations.
> >>>>>>
> >>>>>> This patch fixes that by asserting that if the LineEntry in one of the 
> >>>>>> SymbolContexts that the search produced exactly matches the file & 
> >>>>>> line specifications in the breakpoint, it has to be a valid place to 
> >>>>>> set the breakpoint, and we should just set it.
> >>>>>>
> >>>>>>
> >>>>>> Repository:
> >>>>>> rG LLVM Github Monorepo
> >>>>>>
> >>>>>> https://reviews.llvm.org/D94846
> >>>>>>
> >>>>>> Files:
> >>>>>> lldb/source/Breakpoint/BreakpointResolverFileLine.cpp
> >>>>>> lldb/test/API/lang/cpp/break-on-initializers/Makefile
> >>>>>> lldb/test/API/lang/cpp/break-on-initializers/TestBreakOnCPP11Initializers.py
> >>>>>> lldb/test/API/lang/cpp/break-on-initializers/main.cpp
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> lldb-commits mailing list
> >>>>>> lldb-commits@lists.llvm.org
> >>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
>
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Re: [Lldb-commits] [PATCH] D94846: Allow breakpoints to be set on C++11 inline initializers

Reply via email to