Re: [Dwarf-Discuss] variable locations - safe use as lvalues
On 1/20/20 6:21 PM, Frank Ch. Eigler via Dwarf-Discuss wrote: Complication 2: The compiler reuses variable locations at the same PC. This seems to be a compiler bug. (Actually, this could be a valid optimization, e.g.: int a = expression; int b = expression; > /* use/modify a */ /* use/modify b */ at that point, if $expression is a pure function, a compiler could evaluate it once and reuse the value. It could do this by temporarily storing both a and b in the same register, and only separating the variables afterwards. Yes, where b is a copy of a, they can occupy the same location. That does raise the question about what happens when the user instructs the debugger to change b, expecting that a will not be affected. Complication 3a: That the value of a variable has been fetched for a computation before the debugger modifies it. This is more complicated. The live range of the variable is accurate, but its value has been used before the current PC. DWARF does not include descriptions of data flow or indicate where variables are fetched, so there is no information that a debugger can use to assure that a modified value is actually used. Yeah, and in the absence of dataflow metadata, tools like dyninst must try to reverse-engineer the dataflow in order to perform binary rewriting. Is this something way out of forseeable dwarf scope? I think that describing the data flow would be large. Essentially, copy most of the IR data into DWARF. If you have ideas about how to represent a compressed data flow graph, let us know. There might only be a need for a limited subset. There are a lot of issues with a debugger modifying a program while it is running. A debugger can make essentially unbounded changes to the program state. Some of these may work as expected, some may not, and it is unclear how a debugger would be able to know the difference. This is the key question: how can a tool know what is safe. While the trivial case of assuming every write is unsafe is not helpful :-), it could be okay to have fairly conservative heuristics, known-partial information, and rely on only clear signals to enable write operations. I can't think of any heuristic that would work, unless the debugger does an analysis of the generated code to find where variables are actively being used. It also seems most likely that a user might want to modify variable exactly when they are being used, not where they are quiescent. Anyway, it sounds like the next step is on us to analyze & prototype in a compiler (gcc?). I'd also appreciate authors of other dwarf consumer and producer tools to mention whether they have considered this area, so as to collect a census. -- Michael Eagerea...@eagerm.com 1960 Park Blvd., Palo Alto, CA 94306 ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] variable locations - safe use as lvalues
Hi - > > - ... and undoubtedly other complications exist! > > Interesting question. Thanks! We have been thinking in similar directions generally as y'all. > Complication 2: The compiler reuses variable locations at the same PC. This > seems to be a compiler bug. (Actually, this could be a valid optimization, e.g.: int a = expression; int b = expression; > /* use/modify a */ /* use/modify b */ at that point, if $expression is a pure function, a compiler could evaluate it once and reuse the value. It could do this by temporarily storing both a and b in the same register, and only separating the variables afterwards. > [...] > Presumably, a debugger could check that location lists do not > overlap. This could nevertheless be a valid heuristic to detect the case. > Complication 3a: That the value of a variable has been fetched for a > computation before the debugger modifies it. This is more complicated. The > live range of the variable is accurate, but its value has been used before > the current PC. DWARF does not include descriptions of data flow or > indicate where variables are fetched, so there is no information that a > debugger can use to assure that a modified value is actually used. Yeah, and in the absence of dataflow metadata, tools like dyninst must try to reverse-engineer the dataflow in order to perform binary rewriting. Is this something way out of forseeable dwarf scope? > There are a lot of issues with a debugger modifying a program while it is > running. A debugger can make essentially unbounded changes to the program > state. Some of these may work as expected, some may not, and it is unclear > how a debugger would be able to know the difference. This is the key question: how can a tool know what is safe. While the trivial case of assuming every write is unsafe is not helpful :-), it could be okay to have fairly conservative heuristics, known-partial information, and rely on only clear signals to enable write operations. Anyway, it sounds like the next step is on us to analyze & prototype in a compiler (gcc?). I'd also appreciate authors of other dwarf consumer and producer tools to mention whether they have considered this area, so as to collect a census. - FChE ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] variable locations - safe use as lvalues
On 1/20/20 2:20 PM, Frank Ch. Eigler via Dwarf-Discuss wrote: Hi - I have a question about variable location lists, but not their encoding, the use they are suitable for. The basic debugging scenario is just reading variable values, for which this is fine, especially when high-quality compilers emit exquisitely detailed data for their optimized code. But what about writes - as though one could edit the program to insert an assignment, and resume? A whole slew of complications come up: - trying to modify a variable permanently, but the compiler only emitted -some- of its locations - trying to modify a variable (and only one), but the compiler put two variables into the same location at that PC - expressions using that value as input might have already started to be computed, so it may be too late to change it at the PC in question - ... and undoubtedly other complications exist! Interesting question. Complication 1: That the compiler only emitted partial descriptions for the variable: this seems to be a quality of implementation issue. There is nothing that a debugger can do if the compiler generates incomplete or misleading descriptions. There is also no way that a debugger can ascertain that the compiler has generated a complete or accurate description. Remedy: Fix the compiler. Complication 2: The compiler reuses variable locations at the same PC. This seems to be a compiler bug. While a location (e.g., a register) can be the location for multiple variables, the live ranges for these variables should not overlap. The location lists for all variables should be disjoint. Presumably, a debugger could check that location lists do not overlap. For complication 3, an example might be load r1, =1 add r1, var PC ==> store r1, var There might be arbitrary additional instructions for multiple source statements interspersed. This has has two variants: Complication 3a: That the value of a variable has been fetched for a computation before the debugger modifies it. This is more complicated. The live range of the variable is accurate, but its value has been used before the current PC. DWARF does not include descriptions of data flow or indicate where variables are fetched, so there is no information that a debugger can use to assure that a modified value is actually used. Complication 3b: That a variable's value may be modified after the debugger changes it at PC. This is essentially a race condition. Both the program and the debugger are updating the variable. Last one wins. > A debugger cannot currently be told that any particular variable > location expression is safe to use as an lvalue, something roughly > "exclusive, exhaustive, -O0-equivalent". I believe most debuggers > don't even handle the multiple-locations case for writes at all. I > don't know why - assume complications are rare? or we have kind of > papered over the problem? There are a lot of issues with a debugger modifying a program while it is running. A debugger can make essentially unbounded changes to the program state. Some of these may work as expected, some may not, and it is unclear how a debugger would be able to know the difference. There might be fewer problems modifying variables which are marked volatile. But var in the example above could be volatile and the same issues would occur. Are these complications rare? Unclear. I think that the great majority of debugger use is in displaying the value of variables, not in modifying them. Have we papered over the problem? Probably. Debugging optimized code is difficult, even without trying to change the program state. > As a DWARF standard level matter, descriptive rather than prescriptive > as it is, what might be the minimum extension required to communicate > lvalue-safety to a debugger? A DW_OP_lvalue_safe assertive wrapper > around a real expression? Conceivably, location lists could be extended to include a flag to say that a variable is quiescent for a particular range of PC values and that a modification at that time would be persistent. (A clear definition of quiescent would be needed.) As Cary notes, a default or bounded location description might be used, but I don't believe that either implies that a variable is quiescent (or not) over the specified range. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] variable locations - safe use as lvalues
> A debugger cannot currently be told that any particular variable > location expression is safe to use as an lvalue, something roughly > "exclusive, exhaustive, -O0-equivalent". I believe most debuggers > don't even handle the multiple-locations case for writes at all. I > don't know why - assume complications are rare? or we have kind of > papered over the problem? > > As a DWARF standard level matter, descriptive rather than prescriptive > as it is, what might be the minimum extension required to communicate > lvalue-safety to a debugger? A DW_OP_lvalue_safe assertive wrapper > around a real expression? In DWARF 5, we added the concepts of bounded and default location descriptions. I think it would generally be safe to modify a value under the following conditions: (a) The variable's location is covered by a single location description, or by a location list with a default location description but where no bounded location description matches the current pc. (b) Such location description is a simple memory or register location. (c) You're stopped at a suggested breakpoint location. I'm sure there are circumstances where you could still get into trouble (e.g., your third example), but I think it would be unusual, and there we'd probably be talking about a quality of implementation issue. The courteous thing for a compiler to do would be to at least try to make this ok at suggested breakpoint locations, though I wouldn't expect a compiler at -O2 to deliberately avoid an optimization solely for the sake of being able to modify a variable while stopped at a breakpoint. In your third example, once the compiler has loaded a copy of the variable into a register to use in a computation, it could (should?) create a bounded location description, thus marking it unsafe under the rules above. So, rather than mark lvalue-safe regions, a compiler could instead use a bounded location descriptions to mark a region unsafe that wouldn't already be deemed unsafe. -cary ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
[Dwarf-Discuss] variable locations - safe use as lvalues
Hi - I have a question about variable location lists, but not their encoding, the use they are suitable for. The basic debugging scenario is just reading variable values, for which this is fine, especially when high-quality compilers emit exquisitely detailed data for their optimized code. But what about writes - as though one could edit the program to insert an assignment, and resume? A whole slew of complications come up: - trying to modify a variable permanently, but the compiler only emitted -some- of its locations - trying to modify a variable (and only one), but the compiler put two variables into the same location at that PC - expressions using that value as input might have already started to be computed, so it may be too late to change it at the PC in question - ... and undoubtedly other complications exist! A debugger cannot currently be told that any particular variable location expression is safe to use as an lvalue, something roughly "exclusive, exhaustive, -O0-equivalent". I believe most debuggers don't even handle the multiple-locations case for writes at all. I don't know why - assume complications are rare? or we have kind of papered over the problem? As a DWARF standard level matter, descriptive rather than prescriptive as it is, what might be the minimum extension required to communicate lvalue-safety to a debugger? A DW_OP_lvalue_safe assertive wrapper around a real expression? - FChE ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org