Re: [Dwarf-Discuss] variable locations - safe use as lvalues

2020-01-20 Thread Michael Eager via Dwarf-Discuss

On 1/20/20 6:21 PM, Frank Ch. Eigler via Dwarf-Discuss wrote:

Complication 2: The compiler reuses variable locations at the same PC. This
seems to be a compiler bug.


(Actually, this could be a valid optimization, e.g.:

int a = expression;
int b = expression;
   >
/* use/modify a */
/* use/modify b */

at that point, if $expression is a pure function, a compiler
could evaluate it once and reuse the value.  It could do this
by temporarily storing both a and b in the same register, and
only separating the variables afterwards.


Yes, where b is a copy of a, they can occupy the same location.  That 
does raise the question about what happens when the user instructs the 
debugger to change b, expecting that a will not be affected.



Complication 3a: That the value of a variable has been fetched for a
computation before the debugger modifies it.  This is more complicated. The
live range of the variable is accurate, but its value has been used before
the current PC.  DWARF does not include descriptions of data flow or
indicate where variables are fetched, so there is no information that a
debugger can use to assure that a modified value is actually used.


Yeah, and in the absence of dataflow metadata, tools like dyninst must
try to reverse-engineer the dataflow in order to perform binary
rewriting.  Is this something way out of forseeable dwarf scope?


I think that describing the data flow would be large.  Essentially, copy 
most of the IR data into DWARF.  If you have ideas about how to 
represent a compressed data flow graph, let us know.  There might only 
be a need for a limited subset.



There are a lot of issues with a debugger modifying a program while it is
running.  A debugger can make essentially unbounded changes to the program
state.  Some of these may work as expected, some may not, and it is unclear
how a debugger would be able to know the difference.


This is the key question: how can a tool know what is safe.  While the
trivial case of assuming every write is unsafe is not helpful :-), it
could be okay to have fairly conservative heuristics, known-partial
information, and rely on only clear signals to enable write
operations.


I can't think of any heuristic that would work, unless the debugger does 
an analysis of the generated code to find where variables are actively 
being used.  It also seems most likely that a user might want to modify 
variable exactly when they are being used, not where they are quiescent.



Anyway, it sounds like the next step is on us to analyze & prototype
in a compiler (gcc?).  I'd also appreciate authors of other dwarf
consumer and producer tools to mention whether they have considered
this area, so as to collect a census.



--
Michael Eagerea...@eagerm.com
1960 Park Blvd., Palo Alto, CA 94306
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] variable locations - safe use as lvalues

2020-01-20 Thread Frank Ch. Eigler via Dwarf-Discuss
Hi -

> > - ... and undoubtedly other complications exist!
> 
> Interesting question.

Thanks!

We have been thinking in similar directions generally as y'all.


> Complication 2: The compiler reuses variable locations at the same PC. This
> seems to be a compiler bug.  

(Actually, this could be a valid optimization, e.g.:

   int a = expression;
   int b = expression;
  >
   /* use/modify a */
   /* use/modify b */

at that point, if $expression is a pure function, a compiler
could evaluate it once and reuse the value.  It could do this
by temporarily storing both a and b in the same register, and
only separating the variables afterwards.

> [...]
> Presumably, a debugger could check that location lists do not
> overlap.

This could nevertheless be a valid heuristic to detect the case.


> Complication 3a: That the value of a variable has been fetched for a
> computation before the debugger modifies it.  This is more complicated. The
> live range of the variable is accurate, but its value has been used before
> the current PC.  DWARF does not include descriptions of data flow or
> indicate where variables are fetched, so there is no information that a
> debugger can use to assure that a modified value is actually used.

Yeah, and in the absence of dataflow metadata, tools like dyninst must
try to reverse-engineer the dataflow in order to perform binary
rewriting.  Is this something way out of forseeable dwarf scope?


> There are a lot of issues with a debugger modifying a program while it is
> running.  A debugger can make essentially unbounded changes to the program
> state.  Some of these may work as expected, some may not, and it is unclear
> how a debugger would be able to know the difference.

This is the key question: how can a tool know what is safe.  While the
trivial case of assuming every write is unsafe is not helpful :-), it
could be okay to have fairly conservative heuristics, known-partial
information, and rely on only clear signals to enable write
operations.


Anyway, it sounds like the next step is on us to analyze & prototype
in a compiler (gcc?).  I'd also appreciate authors of other dwarf
consumer and producer tools to mention whether they have considered
this area, so as to collect a census.


- FChE

___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] variable locations - safe use as lvalues

2020-01-20 Thread Michael Eager via Dwarf-Discuss

On 1/20/20 2:20 PM, Frank Ch. Eigler via Dwarf-Discuss wrote:

Hi -

I have a question about variable location lists, but not their
encoding, the use they are suitable for.  The basic debugging scenario
is just reading variable values, for which this is fine, especially
when high-quality compilers emit exquisitely detailed data for their
optimized code.

But what about writes - as though one could edit the program to insert
an assignment, and resume?  A whole slew of complications come up:

- trying to modify a variable permanently, but the compiler only
   emitted -some- of its locations

- trying to modify a variable (and only one), but the compiler put two
   variables into the same location at that PC

- expressions using that value as input might have already started
   to be computed, so it may be too late to change it at the PC
   in question

- ... and undoubtedly other complications exist!


Interesting question.

Complication 1: That the compiler only emitted partial descriptions for 
the variable:  this seems to be a quality of implementation issue. 
There is nothing that a debugger can do if the compiler generates 
incomplete or misleading descriptions.  There is also no way that a 
debugger can ascertain that the compiler has generated a complete or 
accurate description.  Remedy: Fix the compiler.


Complication 2: The compiler reuses variable locations at the same PC. 
This seems to be a compiler bug.  While a location (e.g., a register) 
can be the location for multiple variables, the live ranges for these 
variables should not overlap.  The location lists for all variables 
should be disjoint.  Presumably, a debugger could check that location 
lists do not overlap.


For complication 3, an example might be
load  r1, =1
add   r1, var
PC ==>  store r1, var
There might be arbitrary additional instructions for multiple source 
statements interspersed.


This has has two variants:

Complication 3a: That the value of a variable has been fetched for a 
computation before the debugger modifies it.  This is more complicated. 
The live range of the variable is accurate, but its value has been used 
before the current PC.  DWARF does not include descriptions of data flow 
or indicate where variables are fetched, so there is no information that 
a debugger can use to assure that a modified value is actually used.


Complication 3b: That a variable's value may be modified after the 
debugger changes it at PC.  This is essentially a race condition.  Both 
the program and the debugger are updating the variable.  Last one wins.


> A debugger cannot currently be told that any particular variable
> location expression is safe to use as an lvalue, something roughly
> "exclusive, exhaustive, -O0-equivalent".  I believe most debuggers
> don't even handle the multiple-locations case for writes at all.  I
> don't know why - assume complications are rare?  or we have kind of
> papered over the problem?

There are a lot of issues with a debugger modifying a program while it 
is running.  A debugger can make essentially unbounded changes to the 
program state.  Some of these may work as expected, some may not, and it 
is unclear how a debugger would be able to know the difference.


There might be fewer problems modifying variables which are marked 
volatile.  But var in the example above could be volatile and the same 
issues would occur.


Are these complications rare?  Unclear.  I think that the great majority 
of debugger use is in displaying the value of variables, not in 
modifying them.


Have we papered over the problem?  Probably.  Debugging optimized code 
is difficult, even without trying to change the program state.


> As a DWARF standard level matter, descriptive rather than prescriptive
> as it is, what might be the minimum extension required to communicate
> lvalue-safety to a debugger?  A DW_OP_lvalue_safe assertive wrapper
> around a real expression?

Conceivably, location lists could be extended to include a flag to say 
that a variable is quiescent for a particular range of PC values and 
that a modification at that time would be persistent.  (A clear 
definition of quiescent would be needed.)


As Cary notes, a default or bounded location description might be used, 
but I don't believe that either implies that a variable is quiescent (or 
not) over the specified range.


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] variable locations - safe use as lvalues

2020-01-20 Thread Cary Coutant via Dwarf-Discuss
> A debugger cannot currently be told that any particular variable
> location expression is safe to use as an lvalue, something roughly
> "exclusive, exhaustive, -O0-equivalent".  I believe most debuggers
> don't even handle the multiple-locations case for writes at all.  I
> don't know why - assume complications are rare?  or we have kind of
> papered over the problem?
>
> As a DWARF standard level matter, descriptive rather than prescriptive
> as it is, what might be the minimum extension required to communicate
> lvalue-safety to a debugger?  A DW_OP_lvalue_safe assertive wrapper
> around a real expression?

In DWARF 5, we added the concepts of bounded and default location
descriptions. I think it would generally be safe to modify a value
under the following conditions:

(a) The variable's location is covered by a single location
description, or by a location list with a default location description
but where no bounded location description matches the current pc.

(b) Such location description is a simple memory or register location.

(c) You're stopped at a suggested breakpoint location.

I'm sure there are circumstances where you could still get into
trouble (e.g., your third example), but I think it would be unusual,
and there we'd probably be talking about a quality of implementation
issue. The courteous thing for a compiler to do would be to at least
try to make this ok at suggested breakpoint locations, though I
wouldn't expect a compiler at -O2 to deliberately avoid an
optimization solely for the sake of being able to modify a variable
while stopped at a breakpoint.

In your third example, once the compiler has loaded a copy of the
variable into a register to use in a computation, it could (should?)
create a bounded location description, thus marking it unsafe under
the rules above. So, rather than mark lvalue-safe regions, a compiler
could instead use a bounded location descriptions to mark a region
unsafe that wouldn't already be deemed unsafe.

-cary
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


[Dwarf-Discuss] variable locations - safe use as lvalues

2020-01-20 Thread Frank Ch. Eigler via Dwarf-Discuss
Hi -

I have a question about variable location lists, but not their
encoding, the use they are suitable for.  The basic debugging scenario
is just reading variable values, for which this is fine, especially
when high-quality compilers emit exquisitely detailed data for their
optimized code.

But what about writes - as though one could edit the program to insert
an assignment, and resume?  A whole slew of complications come up:

- trying to modify a variable permanently, but the compiler only
  emitted -some- of its locations

- trying to modify a variable (and only one), but the compiler put two
  variables into the same location at that PC

- expressions using that value as input might have already started
  to be computed, so it may be too late to change it at the PC
  in question

- ... and undoubtedly other complications exist!

A debugger cannot currently be told that any particular variable
location expression is safe to use as an lvalue, something roughly
"exclusive, exhaustive, -O0-equivalent".  I believe most debuggers
don't even handle the multiple-locations case for writes at all.  I
don't know why - assume complications are rare?  or we have kind of
papered over the problem?

As a DWARF standard level matter, descriptive rather than prescriptive
as it is, what might be the minimum extension required to communicate
lvalue-safety to a debugger?  A DW_OP_lvalue_safe assertive wrapper
around a real expression?


- FChE

___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org