Re: Statement Frontier Notes, Location Views, and Inlined Entry Point Markers

Alexandre Oliva Fri, 25 Aug 2017 07:26:53 -0700

On Aug 23, 2017, Richard Biener <richard.guent...@gmail.com> wrote:

>>> if they are not a problem up until here why care now?


>> IIRC we do have a limit for VTA notes too, but there's a C++ testcase
>> (g++.dg/tree-ssa/pr14703.C) that expands and inlines fibonacci template
>> functions so deep, more than doubling the number of statements at all
>> but the base recursion levels, so we'd end up with over 2^{85+} debug
>> stmts if we didn't cut them off somehow.

> Yeah, but I meant we've kept them throughout GIMPLE (for all functions!)
> but are dropping them here at RTL expansion (which we'll have only a
> single live RTL function at a time).  That looks odd ;)

Aah, yeah, the point is, if we find we exceeded the limit, we don't
bother to clean up the gimple, we just refrain from wasting further time
with it, which we would if we converted them to RTL (and then threw them
away), or copied them all when inlining into some other function.  We
could clean up at some point, just as we could stop emitting further
markers once the limit is reached, but it didn't seem important enough
to do so.  Should it prove to be, I guess it wouldn't be too hard to add
it to gimple verification passes that walk over all stmts.

> You're already dropping them at inlining as well so the RTL expansion
> check should be superfluous IMHO (yeah, unrolling might push it over
> the edge for example but all real issues should come from inlining).

The RTL expansion check is indeed not essential, but if we're over the
limit, we'll to throw it all away, so why bother expanding it and
carrying it through all RTL passes just to throw away at the end?  Or
should we not throw it away in this case, and make the limit apply only
to inlining?  But then, what if we inline lots of very large functions
into a single one, do we still want to use markers for that function?
That's not how I designed it, but I guess it might work that way too.


> Hmm, yeah.  I guess we'd have to have a multi-DEBUG_STMT that covers
> not only multiple markers but also multiple binds.  High GIMPLE has
> nested stmts so it might be tempting to wrap adjacent debug-stmts into
> a single one (basically make the IL walking overhead with debug stmts 
> smaller).
> Costs extra memory instead of less when compared to my idea of course.

Yeah.  I guess that's doable and it won't make gimple passes much
trickier: in most cases all that matters are the SSA uses in bind value
expressions, so as long as the update function can efficiently pick the
SSA uses from the op array, it could be a significant win.  

We may need some way to reset one specific bind given a use that is no
longer valid, which I don't immediately see how to implement efficiently
in a multi-debug pack .

Now, I spent some time trying to think of how to pack multiple debug
stmts in a way that made them also save memory.

For each packed stmt, we need at least one bit to indicate whether it's
a bind or just a marker.  Markers then need a locus, and another bit
indicating whether it's a begin stmt marker or an inline entry point
marker.  Debug bind stmts need one bit to indicate tell src binds from
regular ones, and two trees (no locus).

It is unlikely that it would make sense to allocate extra memory, be it
trees holding integral values, be it other arrays to hold them.  I'm
thinking we'd be better off storing some of these bits in an analogous
of the trailing op VLA, that would be present in gdebug but that would
deal with GGC and ssa updates in its own way.

For packs with few stmts, we could use bits from the subcode to indicate
the count and the kinds.  We could use the gimple locus for the first
marker, and then perhaps pack pairs of loci in tree pointer operands (if
their sizes are 1:2, as in lp64).

When packing more than few stmts, we could then define a format for a
32-bit word to hold the bits for an additional set of stmts, possibly
packed in the same word as a locus or another such bit pack.  Ideally,
should we need more than one of these, we should indicate upfront how
many of these there are, or at least how many ops are used.

I was thinking it would be ideal if combining two many-stmts debug stmts
could require little more than allocating a gimple with a larger ops
array and copying (most of) the original op arrays to the right places.

But...  this all feels far too hackish and not very maintainable or
forward-looking.  E.g., if we add more kinds of debug stmts, the bit
counting suddenly no longer applies, and needs to be reworked.

So I guess that's also doable, and would save some memory indeed,
but...  do you think it's worth it?


>>> Btw, just asking as I helped to get the GIMPLE FE in, did you
>>> consider adding GIMPLE FE support for the various debug stmts
>>> we then have?  First thing would be arriving at a syntax I guess.
>>> __DEBUG x = ...; for binds, __STMT; __INLINE; for the other two?
>>> Not sure how to express they encode some location though...
>>> (binds have no location, right?)
>> 
>> I confess I hadn't considered any of that.  I'll give it some thought.
>> Binds don't refer to a (program) location, yeah, whereas that's all the
>> new markers do.  Inline markers actually reference the lexical block
>> containing the entire inlined copy of the function, and there's code
>> that depends on it, so we'd have to figure out some way to express that
>> sort of thing, in case the GIMPLE FE can't.  Once we have that, the
>> syntax for debug stmts is just a breeze.

> Ah, lexical blocks.  Yeah, we'd have to add a syntactic way to
> name them and refer to them.  At least BINDS look easily doable ;)
> For blocks there's the additional issue that the GIMPLE FE doesn't
> really have them as GIMPLE doesn't really care unless you start
> introducing locations and debug info ;)  So with the GIMPLE FE
> there's just the functions outermost scope/BLOCK.  It shouldn't be
> too hard to add though if we think it's useful.

I've given this some more thought too.  I'm a bit confused as to the
role of debug info in this FE.  If it is to be regarded as a source
language, then you'll want at least begin stmt and inline entry markers
to be introduced in the normal way, namely, by the front end, while
parsing statements, and by the inliner, when performing the inlining.
This would address the problem of representation of lexical blocks, too.

It would not, however, deal with the more traditional debug stmts, the
bind ones.  Those are introduced during gimplification, when going into
SSA.  Presumably you don't go through that with GIMPLE FE input, and at
most run some SSA verification and update pass.  If that is so, then it
makes a lot of sense to have explicit debug bind stmts, even as a means
of associating SSA names with variables (how do you do that otherwise?)

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Re: Statement Frontier Notes, Location Views, and Inlined Entry Point Markers

Reply via email to