On Mon, May 13, 2019 at 3:38 PM Marc Glisse <marc.gli...@inria.fr> wrote:
>
> On Mon, 13 May 2019, Richard Biener wrote:
>
> > On Sun, May 12, 2019 at 2:51 PM Marc Glisse <marc.gli...@inria.fr> wrote:
> >>
> >> On Sun, 12 May 2019, Richard Sandiford wrote:
> >>
> >>> Marc Glisse <marc.gli...@inria.fr> writes:
> >>>> Hello,
> >>>>
> >>>> this patch lets gcc know that if a pointer existed before the call to
> >>>> malloc, the result of malloc cannot alias it. This is a bit ad hoc and
> >>>> fragile. A small improvement would be, when the 2 statements are in the
> >>>> same bb but in the wrong order, to check if there is any statement in
> >>>> between that might prevent from reordering them. But that's more
> >>>> complicated, and the patch as it is already does help.
> >>>>
> >>>> I expect people may not like the call to a function from
> >>>> tree-ssa-loop-niter too much, but it is convenient. And if someone
> >>>> improves it, they will likely have to rewrite something not quite
> >>>> equivalent to stmt_dominates_stmt_p.
> >>>
> >>> It has linear complexity for statements in the same block though.
> >>
> >> Ah, true. I should anyway test that the second statement is malloc
> >> (cheaper) before checking for domination, but even then that could be used
> >> to create a quadratic behavior :-(
> >
> > I also think the oracle queries shouldn't encompany such expensive pieces...
>
> Well, it is only expensive because finding the order of statements in a
> basic block is expensive. Would it be better if it only did this check if
> we are in one of a very limited set of passes where statements have
> reliable UIDs? Maybe that's not enough passes, loop-distribution uses UIDs
> but I didn't check if it uses them in a way compatible with this use...
>
> What if I only check basic-block domination and punt when the statements
> are in the same basic block? That seems cheap enough, and would already
> work for the vector testcase.
>
> >>> (reassoc_stmt_dominates_stmt_p avoids that, but relies on uids
> >>> being up-to-date.)
> >>
> >> This function is called from all over the place. Unless there is a simple
> >> test to check if uids are safe to use (reassoc_in_progress?), that's going
> >> to be a problem. I find it surprising that this information is so hard to
> >> get to... Stopping stmt_dominates_stmt_p after walking 128 statements
> >> doesn't feel like a great solution. But without controlling the pass where
> >> this happens, I don't see any good solution. It would be great if that
> >> non-aliasing property could be recorded in the points-to information, then
> >> I could set it from a pass I control, but I somehow got the impression
> >> that it wouldn't work. Maybe I should try to understand PTA better to make
> >> sure.
> >
> > In princple PTA should know the aliasing cannot happen but possibly
> > the info is either lost or the IL is too obfuscated at the point it gets
> > computed.  (hello C++...)
>
> We don't need much obfuscation for this, a simple C program
>
> int g;
> int*f(int**p){
>    int*q=*p;
>    int*r=__builtin_malloc(4);
>    *q=1;
>    *r=2;
>    g=*q;
>    return r;
> }
>
> gives
>
>    q_4 = *p_3(D);
>    r_6 = __builtin_malloc (4);
>    *q_4 = 1;
>    *r_6 = 2;
>    _1 = *q_4;
>    g = _1;
>    return r_6;
>
> where we clearly don't manage to prove that q and r don't alias.

We can probably improve this one in general from inside PTA
by treating escapes through return specially.  I wasn't looking
too closely at the C++ testcase but I simply assumed the
addition to ESCAPED happens through storing the malloc
result to memory that PTA cannot prove local.

> > So at ldist time I see
> >
> >  # PT = null { D.47989 } (escaped, escaped heap)
> >  # ALIGN = 8, MISALIGN = 0
> >  # USE = nonlocal null { D.47989 } (escaped, escaped heap)
> >  # CLB = nonlocal null { D.47989 } (escaped, escaped heap)
> >  _27 = malloc (8192);
> >
> > good.  The interesting loop is the following where the allocation PTA
> > is good and _4 intersect because they include ESCAPED.  This is
> > because _27 itself escapes and PTA not being flow-sensitive.  I don't
> > see an easy way to improve things here but dislike the stmt walking
> > very much.  That is, the proper solution is to re-write PTA in some way.
>
> I am trying to imagine what that might look like. I imagine that instead
> of having a single "escaped" set, we would have multiple escaped sets at
> different points in the function (at most one per VDEF?). Then _4 would
> only contain escaped3 while heap5 (*_27) only appears in escaped9 and
> later? It may be tricky to keep a linear-ish complexity with anything more
> powerful than the current PTA. I don't know what LLVM is doing, but they
> seem to manage.

IIRC LLVM uses Steensgard analysis which is flow-sensitive but generally
less powerful.  We are using Andersen constraint-based analysis because
Steensgard was patented back in time (that patent has now expired).

One approach to make PTA "more" flow-sensitive is to partition the function
into regions (basically represent the function as a series of function calls
to its parts).

For the issue above there's the long-standing issue of splitting escapes
from function return from escapes to functions called to properly represent
local variable lifetime.

That said, your simpler patch still relies on up-to-date dominators, sth
not guaranteed or required previously.

We might consider implementing a separate (region-based) analysis
adjusting/pruning points-to information with flow-sensitive knowlege.
This might even work when done from inside PTA as post-processing.

Richard.

>
> --
> Marc Glisse

Reply via email to