longjmp

Richard Biener Tue, 06 Mar 2018 06:43:45 -0800

On Tue, Mar 6, 2018 at 3:17 PM, Michael Matz <m...@suse.de> wrote:
> Hi,
>
> On Mon, 5 Mar 2018, Jeff Law wrote:
>
>> >>> Actually, without further conditions I don't see how it would be safe
>> >>> for the successor to have multiple preds.  We might have this
>> >>> situation:
>> >>>
>> >>> bb1: ret = setjmp
>> >>> bb2: x0 = phi <x1 (bb1), foo(bbX)>
>> >> No.  Can't happen -- we're still building the initial CFG.  There are no
>> >> PHI nodes, there are no SSA_NAMEs.
>> >
>> > While that is currently true I think it's short-sighted.  Thinking about
>> > the situation in terms of SSA names and PHI nodes clears up the mind.  In
>> > addition there is already code which builds (sub-)cfgs when SSA form
>> > exists (gimple_find_sub_bbs).  Currently that code can't ever generate
>> > setjmp calls, so it's not an issue.
>>
>> It's not clearing up anything for me.  Clearly you're onto something
>> that I'm missing, but still trying to figure out.
>
> I'm saying that if there is SSA form then having the second-return edge to
> the successor block if that has multiple predecessors, then this would be
> incorrect.  Do you agree?  I'm also saying that if it's incorrect when in
> SSA form, then it can't be correct (perhaps only very subtly so) without
> SSA form either; I guess that's where we don't agree.
>
> I'm also saying that we currently don't have SSA form while dealing with
> setjmp by more luck and giving up than thought: inlining and va-arg
> expansion do construct sub-cfgs while in SSA.  inlining setjmp calls is
> simply disabled, va-arg expansion doesn't emit setjmp calls into the
> sequence.  But there is nothing inherently magic about SSA form and
> setjmp, so if we're going to fiddle with the CFG form created by setjmp to
> make it more precise, then I think we ought to do it in a way that is
> correct in all intermediate forms, especially at this devel stage, because
> that'd give confidence that it's also correct in the constrained way we're
> needing it.
>
>> Certainly we have to be careful WRT the implicit set of the return value
>> of the setjmp call that occurs on the longjmp path.  That's worth
>> investigating.  I suspect that works today more by accident of having an
>> incorrect CFG than by design.
>
> No, our current imprecise CFG makes it so that the setting of the return
> value is reached by the abnormal edge from the dispatcher, so in a
> data-flow sense it's executed on the first- and second-return case and all
> is well, and so the IL subsumes all the effects that happen in reality
> (and more, which is the reason for the missed optimizations).  You want to
> change the CFG such that it doesn't subsume effects that don't happen in
> reality, and that makes sense, but by that you're making it not reflect
> actions which _do_ happen.
>
>> But it does or at least it should.  It's implicitly set on the longjmp
>> side.  If we get this wrong I'd expect we'll see uninit uses in the PHI.
>>  That's easy enough to instrument and check for.
>
> Not only uninit uses but conceivably misoptimizations as well.  Let's
> contrieve an example which uses gotos to demonstrate the abnormal edges
> and let's have the second-return edge to land after the setjmp call (and
> setting of return value):
>
>   ret = setjmp(buf);
> second_return:
>   if (ret == 0) {             // first return
> inside:                       // see below
>     call foo(buf)             // might call longjmp
>     maybe-goto dispatch       // therefore we have an edge to dispatch
>   }
>   return;
> dispatch:                     // our abnormal dispatcher
>   goto second_return;         // go to second-return target
>
> Now, as far as data-flow is concerned, as soon as the ret==0 block is
> entered once, ret will never change from 0 again (there simply is no
> visible set which could change it).  So an optimizer would be correct in
> threading the sequence maybe-goto-dispatch->second_return to inside (bear
> with me):
>
>   ret = setjmp();
>   if (ret == 0) {             // first return
> inside:                       // see below
>     call foo()                // might call longjmp
>     maybe-goto inside         // if it does we loop, ret must still be zero ?!
>   }
>   return;
>
> This currently would not happen: first we don't thread through abnormal
> edges, and propagation of knowledge over abnormal edges is artificially
> limited as well.  But from a pure data-flow perspective the above
> transformation would be correct, meaning that initial IL can't have been
> correct.  You might say in practice even if the above would happen it
> doesn't matter, because at runtime, with a real longjmp, the "maybe-goto
> inside" would in reality land at the setjmp return again, hence setting
> ret, then the test comes along and all is well.  That would probably be
> true currently, but assume foo always calls longjmp, and GCC can figure
> out it's using the same buffer as the above setjmp, then a further
> optimization might do away with the longjmp and just jump directly to the
> inside label: misoptimization achieved.
>
> None of the above events can happen when the second-return target is
> imprecise and before the setjmp.
>
> Now that all might sound like theoretical games, but I think it highlights
> why we should think long and hard about setjmp CFGs and IL before we relax
> it.
>
> FWIW: I think none of the issues I'm thinking about matter when you check
> two things: 1) that the setjmp call has no LHS, and 2) that the target BB
> has a single predecessor.  I was only triggered by the discussion between
> you and Richi of why (2) might not be important.  But you certainly miss
> checking (1) as well.  IIRC the testcase you were trying to optimize had
> no LHS, right?
>
> It'd be nice if we try to make the setjmp IL correct For Real (tm) in the
> next devel phase:
> 1) make setjmp have two outgoing edges, always
> 2) create a setjmp_receive internal function that has a return value
> For 'ret = setjmp(buf)', create this CFG:
>
>   bb1
>   ret = setjmp(buf)
>    |       \              bb-recv
>    |        ----------------\
>    |                ret = setjmp_receiver
>    |                        /
>   normal   /---------------/
>   path    /
>    |     /
>   bb-succ
>
> None of these edges would be abnormal.  bb-recv would be the target for
> edges from all calls that might call longjmp(buf).  Those edges might need
> to be abnormal.  As the above CFG reflects all runtime effects precisely,
> but not which instructions are used to achieve them the expansion to RTL
> will be special.


Why do you still have the edge from setjmp to the setjmp receiver?
In your scheme ret is set twice on the longjmp return path, no?  That
is, you have the same issue left as we have with EH returns from a
stmt with a LHS.

We currently have two outgoing edges from setjmp, one which feeds back
to right before the setjmp call via the abnormal dispatcher (so it looks like
a loop).  Jeffs change will make it two outgoing edges to the same
single successor,
one dispatched through the abnormal dispatcher (that also nicely gets
around the limitation of only having a single edge between two blocks...)

Richard.

>
> Ciao,
> Michael.

Re: [RFA][PATCH][PR middle-end/61118] Improve tree CFG accuracy for setjmp/longjmp

Reply via email to