Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa

2020-03-25 Thread Jeff Law via Gcc-patches
On Wed, 2020-01-29 at 18:32 +0300, Alexander Monakov wrote:
> On Tue, 28 Jan 2020, Jeff Law wrote:
> 
> > On Tue, 2019-10-08 at 18:04 +0300, Alexander Monakov wrote:
> > > On Thu, 3 Oct 2019, Jeff Law wrote:
> > > 
> > > > You may want to review the 2018 discussion:
> > > > 
> > > > https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg185287.html
> > > > 
> > > > THe 2018 discussion was primarily concerned with fixing the CFG
> > > > inaccuracies which would in turn fix 61118.  But would not fix 21161.
> > > 
> > > Well, PR 61118 was incorrectly triaged to begin with.
> > > 
> > > Let me show starting from a simplified testcase:
> > > 
> > > void bar(jmp_buf);
> > > void foo(int n)
> > > {
> > > jmp_buf jb;
> > > for (; n; n--)
> > > if (setjmp(jb))
> > > bar(jb);
> > > }
> > > 
> > > Here we warn that "parameter 'n' might be clobbered..." despite that
> > > "obviously" there is no way 'n' might be modified between setjmp and
> > > bar... right?
> > Not necessarily. You have to look at what objects are live.  N is
> > clearly going to be live throughout the entire loop and thus it will be
> > subject to setjmp/longjmp warnings.
> 
> That N is live is not enough: we want to warn if N is also modified
> before an abnormal return (otherwise we'd attempt to warn for live
> readonly variables). But if you're agreeing with me that warning here
> is *not a false positive*, then I'm happy.
We have traditionally only bothered to analyze variables that are live at the
setjmp point.   Objects which are not live at the setjmp point need not be
analyzed.  That's all I was saying.


> > 
> > > Now suppose 'bar' looks like
> > [ ... ]
> > It doesn't really matter what bar looks like.
> 
> Well it did for the purpose of demonstration, but if you're in agreement
> anyway then we can say it's moot.
In general we're not going to necessarily have visibility into calls where the
longjmp may occur.   Any analysis we do has to not depend on looking inside the
called routine.


> 
> > > Still, I can imagine one can argue that the warning in the PR is bogus, as
> > > the loop assigns invariant values to problematic variables:
> > I'm not sure which PR you're referring to (21161, 65041 or some other?)
> 
> I'm arguing that PR 61118 was incorrectly triaged (first sentence of my 
> email).
Who's triage are you referring to? :-)

It's been a couple years since I dug into 61118, but my recollection was that 
the
core problem was the multiple assignments to the pseudo registers holding
cancel_arg and cancel_routine.  If there was a single assignment to those 
pseudos
then the existing RTL Wclobbered analysis would not issue a warning.

While the assignments use the same source, the mere existence of multiple
assignments causes real headaches for the existing -Wclobbered analysis.  The
multiple assignments occur due to the actions of CCP1 and FRE1 propagating into 
a
PHI node, which they arguably shouldn't do for an abnormal PHI.

> 
> > > void foo(struct ctx *arg)
> > > {
> > > while (arg->cond) {
> > > void *var1 = >field;
> > > void (*var2)(void*) = globalfn;
> > > jmp_buf jb;
> > > if (!setjmp(jb))
> > > var2(var1);
> > > }
> > > etc(arg);
> > > }
> > > 
> > > and since "obviously" >field is invariant, there's no way setjmp will
> > > overwrite 'var1' with a different value, right?.. Except:
> > > 
> > > If the while loop is not entered, a longjmp from inside 'etc' would 
> > > restore
> > > default (uninitialized) values; note the compiler doesn't pay attention to
> > > the
> > > lifetime of jmp_buf, it simply assumes 'etc' may cause setjmp to return
> > > abnormally (this is correct in case of other returns_twice functions such
> > > as
> > > vfork).
> > Whether or not we should warn for restoring an uninitialized state I
> > think is somethign we haven't really considered.  Depending on how many
> > assignments to the pseudo there are and the exact control flow we may
> > or may not warn.
> 
> That restored values are uninitialized is not important; the problem remains
> if var1 is declared before the 'while' and initialized to 0.
Sorry, I don't follow your statement. From the fragment above var1 is assigned
only once and inside the loop.  I don't see where it would be initialized to
zero.

> 
> > > Now regarding claims that we emit "incorrect" CFG that is causing extra 
> > > phi
> > > nodes, and that "fixing CFG" would magically eliminate those and help with
> > > false positives.
> > > 
> > > While in principle it is perhaps nicer to have IR that more closely models
> > > the abnormal flow, I don't see how it would reduce phi nodes; we'd go from
> > It certainly does move PHI nodes and it can reduce the number of false
> > positives.  It's not as effective as it could be because of lameness on
> > the RTL side (again see PR21161).
> 
> There's some mixup here because 

Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa

2020-01-29 Thread Alexander Monakov
On Tue, 28 Jan 2020, Jeff Law wrote:

> On Tue, 2019-10-08 at 18:04 +0300, Alexander Monakov wrote:
> > On Thu, 3 Oct 2019, Jeff Law wrote:
> > 
> > > You may want to review the 2018 discussion:
> > > 
> > > https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg185287.html
> > > 
> > > THe 2018 discussion was primarily concerned with fixing the CFG
> > > inaccuracies which would in turn fix 61118.  But would not fix 21161.
> > 
> > Well, PR 61118 was incorrectly triaged to begin with.
> > 
> > Let me show starting from a simplified testcase:
> > 
> > void bar(jmp_buf);
> > void foo(int n)
> > {
> > jmp_buf jb;
> > for (; n; n--)
> > if (setjmp(jb))
> > bar(jb);
> > }
> > 
> > Here we warn that "parameter 'n' might be clobbered..." despite that
> > "obviously" there is no way 'n' might be modified between setjmp and
> > bar... right?
> Not necessarily. You have to look at what objects are live.  N is
> clearly going to be live throughout the entire loop and thus it will be
> subject to setjmp/longjmp warnings.

That N is live is not enough: we want to warn if N is also modified
before an abnormal return (otherwise we'd attempt to warn for live
readonly variables). But if you're agreeing with me that warning here
is *not a false positive*, then I'm happy.

> The inaccuracies in our CFG at the setjmp call cause objects which are
> not actually live across the call to appear to be live.  See PR 21161.

For the example shown in comment #14, agreed.

> > Now suppose 'bar' looks like
> [ ... ]
> It doesn't really matter what bar looks like.

Well it did for the purpose of demonstration, but if you're in agreement
anyway then we can say it's moot.

> > Still, I can imagine one can argue that the warning in the PR is bogus, as
> > the loop assigns invariant values to problematic variables:
> I'm not sure which PR you're referring to (21161, 65041 or some other?)

I'm arguing that PR 61118 was incorrectly triaged (first sentence of my email).

> > void foo(struct ctx *arg)
> > {
> > while (arg->cond) {
> > void *var1 = >field;
> > void (*var2)(void*) = globalfn;
> > jmp_buf jb;
> > if (!setjmp(jb))
> > var2(var1);
> > }
> > etc(arg);
> > }
> > 
> > and since "obviously" >field is invariant, there's no way setjmp will
> > overwrite 'var1' with a different value, right?.. Except:
> > 
> > If the while loop is not entered, a longjmp from inside 'etc' would restore
> > default (uninitialized) values; note the compiler doesn't pay attention to 
> > the
> > lifetime of jmp_buf, it simply assumes 'etc' may cause setjmp to return
> > abnormally (this is correct in case of other returns_twice functions such as
> > vfork).
> Whether or not we should warn for restoring an uninitialized state I
> think is somethign we haven't really considered.  Depending on how many
> assignments to the pseudo there are and the exact control flow we may
> or may not warn.

That restored values are uninitialized is not important; the problem remains
if var1 is declared before the 'while' and initialized to 0.

> > Now regarding claims that we emit "incorrect" CFG that is causing extra phi
> > nodes, and that "fixing CFG" would magically eliminate those and help with
> > false positives.
> > 
> > While in principle it is perhaps nicer to have IR that more closely models
> > the abnormal flow, I don't see how it would reduce phi nodes; we'd go from
> It certainly does move PHI nodes and it can reduce the number of false
> positives.  It's not as effective as it could be because of lameness on
> the RTL side (again see PR21161).

There's some mixup here because overall this was discussing the complete
reimplementation of the warning on GIMPLE, so RTL should not be relevant.

Currently there's no concrete proposal on the table for making GIMPLE CFG
somehow more accurate for setjmps. I can see that RTL CFG could be more
accurate indeed.

The best proposal for GIMPLE so far is moving the outgoing edge from the
abnormal dispatcher to just after the setjmp:

BB1
  r1 = setjmp(env);
 BB2
  |  r2 = ABNORMAL_DISPATCHER()
  |  +---
  V  V
BB3
  r3 = phi(r1, r2)
  if (r3) goto BB4 else goto BB5

where for setjmp in particular the BB1->BB2 edge is not necessary, but
should be there for unknown returns_twice functions.

But I can't see how this proposal substantially aids analysis on GIMPLE.
In particular, we cannot easily make some kind of threading and separate
BB1-BB3-BB5 and BB2-BB3-BB4 paths because then we wouldn't be able to
easily lower it to RTL.

Alexander


Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa

2020-01-28 Thread Jeff Law
On Thu, 2019-10-17 at 16:14 +0300, Alexander Monakov wrote:
> On Tue, 8 Oct 2019, Alexander Monakov wrote:
> 
> [massive snip]
> 
> > So in my opinion our CFG is good enough, the real issues with -Wclobbered 
> > false
> > positives are not due to phi nodes but other effects.
> > 
> > If you agree: what would be the next steps?
> 
> Hello,
> 
> may I ping this discussion?  I apologize for letting my cranky mood that day 
> leak
> too badly into the message.
No worries at all.  Just too much going on.  Happy to try and kick
things moving.  Particularly in the context of 21161 and 61118 which
are regressions.

jeff




Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa

2020-01-28 Thread Jeff Law
On Tue, 2019-10-08 at 18:04 +0300, Alexander Monakov wrote:
> On Thu, 3 Oct 2019, Jeff Law wrote:
> 
> > You may want to review the 2018 discussion:
> > 
> > https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg185287.html
> > 
> > THe 2018 discussion was primarily concerned with fixing the CFG
> > inaccuracies which would in turn fix 61118.  But would not fix 21161.
> 
> Well, PR 61118 was incorrectly triaged to begin with.
> 
> Let me show starting from a simplified testcase:
> 
> void bar(jmp_buf);
> void foo(int n)
> {
> jmp_buf jb;
> for (; n; n--)
> if (setjmp(jb))
> bar(jb);
> }
> 
> Here we warn that "parameter 'n' might be clobbered..." despite that
> "obviously" there is no way 'n' might be modified between setjmp and
> bar... right?
Not necessarily. You have to look at what objects are live.  N is
clearly going to be live throughout the entire loop and thus it will be
subject to setjmp/longjmp warnings.

The inaccuracies in our CFG at the setjmp call cause objects which are
not actually live across the call to appear to be live.  See PR 21161.
> 
> Now suppose 'bar' looks like
[ ... ]
It doesn't really matter what bar looks like.

> 
> 
> Still, I can imagine one can argue that the warning in the PR is bogus, as
> the loop assigns invariant values to problematic variables:
I'm not sure which PR you're referring to (21161, 65041 or some other?)


> 
> void foo(struct ctx *arg)
> {
> while (arg->cond) {
> void *var1 = >field;
> void (*var2)(void*) = globalfn;
> jmp_buf jb;
> if (!setjmp(jb))
> var2(var1);
> }
> etc(arg);
> }
> 
> and since "obviously" >field is invariant, there's no way setjmp will
> overwrite 'var1' with a different value, right?.. Except:
> 
> If the while loop is not entered, a longjmp from inside 'etc' would restore
> default (uninitialized) values; note the compiler doesn't pay attention to the
> lifetime of jmp_buf, it simply assumes 'etc' may cause setjmp to return
> abnormally (this is correct in case of other returns_twice functions such as
> vfork).
Whether or not we should warn for restoring an uninitialized state I
think is somethign we haven't really considered.  Depending on how many
assignments to the pseudo there are and the exact control flow we may
or may not warn.

> 
> 
> So the bottom line here that setjmps in loops are extra tricky, warnings for
> them that superficially appear to be false positives may indicate a real 
> issue,
> and we can do a better job documenting that and recommending safer patterns.
I don't think setjmps in loops add any real additional complexity.


> 
> 
> Now regarding claims that we emit "incorrect" CFG that is causing extra phi
> nodes, and that "fixing CFG" would magically eliminate those and help with
> false positives.
> 
> While in principle it is perhaps nicer to have IR that more closely models
> the abnormal flow, I don't see how it would reduce phi nodes; we'd go from
It certainly does move PHI nodes and it can reduce the number of false
positives.  It's not as effective as it could be because of lameness on
the RTL side (again see PR21161).

Jeff



Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa

2019-10-17 Thread Alexander Monakov
On Tue, 8 Oct 2019, Alexander Monakov wrote:

[massive snip]

> So in my opinion our CFG is good enough, the real issues with -Wclobbered 
> false
> positives are not due to phi nodes but other effects.
> 
> If you agree: what would be the next steps?

Hello,

may I ping this discussion?  I apologize for letting my cranky mood that day 
leak
too badly into the message.

Alexander


Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa

2019-10-08 Thread Alexander Monakov
On Thu, 3 Oct 2019, Jeff Law wrote:

> You may want to review the 2018 discussion:
> 
> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg185287.html
> 
> THe 2018 discussion was primarily concerned with fixing the CFG
> inaccuracies which would in turn fix 61118.  But would not fix 21161.

Well, PR 61118 was incorrectly triaged to begin with.

Let me show starting from a simplified testcase:

void bar(jmp_buf);
void foo(int n)
{
jmp_buf jb;
for (; n; n--)
if (setjmp(jb))
bar(jb);
}

Here we warn that "parameter 'n' might be clobbered..." despite that
"obviously" there is no way 'n' might be modified between setjmp and
bar... right?

Now suppose 'bar' looks like

void bar(jmp_buf jb)
{
static int cnt;
static jmp_buf stash;
if ((cnt ^= -1))
memcpy(stash, jb, sizeof stash);
else
longjmp(stash, 1);
}

So every other iteration of the loop, 'bar' will longjmp to the state
stashed on the previous iteration. Since 'n' is modified between iterations,
this is exactly the situation the warning aims to catch: if 'n' is restored
from jmp_buf, the last write to 'n' is lost.

Still, I can imagine one can argue that the warning in the PR is bogus, as
the loop assigns invariant values to problematic variables:

void foo(struct ctx *arg)
{
while (arg->cond) {
void *var1 = >field;
void (*var2)(void*) = globalfn;
jmp_buf jb;
if (!setjmp(jb))
var2(var1);
}
etc(arg);
}

and since "obviously" >field is invariant, there's no way setjmp will
overwrite 'var1' with a different value, right?.. Except:

If the while loop is not entered, a longjmp from inside 'etc' would restore
default (uninitialized) values; note the compiler doesn't pay attention to the
lifetime of jmp_buf, it simply assumes 'etc' may cause setjmp to return
abnormally (this is correct in case of other returns_twice functions such as
vfork).

(granted, if the loop is not entered, setjmp is not called either and cannot
possibly return normally, let alone abnormally, but our IR does not model that).

So the bottom line here that setjmps in loops are extra tricky, warnings for
them that superficially appear to be false positives may indicate a real issue,
and we can do a better job documenting that and recommending safer patterns.


Now regarding claims that we emit "incorrect" CFG that is causing extra phi
nodes, and that "fixing CFG" would magically eliminate those and help with
false positives.

While in principle it is perhaps nicer to have IR that more closely models
the abnormal flow, I don't see how it would reduce phi nodes; we'd go from

BB2:   BB99:
|  ABNORMAL_DISPATCHER()
v  v--/
BB3:
x_1 = phi(x_2(BB2), x_3(BB99)
ret_1 = setjmp()

to
   BB2:   BB99:
   |  ABNORMAL_DISPATCHER()
   v  |
   BB3:   |
   ret_1 = setjmp()   |
   |\ v
   | >BB4:
   |  ret_2 = __builtin_setjmp_receiver()
   v v---/
   BB4:
   x_1 = phi(x_2(BB2), x_3(BB4)
   ret_3 = phi(ret_1(BB2), ret_2(BB4)

so the phi node for 'x' simply goes to the new join BB, which also needs a 
new phi for setjmp's return value 'ret'.

Plus, we'd need to somehow ensure that BB4 stays empty apart from the receiver,
and we have no infrastructure to ensure that.

So as far as I can tell it's not readily implementable and in fact brings no
benefit for setjmp and vfork. I can see a theoretical benefit for other
functions that have 'returns_twice' attribute and receive arguments: this
would eliminate bogus phi nodes for arguments. But for vfork and setjmp this
doesn't buy anything.

You even said yourself that phi nodes would still be there, just in a different
block:

> You might instead be able to look at the PHI nodes of the successor block 


So in my opinion our CFG is good enough, the real issues with -Wclobbered false
positives are not due to phi nodes but other effects.

If you agree: what would be the next steps?

Alexander


Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa

2019-10-03 Thread Jeff Law
On 10/3/19 6:01 AM, Vladislav Ivanishin wrote:
> What it does
> 
> 
> A fundamental limitation of the new approach is that it requires phi
> nodes for variables to perform the analysis it needs to issue the
> warning for them.  No phis - no warning.  In particular, it doesn't deal
> with register asm variables (see gcc.target/i386/attr-returns_twice-1.c
> which I xfailed) because of how they are currently implemented in GCC.
> Also, in theory there can be struct members that SRA doesn't scalarize,
> but some other compiler might.
> 
> Another "downgrade" from the old implementation is that currently, the
> new pass is placed in such a position in the pipeline that it requires
> optimize > 0 in order to be run.  But there's nothing fundamental about
> it; that is something I just haven't experimented with.
> 
> I placed the new pass in tree-ssa-uninit.c, because I used static
> functions defined there.  The pass ended up using only struct pred_info
> and is_pred_expr_subset_of().
I think we document that these warnings are only possible when the
optimizer is enabled.  So I think the restriction that we have SSA form
is reasonable.


> 
> The main PR tracking the problems of the old implementation is 21161.
> I also went through the "See also" list of PRs for it.  This new
> implementation addresses them all, and the only failing test I saw for
> x86_64-pc-linux-gnu is the already mentioned register asm test.
Not exactly.  There's multiple distinct, but slightly different issues
in play here.  But I would agree that an implementation of the warning
in gimple would be a significant improvement from a design standpoint.


> 
> The existing RTL implementation doesn't check return values of
> setjmp/vfork and thus does not distinguish the paths only realizable
> upon a non-abnormal [*] return (this is what Wclobbered-pr21161-2.c is
> about).  The new implementation does deal with it.
Correct.  I've actually got a patch here which does analysis of return
values and their affects on the CFG for RTL for 21161.  But it's
painful, amazingly painful, because of all the target dependencies you
have to make sure to handle.

You may want to review the 2018 discussion:

https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg185287.html

THe 2018 discussion was primarily concerned with fixing the CFG
inaccuracies which would in turn fix 61118.  But would not fix 21161.

Moving the warning into gimple allows us to robustify the
implementation.  So from design standpoint I absolutely like getting the
warning out of RTL and into gimple.

Where I have concerns is your implementation leaves the CFG as-is
(inaccurate and pessimizing).  In fact, I think your implementation
relies on the inaccuracy which results in undesirable PHI nodes at the
start of the setjmp block (because the long jump incorrectly is modeled
as returning to the point just before the setjmp which creates a loop
and in turn the PHI nodes you're relying on).

I would rather see us attack this as two distinct problems.  First would
be fixing the CFG.  I think the approach Matz sketched out should
address the CFG issues and fix 61118 as a side effect.

https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg185639.html

The key piece of his approach in my mind is moving to a setjmp receiver
for each setjmp call and representing the assignment to the setjmp
return value in the receiver.

Then we'd want to move the warning out of RTL and into gimple.  I fear
that we may have to rethink some of your current approach because if the
CFG is fixed, you won't have the PHI nodes at the start of the block
with the setjmp call.   You might instead be able to look at the PHI
nodes of the successor block or perhaps wire it into the out-of-ssa code
which does life analysis and analyze the objects that are live across
the call point.

Jeff