[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-21 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #32 from Jakub Jelinek  ---
For debug stmts when DCE isn't involved, we already seem to do the right thing,
consider -O2 -g:
__attribute__((noinline)) static void
foo (int a)
{
  int b = 2 * a;
  int c = 3 * a;
  a = a + 4;
  asm volatile ("" : : : "memory");
}

void
bar (void)
{
  foo (1);
  foo (7);
  foo (16);
}

The clone materializes into:
   [local count: 1073741824]:
  # DEBUG D#5 s=> a
  # DEBUG a => D#5

   [local count: 1073741824]:
  # DEBUG D#1 s=> a
  # DEBUG BEGIN_STMT
  # DEBUG b => D#1 * 2
  # DEBUG BEGIN_STMT
  # DEBUG c => D#1 * 3
  # DEBUG BEGIN_STMT
  # DEBUG a => D#1 + 4
  # DEBUG BEGIN_STMT
  __asm__ __volatile__("" :  :  : "memory");
  return;
I thought only the first two debug stmts would appear, would need to look up
where the a_1(D) is replaced with the D#1 copy and where the corresponding
debug source bind is created.
Bet we want similar testcase also for when this DCE is used, and no, debug
stmts shouldn't be marked as something that should be thrown away, but instead
might be reset if we can't do anything better with them.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-21 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

Richard Biener  changed:

   What|Removed |Added

   Priority|P1  |P2

--- Comment #31 from Richard Biener  ---
Let's go for a GCC 11 fix and then eventually backport.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-20 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #30 from Martin Jambor  ---
Created attachment 48320
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48320=edit
Todays WIP patch

This is my todays (still very much) WIP patch.

- It marks statements which should not be copied before copying them
  and then skipping them.
- It does map SSA_NAMEs which should not survive to error_mark_node.
- Processing of calls is however still necessary, we cannot leave
  error_mark_nodes in the IL (until call redirection deals with it
  based on callee info).

But:

- It ICEs on gcc.dg/torture/pr48063.c.  I understand the problem,
  IPA-CP attempts to replace a floating-point parameter with an
  integer constant and fails but this fools the new DCE thingy into
  thinking some analysis declared the parameter unused even though it
  is used.  I'll have to make ipa_param_body_adjustments aware of
  tree_map.  (The original idea was to make it part of tree_map but
  for some reason I gave up on that.)

- There are three libgomp C++ ICEs that I know about which I have not
  even looked at.  I have not attempted any bootstrap yet.  I have not
  yet tested anything other than C/C++/Fortran.

- The new hash maps, or at least the one for statements, might be
  better placed in copy_body_data, the current place is just more
  convenient for the moment.  I do not care too much.

- Information currently stored in m_dead_ssas might be obtainable from
  decl_map in copy_body_data.

- I have not thought about debug statements yet and just ignored them
  for now.  I do want to handle them after other things work.

Any feedback welcome.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #29 from Jakub Jelinek  ---
(In reply to rguent...@suse.de from comment #28)
> On Fri, 17 Apr 2020, jakub at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385
> > 
> > --- Comment #26 from Jakub Jelinek  ---
> > For debug stmts, it would be best if we could use those
> >  DEBUG D#Y s=> parm
> >  DEBUG var => D#Y
> > added in if (param_body_adjs && MAY_HAVE_DEBUG_BIND_STMTS).
> > Though, if we remove already during the copy_bb by not actually creating the
> > stmt, I'm afraid that will mean the debug info is lost (debug stmts will be
> > reset), unless we for the lhs of to be dced stmts manually create debug
> > temporaries and debug stmts when we remap_gimple_stmt those stmts (for
> > SSA_NAMEs that are directly (or indirectly!) used in debug stmts).
> > If we do what Martin was proposing instead, i.e. copy the stmts and then DCE
> > them afterwards, it might work properly (perhaps only if we DCE them in the
> > right order).
> 
> It's true that copying everything and then DCEing is easier for debug
> stmt generation.  I didn't consider this.  That also argues for not
> remapping anything to error_mark_node.  Of course this leaves us with
> no automagic verification if we really DCEd everything required
> (esp. his handling of call arguments looks expensive and odd to me).

We could remap those to error_mark_node for -g0 and for -g to the
DEBUG_EXPR_DECLs we'd create and then just during verification make sure we
diagnose DEBUG_EXPR_DECLs in non-debug stmts (and error_mark_node).
Though, I think even right now the debug side of things even without the DCE is
not perfect, we do create a bind to cover the optimized out parameters at the
start of the function, but don't do this remapping of the SSA_NAME to that
DEBUG_EXPR_DECL, which means  that e.g. debug stmts that used to use that
SSA_NAME in some expressions are reset.
So, I think if Martin comes up with something that just doesn't handle the
debug stmts for GCC10, I can try to improve the debug side later.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-20 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #28 from rguenther at suse dot de  ---
On Fri, 17 Apr 2020, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385
> 
> --- Comment #26 from Jakub Jelinek  ---
> For debug stmts, it would be best if we could use those
>  DEBUG D#Y s=> parm
>  DEBUG var => D#Y
> added in if (param_body_adjs && MAY_HAVE_DEBUG_BIND_STMTS).
> Though, if we remove already during the copy_bb by not actually creating the
> stmt, I'm afraid that will mean the debug info is lost (debug stmts will be
> reset), unless we for the lhs of to be dced stmts manually create debug
> temporaries and debug stmts when we remap_gimple_stmt those stmts (for
> SSA_NAMEs that are directly (or indirectly!) used in debug stmts).
> If we do what Martin was proposing instead, i.e. copy the stmts and then DCE
> them afterwards, it might work properly (perhaps only if we DCE them in the
> right order).

It's true that copying everything and then DCEing is easier for debug
stmt generation.  I didn't consider this.  That also argues for not
remapping anything to error_mark_node.  Of course this leaves us with
no automagic verification if we really DCEd everything required
(esp. his handling of call arguments looks expensive and odd to me).

I'm also not sure how the machinery is helping the __builtin_constant_p
inline predication issue.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #27 from Jakub Jelinek  ---
Now, perhaps the analysis code could also detect which lhs are directly or
indirectly needed by debug stmts and when doing this return NULL in
remap_gimple_stmt, we could do something like (much simplified)
insert_debug_temp_for_var_def in there.  Though unsure if we must always handle
first the SSA_NAME definitions before uses, if not, then we'd need to figure
out the SSA_NAME to DEBUG_EXPR_DECL mapping early and then just let
remap_gimple_stmt add the debug stmt for it.
Perhaps ignore debug stmts for now as long as it doesn't ICE on them and I'll
try to do something for those?

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #26 from Jakub Jelinek  ---
For debug stmts, it would be best if we could use those
 DEBUG D#Y s=> parm
 DEBUG var => D#Y
added in if (param_body_adjs && MAY_HAVE_DEBUG_BIND_STMTS).
Though, if we remove already during the copy_bb by not actually creating the
stmt, I'm afraid that will mean the debug info is lost (debug stmts will be
reset), unless we for the lhs of to be dced stmts manually create debug
temporaries and debug stmts when we remap_gimple_stmt those stmts (for
SSA_NAMEs that are directly (or indirectly!) used in debug stmts).
If we do what Martin was proposing instead, i.e. copy the stmts and then DCE
them afterwards, it might work properly (perhaps only if we DCE them in the
right order).

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #24 from rguenther at suse dot de  ---
On April 17, 2020 3:53:07 PM GMT+02:00, "jakub at gcc dot gnu.org"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385
>
>--- Comment #23 from Jakub Jelinek  ---
>Instead of #c11 I meant:
>-   else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops
>(stmt))
>-|| gimple_code (stmt) == GIMPLE_PHI)
>+   else if (flag_tree_dce
>+&& ((is_gimple_assign (stmt) &&
>!gimple_has_volatile_ops
>(stmt))
>+|| gimple_code (stmt) == GIMPLE_PHI))
>because the res = -1 handling is in the else after this already.
>
>For what Richi is proposing, I guess one needs to run the analysis
>somewhere
>(early tree_function_versioning or elsewhere?, note in a hash_set stmts
>that
>shouldn't be copied, add pointer to that hash_set to copy_body_data and
>perhaps
>in remap_gimple_stmt if the hash_set pointer is non-NULL, check early
>if stmt
>is in it and in that case return NULL (like we already return NULL for
>debug
>stmts if we want to drop them).
>Most likely the lhs of such to be removed statements need to be also
>added to
>id.killed_new_ssa_names (and verify if it works properly).

I'd definitely make sure to map supposed to be DCEd defs to error_mark_node. 

For the marking one could add id.plf_to_dce and clear/set the plf on the stmts.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #25 from Martin Jambor  ---
(In reply to rguent...@suse.de from comment #21)
> Btw, I'd much prefer to not first copy the stmts and then remove them.
> Instead the DCE "analysis" can be done on the original IL and stmts
> be "marked" to be elided during copying.  That saves generating
> SSA names and gimple stmts rather than needing to remove them after the
> fact.

It is of course easy to change the patch to do the analysis on the
original and just create a hash_set of statements/SSA_NAMES to not
copy.  I'll do that.

As far as remapping the removed values to ERROR_MARK, I'm not sure.
We'd need to remap some SSA_NAMES of the same DECL differently than
other names (e.g. default-definition of the removed PARM_DECL would
get remapped to ERROR_MARK but not other SSA_NAMES and similarly for
other SSA_NAMES derived from those default-defs) ...and ATM I do not
know to what extent that is a problem.  But I can try.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #23 from Jakub Jelinek  ---
Instead of #c11 I meant:
-   else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt))
-|| gimple_code (stmt) == GIMPLE_PHI)
+   else if (flag_tree_dce
+&& ((is_gimple_assign (stmt) && !gimple_has_volatile_ops
(stmt))
+|| gimple_code (stmt) == GIMPLE_PHI))
because the res = -1 handling is in the else after this already.

For what Richi is proposing, I guess one needs to run the analysis somewhere
(early tree_function_versioning or elsewhere?, note in a hash_set stmts that
shouldn't be copied, add pointer to that hash_set to copy_body_data and perhaps
in remap_gimple_stmt if the hash_set pointer is non-NULL, check early if stmt
is in it and in that case return NULL (like we already return NULL for debug
stmts if we want to drop them).
Most likely the lhs of such to be removed statements need to be also added to
id.killed_new_ssa_names (and verify if it works properly).

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #22 from Martin Jambor  ---
(In reply to Jakub Jelinek from comment #18)
> Comment on attachment 48302 [details]
> Untested fix
> 
> + /* IPA-SRA does not analyze other types of statements.  */
> + gcc_unreachable ();
> Won't this ICE on any is_gimple_debug stmt?  Those should be just ignored
> and normal SSA_NAME handling should DTRT for those.

Yeah, it most probably will, I wrote it was only very mildly tested
(i.e. I only ran IPA testcases on it) - I wanted to post what I had
before I had to stop working on this for a few hours.

> As for PHIs, can you just gsi_remove them?
> Looking at tree-ssa-dce.c, it uses remove_phi_node rather than
> gsi_remove for PHIs.  And for non-PHIs, it calls release_defs after
> gsi_remove.

You are again most probably right, I keep forgetting about this.

> 
> Plus, I think in isra_track_scalar_value_uses for non-is_gimple_{debug,call}
> we should punt if !flag_tree_dce, i.e. when user asked not to perform dead
> code elimination.  Though, guess that hunk should be added only after this
> is tested (and perhaps the testcase or its copy should use
> -fdisable-tree-dce or whatever other way to avoid doing DCE even when
> flag_tree_dce is non-zero.

OK, that makes sense.  I'd slightly prefer the patch in comment #11
for this so that direct passes of a parameter to another function
without any modification is still not considered as doing DCE - but I
also do not really care too much.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #21 from rguenther at suse dot de  ---
On Fri, 17 Apr 2020, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385
> 
> --- Comment #20 from Jakub Jelinek  ---
> Looking at tree-ssa-dce.c, it uses remove_phi_node rather than gsi_remove for
> PHIs.  And for non-PHIs, it calls release_defs after gsi_remove.

Btw, I'd much prefer to not first copy the stmts and then remove them.
Instead the DCE "analysis" can be done on the original IL and stmts
be "marked" to be elided during copying.  That saves generating
SSA names and gimple stmts rather than needing to remove them after the
fact.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #20 from Jakub Jelinek  ---
Looking at tree-ssa-dce.c, it uses remove_phi_node rather than gsi_remove for
PHIs.  And for non-PHIs, it calls release_defs after gsi_remove.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #19 from Richard Biener  ---
(In reply to Martin Jambor from comment #17)
> Created attachment 48302 [details]
> Untested fix
> 
> I'm playing with this - only very mildly tested - fix.

Ugh.

I was thinking of altering the parameter setup for cloning/inlining
so that it maps a removed parameter [SSA_NAME] not to a new
uninitialized variable/default-def but to error_mark_node and then
either scanning the function for those, propagating & DCEing on the
fly, or, during cloning/inlining seed a worklist whenever we map
something to error_mark_node.

The advantage is that we're leaving the IL in a clearly invalid state
in case one of those slips through and that the call case you handle
should handle itself by followup IPA transforms.

Another option is to track a set of removed SSA names in the original
and whenever we encounter an original stmt using that during the copying
operation elide the copy or transform it.  IIRC there's already a stmt
transform hook for IPA that could eventually be used for this approach\
(the modify_gimple_stmt hook), but it alreayd gets the copied stmts so
all defs have been cloned and the safety the error_mark_node approach
would bring is lost.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #18 from Jakub Jelinek  ---
Comment on attachment 48302
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48302
Untested fix

+   /* IPA-SRA does not analyze other types of statements.  */
+   gcc_unreachable ();
Won't this ICE on any is_gimple_debug stmt?  Those should be just ignored and
normal SSA_NAME handling should DTRT for those.
As for PHIs, can you just gsi_remove them?

Plus, I think in isra_track_scalar_value_uses for non-is_gimple_{debug,call} we
should punt if !flag_tree_dce, i.e. when user asked not to perform dead code
elimination.  Though, guess that hunk should be added only after this is tested
(and perhaps the testcase or its copy should use -fdisable-tree-dce or whatever
other way to avoid doing DCE even when flag_tree_dce is non-zero.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-17 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #17 from Martin Jambor  ---
Created attachment 48302
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48302=edit
Untested fix

I'm playing with this - only very mildly tested - fix.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-04-09 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

Richard Biener  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #16 from Richard Biener  ---
(In reply to Martin Jambor from comment #14)
> Another option, which does not create an inter-pass dependency and
> does not clutter tree-inline any more, but which pessimizes IPA-SRA
> (put perhaps just alittle bit?), is making sure that the statements
> which might be left behind are harmless:
> 
> diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
> index 31de527d111..df54b98759c 100644
> --- a/gcc/ipa-sra.c
> +++ b/gcc/ipa-sra.c
> @@ -859,7 +859,8 @@ isra_track_scalar_value_uses (cgraph_node *node, tree
> name, int parm_num,
> }
>   res += all_uses;
> }
> -  else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt))
> +  else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt)
> +   && !gimple_could_trap_p (stmt))
>|| gimple_code (stmt) == GIMPLE_PHI)
> {
>   tree lhs;
> 
> I'll see what a tree-inline.c solution would look like and then decide
> which of these I'll propose.

Any progress here?  I think using the above would be best.  Another
option would be, during inlining, to make sure that removed parameters
get initialized (that also silences the warning), just the constant to use
isn't 100% obvious.  I'd settle for a 1 which is safe for all cases I
can think of besides array indexing or pointer adjustment and for
signed operations where anything but 0 might cause some undefined overflow.
Note the overflow thing also means that the gimple_could_trap_p check
is not sufficient.

But in the end I always wonder whether those "premature" optimizations
during IPA analysis are worth it and why we not simply consider all uses
as "uses"?

I also think that the user should not be able to disable all of DCE
(or enable IPA SRA ontop of -O0).  But of course that only makes the
attack surface smaller - I can very well imagine "bad" circumstances
with us arriving at a similarly problematic IL before IPA SRA
analysis with plain -O2.

Oh, instead of initializing dropped parameters the inliner could also
make sure to map it to a NaT, dropping stmts that use that and turn
their defs into NaTs as well.  Of course then IPA SRAs "DCE" decision
becomes a correctness thing.

That said, a proper solution _will_ need to get rid of the dead stmts
at IPA transform time.

Not sure if we really need an "improper" solution right now (given
the testcase needs to disable DCE).  The situation here is similar
to that with inline predicates and __builtin_constant_p.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-03-15 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #15 from Jakub Jelinek  ---
*** Bug 94178 has been marked as a duplicate of this bug. ***

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-23 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #14 from Martin Jambor  ---
Another option, which does not create an inter-pass dependency and
does not clutter tree-inline any more, but which pessimizes IPA-SRA
(put perhaps just alittle bit?), is making sure that the statements
which might be left behind are harmless:

diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index 31de527d111..df54b98759c 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -859,7 +859,8 @@ isra_track_scalar_value_uses (cgraph_node *node, tree name,
int parm_num,
}
  res += all_uses;
}
-  else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt))
+  else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt)
+   && !gimple_could_trap_p (stmt))
   || gimple_code (stmt) == GIMPLE_PHI)
{
  tree lhs;

I'll see what a tree-inline.c solution would look like and then decide
which of these I'll propose.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-22 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #13 from Martin Jambor  ---
The assumption is that DCE will eventually remove all gimple
assignments with LHS which has no uses if gimple_has_volatile_ops
returns false for them.  I have had only a brief look at DCE today and
it seems to use gimple_has_side_effects instead, which for assignments
looks equivalent but perhaps IPA-SRA should use that for the sake of
consistency.  I thought this assumption was conservative enough, but
perhaps I am too optimistic.

Scheduling a DCE before IPA-SRA summary building would not help in the
general case because IPA-SRA is optimistic about call argument uses.
If all calculations end up being used only as a call argument and the
pass finds out in the interprocedural phase that these arguments are
going to be removed, it would then also remove the parameter in
question - and without subsequent DCE the calculations would not be
removed again.

Adding this kind of DCE to the modification phase of IPA-SRA is of
course an option - we'll have to plug it into tree-inline.c somewhere,
making it a tiny bit more complicated again though.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #12 from Jakub Jelinek  ---
Is that safe though?
I mean, does ipa-sra use exactly the same logic as dce does?  Say statements
that could throw with non-call-exceptions and all that kinds of details.
Wouldn't it be safer e.g. to perform another dce right before ipa-sra?
Or let ipa-sra to DCE what it thinks is dead (of course, in that case having
-fno-tree-dce disable that is desirable).

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-22 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #11 from Martin Jambor  ---
I'm going to test a patch that basically does the following (plus moving
push_cfun a bit):

diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index 31de527d111..e18dc6958dc 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -862,6 +862,14 @@ isra_track_scalar_value_uses (cgraph_node *node, tree
name, int parm_num,
   else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt))
   || gimple_code (stmt) == GIMPLE_PHI)
{
+ if (!flag_tree_dce)
+   {
+ /* We depend on DCE to clean up statements that we discover to be
+unnecessary.  */
+ res = -1;
+ BREAK_FROM_IMM_USE_STMT (imm_iter);
+   }
+ 
  tree lhs;
  if (gimple_code (stmt) == GIMPLE_PHI)
lhs = gimple_phi_result (stmt);

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-22 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #10 from Martin Jambor  ---
Ah, -fno-tree-dce is specfied on the command line...

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #9 from Jakub Jelinek  ---
The options include -fno-tree-dce.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-22 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

Martin Jambor  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jamborm at gcc dot 
gnu.org

--- Comment #8 from Martin Jambor  ---
Yes, IPA-SRA removes the parameter because we do bunch of computations
on them but never store the result anywhere nor pass it to another
function.  The problem is that the modulo operation survives the
compilation even though its result is not stored anywhere (which
somewhat surprises me, I must say).

I'll figure out why tree-ssa-dce refuses to remove it and see if I can
employ a similar logic in IPA-SRA.

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #7 from Jakub Jelinek  ---
And with additiona -fno-early-inlining started with
r10-3311-gff6686d2e5f797d6c6a36ad14a7084bc1dc350e4

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek  ---
Started with r10-3542-g0b92cf305dcf34387a8e2564e55ca8948df3b47a

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-22 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

--- Comment #5 from Richard Biener  ---
Hmm, no, IPA SRA thinks the parameter is unused (huh).  Possibly -fno-ipa-cp
isn't implemented "fully".

[Bug ipa/93385] [10 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2020-01-22 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1
 CC||jamborm at gcc dot gnu.org,
   ||marxin at gcc dot gnu.org
  Component|tree-optimization   |ipa

--- Comment #4 from Richard Biener  ---
-fno-ipa-sra fixes it.  I suspect it falls foul of needing wide-ints for the
constant, punting at transform time only half-way.