[Bug tree-optimization/101641] Bogus redundant store removal

2021-09-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641

--- Comment #7 from Richard Biener  ---
Wow, and this time it's even combine coming into play!

(insn 10 9 11 2 (set (reg/v:DI 82 [ xy ])
(mem/j:DI (reg/v/f:DI 86 [ pu ]) [2 pu_6(D)->y+0 S8 A64])) "t.i":12:8
76 {*movdi_internal}
 (nil))
(insn 11 10 12 2 (set (mem/j:DI (reg/v/f:DI 86 [ pu ]) [2 pu_6(D)->x+0 S8 A64])
(reg/v:DI 82 [ xy ])) "t.i":13:9 76 {*movdi_internal}
 (expr_list:REG_DEAD (reg/v/f:DI 86 [ pu ])
(expr_list:REG_DEAD (reg/v:DI 82 [ xy ])
(nil

Trying 10 -> 11:
   10: r82:DI=[r86:DI]
   11: [r86:DI]=r82:DI
  REG_DEAD r86:DI
  REG_DEAD r82:DI
Failed to match this instruction:
(set (mem/j:DI (reg/v/f:DI 86 [ pu ]) [2 pu_6(D)->x+0 S8 A64])
(mem/j:DI (reg/v/f:DI 86 [ pu ]) [2 pu_6(D)->y+0 S8 A64]))
allowing combination of insns 10 and 11
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 10.
modifying insn i311: [r86:DI]=[r86:DI]
  REG_DEAD r86:DI
deferring rescan insn with uid = 11.

Trying 4 -> 11:
4: r86:DI=r91:DI
  REG_DEAD r91:DI
   11: [r86:DI]=[r86:DI]
  REG_DEAD r86:DI
Failed to match this instruction:
(set (mem/j:DI (reg:DI 91) [2 pu_6(D)->x+0 S8 A64])
(mem/j:DI (reg:DI 91) [2 pu_6(D)->y+0 S8 A64]))
allowing combination of insns 4 and 11
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 4.
modifying insn i311: [r91:DI]=[r91:DI]
  REG_DEAD r91:DI
deferring rescan insn with uid = 11.
deleting noop move 11


somewhere inside combine we'd have to realize that this isn't a noop move
and then maybe not allow the combination in the first place since it
isn't recognizable?  That is, somehow we must anticipate the removal,
I suppose it is via

  /* Recognize all noop sets, these will be killed by followup pass.  */
  if (insn_code_number < 0 && GET_CODE (pat) == SET && set_noop_p (pat))
insn_code_number = NOOP_MOVE_INSN_CODE, num_clobbers_to_add = 0;

where set_noop_p for two MEMs simply dispatches to
rtx_equal_p && !side_effects_p.

Note on RTL we see that we cannot rely on MEM_ALIAS_SET but have to
use MEM_EXPR to conservatively assess that the access is _not_ through
a union ... (or as said we could annotate the alias set entry as to
belonging to a union).


In the end how we handle TBAA and unions might not be the very best way
(but I can't offer something better yet).

[Bug tree-optimization/101641] Bogus redundant store removal

2021-09-10 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641

Richard Biener  changed:

   What|Removed |Added

 CC||davmac at davmac dot org

--- Comment #6 from Richard Biener  ---
*** Bug 102268 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/101641] Bogus redundant store removal

2021-07-29 Thread muecker at gwdg dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641

--- Comment #5 from Martin Uecker  ---
(In reply to Richard Biener from comment #3)
>
> What we know from the union read is that the dynamic type is either the
> union type (so the read can pun) or the dynamic type is the union member
> type (then the read cannot pun).  6.5/7 allows accesses using
> "an aggregate or union type that includes [a type compatible with the
> effective
> type of the object]".  Though the 'aggregate' case appears quite odd to me,
> does it really allow 'int i; struct { float f; int i; } *p =  float x =
> p->f;'?  

I do not think this is meant to work as the
read is using an lvalue of type float and
that is not covered by the rules.

I believe that the aggregate case is meant for
the case where sub objects are modified as part of
struct/union assignment (i.e. when the lvalue used
for the access is of struct/union type).

But the rules could be interpreted to allow:

int i; struct { float f; int i; } *p = 
int j = p->i;

which I think should not be allowed.

[Bug tree-optimization/101641] Bogus redundant store removal

2021-07-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641

--- Comment #4 from Richard Biener  ---
Just some more brain-dumps from thinking about a fix.

- we can annotate the alias_set_entry with a flag whether it was created for
  a union type and use that to improve the logic

- we can introduce some dynamic-type change IL elements that would allow us
  to elide all those redundant stores but preserve their effect.  They'd
  be modeled as stores but would generate no code.  The stored value would
  be implicit so we can readily remove the load (or constant).  An
  internal function call like we have for masked stores would be a possibility
  but those would be quite disruptive to passes compared to the load/store
  sequence, so preserving the original store but with a special RHS seems
  most logical (but we cannot use RHS == LHS as that's only valid GIMPLE
  for non-register-types).  We'd also have to be careful to not treat those
  "stores" as kills which leans towards the IFN again.

A fix along the first idea looks least intrusive for backporting.  A fix
along the second idea looks best for recovering lost redundant store removal.

[Bug tree-optimization/101641] Bogus redundant store removal

2021-07-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641

--- Comment #3 from Richard Biener  ---
With union punning we've basically opened up a way to change the dynamic type
of storage in a way that expects us from generating no code from a "virtual"
read-write cycle involving the previous and the next dynamic type of the
storage.

Now the checks we're currently performing try to assess whether such a
change takes place but given we don't really know the current dynamic type
we're guessing based on the "read" which constrains that dynamic type.
And we're willing to elide changes to a type that is strictly more
restrictive as to what followon accesses it allows.

What goes wrong here is our guessing of the current dynamic type based on
the observed read, that's because the read picks up *py but the reads
alias-set is not equal to or a subset of the *py stores alias-set (but
it conflicts since the subset relation is the other way around).

What we know from the union read is that the dynamic type is either the
union type (so the read can pun) or the dynamic type is the union member
type (then the read cannot pun).  6.5/7 allows accesses using
"an aggregate or union type that includes [a type compatible with the effective
type of the object]".  Though the 'aggregate' case appears quite odd to me,
does it really allow 'int i; struct { float f; int i; } *p =  float x =
p->f;'?  GCC doesn't allow this already based on the fact that sizeof (*p)
is larger than 'i', thus would even disallow it in case the 'f' member was
'int' as well.

[Bug tree-optimization/101641] Bogus redundant store removal

2021-07-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641

--- Comment #2 from Richard Biener  ---
And the DOM copy has been split out to refs_same_for_tbaa_p, now also used
by RTL CSE and DSE (which do the same transform), see the fix for PR93946.

[Bug tree-optimization/101641] Bogus redundant store removal

2021-07-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2021-07-27
 Status|UNCONFIRMED |ASSIGNED
   Keywords||wrong-code

--- Comment #1 from Richard Biener  ---
We're already going great lengths trying to preserve stores that change the
dynamic type in a way relevant for downstream users (which we actually do not
see).  In particular we've settled to

  alias_set_type set = ao_ref_alias_set (_ref);
  alias_set_type base_set = ao_ref_base_alias_set (_ref);
  if ((vnresult->set != set
   && ! alias_set_subset_of (set, vnresult->set))
  || (vnresult->base_set != base_set
  && ! alias_set_subset_of (base_set, vnresult->base_set)))
resultsame = false;

but this case is so degenerate that all the alias sets and base alias sets
of both pu->y (vnresult) and pu->x (lhs_ref) are 2 since we use the alias
set of the union type for all accesses that have the union in their path.

The only remaining chance we have here is to look at the actual types
and reject punning attempts.  We are really missing a way to preserve
the TBAA effect of a store but not the store itself.