[Bug tree-optimization/101641] Bogus redundant store removal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641 --- Comment #7 from Richard Biener --- Wow, and this time it's even combine coming into play! (insn 10 9 11 2 (set (reg/v:DI 82 [ xy ]) (mem/j:DI (reg/v/f:DI 86 [ pu ]) [2 pu_6(D)->y+0 S8 A64])) "t.i":12:8 76 {*movdi_internal} (nil)) (insn 11 10 12 2 (set (mem/j:DI (reg/v/f:DI 86 [ pu ]) [2 pu_6(D)->x+0 S8 A64]) (reg/v:DI 82 [ xy ])) "t.i":13:9 76 {*movdi_internal} (expr_list:REG_DEAD (reg/v/f:DI 86 [ pu ]) (expr_list:REG_DEAD (reg/v:DI 82 [ xy ]) (nil Trying 10 -> 11: 10: r82:DI=[r86:DI] 11: [r86:DI]=r82:DI REG_DEAD r86:DI REG_DEAD r82:DI Failed to match this instruction: (set (mem/j:DI (reg/v/f:DI 86 [ pu ]) [2 pu_6(D)->x+0 S8 A64]) (mem/j:DI (reg/v/f:DI 86 [ pu ]) [2 pu_6(D)->y+0 S8 A64])) allowing combination of insns 10 and 11 original costs 4 + 4 = 8 replacement cost 4 deferring deletion of insn with uid = 10. modifying insn i311: [r86:DI]=[r86:DI] REG_DEAD r86:DI deferring rescan insn with uid = 11. Trying 4 -> 11: 4: r86:DI=r91:DI REG_DEAD r91:DI 11: [r86:DI]=[r86:DI] REG_DEAD r86:DI Failed to match this instruction: (set (mem/j:DI (reg:DI 91) [2 pu_6(D)->x+0 S8 A64]) (mem/j:DI (reg:DI 91) [2 pu_6(D)->y+0 S8 A64])) allowing combination of insns 4 and 11 original costs 4 + 4 = 8 replacement cost 4 deferring deletion of insn with uid = 4. modifying insn i311: [r91:DI]=[r91:DI] REG_DEAD r91:DI deferring rescan insn with uid = 11. deleting noop move 11 somewhere inside combine we'd have to realize that this isn't a noop move and then maybe not allow the combination in the first place since it isn't recognizable? That is, somehow we must anticipate the removal, I suppose it is via /* Recognize all noop sets, these will be killed by followup pass. */ if (insn_code_number < 0 && GET_CODE (pat) == SET && set_noop_p (pat)) insn_code_number = NOOP_MOVE_INSN_CODE, num_clobbers_to_add = 0; where set_noop_p for two MEMs simply dispatches to rtx_equal_p && !side_effects_p. Note on RTL we see that we cannot rely on MEM_ALIAS_SET but have to use MEM_EXPR to conservatively assess that the access is _not_ through a union ... (or as said we could annotate the alias set entry as to belonging to a union). In the end how we handle TBAA and unions might not be the very best way (but I can't offer something better yet).
[Bug tree-optimization/101641] Bogus redundant store removal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641 Richard Biener changed: What|Removed |Added CC||davmac at davmac dot org --- Comment #6 from Richard Biener --- *** Bug 102268 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/101641] Bogus redundant store removal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641 --- Comment #5 from Martin Uecker --- (In reply to Richard Biener from comment #3) > > What we know from the union read is that the dynamic type is either the > union type (so the read can pun) or the dynamic type is the union member > type (then the read cannot pun). 6.5/7 allows accesses using > "an aggregate or union type that includes [a type compatible with the > effective > type of the object]". Though the 'aggregate' case appears quite odd to me, > does it really allow 'int i; struct { float f; int i; } *p = float x = > p->f;'? I do not think this is meant to work as the read is using an lvalue of type float and that is not covered by the rules. I believe that the aggregate case is meant for the case where sub objects are modified as part of struct/union assignment (i.e. when the lvalue used for the access is of struct/union type). But the rules could be interpreted to allow: int i; struct { float f; int i; } *p = int j = p->i; which I think should not be allowed.
[Bug tree-optimization/101641] Bogus redundant store removal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641 --- Comment #4 from Richard Biener --- Just some more brain-dumps from thinking about a fix. - we can annotate the alias_set_entry with a flag whether it was created for a union type and use that to improve the logic - we can introduce some dynamic-type change IL elements that would allow us to elide all those redundant stores but preserve their effect. They'd be modeled as stores but would generate no code. The stored value would be implicit so we can readily remove the load (or constant). An internal function call like we have for masked stores would be a possibility but those would be quite disruptive to passes compared to the load/store sequence, so preserving the original store but with a special RHS seems most logical (but we cannot use RHS == LHS as that's only valid GIMPLE for non-register-types). We'd also have to be careful to not treat those "stores" as kills which leans towards the IFN again. A fix along the first idea looks least intrusive for backporting. A fix along the second idea looks best for recovering lost redundant store removal.
[Bug tree-optimization/101641] Bogus redundant store removal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641 --- Comment #3 from Richard Biener --- With union punning we've basically opened up a way to change the dynamic type of storage in a way that expects us from generating no code from a "virtual" read-write cycle involving the previous and the next dynamic type of the storage. Now the checks we're currently performing try to assess whether such a change takes place but given we don't really know the current dynamic type we're guessing based on the "read" which constrains that dynamic type. And we're willing to elide changes to a type that is strictly more restrictive as to what followon accesses it allows. What goes wrong here is our guessing of the current dynamic type based on the observed read, that's because the read picks up *py but the reads alias-set is not equal to or a subset of the *py stores alias-set (but it conflicts since the subset relation is the other way around). What we know from the union read is that the dynamic type is either the union type (so the read can pun) or the dynamic type is the union member type (then the read cannot pun). 6.5/7 allows accesses using "an aggregate or union type that includes [a type compatible with the effective type of the object]". Though the 'aggregate' case appears quite odd to me, does it really allow 'int i; struct { float f; int i; } *p = float x = p->f;'? GCC doesn't allow this already based on the fact that sizeof (*p) is larger than 'i', thus would even disallow it in case the 'f' member was 'int' as well.
[Bug tree-optimization/101641] Bogus redundant store removal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641 --- Comment #2 from Richard Biener --- And the DOM copy has been split out to refs_same_for_tbaa_p, now also used by RTL CSE and DSE (which do the same transform), see the fix for PR93946.
[Bug tree-optimization/101641] Bogus redundant store removal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2021-07-27 Status|UNCONFIRMED |ASSIGNED Keywords||wrong-code --- Comment #1 from Richard Biener --- We're already going great lengths trying to preserve stores that change the dynamic type in a way relevant for downstream users (which we actually do not see). In particular we've settled to alias_set_type set = ao_ref_alias_set (_ref); alias_set_type base_set = ao_ref_base_alias_set (_ref); if ((vnresult->set != set && ! alias_set_subset_of (set, vnresult->set)) || (vnresult->base_set != base_set && ! alias_set_subset_of (base_set, vnresult->base_set))) resultsame = false; but this case is so degenerate that all the alias sets and base alias sets of both pu->y (vnresult) and pu->x (lhs_ref) are 2 since we use the alias set of the union type for all accesses that have the union in their path. The only remaining chance we have here is to look at the actual types and reject punning attempts. We are really missing a way to preserve the TBAA effect of a store but not the store itself.