[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 Jakub Jelinek changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #14 from Jakub Jelinek --- Fixed.
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #13 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Fri Mar 11 12:08:01 2016 New Revision: 234138 URL: https://gcc.gnu.org/viewcvs?rev=234138=gcc=rev Log: Fix PR/70013 gcc: PR tree-optimization/70013 * tree-sra.c (analyze_access_subtree): Also set grp_unscalarized_data for constant-pool entries. gcc/testsuite: * gcc.dg/tree-ssa/sra-20.c: New. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/sra-20.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-sra.c
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #12 from alalaw01 at gcc dot gnu.org --- Thanks, Martin - yes, I see. Patch posted at https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00680.html after full regtest.
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #11 from Martin Jambor --- (In reply to alalaw01 from comment #10) > which is much saner. But I don't really understand why the PARM_DECL case > that I'm adding to here is that way SRA tries to avoid generating unnecessary aggregate loads if it can figure out that that an area of an aggregate does not have any (unscalarized) data in it. In fact, it even removes existing loads of uninitialized stuff. The pass is not flow sensitive, so it can only figure out that a bit of an aggregate does not have anything sensible in it if there is no store to it anywhere in the function. However, if the aggregate in question is a parameter, then of course there is a store it cannot see, which happens in the caller. The constants are of course very similar, they contain value even though there is no store to them anywhere in the function body. So that is why they must be considered as containing (unscalarized) data, which is why you must also set the flag for them. Does this explain things?
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #10 from alalaw01 at gcc dot gnu.org --- Hmmm, so this fixes the ICE, generating: SR.5_12 = MEM[(struct S0[2] *)&*.LC0].f0; MEM[(struct S0[2] *)&*.LC0].f0 = SR.5_12; d = *.LC0; d$3$f0_14 = MEM[(struct S0[2] *)&*.LC0 + 3B].f0; d$0$f0_7 = SR.5_12; e$f0_9 = d$3$f0_14; _3 = (int) d$0$f0_7; c = _3; _5 = (int) e$f0_9; __builtin_printf ("%x\n", _5); d ={v} {CLOBBER}; return 0; which in -fdump-tree-optimized (at -O1) looks like: SR.5_12 = MEM[(struct S0[2] *)&*.LC0].f0; d$3$f0_14 = MEM[(struct S0[2] *)&*.LC0 + 3B].f0; _3 = (int) SR.5_12; c = _3; _5 = (int) d$3$f0_14; __builtin_printf ("%x\n", _5); return 0; which is much saner. But I don't really understand why the PARM_DECL case that I'm adding to here is that way (since r147980 "New implementation of SRA" in 2009, https://gcc.gnu.org/ml/gcc-patches/2009-04/msg02218.html)... Bootstrapped+regtest on AArch64 (c,c++) and ARM (c,c++,ada), no regressions. (Constants don't get pushed into the pool on x86.) diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 72157edd02e3235e57b786bbf460c94b0c52b2c5..24eac6ae7c4dcd41358b1a020047076afe1a8106 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -2427,7 +2427,8 @@ analyze_access_subtree (struct access *root, struct access *parent, if (!hole || root->grp_total_scalarization) root->grp_covered = 1; - else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL) + else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL + || constant_decl_p (root->base)) root->grp_unscalarized_data = 1; /* not covered and written to */ return sth_created; }
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #9 from alalaw01 at gcc dot gnu.org --- In analyze_access_subtree (since r147980, "New implementation of SRA", 2009): else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL) root->grp_unscalarized_data = 1; /* not covered and written to */ adding a case for constant_decl_p alongside the PARM_DECL case, fixes the ICE; AArch64 bootstrap in progress.
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #8 from Martin Jambor --- That is what I suspected. Please have at look why analyze_access_subtree (which has to set the grp_unscalarized_data flag) acts differently then.
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #7 from alalaw01 at gcc dot gnu.org --- *second* half, sorry. grp_to_be_replaced is here true, but grp_unscalarized_data is false, so handle_unscalarized_data_in_subtree sets sad->refreshed=UDH_LEFT and we build the access to the LHS. (Then, load_assign_lhs_subreplacements exits, and the caller sees UDH_LEFT and removes the original block move statement.) In contrast, on a similar testcase using a parameter rather than *.LC0, grp_unscalarized_data is true, handle_unscalarized_data_in_subtree sets sad->refreshed=UDH_RIGHT and we build an access to the RHS, which is OK; and leave the block move statement in place, hence correctness.
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #6 from alalaw01 at gcc dot gnu.org --- Ugh, initializing the scalar replacement for the first half of d, with a value read from the first half of d (should be from the first half of *.LC0).
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #5 from alalaw01 at gcc dot gnu.org --- Prior to SRA, we have d = *.LC0; d$0$f0_7 = MEM[(struct S0[2] *)&*.LC0].f0; e$f0_9 = MEM[(struct S0[2] *) + 3B].f0; _3 = (int) d$0$f0_7; c = _3; _5 = (int) e$f0_9; __builtin_printf ("%x\n", _5); sra_modify_assign for d=*.LC0 ends up in load_assign_lhs_subreplacements, where d has two children; the second is grp_to_be_replaced, but because we did not completely_scalarize LC0, there is an access to only the first half of *.LC0, and no corresponding RHS for the second half of d ('racc = find_access_in_subtree (sad->top_racc, offset, lacc->size' returns null). So we generate the bad d$3$f0_14 = MEM[(struct S0[2] *) + 3B].f0; that is, initializing the scalar replacement for the second half of d, with a value read from the first half of d.
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 alalaw01 at gcc dot gnu.org changed: What|Removed |Added CC||alalaw01 at gcc dot gnu.org --- Comment #4 from alalaw01 at gcc dot gnu.org --- Hmmm. First thing I notice is that the type of d (struct S0[2]) is not scalarizable_type_p, but passes type_internals_preclude_sra_p. Changing the latter to bail out on DECL_BIT_FIELD (as the former does) fixes the ICE, but I'm not yet sure we want to do that.
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 Martin Jambor changed: What|Removed |Added CC||alan.lawrence at arm dot com --- Comment #3 from Martin Jambor --- This started with r232506 (Make SRA scalarize constant-pool loads).
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 Andrew Pinski changed: What|Removed |Added Keywords||wrong-code Target||aarch64-linux-gnu ||powerpc64el-linux-gnu Known to work||4.8.2 Target Milestone|--- |6.0 Summary|packed structure tree-sra |[6 Regression] packed |loses initialization|structure tree-sra loses ||initialization --- Comment #2 from Andrew Pinski --- This used to work with 4.8.2 on aarch64-linux-gnu but fails on the trunk the same way as mentioned on powerpc64el.