[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from Jakub Jelinek  ---
Fixed.

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-11 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

--- Comment #13 from alalaw01 at gcc dot gnu.org ---
Author: alalaw01
Date: Fri Mar 11 12:08:01 2016
New Revision: 234138

URL: https://gcc.gnu.org/viewcvs?rev=234138=gcc=rev
Log:
Fix PR/70013

gcc:

PR tree-optimization/70013
* tree-sra.c (analyze_access_subtree): Also set grp_unscalarized_data
for constant-pool entries.

gcc/testsuite:

* gcc.dg/tree-ssa/sra-20.c: New.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/sra-20.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-sra.c

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-11 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

--- Comment #12 from alalaw01 at gcc dot gnu.org ---
Thanks, Martin - yes, I see.

Patch posted at https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00680.html after
full regtest.

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-10 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

--- Comment #11 from Martin Jambor  ---
(In reply to alalaw01 from comment #10)
> which is much saner. But I don't really understand why the PARM_DECL case
> that I'm adding to here is that way 

SRA tries to avoid generating unnecessary aggregate loads if it can
figure out that that an area of an aggregate does not have any
(unscalarized) data in it.  In fact, it even removes existing loads of
uninitialized stuff.  The pass is not flow sensitive, so it can only
figure out that a bit of an aggregate does not have anything sensible
in it if there is no store to it anywhere in the function.

However, if the aggregate in question is a parameter, then of course
there is a store it cannot see, which happens in the caller.  The
constants are of course very similar, they contain value even though
there is no store to them anywhere in the function body.  So that is
why they must be considered as containing (unscalarized) data, which
is why you must also set the flag for them.  Does this explain things?

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-10 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-09 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

--- Comment #10 from alalaw01 at gcc dot gnu.org ---
Hmmm, so this fixes the ICE, generating:

  SR.5_12 = MEM[(struct S0[2] *)&*.LC0].f0;
  MEM[(struct S0[2] *)&*.LC0].f0 = SR.5_12;
  d = *.LC0;
  d$3$f0_14 = MEM[(struct S0[2] *)&*.LC0 + 3B].f0;
  d$0$f0_7 = SR.5_12;
  e$f0_9 = d$3$f0_14;
  _3 = (int) d$0$f0_7;
  c = _3;
  _5 = (int) e$f0_9;
  __builtin_printf ("%x\n", _5);
  d ={v} {CLOBBER};
  return 0;

which in -fdump-tree-optimized (at -O1) looks like:

  SR.5_12 = MEM[(struct S0[2] *)&*.LC0].f0;
  d$3$f0_14 = MEM[(struct S0[2] *)&*.LC0 + 3B].f0;
  _3 = (int) SR.5_12;
  c = _3;
  _5 = (int) d$3$f0_14;
  __builtin_printf ("%x\n", _5);
  return 0;

which is much saner. But I don't really understand why the PARM_DECL case that
I'm adding to here is that way (since r147980 "New implementation of SRA" in
2009, https://gcc.gnu.org/ml/gcc-patches/2009-04/msg02218.html)...

Bootstrapped+regtest on AArch64 (c,c++) and ARM (c,c++,ada), no regressions.
(Constants don't get pushed into the pool on x86.)

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index
72157edd02e3235e57b786bbf460c94b0c52b2c5..24eac6ae7c4dcd41358b1a020047076afe1a8106
100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2427,7 +2427,8 @@ analyze_access_subtree (struct access *root, struct
access *parent,

   if (!hole || root->grp_total_scalarization)
 root->grp_covered = 1;
-  else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
+  else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL
+  || constant_decl_p (root->base))
 root->grp_unscalarized_data = 1; /* not covered and written to */
   return sth_created;
 }

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

--- Comment #9 from alalaw01 at gcc dot gnu.org ---
In analyze_access_subtree (since r147980, "New implementation of SRA", 2009):

  else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
root->grp_unscalarized_data = 1; /* not covered and written to */

adding a case for constant_decl_p alongside the PARM_DECL case, fixes the ICE;
AArch64 bootstrap in progress.

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

--- Comment #8 from Martin Jambor  ---
That is what I suspected.  Please have at look why
analyze_access_subtree (which has to set the grp_unscalarized_data
flag) acts differently then.

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

--- Comment #7 from alalaw01 at gcc dot gnu.org ---
*second* half, sorry. grp_to_be_replaced is here true, but
grp_unscalarized_data is false, so handle_unscalarized_data_in_subtree sets
sad->refreshed=UDH_LEFT and we build the access to the LHS. (Then,
load_assign_lhs_subreplacements exits, and the caller sees UDH_LEFT and removes
the original block move statement.)

In contrast, on a similar testcase using a parameter rather than *.LC0,
grp_unscalarized_data is true, handle_unscalarized_data_in_subtree sets
sad->refreshed=UDH_RIGHT and we build an access to the RHS, which is OK; and
leave the block move statement in place, hence correctness.

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

--- Comment #6 from alalaw01 at gcc dot gnu.org ---
Ugh, initializing the scalar replacement for the first half of d, with a value
read from the first half of d (should be from the first half of *.LC0).

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

--- Comment #5 from alalaw01 at gcc dot gnu.org ---
Prior to SRA, we have
  d = *.LC0;
  d$0$f0_7 = MEM[(struct S0[2] *)&*.LC0].f0;
  e$f0_9 = MEM[(struct S0[2] *) + 3B].f0;
  _3 = (int) d$0$f0_7;
  c = _3;
  _5 = (int) e$f0_9;
  __builtin_printf ("%x\n", _5);

sra_modify_assign for d=*.LC0 ends up in load_assign_lhs_subreplacements, where
d has two children; the second is grp_to_be_replaced, but because we did not 
completely_scalarize LC0, there is an access to only the first half of *.LC0,
and no corresponding RHS for the second half of d ('racc =
find_access_in_subtree (sad->top_racc, offset, lacc->size' returns null). So we
generate the bad

d$3$f0_14 = MEM[(struct S0[2] *) + 3B].f0;

that is, initializing the scalar replacement for the second half of d, with a
value read from the first half of d.

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

alalaw01 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||alalaw01 at gcc dot gnu.org

--- Comment #4 from alalaw01 at gcc dot gnu.org ---
Hmmm. First thing I notice is that the type of d (struct S0[2]) is not
scalarizable_type_p, but passes type_internals_preclude_sra_p. Changing the
latter to bail out on DECL_BIT_FIELD (as the former does) fixes the ICE, but
I'm not yet sure we want to do that.

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-04 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

Martin Jambor  changed:

   What|Removed |Added

 CC||alan.lawrence at arm dot com

--- Comment #3 from Martin Jambor  ---
This started with r232506 (Make SRA scalarize constant-pool loads).

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-02 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||wrong-code
 Target||aarch64-linux-gnu
   ||powerpc64el-linux-gnu
  Known to work||4.8.2
   Target Milestone|--- |6.0
Summary|packed structure tree-sra   |[6 Regression] packed
   |loses initialization|structure tree-sra loses
   ||initialization

--- Comment #2 from Andrew Pinski  ---
This used to work with 4.8.2 on aarch64-linux-gnu but fails on the trunk the
same way as mentioned on powerpc64el.