Re: [PATCH 3/5] Build ARRAY_REFs when the base is of ARRAY_TYPE.

2015-08-26 Thread Richard Biener
On Tue, Aug 25, 2015 at 9:50 PM, Jeff Law l...@redhat.com wrote:
 On 08/25/2015 05:06 AM, Alan Lawrence wrote:

 When SRA completely scalarizes an array, this patch changes the
 generated accesses from e.g.

 MEM[(int[8] *)a + 4B] = 1;

 to

 a[1] = 1;

 This overcomes a limitation in dom2, that accesses to equivalent
 chunks of e.g. MEM[(int[8] *)a] are not hashable_expr_equal_p with
 accesses to e.g. MEM[(int[8] *)a]. This is necessary for constant
 propagation in the ssa-dom-cse-2.c testcase (after the next patch
 that makes SRA handle constant-pool loads).

 I tried to work around this by making dom2's hashable_expr_equal_p
 less conservative, but found that on platforms without AArch64's
 vectorized reductions (specifically Alpha, hppa, PowerPC, and SPARC,
 mentioned in ssa-dom-cse-2.c), I also needed to make MEM[(int[8]
 *)a] equivalent to a[0], etc.; a complete overhaul of
 hashable_expr_equal_p seems like a larger task than this patch
 series.

 I can't see how to write a testcase for this in C though as direct
 assignment to an array is not possible; such assignments occur only
 with constant pool data, which is dealt with in the next patch.

 It's a general issue that if there's  1 common way to represent an
 expression, then DOM will often miss discovery of the CSE opportunity
 because of the way it hashes expressions.

 Ideally we'd be moving to a canonical form, but I also realize that in
 the case of memory references like this, that may not be feasible.

 It does make me wonder how many CSEs we're really missing due to the two
 ways to represent array accesses.


 Bootstrap + check-gcc on x86-none-linux-gnu,
 arm-none-linux-gnueabihf, aarch64-none-linux-gnu.

 gcc/ChangeLog:

 * tree-sra.c (completely_scalarize): Move some code into:
 (get_elem_size): New. (build_ref_for_offset): Build ARRAY_REF if base
 is aligned array. --- gcc/tree-sra.c | 110
 - 1 file
 changed, 69 insertions(+), 41 deletions(-)

 diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 08fa8dc..af35fcc
 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -957,6 +957,20 @@
 scalarizable_type_p (tree type) } }

 +static bool +get_elem_size (const_tree type, unsigned HOST_WIDE_INT
 *sz_out)

 Function comment needed.

 I may have missed it in the earlier patches, but can you please make
 sure any new functions you created have comments in those as well.  Such
 patches are pre-approved.

 With the added function comment, this patch is fine.

Err ... you generally _cannot_ create ARRAY_REFs out of thin air because
of correctness issues with data-ref and data dependence analysis.  You can
of course keep ARRAY_REFs if the original access was an ARRAY_REF.

But I'm not convinced this is what the pass does.

We've went through great lengths removing all the code from gimplification
and folding that tried to be clever in producing array refs from accesses to
sth with an ARRAY_TYPE - this all eventually lead to wrong-code issues
later.

So I'd rather _not_ have this patch.  (as always I'm too slow responding
and Jeff is too fast ;))

Thanks,
Richard.

 jeff




Re: [PATCH 3/5] Build ARRAY_REFs when the base is of ARRAY_TYPE.

2015-08-26 Thread Richard Biener
On Wed, 26 Aug 2015, Bin.Cheng wrote:

 On Wed, Aug 26, 2015 at 3:50 AM, Jeff Law l...@redhat.com wrote:
  On 08/25/2015 05:06 AM, Alan Lawrence wrote:
 
  When SRA completely scalarizes an array, this patch changes the
  generated accesses from e.g.
 
  MEM[(int[8] *)a + 4B] = 1;
 
  to
 
  a[1] = 1;
 
  This overcomes a limitation in dom2, that accesses to equivalent
  chunks of e.g. MEM[(int[8] *)a] are not hashable_expr_equal_p with
  accesses to e.g. MEM[(int[8] *)a]. This is necessary for constant
  propagation in the ssa-dom-cse-2.c testcase (after the next patch
  that makes SRA handle constant-pool loads).
 
  I tried to work around this by making dom2's hashable_expr_equal_p
  less conservative, but found that on platforms without AArch64's
  vectorized reductions (specifically Alpha, hppa, PowerPC, and SPARC,
  mentioned in ssa-dom-cse-2.c), I also needed to make MEM[(int[8]
  *)a] equivalent to a[0], etc.; a complete overhaul of
  hashable_expr_equal_p seems like a larger task than this patch
  series.
 
  I can't see how to write a testcase for this in C though as direct
  assignment to an array is not possible; such assignments occur only
  with constant pool data, which is dealt with in the next patch.
 
  It's a general issue that if there's  1 common way to represent an
  expression, then DOM will often miss discovery of the CSE opportunity
  because of the way it hashes expressions.
 
  Ideally we'd be moving to a canonical form, but I also realize that in
  the case of memory references like this, that may not be feasible.
 IIRC, there were talks about lowering all memory reference on GIMPLE?
 Which is the reverse approach.  Since SRA is in quite early
 compilation stage, don't know if lowered memory reference has impact
 on other optimizers.

Yeah, I'd only do the lowering after loop opts.  Which also may make
the DOM issue moot as the array refs would be lowered as well and thus
DOM would see a consistent set of references again.  The lowering should
also simplify SLSR and expose address computation redundancies to DOM.

I'd place such lowering before the late reassoc (any takers?  I suppose
you can pick up one of the bitfield lowering passes posted in the
previous years as this should also handle bitfield accesses correctly).

Thanks,
Richard.

 Thanks,
 bin
 
  It does make me wonder how many CSEs we're really missing due to the two
  ways to represent array accesses.
 
 
  Bootstrap + check-gcc on x86-none-linux-gnu,
  arm-none-linux-gnueabihf, aarch64-none-linux-gnu.
 
  gcc/ChangeLog:
 
  * tree-sra.c (completely_scalarize): Move some code into:
  (get_elem_size): New. (build_ref_for_offset): Build ARRAY_REF if base
  is aligned array. --- gcc/tree-sra.c | 110
  - 1 file
  changed, 69 insertions(+), 41 deletions(-)
 
  diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 08fa8dc..af35fcc
  100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -957,6 +957,20 @@
  scalarizable_type_p (tree type) } }
 
  +static bool +get_elem_size (const_tree type, unsigned HOST_WIDE_INT
  *sz_out)
 
  Function comment needed.
 
  I may have missed it in the earlier patches, but can you please make
  sure any new functions you created have comments in those as well.  Such
  patches are pre-approved.
 
  With the added function comment, this patch is fine.
 
  jeff
 
 
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH 3/5] Build ARRAY_REFs when the base is of ARRAY_TYPE.

2015-08-26 Thread Bin.Cheng
On Wed, Aug 26, 2015 at 3:29 PM, Richard Biener rguent...@suse.de wrote:
 On Wed, 26 Aug 2015, Bin.Cheng wrote:

 On Wed, Aug 26, 2015 at 3:50 AM, Jeff Law l...@redhat.com wrote:
  On 08/25/2015 05:06 AM, Alan Lawrence wrote:
 
  When SRA completely scalarizes an array, this patch changes the
  generated accesses from e.g.
 
  MEM[(int[8] *)a + 4B] = 1;
 
  to
 
  a[1] = 1;
 
  This overcomes a limitation in dom2, that accesses to equivalent
  chunks of e.g. MEM[(int[8] *)a] are not hashable_expr_equal_p with
  accesses to e.g. MEM[(int[8] *)a]. This is necessary for constant
  propagation in the ssa-dom-cse-2.c testcase (after the next patch
  that makes SRA handle constant-pool loads).
 
  I tried to work around this by making dom2's hashable_expr_equal_p
  less conservative, but found that on platforms without AArch64's
  vectorized reductions (specifically Alpha, hppa, PowerPC, and SPARC,
  mentioned in ssa-dom-cse-2.c), I also needed to make MEM[(int[8]
  *)a] equivalent to a[0], etc.; a complete overhaul of
  hashable_expr_equal_p seems like a larger task than this patch
  series.
 
  I can't see how to write a testcase for this in C though as direct
  assignment to an array is not possible; such assignments occur only
  with constant pool data, which is dealt with in the next patch.
 
  It's a general issue that if there's  1 common way to represent an
  expression, then DOM will often miss discovery of the CSE opportunity
  because of the way it hashes expressions.
 
  Ideally we'd be moving to a canonical form, but I also realize that in
  the case of memory references like this, that may not be feasible.
 IIRC, there were talks about lowering all memory reference on GIMPLE?
 Which is the reverse approach.  Since SRA is in quite early
 compilation stage, don't know if lowered memory reference has impact
 on other optimizers.

 Yeah, I'd only do the lowering after loop opts.  Which also may make
 the DOM issue moot as the array refs would be lowered as well and thus
 DOM would see a consistent set of references again.  The lowering should
 also simplify SLSR and expose address computation redundancies to DOM.

 I'd place such lowering before the late reassoc (any takers?  I suppose
 you can pick up one of the bitfield lowering passes posted in the
 previous years as this should also handle bitfield accesses correctly).
I ran into several issues related to lowered memory references (some
of them are about slsr), and want to have a look at this.  But only
after finishing major issues in IVO...

As for slsr, I think the problem is more about we need to prove
equality of expressions by diving into definition chain of ssa_var,
just like tree_to_affine_expand.  I think this has already been
discussed too.  Anyway, lowering memory reference provides a canonical
form and should benefit other optimizers.

Thanks,
bin

 Thanks,
 Richard.

 Thanks,
 bin
 
  It does make me wonder how many CSEs we're really missing due to the two
  ways to represent array accesses.
 
 
  Bootstrap + check-gcc on x86-none-linux-gnu,
  arm-none-linux-gnueabihf, aarch64-none-linux-gnu.
 
  gcc/ChangeLog:
 
  * tree-sra.c (completely_scalarize): Move some code into:
  (get_elem_size): New. (build_ref_for_offset): Build ARRAY_REF if base
  is aligned array. --- gcc/tree-sra.c | 110
  - 1 file
  changed, 69 insertions(+), 41 deletions(-)
 
  diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 08fa8dc..af35fcc
  100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -957,6 +957,20 @@
  scalarizable_type_p (tree type) } }
 
  +static bool +get_elem_size (const_tree type, unsigned HOST_WIDE_INT
  *sz_out)
 
  Function comment needed.
 
  I may have missed it in the earlier patches, but can you please make
  sure any new functions you created have comments in those as well.  Such
  patches are pre-approved.
 
  With the added function comment, this patch is fine.
 
  jeff
 
 



 --
 Richard Biener rguent...@suse.de
 SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
 21284 (AG Nuernberg)


Re: [PATCH 3/5] Build ARRAY_REFs when the base is of ARRAY_TYPE.

2015-08-25 Thread Bin.Cheng
On Wed, Aug 26, 2015 at 3:50 AM, Jeff Law l...@redhat.com wrote:
 On 08/25/2015 05:06 AM, Alan Lawrence wrote:

 When SRA completely scalarizes an array, this patch changes the
 generated accesses from e.g.

 MEM[(int[8] *)a + 4B] = 1;

 to

 a[1] = 1;

 This overcomes a limitation in dom2, that accesses to equivalent
 chunks of e.g. MEM[(int[8] *)a] are not hashable_expr_equal_p with
 accesses to e.g. MEM[(int[8] *)a]. This is necessary for constant
 propagation in the ssa-dom-cse-2.c testcase (after the next patch
 that makes SRA handle constant-pool loads).

 I tried to work around this by making dom2's hashable_expr_equal_p
 less conservative, but found that on platforms without AArch64's
 vectorized reductions (specifically Alpha, hppa, PowerPC, and SPARC,
 mentioned in ssa-dom-cse-2.c), I also needed to make MEM[(int[8]
 *)a] equivalent to a[0], etc.; a complete overhaul of
 hashable_expr_equal_p seems like a larger task than this patch
 series.

 I can't see how to write a testcase for this in C though as direct
 assignment to an array is not possible; such assignments occur only
 with constant pool data, which is dealt with in the next patch.

 It's a general issue that if there's  1 common way to represent an
 expression, then DOM will often miss discovery of the CSE opportunity
 because of the way it hashes expressions.

 Ideally we'd be moving to a canonical form, but I also realize that in
 the case of memory references like this, that may not be feasible.
IIRC, there were talks about lowering all memory reference on GIMPLE?
Which is the reverse approach.  Since SRA is in quite early
compilation stage, don't know if lowered memory reference has impact
on other optimizers.

Thanks,
bin

 It does make me wonder how many CSEs we're really missing due to the two
 ways to represent array accesses.


 Bootstrap + check-gcc on x86-none-linux-gnu,
 arm-none-linux-gnueabihf, aarch64-none-linux-gnu.

 gcc/ChangeLog:

 * tree-sra.c (completely_scalarize): Move some code into:
 (get_elem_size): New. (build_ref_for_offset): Build ARRAY_REF if base
 is aligned array. --- gcc/tree-sra.c | 110
 - 1 file
 changed, 69 insertions(+), 41 deletions(-)

 diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 08fa8dc..af35fcc
 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -957,6 +957,20 @@
 scalarizable_type_p (tree type) } }

 +static bool +get_elem_size (const_tree type, unsigned HOST_WIDE_INT
 *sz_out)

 Function comment needed.

 I may have missed it in the earlier patches, but can you please make
 sure any new functions you created have comments in those as well.  Such
 patches are pre-approved.

 With the added function comment, this patch is fine.

 jeff




Re: [PATCH 3/5] Build ARRAY_REFs when the base is of ARRAY_TYPE.

2015-08-25 Thread Martin Jambor
Hi,

On Tue, Aug 25, 2015 at 12:06:15PM +0100, Alan Lawrence wrote:
 When SRA completely scalarizes an array, this patch changes the
 generated accesses from e.g.
 
MEM[(int[8] *)a + 4B] = 1;
 
 to
 
a[1] = 1;
 
 This overcomes a limitation in dom2, that accesses to equivalent
 chunks of e.g. MEM[(int[8] *)a] are not hashable_expr_equal_p with
 accesses to e.g. MEM[(int[8] *)a]. This is necessary for constant
 propagation in the ssa-dom-cse-2.c testcase (after the next patch
 that makes SRA handle constant-pool loads).
 
 I tried to work around this by making dom2's hashable_expr_equal_p
 less conservative, but found that on platforms without AArch64's
 vectorized reductions (specifically Alpha, hppa, PowerPC, and SPARC,
 mentioned in ssa-dom-cse-2.c), I also needed to make MEM[(int[8]
 *)a] equivalent to a[0], etc.; a complete overhaul of
 hashable_expr_equal_p seems like a larger task than this patch
 series.

Uff.  Well, while I am obviously not excited about such workaround
landing in SRA, if maintainers agree that it is reasonable, I suppose
I'll manage to live with it.

I also have more specific comments:

 
 I can't see how to write a testcase for this in C though as direct assignment 
 to an array is not possible; such assignments occur only with constant pool 
 data, which is dealt with in the next patch.
 
 Bootstrap + check-gcc on x86-none-linux-gnu, arm-none-linux-gnueabihf, 
 aarch64-none-linux-gnu.
 
 gcc/ChangeLog:
 
   * tree-sra.c (completely_scalarize): Move some code into:
   (get_elem_size): New.
   (build_ref_for_offset): Build ARRAY_REF if base is aligned array.
 ---
  gcc/tree-sra.c | 110 
 -
  1 file changed, 69 insertions(+), 41 deletions(-)
 
 diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
 index 08fa8dc..af35fcc 100644
 --- a/gcc/tree-sra.c
 +++ b/gcc/tree-sra.c
 @@ -957,6 +957,20 @@ scalarizable_type_p (tree type)
}
  }
  
 +static bool
 +get_elem_size (const_tree type, unsigned HOST_WIDE_INT *sz_out)

As Jeff already pointed out, this function needs a comment.

 +{
 +  gcc_assert (TREE_CODE (type) == ARRAY_TYPE);
 +  tree t_size = TYPE_SIZE (TREE_TYPE (type));
 +  if (!t_size || !tree_fits_uhwi_p (t_size))
 +return false;
 +  unsigned HOST_WIDE_INT sz = tree_to_uhwi (t_size);
 +  if (!sz)
 +return false;
 +  *sz_out = sz;
 +  return true;
 +}
 +
  static void scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, tree, tree);
  
  /* Create total_scalarization accesses for all scalar fields of a member
 @@ -985,10 +999,9 @@ completely_scalarize (tree base, tree decl_type, 
 HOST_WIDE_INT offset, tree ref)
  case ARRAY_TYPE:
{
   tree elemtype = TREE_TYPE (decl_type);
 - tree elem_size = TYPE_SIZE (elemtype);
 - gcc_assert (elem_size  tree_fits_uhwi_p (elem_size));
 - int el_size = tree_to_uhwi (elem_size);
 - gcc_assert (el_size);
 + unsigned HOST_WIDE_INT el_size;
 + if (!get_elem_size (decl_type, el_size))
 +   gcc_assert (false);

This is usually written as gcc_unreachable ()

  
   tree minidx = TYPE_MIN_VALUE (TYPE_DOMAIN (decl_type));
   tree maxidx = TYPE_MAX_VALUE (TYPE_DOMAIN (decl_type));
 @@ -1563,7 +1576,7 @@ build_ref_for_offset (location_t loc, tree base, 
 HOST_WIDE_INT offset,
tree off;
tree mem_ref;
HOST_WIDE_INT base_offset;
 -  unsigned HOST_WIDE_INT misalign;
 +  unsigned HOST_WIDE_INT misalign, el_sz;
unsigned int align;
  
gcc_checking_assert (offset % BITS_PER_UNIT == 0);
 @@ -1572,47 +1585,62 @@ build_ref_for_offset (location_t loc, tree base, 
 HOST_WIDE_INT offset,
  
/* get_addr_base_and_unit_offset returns NULL for references with a 
 variable
   offset such as array[var_index].  */
 -  if (!base)
 -{
 -  gassign *stmt;
 -  tree tmp, addr;
 -
 -  gcc_checking_assert (gsi);
 -  tmp = make_ssa_name (build_pointer_type (TREE_TYPE (prev_base)));
 -  addr = build_fold_addr_expr (unshare_expr (prev_base));
 -  STRIP_USELESS_TYPE_CONVERSION (addr);
 -  stmt = gimple_build_assign (tmp, addr);
 -  gimple_set_location (stmt, loc);
 -  if (insert_after)
 - gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
 -  else
 - gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
 -
 -  off = build_int_cst (reference_alias_ptr_type (prev_base),
 -offset / BITS_PER_UNIT);
 -  base = tmp;
 -}
 -  else if (TREE_CODE (base) == MEM_REF)
 -{
 -  off = build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)),
 -base_offset + offset / BITS_PER_UNIT);
 -  off = int_const_binop (PLUS_EXPR, TREE_OPERAND (base, 1), off);
 -  base = unshare_expr (TREE_OPERAND (base, 0));
 +  if (base
 +   TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE
 +   misalign == 0
 +   get_elem_size (TREE_TYPE (base), el_sz)
 +   ((offset % el_sz) == 0)
 +   useless_type_conversion_p (exp_type, TREE_TYPE (TREE_TYPE (base)))
 +

Re: [PATCH 3/5] Build ARRAY_REFs when the base is of ARRAY_TYPE.

2015-08-25 Thread Jeff Law

On 08/25/2015 05:06 AM, Alan Lawrence wrote:

When SRA completely scalarizes an array, this patch changes the
generated accesses from e.g.

MEM[(int[8] *)a + 4B] = 1;

to

a[1] = 1;

This overcomes a limitation in dom2, that accesses to equivalent
chunks of e.g. MEM[(int[8] *)a] are not hashable_expr_equal_p with
accesses to e.g. MEM[(int[8] *)a]. This is necessary for constant
propagation in the ssa-dom-cse-2.c testcase (after the next patch
that makes SRA handle constant-pool loads).

I tried to work around this by making dom2's hashable_expr_equal_p
less conservative, but found that on platforms without AArch64's
vectorized reductions (specifically Alpha, hppa, PowerPC, and SPARC,
mentioned in ssa-dom-cse-2.c), I also needed to make MEM[(int[8]
*)a] equivalent to a[0], etc.; a complete overhaul of
hashable_expr_equal_p seems like a larger task than this patch
series.

I can't see how to write a testcase for this in C though as direct
assignment to an array is not possible; such assignments occur only
with constant pool data, which is dealt with in the next patch.

It's a general issue that if there's  1 common way to represent an
expression, then DOM will often miss discovery of the CSE opportunity
because of the way it hashes expressions.

Ideally we'd be moving to a canonical form, but I also realize that in
the case of memory references like this, that may not be feasible.

It does make me wonder how many CSEs we're really missing due to the two
ways to represent array accesses.



Bootstrap + check-gcc on x86-none-linux-gnu,
arm-none-linux-gnueabihf, aarch64-none-linux-gnu.

gcc/ChangeLog:

* tree-sra.c (completely_scalarize): Move some code into:
(get_elem_size): New. (build_ref_for_offset): Build ARRAY_REF if base
is aligned array. --- gcc/tree-sra.c | 110
- 1 file
changed, 69 insertions(+), 41 deletions(-)

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 08fa8dc..af35fcc
100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -957,6 +957,20 @@
scalarizable_type_p (tree type) } }

+static bool +get_elem_size (const_tree type, unsigned HOST_WIDE_INT
*sz_out)

Function comment needed.

I may have missed it in the earlier patches, but can you please make
sure any new functions you created have comments in those as well.  Such 
patches are pre-approved.


With the added function comment, this patch is fine.

jeff




[PATCH 3/5] Build ARRAY_REFs when the base is of ARRAY_TYPE.

2015-08-25 Thread Alan Lawrence
When SRA completely scalarizes an array, this patch changes the generated 
accesses from e.g.

   MEM[(int[8] *)a + 4B] = 1;

to

   a[1] = 1;

This overcomes a limitation in dom2, that accesses to equivalent chunks of e.g. 
MEM[(int[8] *)a] are not hashable_expr_equal_p with accesses to e.g. 
MEM[(int[8] *)a]. This is necessary for constant propagation in the 
ssa-dom-cse-2.c testcase (after the next patch that makes SRA handle 
constant-pool loads).

I tried to work around this by making dom2's hashable_expr_equal_p less 
conservative, but found that on platforms without AArch64's vectorized 
reductions (specifically Alpha, hppa, PowerPC, and SPARC, mentioned in 
ssa-dom-cse-2.c), I also needed to make MEM[(int[8] *)a] equivalent to a[0], 
etc.; a complete overhaul of hashable_expr_equal_p seems like a larger task 
than this patch series.

I can't see how to write a testcase for this in C though as direct assignment 
to an array is not possible; such assignments occur only with constant pool 
data, which is dealt with in the next patch.

Bootstrap + check-gcc on x86-none-linux-gnu, arm-none-linux-gnueabihf, 
aarch64-none-linux-gnu.

gcc/ChangeLog:

* tree-sra.c (completely_scalarize): Move some code into:
(get_elem_size): New.
(build_ref_for_offset): Build ARRAY_REF if base is aligned array.
---
 gcc/tree-sra.c | 110 -
 1 file changed, 69 insertions(+), 41 deletions(-)

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 08fa8dc..af35fcc 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -957,6 +957,20 @@ scalarizable_type_p (tree type)
   }
 }
 
+static bool
+get_elem_size (const_tree type, unsigned HOST_WIDE_INT *sz_out)
+{
+  gcc_assert (TREE_CODE (type) == ARRAY_TYPE);
+  tree t_size = TYPE_SIZE (TREE_TYPE (type));
+  if (!t_size || !tree_fits_uhwi_p (t_size))
+return false;
+  unsigned HOST_WIDE_INT sz = tree_to_uhwi (t_size);
+  if (!sz)
+return false;
+  *sz_out = sz;
+  return true;
+}
+
 static void scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, tree, tree);
 
 /* Create total_scalarization accesses for all scalar fields of a member
@@ -985,10 +999,9 @@ completely_scalarize (tree base, tree decl_type, 
HOST_WIDE_INT offset, tree ref)
 case ARRAY_TYPE:
   {
tree elemtype = TREE_TYPE (decl_type);
-   tree elem_size = TYPE_SIZE (elemtype);
-   gcc_assert (elem_size  tree_fits_uhwi_p (elem_size));
-   int el_size = tree_to_uhwi (elem_size);
-   gcc_assert (el_size);
+   unsigned HOST_WIDE_INT el_size;
+   if (!get_elem_size (decl_type, el_size))
+ gcc_assert (false);
 
tree minidx = TYPE_MIN_VALUE (TYPE_DOMAIN (decl_type));
tree maxidx = TYPE_MAX_VALUE (TYPE_DOMAIN (decl_type));
@@ -1563,7 +1576,7 @@ build_ref_for_offset (location_t loc, tree base, 
HOST_WIDE_INT offset,
   tree off;
   tree mem_ref;
   HOST_WIDE_INT base_offset;
-  unsigned HOST_WIDE_INT misalign;
+  unsigned HOST_WIDE_INT misalign, el_sz;
   unsigned int align;
 
   gcc_checking_assert (offset % BITS_PER_UNIT == 0);
@@ -1572,47 +1585,62 @@ build_ref_for_offset (location_t loc, tree base, 
HOST_WIDE_INT offset,
 
   /* get_addr_base_and_unit_offset returns NULL for references with a variable
  offset such as array[var_index].  */
-  if (!base)
-{
-  gassign *stmt;
-  tree tmp, addr;
-
-  gcc_checking_assert (gsi);
-  tmp = make_ssa_name (build_pointer_type (TREE_TYPE (prev_base)));
-  addr = build_fold_addr_expr (unshare_expr (prev_base));
-  STRIP_USELESS_TYPE_CONVERSION (addr);
-  stmt = gimple_build_assign (tmp, addr);
-  gimple_set_location (stmt, loc);
-  if (insert_after)
-   gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
-  else
-   gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
-
-  off = build_int_cst (reference_alias_ptr_type (prev_base),
-  offset / BITS_PER_UNIT);
-  base = tmp;
-}
-  else if (TREE_CODE (base) == MEM_REF)
-{
-  off = build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)),
-  base_offset + offset / BITS_PER_UNIT);
-  off = int_const_binop (PLUS_EXPR, TREE_OPERAND (base, 1), off);
-  base = unshare_expr (TREE_OPERAND (base, 0));
+  if (base
+   TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE
+   misalign == 0
+   get_elem_size (TREE_TYPE (base), el_sz)
+   ((offset % el_sz) == 0)
+   useless_type_conversion_p (exp_type, TREE_TYPE (TREE_TYPE (base)))
+   (align = TYPE_ALIGN (exp_type)))
+{
+  tree idx = build_int_cst (TYPE_DOMAIN (TREE_TYPE (base)), offset / 
el_sz);
+  base = unshare_expr (base);
+  mem_ref = build4 (ARRAY_REF, exp_type, base, idx, NULL_TREE, NULL_TREE);
 }
   else
 {
-  off = build_int_cst (reference_alias_ptr_type (base),
-  base_offset + offset / BITS_PER_UNIT);
-  base = build_fold_addr_expr (unshare_expr (base));
-}
+  if