[PATCH] tree-optimization/101293 - further enhance LIMs ref canonicalization

Richard Biener Fri, 02 Jul 2021 04:54:32 -0700

This makes sure to handle MEM[p + 4] and MEM[p].j with j at offset 4
as the same ref in store motion.  For hashing we need to be
more restrictive in what we handle since there's no poly-int
handlers for inchash.  For comparison we can compare poly_offsets directly.


Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

2021-07-02  Richard Biener  <rguent...@suse.de>

        PR tree-optimization/101293
        * tree-ssa-loop-im.c (mem_ref_hasher::equal): Compare MEM_REF bases
        with combined offsets.
        (gather_mem_refs_stmt): Hash MEM_REFs as if their offset were
        combined with the rest of the offset.

        * gcc.dg/tree-ssa/ssa-lim-15.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-15.c | 18 +++++++++++++++
 gcc/tree-ssa-loop-im.c                     | 27 ++++++++++++++++++----
 2 files changed, 41 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-15.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-15.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-15.c
new file mode 100644
index 00000000000..5efb95627ee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-15.c
@@ -0,0 +1,18 @@
+/* PR/101293 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-lim2-details" } */
+
+struct X { int i; int j; };
+
+void foo(struct X *x, int n)
+{
+  for (int i = 0; i < n; ++i)
+    {
+      int *p = &x->j;
+      int tem = *p;
+      x->j += tem * i;
+    }
+}
+
+/* Make sure LIM can handle unifying MEM[x, 4] and MEM[x].j  */
+/* { dg-final { scan-tree-dump "Executing store motion" "lim2" } } */
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 48c952a1eac..e7a3050ba9d 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -194,8 +194,14 @@ mem_ref_hasher::equal (const im_mem_ref *mem1, const 
ao_ref *obj2)
 {
   if (obj2->max_size_known_p ())
     return (mem1->ref_decomposed
-           && operand_equal_p (mem1->mem.base, obj2->base, 0)
-           && known_eq (mem1->mem.offset, obj2->offset)
+           && ((TREE_CODE (mem1->mem.base) == MEM_REF
+                && TREE_CODE (obj2->base) == MEM_REF
+                && operand_equal_p (TREE_OPERAND (mem1->mem.base, 0),
+                                    TREE_OPERAND (obj2->base, 0), 0)
+                && known_eq (mem_ref_offset (mem1->mem.base) * BITS_PER_UNIT + 
mem1->mem.offset,
+                             mem_ref_offset (obj2->base) * BITS_PER_UNIT + 
obj2->offset))
+               || (operand_equal_p (mem1->mem.base, obj2->base, 0)
+                   && known_eq (mem1->mem.offset, obj2->offset)))
            && known_eq (mem1->mem.size, obj2->size)
            && known_eq (mem1->mem.max_size, obj2->max_size)
            && mem1->mem.volatile_p == obj2->volatile_p
@@ -1500,8 +1506,21 @@ gather_mem_refs_stmt (class loop *loop, gimple *stmt)
          && (mem_base = get_addr_base_and_unit_offset (aor.ref, &mem_off)))
        {
          ref_decomposed = true;
-         hash = iterative_hash_expr (ao_ref_base (&aor), 0);
-         hash = iterative_hash_host_wide_int (offset, hash);
+         tree base = ao_ref_base (&aor);
+         poly_int64 moffset;
+         HOST_WIDE_INT mcoffset;
+         if (TREE_CODE (base) == MEM_REF
+             && (mem_ref_offset (base) * BITS_PER_UNIT + offset).to_shwi 
(&moffset)
+             && moffset.is_constant (&mcoffset))
+           {
+             hash = iterative_hash_expr (TREE_OPERAND (base, 0), 0);
+             hash = iterative_hash_host_wide_int (mcoffset, hash);
+           }
+         else
+           {
+             hash = iterative_hash_expr (base, 0);
+             hash = iterative_hash_host_wide_int (offset, hash);
+           }
          hash = iterative_hash_host_wide_int (size, hash);
        }
       else
-- 
2.26.2

[PATCH] tree-optimization/101293 - further enhance LIMs ref canonicalization

Reply via email to