On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>> Sorry, Should have replied to gcc-patches list.
>>
>> Thanks,
>> bin
>>
>> ---------- Forwarded message ----------
>> From: "Bin.Cheng" <amker.ch...@gmail.com>
>> Date: Tue, 29 Mar 2016 03:55:04 +0800
>> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
>> DR against its innermost loop bahavior if possible
>> To: Richard Biener <richard.guent...@gmail.com>
>>
>> On 3/17/16, Richard Biener <richard.guent...@gmail.com> wrote:
>>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>>> <richard.guent...@gmail.com> wrote:
>>>>>
>>>>> Hmm.
>>>> Hi,
>>>> Thanks for reviewing.
>>>>>
>>>>> +  equal_p = true;
>>>>> +  if (e1->base_address && e2->base_address)
>>>>> +    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>>>>> +  if (e1->offset && e2->offset)
>>>>> +    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>>>>
>>>>> surely better to return false early.
>>>>>
>>>>> I think we don't want this in tree-data-refs.h also because of ...
>>>>>
>>>>> @@ -615,15 +619,29 @@
>>>>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>>>>> (data_reference_p a)
>>>>>    data_reference_p *master_dr, *base_master_dr;and REALPART) before
>>>>> creating the DR (or adjust the equality function
>>>> and hashing
>>>>>    tree ref = DR_REF (a);
>>>>>    tree base_ref = DR_BASE_OBJECT (a);
>>>>> +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>>>>>    tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>>>>    bool exist1, exist2;
>>>>>
>>>>> -  while (TREE_CODE (ref) == COMPONENT_REF
>>>>> -        || TREE_CODE (ref) == IMAGPART_EXPR
>>>>> -        || TREE_CODE (ref) == REALPART_EXPR)
>>>>> -    ref = TREE_OPERAND (ref, 0);
>>>>> +  /* If reference in DR has innermost loop behavior and it is not
>>>>> +     a compound memory reference, we store it to innermost_DR_map,
>>>>> +     otherwise to ref_DR_map.  */
>>>>> +  if (TREE_CODE (ref) == COMPONENT_REF
>>>>> +      || TREE_CODE (ref) == IMAGPART_EXPR
>>>>> +      || TREE_CODE (ref) == REALPART_EXPR
>>>>> +      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>>>>> +          || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>>>>> +    {
>>>>> +      while (TREE_CODE (ref) == COMPONENT_REF
>>>>> +            || TREE_CODE (ref) == IMAGPART_EXPR
>>>>> +            || TREE_CODE (ref) == REALPART_EXPR)
>>>>> +       ref = TREE_OPERAND (ref, 0);
>>>>> +
>>>>> +      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
>>>>> +    }
>>>>> +  else
>>>>> +    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>>>>>
>>>>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd
>>>>> need to
>>>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>>>>> and REALPART) before creating the DR (or adjust the equality function
>>>>> and hashing
>>>>> to disregard them which means subtracting their offset from DR_INIT.
>>>> I am not sure if I understand correctly.  But for component reference,
>>>> it is the base object that we want to record/track.  For example,
>>>>
>>>>   for (i = 0; i < N; i++) {
>>>>     m = *data++;
>>>>
>>>>     m1 = p1->x - m;
>>>>     m2 = p2->x + m;
>>>>
>>>>     p3->y = (m1 >= m2) ? p1->y : p2->y;
>>>>
>>>>     p1++;
>>>>     p2++;
>>>>     p3++;
>>>>   }
>>>> We want to infer that reads of p1/p2 in condition statement won't trap
>>>> because there are unconditional reads of the structures, though the
>>>> unconditional reads are actual of other sub-objects.  Here it is the
>>>> invariant part of address that we want to track.
>>>
>>> Well, the variant parts - we want to strip invariant parts as far as we can
>>> (offsetof (x) and offsetof (y))
>>>
>>>> Also illustrated by this example, we can't rely on data-ref analyzer
>>>> here.  Because in gathering/scattering cases, the address could be not
>>>> affine at all.
>>>
>>> Sure, but that's a different issue.
>>>
>>>>>
>>>>> To adjust the references we collect you'd maybe could use a callback
>>>>> to get_references_in_stmt
>>>>> to adjust them.
>>>>>
>>>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
>>>>> as
>>>> Is this a part of the method you suggested above, or is it an
>>>> alternative one?  If it's the latter, then I have below questions
>>>> embedded.
>>>
>>> It is an alternative to adding a hook to get_references_in_stmt and
>>> probably "easier".
>>>
>>>>>
>>>>> Index: tree-if-conv.c
>>>>> ===================================================================
>>>>> --- tree-if-conv.c      (revision 234215)
>>>>> +++ tree-if-conv.c      (working copy)
>>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>>>
>>>>>    for (i = 0; refs->iterate (i, &dr); i++)
>>>>>      {
>>>>> +      tree *refp = &DR_REF (dr);
>>>>> +      while ((TREE_CODE (*refp) == COMPONENT_REF
>>>>> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>>>> +            || TREE_CODE (*refp) == IMAGPART_EXPR
>>>>> +            || TREE_CODE (*refp) == REALPART_EXPR)
>>>>> +       refp = &TREE_OPERAND (*refp, 0);
>>>>> +      if (refp != &DR_REF (dr))
>>>>> +       {
>>>>> +         tree saved_base = *refp;
>>>>> +         *refp = integer_zero_node;
>>>>> +
>>>>> +         if (DR_INIT (dr))
>>>>> +           {
>>>>> +             tree poffset;
>>>>> +             int punsignedp, preversep, pvolatilep;
>>>>> +             machine_mode pmode;
>>>>> +             HOST_WIDE_INT pbitsize, pbitpos;
>>>>> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos,
>>>>> &poffset,
>>>>> +                                  &pmode, &punsignedp, &preversep,
>>>>> &pvolatilep,
>>>>> +                                  false);
>>>>> +             gcc_assert (poffset == NULL_TREE);
>>>>> +
>>>>> +             DR_INIT (dr)
>>>>> +               = wide_int_to_tree (ssizetype,
>>>>> +                                   wi::sub (DR_INIT (dr),
>>>>> +                                            pbitpos / BITS_PER_UNIT));
>>>>> +           }
>>>>> +
>>>>> +         *refp = saved_base;
>>>>> +         DR_REF (dr) = *refp;
>>>>> +       }
>>>> Looks to me the code is trying to resolve difference between two (or
>>>> more) component references, which is DR_INIT in the code.  But DR_INIT
>>>> is not the only thing needs to be handled.  For a structure containing
>>>> two sub-arrays, DR_OFFSET may be different too.
>>>
>>> Yes, but we can't say that if
>>>
>>>   a->a[i]
>>>
>>> doesn't trap that then
>>>
>>>   a->b[i]
>>>
>>> doesn't trap either.  We can only "strip" outermost
>>> non-variable-offset components.
>>>
>>> But maybe I'm missing what example you are thinking of.
>> Hmm, this was the case I meant.  What I don't understand is current
>> code logic does infer trap information for a.b[i] from a.a[i].  Given
>> below example:
>> struct str
>> {
>>   int a[10];
>>   int b[20];
>>   char c;
>> };
>>
>> void bar (struct str *);
>> int foo (int x, int n)
>> {
>>   int i;
>>   struct str s;
>>   bar (&s);
>>   for (i = 0; i < n; i++)
>>     {
>>       s.a[i] = s.b[i];
>>       if (x > i)
>>         s.b[i] = 0;
>>     }
>>   bar (&s);
>>   return 0;
>> }
>> The loop is convertible because of below code in function
>> ifcvt_memrefs_wont_trap:
>>
>>   /* If a is unconditionally accessed then ... */
>>   if (DR_RW_UNCONDITIONALLY (*master_dr))
>>     {
>>       /* an unconditional read won't trap.  */
>>       if (DR_IS_READ (a))
>>         return true;
>>
>>       /* an unconditionaly write won't trap if the base is written
>>          to unconditionally.  */
>>       if (base_master_dr
>>           && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
>>         return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>       else
>>         {
>>           /* or the base is know to be not readonly.  */
>>           tree base_tree = get_base_address (DR_REF (a));
>>           if (DECL_P (base_tree)
>>               && decl_binds_to_current_def_p (base_tree)
>>               && ! TREE_READONLY (base_tree))
>>             return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>         }
>>     }
>> It is the main object '&s' that is recorded in base_master_dr, so
>> s.b[i] is considered not trap.  Even it's not, I suppose
>> get_base_address will give same result?
>
> Well, for this case it sees that s.b[i] is read from so it can't be an
> out-of-bound
> access.  And s.a[i] is written to unconditionally so 's' cannot be a
> readonly object.
> With both pieces of information we can conclude that s.b[i] = 0 doesn't trap.

Hi Richard,
Attachment is the updated patch.  I made below changes wrto your
review comments:
1) Moved hash class for innermost loop behavior from tree-data-ref.h
to tree-if-conv.c.
    I also removed check on innermost_loop_behavior.aligned_to which
seems redundant to me because it's computed from offset and we have
already checked equality for offset.
2) Replaced ref_DR_map with new map innermost_DR_map.
3) Post-processed DR in if_convertible_loop_p_1 for compound reference
or references don't have innermost behavior.  This cleans up following
code using DR information.
4) Added a vectorization test for PR56625.

I didn't incorporate your proposed code because I think it handles
COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y.
Looks to me it is another ifcvt opportunity different from PR69489.
Anyway, fix is easy, I can send another patch in GCC7.

Bootstrap and test on x86_64/AArch64, so any comments on this version?

Thanks,
bin

2016-03-30  Bin Cheng  <bin.ch...@arm.com>

    PR tree-optimization/56625
    PR tree-optimization/69489
    * tree-data-ref.h (DR_INNERMOST): New macro.
    * tree-if-conv.c (innermost_loop_behavior_hash): New class for
    hashing struct innermost_loop_behavior.
    (ref_DR_map): Remove.
    (innermost_DR_map): New map.
    (baseref_DR_map): Revise comment.
    (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR
    to innermost_DR_map accroding to its innermost loop behavior.
    (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map according
    to its innermost loop behavior.
    (if_convertible_loop_p_1): Remove intialization for ref_DR_map.
    Add initialization for innermost_DR_map.  Record memory reference
    in DR_BASE_ADDRESS if the reference is compound one or it doesn't
    have innermost loop behavior.
    (if_convertible_loop_p): Remove release for ref_DR_map.  Release
    innermost_DR_map.

gcc/testsuite/ChangeLog
2016-03-30  Bin Cheng  <bin.ch...@arm.com>

    PR tree-optimization/56625
    PR tree-optimization/69489
    * gcc.dg/vect/pr56625.c: New test.
    * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test.
diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h
index eb24d16..856dd58 100644
--- a/gcc/tree-data-ref.h
+++ b/gcc/tree-data-ref.h
@@ -144,6 +144,7 @@ struct data_reference
 #define DR_STEP(DR)                (DR)->innermost.step
 #define DR_PTR_INFO(DR)            (DR)->alias.ptr_info
 #define DR_ALIGNED_TO(DR)          (DR)->innermost.aligned_to
+#define DR_INNERMOST(DR)           (DR)->innermost
 
 typedef struct data_reference *data_reference_p;
 
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 9e305c7..884006c 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -114,16 +114,68 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "params.h"
 
+/* Hash for struct innermost_loop_behavior.  It depends on the user to
+   free the memory.  */
+
+struct innermost_loop_behavior_hash : nofree_ptr_hash <innermost_loop_behavior>
+{
+  static inline hashval_t hash (const value_type &);
+  static inline bool equal (const value_type &,
+                           const compare_type &);
+};
+
+inline hashval_t
+innermost_loop_behavior_hash::hash (const value_type &e)
+{
+  hashval_t hash;
+
+  hash = iterative_hash_expr (e->base_address, 0);
+  hash = iterative_hash_expr (e->offset, hash);
+  hash = iterative_hash_expr (e->init, hash);
+  return iterative_hash_expr (e->step, hash);
+}
+
+inline bool
+innermost_loop_behavior_hash::equal (const value_type &e1,
+                                    const compare_type &e2)
+{
+  if ((e1->base_address && !e2->base_address)
+      || (!e1->base_address && e2->base_address)
+      || (!e1->offset && e2->offset)
+      || (e1->offset && !e2->offset)
+      || (!e1->init && e2->init)
+      || (e1->init && !e2->init)
+      || (!e1->step && e2->step)
+      || (e1->step && !e2->step))
+    return false;
+
+  if (e1->base_address && e2->base_address
+      && !operand_equal_p (e1->base_address, e2->base_address, 0))
+    return false;
+  if (e1->offset && e2->offset
+      && !operand_equal_p (e1->offset, e2->offset, 0))
+    return false;
+  if (e1->init && e2->init
+      && !operand_equal_p (e1->init, e2->init, 0))
+    return false;
+  if (e1->step && e2->step
+      && !operand_equal_p (e1->step, e2->step, 0))
+    return false;
+
+  return true;
+}
+
 /* List of basic blocks in if-conversion-suitable order.  */
 static basic_block *ifc_bbs;
 
 /* Apply more aggressive (extended) if-conversion if true.  */
 static bool aggressive_if_conv;
 
-/* Hash table to store references, DR pairs.  */
-static hash_map<tree_operand_hash, data_reference_p> *ref_DR_map;
+/* Hash table to store <DR's innermost loop behavior, DR> pairs.  */
+static hash_map<innermost_loop_behavior_hash,
+               data_reference_p> *innermost_DR_map;
 
-/* Hash table to store base reference, DR pairs.  */
+/* Hash table to store <base reference, DR> pairs.  */
 static hash_map<tree_operand_hash, data_reference_p> *baseref_DR_map;
 
 /* Structure used to predicate basic blocks.  This is attached to the
@@ -613,17 +665,12 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info 
(data_reference_p a)
 {
 
   data_reference_p *master_dr, *base_master_dr;
-  tree ref = DR_REF (a);
   tree base_ref = DR_BASE_OBJECT (a);
+  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
   tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
   bool exist1, exist2;
 
-  while (TREE_CODE (ref) == COMPONENT_REF
-        || TREE_CODE (ref) == IMAGPART_EXPR
-        || TREE_CODE (ref) == REALPART_EXPR)
-    ref = TREE_OPERAND (ref, 0);
-
-  master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
+  master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
   if (!exist1)
     *master_dr = a;
 
@@ -685,21 +732,18 @@ ifcvt_memrefs_wont_trap (gimple *stmt, 
vec<data_reference_p> drs)
   data_reference_p *master_dr, *base_master_dr;
   data_reference_p a = drs[gimple_uid (stmt) - 1];
 
-  tree ref_base_a = DR_REF (a);
   tree base = DR_BASE_OBJECT (a);
+  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
 
   gcc_assert (DR_STMT (a) == stmt);
+  gcc_assert (DR_BASE_ADDRESS (a) || DR_OFFSET (a)
+              || DR_INIT (a) || DR_STEP (a));
 
-  while (TREE_CODE (ref_base_a) == COMPONENT_REF
-        || TREE_CODE (ref_base_a) == IMAGPART_EXPR
-        || TREE_CODE (ref_base_a) == REALPART_EXPR)
-    ref_base_a = TREE_OPERAND (ref_base_a, 0);
+  master_dr = innermost_DR_map->get (innermost);
+  gcc_assert (master_dr != NULL);
 
-  master_dr = ref_DR_map->get (ref_base_a);
   base_master_dr = baseref_DR_map->get (base);
 
-  gcc_assert (master_dr != NULL);
-
   /* If a is unconditionally written to it doesn't trap.  */
   if (DR_W_UNCONDITIONALLY (*master_dr))
     return true;
@@ -1228,13 +1272,16 @@ if_convertible_loop_p_1 (struct loop *loop,
 
   data_reference_p dr;
 
-  ref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
+  innermost_DR_map
+         = new hash_map<innermost_loop_behavior_hash, data_reference_p>;
   baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
 
   predicate_bbs (loop);
 
   for (i = 0; refs->iterate (i, &dr); i++)
     {
+      tree ref = DR_REF (dr);
+
       dr->aux = XNEW (struct ifc_dr);
       DR_BASE_W_UNCONDITIONALLY (dr) = false;
       DR_RW_UNCONDITIONALLY (dr) = false;
@@ -1244,6 +1291,27 @@ if_convertible_loop_p_1 (struct loop *loop,
       IFC_DR (dr)->base_w_predicate = boolean_false_node;
       if (gimple_uid (DR_STMT (dr)) == 0)
        gimple_set_uid (DR_STMT (dr), i + 1);
+
+      /* If DR doesn't have innermost loop behavior or it's a compound
+         memory reference, we synthesize its innermost loop behavior
+         for hashing.  */
+      if (TREE_CODE (ref) == COMPONENT_REF
+          || TREE_CODE (ref) == IMAGPART_EXPR
+          || TREE_CODE (ref) == REALPART_EXPR
+          || !(DR_BASE_ADDRESS (dr) || DR_OFFSET (dr)
+              || DR_INIT (dr) || DR_STEP (dr)))
+        {
+          while (TREE_CODE (ref) == COMPONENT_REF
+                || TREE_CODE (ref) == IMAGPART_EXPR
+                || TREE_CODE (ref) == REALPART_EXPR)
+           ref = TREE_OPERAND (ref, 0);
+
+          DR_BASE_ADDRESS (dr) = ref;
+          DR_OFFSET (dr) = NULL;
+          DR_INIT (dr) = NULL;
+          DR_STEP (dr) = NULL;
+          DR_ALIGNED_TO (dr) = NULL;
+        }
       hash_memrefs_baserefs_and_store_DRs_read_written_info (dr);
     }
 
@@ -1338,8 +1406,8 @@ if_convertible_loop_p (struct loop *loop, bool 
*any_mask_load_store)
 
   free_data_refs (refs);
 
-  delete ref_DR_map;
-  ref_DR_map = NULL;
+  delete innermost_DR_map;
+  innermost_DR_map = NULL;
 
   delete baseref_DR_map;
   baseref_DR_map = NULL;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
new file mode 100644
index 0000000..02a6731
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-ifcvt-stats" { target *-*-* 
} } */
+
+void foo (int a[], int b[])
+{
+  int i;
+  for (i = 0; i < 100; i++)
+    {
+      if (a[i] == 0)
+       a[i] = b[i]*4;
+      else
+       a[i] = b[i]*3;
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr56625.c 
b/gcc/testsuite/gcc.dg/vect/pr56625.c
new file mode 100644
index 0000000..b903be3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr56625.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+void foo (int a[], int b[])
+{
+  int i;
+  for (i = 0; i < 100; i++)
+    {
+      if (a[i] == 0)
+       a[i] = b[i]*4;
+      else
+       a[i] = b[i]*3;
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */

Reply via email to