Thanks Richard for your comments.
I changes algorithm to remove dead scalar statements as you proposed.

Bootstrap and regression testing did not show any new failures on x86-64.
Is it OK for trunk?

Changelog:
2016-02-10  Yuri Rumyantsev  <ysrum...@gmail.com>

PR tree-optimization/69652
* tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1
to nested loop, did source re-formatting, skip debug statements,
add check on statement with volatile operand, remove dead scalar
statements.

gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr69652.c: New test.


2016-02-09 15:33 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Fri, Feb 5, 2016 at 3:54 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>> Hi All,
>>
>> Here is updated patch - I came back to move call statements also since
>>  masked loads are presented by internal call. I also assume that for
>> the following simple loop
>>   for (i = 0; i < n; i++)
>>     if (b1[i])
>>       a1[i] = sqrtf(a2[i] * a2[i] + a3[i] * a3[i]);
>> motion must be done for all vector statements in semi-hammock including SQRT.
>>
>> Bootstrap and regression testing did not show any new failures.
>> Is it OK for trunk?
>
> The patch is incredibly hard to parse due to the re-indenting.  Please
> consider sending
> diffs with -b.
>
> This issue exposes that you are moving (masked) stores across loads without
> checking aliasing.  In the specific case those loads are dead and thus
> this is safe
> but in general I thought we were checking that we are using the same VUSE
> during the sinking operation.
>
> Thus, I'd rather have
>
> +             /* Check that LHS does not have uses outside of STORE_BB.  */
> +             res = true;
> +             FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
> +               {
> +                 gimple *use_stmt;
> +                 use_stmt = USE_STMT (use_p);
> +                 if (is_gimple_debug (use_stmt))
> +                   continue;
> +                 if (gimple_bb (use_stmt) != store_bb)
> +                   {
> +                     res = false;
> +                     break;
> +                   }
> +               }
>
> also check for the dead code case and DCE those stmts here.  Like so:
>
>    if (has_zero_uses (lhs))
>     {
>       gsi_remove (&gsi_from, true);
>       continue;
>     }
>
> before the above loop.
>
> Richard.
>
>> ChangeLog:
>>
>> 2016-02-05  Yuri Rumyantsev  <ysrum...@gmail.com>
>>
>> PR tree-optimization/69652
>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1
>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all
>> skipped scalar statements, introduce variable LAST_VUSE to keep
>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the
>> begining of current masked store processing, did source re-formatting,
>> skip parsing of debug gimples, stop processing if a gimple with
>> volatile operand has been encountered, save scalar statement
>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE
>> iterator, change vuse of all saved scalar statements to LAST_VUSE if
>> it makes sence.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.dg/torture/pr69652.c: New test.
>>
>> 2016-02-04 19:40 GMT+03:00 Jakub Jelinek <ja...@redhat.com>:
>>> On Thu, Feb 04, 2016 at 05:46:27PM +0300, Yuri Rumyantsev wrote:
>>>> Here is a patch that cures the issues with non-correct vuse for scalar
>>>> statements during code motion, i.e. if vuse of scalar statement is
>>>> vdef of masked store which has been sunk to new basic block, we must
>>>> fix it up.  The patch also fixed almost all remarks pointed out by
>>>> Jacub.
>>>>
>>>> Bootstrapping and regression testing on v86-64 did not show any new 
>>>> failures.
>>>> Is it OK for trunk?
>>>>
>>>> ChangeLog:
>>>> 2016-02-04  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>
>>>> PR tree-optimization/69652
>>>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1
>>>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all
>>>> skipped scalar statements, introduce variable LAST_VUSE that has
>>>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the
>>>> begining of current masked store processing, did source re-formatting,
>>>> skip parsing of debug gimples, stop processing when call or gimple
>>>> with volatile operand habe been encountered, save scalar statement
>>>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE
>>>> iterator, change vuse of all saved scalar statements to LAST_VUSE if
>>>> it makes sence.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>> * gcc.dg/torture/pr69652.c: New test.
>>>
>>> Your mailer breaks ChangeLog formatting, so it is hard to check the
>>> formatting of the ChangeLog entry.
>>>
>>> diff --git a/gcc/testsuite/gcc.dg/torture/pr69652.c 
>>> b/gcc/testsuite/gcc.dg/torture/pr69652.c
>>> new file mode 100644
>>> index 0000000..91f30cf
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/torture/pr69652.c
>>> @@ -0,0 +1,14 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -ffast-math -ftree-vectorize " } */
>>> +/* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */
>>> +
>>> +void fn1(double **matrix, int column, int row, int n)
>>> +{
>>> +  int k;
>>> +  for (k = 0; k < n; k++)
>>> +    if (matrix[row][k] != matrix[column][k])
>>> +      {
>>> +       matrix[column][k] = -matrix[column][k];
>>> +       matrix[row][k] = matrix[row][k] - matrix[column][k];
>>> +      }
>>> +}
>>> \ No newline at end of file
>>>
>>> Please make sure the last line of the test is a new-line.
>>>
>>> @@ -6971,6 +6972,8 @@ optimize_mask_stores (struct loop *loop)
>>>            gsi_next (&gsi))
>>>         {
>>>           stmt = gsi_stmt (gsi);
>>> +         if (is_gimple_debug (stmt))
>>> +           continue;
>>>           if (is_gimple_call (stmt)
>>>               && gimple_call_internal_p (stmt)
>>>               && gimple_call_internal_fn (stmt) == IFN_MASK_STORE)
>>>
>>> This is not needed, you do something only for is_gimple_call,
>>> which is never true if is_gimple_debug, so the code used to be fine as is.
>>>
>>> +             /* Skip debug sstatements.  */
>>>
>>> s/ss/s/
>>>
>>> +             if (is_gimple_debug (gsi_stmt (gsi)))
>>> +               continue;
>>> +             stmt1 = gsi_stmt (gsi);
>>> +             /* Do not consider writing to memory,volatile and call
>>>
>>> Missing space after ,
>>>
>>> +             /* Skip scalar statements.  */
>>> +             if (!VECTOR_TYPE_P (TREE_TYPE (lhs)))
>>> +               {
>>> +                 /* If scalar statement has vuse we need to modify it
>>> +                    when another masked store will be sunk.  */
>>> +                 if (gimple_vuse (stmt1))
>>> +                   scalar_vuse.safe_push (stmt1);
>>>                   continue;
>>> +               }
>>>
>>> I don't think it is safe to take for granted that the scalar stmts are all
>>> going to be DCEd, but I could be wrong.
>>>
>>> +             /* Check that LHS does not have uses outside of STORE_BB.  */
>>> +             res = true;
>>> +             FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
>>> +               {
>>> +                 gimple *use_stmt;
>>> +                 use_stmt = USE_STMT (use_p);
>>> +                 if (is_gimple_debug (use_stmt))
>>> +                   continue;
>>>
>>> Ignoring debug stmts to make decision whether you move or not is
>>> of course the right thing to do.  But IMHO you should remember if
>>> you saw any is_gimple_debug stmts in some bool var.
>>>
>>> +                 if (gimple_bb (use_stmt) != store_bb)
>>> +                   {
>>> +                     res = false;
>>> +                     break;
>>> +                   }
>>> +               }
>>> +             if (!res)
>>> +               break;
>>>
>>> -               if (gimple_vuse (stmt1)
>>> -                   && gimple_vuse (stmt1) != gimple_vuse (last_store))
>>> -                 break;
>>> +             if (gimple_vuse (stmt1)
>>> +                 && gimple_vuse (stmt1) != gimple_vuse (last_store))
>>> +               break;
>>>
>>> +             /* Can move STMT1 to STORE_BB.  */
>>> +             if (dump_enabled_p ())
>>> +               {
>>> +                 dump_printf_loc (MSG_NOTE, vect_location,
>>> +                                  "Move stmt to created bb\n");
>>> +                 dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt1, 0);
>>> +               }
>>>
>>> And if yes, invalidate them here, because the move would otherwise
>>> generate invalid IL.
>>>
>>> +             gsi_move_before (&gsi_from, &gsi_to);
>>> +             /* Shift GSI_TO for further insertion.  */
>>> +             gsi_prev (&gsi_to);
>>> +           }
>>> +         /* Put other masked stores with the same mask to STORE_BB.  */
>>> +         if (worklist.is_empty ()
>>> +             || gimple_call_arg (worklist.last (), 2) != mask
>>> +             || worklist.last () != stmt1)
>>> +           break;
>>> +         last = worklist.pop ();
>>>         }
>>>        add_phi_arg (phi, gimple_vuse (last_store), e, UNKNOWN_LOCATION);
>>> +      /* Mask stores motion could crossing scalar statements with vuse
>>> +        which should be corrected.  */
>>>
>>> s/crossing/cross/
>>> That said, I'm not really sure if without some verification if such
>>> reads are really dead it is safe to skip them and update this way.
>>> Richard?
>>>
>>> +      last_vuse = gimple_vuse (last_store);
>>> +      while (!scalar_vuse.is_empty ())
>>> +       {
>>> +          stmt = scalar_vuse.pop ();
>>> +          if (gimple_vuse (stmt) != last_vuse)
>>> +             {
>>> +               gimple_set_vuse (stmt, last_vuse);
>>> +               update_stmt (stmt);
>>> +             }
>>> +       }
>>>      }
>>>  }
>>>
>>>         Jakub

Attachment: patch.2
Description: Binary data

Reply via email to