Richard Biener <rguent...@suse.de> writes:
> On Tue, 10 Jan 2023, Richard Sandiford wrote:
>
>> Richard Biener <rguent...@suse.de> writes:
>> > The extract-last reduction internal function expects the then and
>> > else clause as vector and scalar and thus we cannot perform optimization
>> > of the inversion of the condition by swapping the then/else clauses.
>> >
>> > Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?
>> 
>> Sorry for not having found the time to look at the PR yet.
>> Like you say in the trail, it seems kind-of familiar.
>> 
>> I think we should instead prevent the else in:
>> 
>>        scalar_cond_masked_key cond (cond_expr, ncopies);
>>        if (loop_vinfo->scalar_cond_masked_set.contains (cond))
>>          masks = &LOOP_VINFO_MASKS (loop_vinfo);
>>        else
>>          {
>> 
>> for EXTRACT_LAST.  We've lost as soon as swap_cond_operands gets
>> set to true.
>
> But we're not getting there - the above is guarded with
>
>       if (reduction_type == EXTRACT_LAST_REDUCTION) 
>         masks = &LOOP_VINFO_MASKS (loop_vinfo); 
>       else
>         {
>
> instead we run into
>
>       if (masked)
>         vec_compare = vec_cond_lhs;
>       else
>         {
>           vec_cond_rhs = vec_oprnds1[i];
>           if (bitop1 == NOP_EXPR)
>             {
> ...
>           else
>             {
> ...
>               else if (bitop2 == BIT_NOT_EXPR
>                 {
>                   /* Instead of doing ~x ? y : z do x ? z : y.  */
>                   vec_compare = new_temp;
>                   std::swap (vec_then_clause, vec_else_clause);
>
> so we could instead reject vectorizing for EQ_EXPR but then
> applying the negation to the condition allows this to be
> vectorized just fine (which is what the patch does)?

Ah, OK.  I wasn't sure which of the paths we were going down to get here.

So yeah, I agree the patch is OK.  Sorry for the noise.

Richard

> Richard.
>
>> Thanks,
>> Richard
>> 
>> > Thanks,
>> > Richard.
>> >
>> >    PR tree-optimization/108314
>> >    * tree-vect-stmts.cc (vectorizable_condition): Do not
>> >    perform BIT_NOT_EXPR optimization for EXTRACT_LAST_REDUCTION.
>> >
>> >    * gcc.dg/vect/pr108314.c: New testcase.
>> > ---
>> >  gcc/testsuite/gcc.dg/vect/pr108314.c | 16 ++++++++++++++++
>> >  gcc/tree-vect-stmts.cc               | 13 +++++++++----
>> >  2 files changed, 25 insertions(+), 4 deletions(-)
>> >  create mode 100644 gcc/testsuite/gcc.dg/vect/pr108314.c
>> >
>> > diff --git a/gcc/testsuite/gcc.dg/vect/pr108314.c 
>> > b/gcc/testsuite/gcc.dg/vect/pr108314.c
>> > new file mode 100644
>> > index 00000000000..07260e06915
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.dg/vect/pr108314.c
>> > @@ -0,0 +1,16 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-additional-options "-march=armv9-a" { target aarch64-*-* } } */
>> > +
>> > +int x, y, z;
>> > +
>> > +void f(void)
>> > +{
>> > +  int t = 4;
>> > +  for (; x; x++)
>> > +    {
>> > +      if (y)
>> > +  continue;
>> > +      t = 0;
>> > +    }
>> > +  z = t;
>> > +}
>> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
>> > index 6ddd41fb473..eb4ca1f184e 100644
>> > --- a/gcc/tree-vect-stmts.cc
>> > +++ b/gcc/tree-vect-stmts.cc
>> > @@ -10677,7 +10677,8 @@ vectorizable_condition (vec_info *vinfo,
>> >          vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
>> >          if (bitop2 == NOP_EXPR)
>> >            vec_compare = new_temp;
>> > -        else if (bitop2 == BIT_NOT_EXPR)
>> > +        else if (bitop2 == BIT_NOT_EXPR
>> > +                 && reduction_type != EXTRACT_LAST_REDUCTION)
>> >            {
>> >              /* Instead of doing ~x ? y : z do x ? z : y.  */
>> >              vec_compare = new_temp;
>> > @@ -10686,9 +10687,13 @@ vectorizable_condition (vec_info *vinfo,
>> >          else
>> >            {
>> >              vec_compare = make_ssa_name (vec_cmp_type);
>> > -            new_stmt
>> > -              = gimple_build_assign (vec_compare, bitop2,
>> > -                                     vec_cond_lhs, new_temp);
>> > +            if (bitop2 == BIT_NOT_EXPR)
>> > +              new_stmt
>> > +                = gimple_build_assign (vec_compare, bitop2, new_temp);
>> > +            else
>> > +              new_stmt
>> > +                = gimple_build_assign (vec_compare, bitop2,
>> > +                                       vec_cond_lhs, new_temp);
>> >              vect_finish_stmt_generation (vinfo, stmt_info,
>> >                                           new_stmt, gsi);
>> >            }
>> 

Reply via email to