[Bug tree-optimization/98339] GCC could not vectorize loop with conditional reduced add and store

rguenth at gcc dot gnu.org via Gcc-bugs Mon, 04 Jan 2021 07:58:00 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98339


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
             Target|                            |x86_64-*-*
     Ever confirmed|0                           |1
             Blocks|                            |53947
   Last reconfirmed|                            |2021-01-04

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is that we need to vectorize this as reduction and since there's no
"masked scalar store" on GIMPLE LIM itself doesn't help.  The issue why
LIM doesn't apply store-motion here is the _load_ which can trap.  LIM would
like to do

  ret0 = ret[0];
  bool stored = false;
    for (int i = 0; i < n; i++)
    {
        int pos = start + i;
        if ( pos <= m)
          {
            ret0 += x[i];    
            stored = true;
          }
    }
  if (stored)
    ret[0] = ret0;

but as you can see the unconditional load breaks this.  LIM would need to
be changed to handle the whole load-update-store sequence delaying the
load as well (thereby re-associating the reduction).

An alternative would be to split the loop and apply store-motion to the tail.

    for (int i = 0; i < n; i++)
    {
        int pos = start + i;
        if ( pos <= m)
          break;
    }
    if (i < n)
      {
        ret0 = ret[0];
      for (int i = 0; i < n; i++)
       {
         int pos = start + i;
         if ( pos <= m)
            ret0 += x[i]; 
       }
        ret[0] = ret0;
      }

we can then vectorize the second loop.

At the source level the fix is to make sure the load from ret[0] doesn't trap.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug tree-optimization/98339] GCC could not vectorize loop with conditional reduced add and store

Reply via email to