https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98339
Bug ID: 98339 Summary: GCC could not vectorize loop with conditional reduced add and store Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: wwwhhhyyy333 at gmail dot com Target Milestone: --- For testcase void foo( int* restrict x, int n, int start, int m, int* restrict ret ) { for (int i = 0; i < n; i++) { int pos = start + i; if ( pos <= m) ret[0] += x[i]; } } with -O3 -mavx2 it could not be vectorized because ret[0] += x[i] is zero step MASK_STORE inside loop, and dr analysis failed for zero step store. But with manually loop store motion void foo2( int* restrict x, int n, int start, int m, int* restrict ret ) { int tmp = 0; for (int i = 0; i < n; i++) { int pos = start + i; if (pos <= m) tmp += x[i]; } ret[0] += tmp; } could be vectorized. godbolt: https://godbolt.org/z/Kcv8hP There is no LIM between ifcvt and vect, and current LIM could not handle MASK_STORE. Is there any possibility to vectorize foo, like by doing loop store motion in ifcvt instead of creating MASK_STORE?