https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122790
Bug ID: 122790
Summary: conditional OMP SIMD clone does not handle large
simdlen properly
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
#pragma omp declare simd simdlen(32) inbranch
int __attribute__((const)) baz (int x);
short a[1024];
void __attribute__((noipa))
foo (int n, int * __restrict b)
{
for (int i = 0; i < n; ++i)
if (a[i])
b[i] = baz (b[i]);
}
ICEs with
> ./cc1 -quiet t.c -O3 -mavx512bw -fopt-info-vec -fopenmp-simd --param
> vect-partial-vector-usage=2
t.c:9:21: optimized: loop vectorized using masked 64 byte vectors and unroll
factor 32
during GIMPLE pass: vect
t.c: In function ‘foo’:
t.c:7:1: internal compiler error: in exact_div, at poly-int.h:2179
7 | foo (int n, int * __restrict b)
| ^~~
0x37eeeba internal_error(char const*, ...)
../../src/gcc/gcc/diagnostic-global-context.cc:787
there's mismatch in 'ncopies' we use for querying the mask argument and also
the position used. 'ncopies' is the number of SIMD clone calls we emit.
For AVX512 masking
atype = bestn->simdclone->args[i].vector_type;
/* Guess the number of lanes represented by atype. */
poly_uint64 atype_subparts
= exact_div (bestn->simdclone->simdlen,
num_mask_args);
o = vector_unroll_factor (nunits, atype_subparts);
is wrong. I do not see any info in the simdclone info as to which
"multiplicity" there is on arguments, in particular the mask argument.
For a vector data type we can use vector_type and simdlen, but for
SIMD_CLONE_ARG_TYPE_MASK there is not enough info(?).