On Fri, 21 Nov 2025, Jakub Jelinek wrote: > On Fri, Nov 21, 2025 at 12:47:22PM +0100, Richard Biener wrote: > > I do wonder if there's a way to figure the number of mask > > arguments we expect for a SIMD clone? Consider > > > > #pragma omp declare simd simdlen(32) inbranch > > int __attribute__((const)) baz (); > > > > where there's only the mask argument or a case with mixed type > > arguments or return? > > Given sc->args[i].arg_type == SIMD_CLONE_ARG_TYPE_MASK, if > sc->args[i].vector_type is a VECTOR_TYPE, then easily, divide > the number of lanes handled by that single call (sc->simdlen) > by number of elements in that VECTOR_TYPE. > For integer masks that is harder, I'm afraid you'd need to repeat > what simd_clone_adjust_argument_types did in the sc->inbranch > handling to compute veclen, because orig_type is set in that > case to boolean_type_node and the precision of the INTEGRAL_TYPE_P > vector_type could be larger than veclen. > But we could store the veclen (I think it must be constant in that case) > or k (i.e. the number of the mask arguments) e.g. in > sc->args[i].linear_step. > That is currently documented > /* For arg_type SIMD_CLONE_ARG_TYPE_LINEAR_*CONSTANT_STEP this is > the constant linear step, if arg_type is > SIMD_CLONE_ARG_TYPE_LINEAR_*VARIABLE_STEP, this is index of > the uniform argument holding the step, otherwise 0. */ > HOST_WIDE_INT linear_step; > and so it could be reused/abused for one of the veclen or number > of mask arguments, whatever is easier for the vectorizer, if > this comment is adjusted to say what it means for > SIMD_CLONE_ARG_TYPE_MASK and let the code just store it, i.e. > either > --- gcc/omp-simd-clone.cc 2025-04-08 14:08:57.115200232 +0200 > +++ gcc/omp-simd-clone.cc 2025-11-21 13:09:10.733282017 +0100 > @@ -892,6 +892,7 @@ simd_clone_adjust_argument_types (struct > sc->args[i].orig_type = base_type; > sc->args[i].arg_type = SIMD_CLONE_ARG_TYPE_MASK; > sc->args[i].vector_type = mask_type; > + sc->args[i].linear_step = k; > } > > if (!node->definition) > or = veclen.to_constant (); (though that would need to be conditional > on mask_type being INTEGRAL_TYPE_P or sc->mask_mode != VOIDmode.
That would be quite useful indeed. I thought of using mask_mode, but for 'double' and simdlen == 16 (so two vector double args) we get there SImode and not QImode. That is, I thought of simply ceil dividing simdlen by GET_MODE_BITSIZE (mask_mode) ... I'll include the above with a comment and adjust the docs accordingly. Richard.
