On Fri, 21 Nov 2025, Jakub Jelinek wrote:

> On Fri, Nov 21, 2025 at 12:47:22PM +0100, Richard Biener wrote:
> > I do wonder if there's a way to figure the number of mask
> > arguments we expect for a SIMD clone?  Consider
> > 
> > #pragma omp declare simd simdlen(32) inbranch
> > int __attribute__((const)) baz ();
> > 
> > where there's only the mask argument or a case with mixed type
> > arguments or return?
> 
> Given sc->args[i].arg_type == SIMD_CLONE_ARG_TYPE_MASK, if
> sc->args[i].vector_type is a VECTOR_TYPE, then easily, divide
> the number of lanes handled by that single call (sc->simdlen)
> by number of elements in that VECTOR_TYPE.
> For integer masks that is harder, I'm afraid you'd need to repeat
> what simd_clone_adjust_argument_types did in the sc->inbranch
> handling to compute veclen, because orig_type is set in that
> case to boolean_type_node and the precision of the INTEGRAL_TYPE_P
> vector_type could be larger than veclen.
> But we could store the veclen (I think it must be constant in that case)
> or k (i.e. the number of the mask arguments) e.g. in
> sc->args[i].linear_step.
> That is currently documented
>   /* For arg_type SIMD_CLONE_ARG_TYPE_LINEAR_*CONSTANT_STEP this is
>      the constant linear step, if arg_type is
>      SIMD_CLONE_ARG_TYPE_LINEAR_*VARIABLE_STEP, this is index of
>      the uniform argument holding the step, otherwise 0.  */
>   HOST_WIDE_INT linear_step;
> and so it could be reused/abused for one of the veclen or number
> of mask arguments, whatever is easier for the vectorizer, if
> this comment is adjusted to say what it means for
> SIMD_CLONE_ARG_TYPE_MASK and let the code just store it, i.e.
> either
> --- gcc/omp-simd-clone.cc     2025-04-08 14:08:57.115200232 +0200
> +++ gcc/omp-simd-clone.cc     2025-11-21 13:09:10.733282017 +0100
> @@ -892,6 +892,7 @@ simd_clone_adjust_argument_types (struct
>        sc->args[i].orig_type = base_type;
>        sc->args[i].arg_type = SIMD_CLONE_ARG_TYPE_MASK;
>        sc->args[i].vector_type = mask_type;
> +      sc->args[i].linear_step = k;
>      }
>  
>    if (!node->definition)
> or = veclen.to_constant (); (though that would need to be conditional
> on mask_type being INTEGRAL_TYPE_P or sc->mask_mode != VOIDmode.

That would be quite useful indeed.  I thought of using mask_mode,
but for 'double' and simdlen == 16 (so two vector double args)
we get there SImode and not QImode.  That is, I thought of simply
ceil dividing simdlen by GET_MODE_BITSIZE (mask_mode) ...

I'll include the above with a comment and adjust the docs accordingly.

Richard.

Reply via email to