On Thu, Jun 5, 2025 at 2:34 PM Robin Dapp <rdapp....@gmail.com> wrote: > > > So I do wonder how this interacts with vector_vector_composition_type, > > in fact the difference is that for strided_load we know the composition > > happens as part of a load, so how about instead extending > > this function, pass it VLS_LOAD/STORE and also consider > > strided_loads as composition kind there? This would avoid duplication > > and I think at least some cases of non-power-of-two groups would > > be handled this way already (permuting out gaps). > > What would we do if vector_vector_composition_type says strided loads are OK > but the alignment check doesn't agree? Right now we could still fall back to > vector-vector composition.
But that would not pass the alignment check either, no? In fact, I assume that for strided loads we have a scalar type as component (ptype), so we always get supported unaligned accesses here? > I guess moving the alignment check into vector_vector_composition_type isn't > great either. > > Maybe a mask of "wanted" composition types, remove strided from the mask if it > fails and call vector_vector_composition_type again? Doesn't make the > function > signature slimmer, though. We'd be needing an additional vec_load_store_type > argument, vect_composition_type wanted_types, as well as an elsvals argument. > > So somehow like: > > HOST_WIDE_INT composition_types = vect_strided | vect_vec_init | > vect_elt_init; > > vector_vector_composition_type (..., VLS_LOAD, &composition_types, &elsvals); > > if (composition_types & vect_strided) > { > if (!alignment_ok) > { > composition_types &= ~vect_strided; > vector_vector_composition_type (..., VLS_LOAD, composition_types, > nullptr); > } > } > > Or maybe leave the elsvals checking to the caller (although we're getting it > anyway from internal_strided_fn_supported_p)? Handling elsevals in vector_vector_composition_type is OK I think. But yes, the function API should probably return a composition kind, not the type. There's also users of the existing API elsewhere, so possibly keep the original API as alternate entry. I'm mostly worried about the complication of the handling in the strided-SLP load case where I don't really see how a strided load is "special". It might be also interesting to think of using a gather (though at least on x86 that will hardly be efficient). Richard. > > -- > Regards > Robin >