> On Wed, Sep 17, 2025 at 9:22 AM Robin Dapp <rdapp....@gmail.com> wrote:
>>
>> > We are supposed to not get into
>> >
>> >       if (mask_element != index)
>> >         noop_p = false;
>>
>> I guess the problem is the vectype mismatch.  We're checking the permutation
>> for e.g. V16QI = {0, 1, 2, 3, 8, 9, 10, 11, ...} which, in isolation, is not
>> a nop.  That's because nelts_to_build = vf * group_size = 16.
>>
>> So either we need to check monotonicity etc. for each punned element later or
>> we somehow need to pun earlier (as you suggested yesterday).
>
> I don't think that would help - the issue is that the group_size is 8 but the
> elements 4, 5, 6, 7 are gaps that we simply do not load.  That is, the
> permute code does not anticipate that we turned the contiguous load
> into a strided one where we do not load a trailing gap, so effectively have
> group_size == 4?  That is, it's dr_group_size that is "wrong" if we want
> to apply the load-permutation after our way of gathering the to be permuted
> elements, as we are not building vectors that have those gaps represented
> but skipped.
>
> Of course this means the early vect_transform_slp_perm_load call computing
> n_perms cannot anticipate whether we are "re-interpreting" the DR group as
> strided.  It also means we cannot simply perform a permutation using this
> function without adjusting this.  But this means we're not actually 
> repeating_p
> right now, correct?

Yes.

> One could add a gap_skipped parameter to the function and adjust
>
>       dr_group_size = DR_GROUP_SIZE (stmt_info);
>
> to
>
>       dr_group_size = DR_GROUP_SIZE (stmt_info) - (gap_skipped ?
> DR_GROUP_GAP (stmt_info) : 0);

Hmm, guess I'm lost.  I'm only ever seeing a group gap of 0 or 1.  As we're 
analyzing the datarefs all elements are present and AFAIK there is no 
traditional group gap (like e.g. when just accessing the first 6 elements of a 
group of 8).

The number of SLP lanes is 4, though.

-- 
Regards
 Robin

Reply via email to