>> > The patch looks OK, but it would be nice to have some overall
>> > documentation on this length + bias thing somewhere. IMO this
>> > belongs to md.texi, but there's a PR about the patterns being
>> > an unordered mess without structure. Possibly amending the
>> > first paragraph is an option, short of subdividing the
>> > standard pattern name listing into logical parts.
>>
>> I've been trying to understand this myself, so lets see if my current
>> understanding is in the right direction. My understanding is that on some
>> targets len == 0 isn't a valid option, and so for those target a bias is
>> applied
>> such that the minimum len is biased so the operation is valid.
>>
>> At least that's my naïve understanding. How far off am I Robin?
>
> What adds to my confusion is that we pass down both 'len' and 'bias'
> but (IIRC for optimization reasons) apply the bias already to 'len'?
>
> So I thought vect_get_loop_len would always return biased len and
> to arrive at the actual "len" we have to un-apply the bias?
Apologies, I was out for most of the day so can catch up only now.
Given how later we are in the cycle the patch looks reasonable to me but I
would have preferred it to untangle things a bit.
The whole len handling is indeed confusing and very ripe for refactoring.
I'll take this on in stage 1.
Biasing/adjusting the length is an optimization. Subtracting 1 from the
length for every load and store was too costly when I benchmarked it.
Adding length and bias to the more advanced length-based load/store IFNs
(gather, lanes) was done for "symmetry". Neither s390 nor power support
those. As all other IFNs had a bias we also added it to vec_extract even
though that doesn't make a lot of sense. vec_extract as it's currently
defined can only reasonably operate on units/lanes and I hope we won't
ever see a vector ISA that cannot extract the first element but every other.
And even though we added the bias for comprehensiveness, only the code for
consecutive loads/stores (that s390 and power support) properly handles the
byte-based indexing at all, no other IFN does. So it's not like we have been
comprehensive but rather showed a bit of good will :)
IMHO what we need for legibility is
enum {
VECT_LEN_TYPE_BYTES,
VECT_LEN_TYPE_UNITS
} vect_len_type;
that we pass to record/get_loop_len instead of "factor" and "adjusted".
A vec_extract would ask for LEN_TYPE_UNITS, as would riscv. s390/power
would use LEN_TYPE_BYTES for their loads and stores.
We already have two "lens", the adjusted one and the regular one so all
that's left is gift-wrap the scaling from bytes to units and vice-versa
a bit nicer. What we could do as an optimization on s390 is keep the
regular len in units and the adjusted len in bytes.
--
Regards
Robin