store bias handling

Robin Dapp Tue, 10 Feb 2026 07:24:23 -0800

>> > The patch looks OK, but it would be nice to have some overall
>> > documentation on this length + bias thing somewhere.  IMO this
>> > belongs to md.texi, but there's a PR about the patterns being
>> > an unordered mess without structure.  Possibly amending the
>> > first paragraph is an option, short of subdividing the
>> > standard pattern name listing into logical parts.
>> 
>> I've been trying to understand this myself, so lets see if my current
>> understanding is in the right direction. My understanding is that on some
>> targets len == 0 isn't a valid option, and so for those target a bias is 
>> applied
>> such that the minimum len is biased so the operation is valid.
>> 
>> At least that's my naïve understanding. How far off am I Robin?
>
> What adds to my confusion is that we pass down both 'len' and 'bias'
> but (IIRC for optimization reasons) apply the bias already to 'len'?
>
> So I thought vect_get_loop_len would always return biased len and
> to arrive at the actual "len" we have to un-apply the bias?


Apologies, I was out for most of the day so can catch up only now.

Given how later we are in the cycle the patch looks reasonable to me but I 
would have preferred it to untangle things a bit.

The whole len handling is indeed confusing and very ripe for refactoring.
I'll take this on in stage 1.

Biasing/adjusting the length is an optimization.  Subtracting 1 from the
length for every load and store was too costly when I benchmarked it.

Adding length and bias to the more advanced length-based load/store IFNs 
(gather, lanes) was done for "symmetry".  Neither s390 nor power support
those.  As all other IFNs had a bias we also added it to vec_extract even 
though that doesn't make a lot of sense.  vec_extract as it's currently
defined can only reasonably operate on units/lanes and I hope we won't
ever see a vector ISA that cannot extract the first element but every other.

And even though we added the bias for comprehensiveness, only the code for 
consecutive loads/stores (that s390 and power support) properly handles the 
byte-based indexing at all, no other IFN does.  So it's not like we have been 
comprehensive but rather showed a bit of good will :)

IMHO what we need for legibility is

  enum {
    VECT_LEN_TYPE_BYTES,
    VECT_LEN_TYPE_UNITS
  } vect_len_type;

that we pass to record/get_loop_len instead of "factor" and "adjusted".
A vec_extract would ask for LEN_TYPE_UNITS, as would riscv.  s390/power
would use LEN_TYPE_BYTES for their loads and stores.

We already have two "lens", the adjusted one and the regular one so all
that's left is gift-wrap the scaling from bytes to units and vice-versa
a bit nicer.  What we could do as an optimization on s390 is keep the
regular len in units and the adjusted len in bytes.

-- 
Regards
 Robin

Re: [PATCH] tree-optimization/122297 - fix load/store bias handling

Reply via email to