Re: [PATCH] RISC-V: Allow quadratic LMUL cost for unknown niter loops

Kito Cheng Mon, 18 May 2026 00:54:37 -0700

> Quadratic is chosen so that higher LMULs are penalized more than lower LMULs.
> When a loop has a low number of iterations (say, 6) at runtime, and the
> vectorized loop only iterates once for LMUL=1,
> the higher the LMUL,  the slower the code.


That's not true for all cores. SiFive cores are implemented as `Olvt`, so
VL=1 results in the same latency for both LMUL=1 and LMUL=8.

I am not opposed to adding this as a new parameter, but I do oppose making it
the default. It should be disabled by default and enabled only for cores whose
owners explicitly confirm that this model is appropriate.

Re: [PATCH] RISC-V: Allow quadratic LMUL cost for unknown niter loops

Reply via email to