> This patch introduces a vector cost model for the Spacemit-X60 core,
> using dynamic LMUL scaling with the -madjust-lmul-cost flag.
>
> Compared to the previous patch, I dropped the local 'vector_lmul'
> attribute and the corresponding LMUL-aware cost logic in spacemit-x60.md.
> Instead, Spacemit-X60 tuning now enables -madjust-lmul-cost implicitly,
> and riscv_sched_adjust_cost is updated so that the adjustment applies to
> spacemit_x60 in addition to the generic out-of-order model.
>
> The stress tests I previously used to tune individual instruction costs
> (with the LMUL-aware logic implemented directly in spacemit-x60.md)
> now show a regression in performance. The most likely cause is the implicit
> -madjust-lmul-cost scaling, since some instructions performed better
> with non-power-of-two scaling (or with no LMUL scaling at all), so the
> uniform ×(1,2,4,8) adjustment affects performance.
>
> Updated performance results:
>
> | Benchmark        | Metric | Trunk            | Vector Cost Model | Δ (%)   |
> |------------------|--------|------------------|-------------------|---------|
> | SciMark2-C       | cycles | 311,450,555,453  | 313,278,899,107   | +0.56%  |
> |------------------|--------|------------------|-------------------|---------|
> | tramp3d-v4       | cycles | 23,788,980,247   | 21,073,526,428    | -12.89% |
> |------------------|--------|------------------|-------------------|---------|
> | Freebench/neural | cycles | 471,707,641      | 435,842,612       | -8.23%  |
> |------------------|--------|------------------|-------------------|---------|

>
> Benchmarks were run from the LLVM test-suite
> (MultiSource/Benchmarks) using:
>
> taskset -c 0 perf stat -r 10 ./...

How sure are we about these results?  It has been notoriously difficult to 
obtain reliable benchmark numbers on the BPI.  Do the results hold after a 
reboot or on the next day?  What about an even higher number of iterations?

I find it difficult to understand why two benchmarks improve a lot more and one 
regresses.  If the LMUL scaling is incorrect, wouldn't we expect similar 
behavior for all three?  Or does SciMark have a different footprint WRT 
instructions and e.g. uses some insns more for which the uniform scaling 
doesn't hold?

-- 
Regards
 Robin

Reply via email to