On Thu, Sep 25, 2025 at 5:36 PM 钟居哲 <[email protected]> wrote:
> Since Originally I designed the VSETVL PASS is able to do fusion under this 
> condition:
>
> insn 1: TAIL_ANY
> insn 2: TAIL_UNDISTURBED
>
> fuse successfully -> TAIL_UNDISTURBED

For OOO uarch, we don't want fusion in this case as it would
unnecessarily apply the tu policy to many subsequent vector
operations, leading to performance inefficiency.
We only want `vsetvli ... tu` before operations that must use this
policy, even if this results in more vset instructions.

After applying this patch, our uarch testing shows a 30% performance
uplift for SPEC2017 510.parest_r.

BTW, I've set it to false for all in-order cores and true for all
out-of-order cores in tune_info.

Reply via email to