> Use pr115763-2.c as example:
>
> ; w/o this patch, one vec load
> fsh fa0,14(sp)
> addi a5,sp,14
> vsetivli zero,2,e16,mf4,ta,ma
> vlse16.v v1,0(a5),zero
>
> vs
>
> ; w/ this patch, two vector instruction
> fcvt.s.h        fa0,fa0
> vsetivli        zero,2,e32,mf2,ta,ma
> vfmv.v.f        v1,fa0
> vsetvli zero,zero,e16,mf4,ta,ma
> vfncvt.f.f.w    v1,v1

Add a little on this part: I am not saying the one vlse16 is always
better than vfmv.v.f + vfncvt.f.f.w, but that should at least be
guarded with strided_load_broadcast_p rather than just remove that
path.

Reply via email to