> Use pr115763-2.c as example: > > ; w/o this patch, one vec load > fsh fa0,14(sp) > addi a5,sp,14 > vsetivli zero,2,e16,mf4,ta,ma > vlse16.v v1,0(a5),zero > > vs > > ; w/ this patch, two vector instruction > fcvt.s.h fa0,fa0 > vsetivli zero,2,e32,mf2,ta,ma > vfmv.v.f v1,fa0 > vsetvli zero,zero,e16,mf4,ta,ma > vfncvt.f.f.w v1,v1
Add a little on this part: I am not saying the one vlse16 is always better than vfmv.v.f + vfncvt.f.f.w, but that should at least be guarded with strided_load_broadcast_p rather than just remove that path.