Committed to trunk, thanks :)
On Tue, Apr 18, 2023 at 9:50 PM Jeff Law <jeffreya...@gmail.com> wrote: > > > > On 3/13/23 02:19, juzhe.zh...@rivai.ai wrote: > > From: Ju-Zhe Zhong <juzhe.zh...@rivai.ai> > > > > Co-authored-by: kito-cheng <kito.ch...@sifive.com> > > Co-authored-by: kito-cheng <kito.ch...@gmail.com> > > > > Consider this case: > > void f19 (void *base,void *base2,void *out,size_t vl, int n) > > { > > vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl); > > for (int i = 0; i < n; i++){ > > vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); > > vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl); > > vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl); > > vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl); > > vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl); > > __riscv_vse8_v_i8m1 (out + 100*i,v3,vl); > > __riscv_vse8_v_i8m1 (out + 222*i,v4,vl); > > } > > } > > > > Due to the current unreasonable reg order, this case produce unnecessary > > register spillings. > > > > Fix the order can help for RA. > Note that this is likely a losing game -- over time you're likely to > find that one ordering works better for one set of inputs while another > ordering works better for a different set of inputs. > > So while I don't object to the patch, in general we try to find a > reasonable setting, knowing that it's likely not to be optimal in all cases. > > Probably the most important aspect of this patch in my mind is moving > the vector mask register to the end so that it's only used for vectors > when we've exhausted the whole vector register file. Thus it's more > likely to be usable as a mask when we need it for that purpose. > > OK for the trunk and backporting to the shared RISC-V sub-branch off > gcc-13 (once it's created). > > jeff > > >