> >> - riscv_vector::emit_vlmax_insn (code_for_pred_broadcast (<MODE>mode), > >> - riscv_vector::UNARY_OP, operands); > >> + /* We cannot do anything with a Float16 mode apart from converting. > >> + So convert to float, broadcast and truncate. */ > >> + if (TARGET_ZVFHMIN && !TARGET_ZVFH && <VEL>mode == HFmode) > >> + { > >> + rtx tmpsf = gen_reg_rtx (SFmode); > >> + emit_insn (gen_extendhfsf2 (tmpsf, operands[1])); > >> + poly_uint64 nunits = GET_MODE_NUNITS (<MODE>mode); > >> + machine_mode vmodesf > >> + = riscv_vector::get_vector_mode (SFmode, nunits).require (); > >> + rtx tmp = gen_reg_rtx (vmodesf); > >> + rtx ops[] = {tmp, tmpsf}; > >> + riscv_vector::emit_vlmax_insn (code_for_pred_broadcast (vmodesf), > >> + riscv_vector::UNARY_OP, ops); > >> + rtx ops2[] = {operands[0], tmp}; > >> + riscv_vector::emit_vlmax_insn (code_for_pred_trunc (vmodesf), > >> + riscv_vector::UNARY_OP_FRM_DYN, ops2); > > > > I disagree with this part especially the comment, vlse for HF vector > > just a 16 bits load, and load does not really care about the data > > format but size. > Hmm, we certainly can do a bit more. Don't have have fmacs defined on > HF and/or BF types through one of those obscure HF/BF extensions?
Yeah we don't have fmacs for HF and BF if we have ZVFHMIN only, but this part is deoptimization, The code gen path was: If the value in memory -> vlse If the value in either GPR or FPR -> spill to stack -> vlse And now: If the value in memory -> load to FPR -> extendhfsf -> vmv.f.v (broadcast) -> vfncvt.vv (trunc) If the value in FPR -> extendhfsf -> vmv.f.v (broadcast) -> vfncvt.vv (trunc) Use pr115763-2.c as example: ; w/o this patch, one vec load fsh fa0,14(sp) addi a5,sp,14 vsetivli zero,2,e16,mf4,ta,ma vlse16.v v1,0(a5),zero vs ; w/ this patch, two vector instruction fcvt.s.h fa0,fa0 vsetivli zero,2,e32,mf2,ta,ma vfmv.v.f v1,fa0 vsetvli zero,zero,e16,mf4,ta,ma vfncvt.f.f.w v1,v1 > > Also we can put HF in GPR rather than FPR for those splat/broadcast > > patterns in theory. > In theory, yes. BUt I don't think any of the patterns in the backend > have constraints that would allow a GPR to hold a BF16 value. We have those pattern to allow GPR to hold BF16 and F16 value, and riscv_hard_regno_mode_ok didn't limit GPR can't hold those modes as well: (define_insn "*mov<mode>_hardfloat" [(set (match_operand:HFBF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r, *r,*r,*m") (match_operand:HFBF 1 "move_operand" " f,zfli,G,m,f,G,*r,*f,*G*r,*m,*r"))] "((TARGET_ZFHMIN && <MODE>mode == HFmode) || (TARGET_ZFBFMIN && <MODE>mode == BFmode)) && (register_operand (operands[0], <MODE>mode) || reg_or_0_operand (operands[1], <MODE>mode))" { return riscv_output_move (operands[0], operands[1]); } [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") (set_attr "type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") (set_attr "mode" "<MODE>")]) (define_insn "*mov<mode>_softfloat" [(set (match_operand:HFBF 0 "nonimmediate_operand" "=f, r,r,m,*f,*r") (match_operand:HFBF 1 "move_operand" " f,Gr,m,r,*r,*f"))] "((!TARGET_ZFHMIN && <MODE>mode == HFmode) || (<MODE>mode == BFmode)) && (register_operand (operands[0], <MODE>mode) || reg_or_0_operand (operands[1], <MODE>mode))" { return riscv_output_move (operands[0], operands[1]); } [(set_attr "move_type" "fmove,move,load,store,mtc,mfc") (set_attr "type" "fmove,move,load,store,mtc,mfc") (set_attr "mode" "<MODE>")]) > > > Given the objections, clearly this shouldn't be committed until those > are resolved. The objection from me is removing "*pred_broadcast<mode>_zvfhmin" and those HF/BF16 changes, I propose that part should separate into another patch since this part does not appear in the title and the git comment. > > Jeff >