Committed, thanks Juzhe. Pan
From: juzhe.zh...@rivai.ai <juzhe.zh...@rivai.ai> Sent: Monday, October 23, 2023 3:56 PM To: Li, Pan2 <pan2...@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org> Cc: Li, Pan2 <pan2...@intel.com>; Wang, Yanzhang <yanzhang.w...@intel.com>; kito.cheng <kito.ch...@gmail.com> Subject: Re: [PATCH v1] RISC-V: Bugfix for merging undef tmp register for trunc LGTM。 ________________________________ juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai> From: pan2.li<mailto:pan2...@intel.com> Date: 2023-10-23 15:53 To: gcc-patches<mailto:gcc-patches@gcc.gnu.org> CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; pan2.li<mailto:pan2...@intel.com>; yanzhang.wang<mailto:yanzhang.w...@intel.com>; kito.cheng<mailto:kito.ch...@gmail.com> Subject: [PATCH v1] RISC-V: Bugfix for merging undef tmp register for trunc From: Pan Li <pan2...@intel.com<mailto:pan2...@intel.com>> For trunc function autovec, there will be one step like below take MU for the merge operand. rtx tmp = gen_reg_rtx (vec_int_mode); emit_vec_cvt_x_f_rtz (tmp, op_1, mask, vec_fp_mode); The MU will leave the tmp (aka dest register) register unmasked elements unchanged and it is undefined here. This patch would like to adjust the MU to MA. gcc/ChangeLog: * config/riscv/riscv-v.cc (emit_vec_cvt_x_f_rtz): Add insn type arg. (expand_vec_trunc): Take MA instead of MU for cvt_x_f_rtz. Signed-off-by: Pan Li <pan2...@intel.com<mailto:pan2...@intel.com>> --- gcc/config/riscv/riscv-v.cc | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 91ad6a61fa8..fb6a4e561db 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4144,12 +4144,20 @@ emit_vec_cvt_f_x (rtx op_dest, rtx op_src, rtx mask, static void emit_vec_cvt_x_f_rtz (rtx op_dest, rtx op_src, rtx mask, - machine_mode vec_mode) + insn_type type, machine_mode vec_mode) { - rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src}; insn_code icode = code_for_pred (FIX, vec_mode); - emit_vlmax_insn (icode, UNARY_OP_TAMU, cvt_x_ops); + if (type & USE_VUNDEF_MERGE_P) + { + rtx cvt_x_ops[] = {op_dest, mask, op_src}; + emit_vlmax_insn (icode, type, cvt_x_ops); + } + else + { + rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src}; + emit_vlmax_insn (icode, type, cvt_x_ops); + } } void @@ -4285,7 +4293,7 @@ expand_vec_trunc (rtx op_0, rtx op_1, machine_mode vec_fp_mode, /* Step-3: Convert to integer on mask, rounding to zero (aka truncate). */ rtx tmp = gen_reg_rtx (vec_int_mode); - emit_vec_cvt_x_f_rtz (tmp, op_1, mask, vec_fp_mode); + emit_vec_cvt_x_f_rtz (tmp, op_1, mask, UNARY_OP_TAMA, vec_fp_mode); /* Step-4: Convert to floating-point on mask for the rint result. */ emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode); -- 2.34.1