Re: [PATCH V3, rs6000] Disable generation of scalar modulo instructions
Hi Pat, > On 6/27/23 1:52 PM, Pat Haugen via Gcc-patches wrote: >> Updated from prior version to address review comments (update >> rs6000_rtx_cost, >> update scan strings of mod-1.c/mod-2.c)l. >> >> Disable generation of scalar modulo instructions. >> >> It was recently discovered that the scalar modulo instructions can suffer >> noticeable performance issues for certain input values. This patch disables >> their generation since the equivalent div/mul/sub sequence does not suffer >> the same problem. >> >> Bootstrapped and regression tested on powerpc64/powerpc64le. >> Ok for master and backports after burn in? >> >> -Pat >> >> >> 2023-06-27 Pat Haugen >> >> gcc/ >> * config/rs6000/rs6000.cc (rs6000_rtx_costs): Check if disabling >> scalar modulo. >> * config/rs6000/rs6000.h (RS6000_DISABLE_SCALAR_MODULO): New. >> * config/rs6000/rs6000.md (mod3, *mod3): Disable. >> (define_expand umod3): New. >> (define_insn umod3): Rename to *umod3 and disable. >> (umodti3, modti3): Disable. >> >> gcc/testsuite/ >> * gcc.target/powerpc/clone1.c: Add xfails. >> * gcc.target/powerpc/clone3.c: Likewise. >> * gcc.target/powerpc/mod-1.c: Update scan strings and add xfails. >> * gcc.target/powerpc/mod-2.c: Likewise. >> * gcc.target/powerpc/p10-vdivq-vmodq.c: Add xfails. >> > > Attaching patch since my mailer apparently messed up some formatting again. > > -Pat > > no-modulo.diff > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 546c353029b..2dae217bf64 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -22127,7 +22127,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > outer_code, > *total = rs6000_cost->divsi; > } >/* Add in shift and subtract for MOD unless we have a mod instruction. > */ > - if (!TARGET_MODULO && (code == MOD || code == UMOD)) > + if ((!TARGET_MODULO > +|| (RS6000_DISABLE_SCALAR_MODULO && SCALAR_INT_MODE_P (mode))) > + && (code == MOD || code == UMOD)) > *total += COSTS_N_INSNS (2); >return false; > > diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h > index 3503614efbd..22595f6ebd7 100644 > --- a/gcc/config/rs6000/rs6000.h > +++ b/gcc/config/rs6000/rs6000.h > @@ -2492,3 +2492,9 @@ while (0) > rs6000_asm_output_opcode (STREAM);\ > } > \ >while (0) > + > +/* Disable generation of scalar modulo instructions due to performance issues > + with certain input values. This can be removed in the future when the > + issues have been resolved. */ > +#define RS6000_DISABLE_SCALAR_MODULO 1 > + > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index b0db8ae508d..6c2f237a539 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -3421,6 +3421,17 @@ (define_expand "mod3" > FAIL; > >operands[2] = force_reg (mode, operands[2]); > + > + if (RS6000_DISABLE_SCALAR_MODULO) > + { > + temp1 = gen_reg_rtx (mode); > + temp2 = gen_reg_rtx (mode); > + > + emit_insn (gen_div3 (temp1, operands[1], operands[2])); > + emit_insn (gen_mul3 (temp2, temp1, operands[2])); > + emit_insn (gen_sub3 (operands[0], operands[1], temp2)); > + DONE; > + } > } >else > { > @@ -3440,17 +3451,42 @@ (define_insn "*mod3" >[(set (match_operand:GPR 0 "gpc_reg_operand" "=,r") > (mod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") >(match_operand:GPR 2 "gpc_reg_operand" "r,r")))] > - "TARGET_MODULO" > + "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO" >"mods %0,%1,%2" >[(set_attr "type" "div") > (set_attr "size" "")]) > > +;; This define_expand can be removed when RS6000_DISABLE_SCALAR_MODULO is > +;; removed. > +(define_expand "umod3" > + [(set (match_operand:GPR 0 "gpc_reg_operand") > + (umod:GPR (match_operand:GPR 1 "gpc_reg_operand") > + (match_operand:GPR 2 "gpc_reg_operand")))] > + "" > +{ > + rtx temp1; > + rtx temp2; > + > + if (!TARGET_MODULO) > + FAIL; > Nit: Sorry that I didn't catch this in the previous review, it seems we can just put TARGET_MODULO in the condition of define_expand. Besides, it seems the declarations of temp1 and temp2 can be merged into their assignments below. The others look pretty good to me, thanks! As I know this patch is still on Segher's list to review, I'd defer the final approval to Segher. BR, Kewen
Re: [PATCH V3, rs6000] Disable generation of scalar modulo instructions
On 6/27/23 1:52 PM, Pat Haugen via Gcc-patches wrote: Updated from prior version to address review comments (update rs6000_rtx_cost, update scan strings of mod-1.c/mod-2.c)l. Disable generation of scalar modulo instructions. It was recently discovered that the scalar modulo instructions can suffer noticeable performance issues for certain input values. This patch disables their generation since the equivalent div/mul/sub sequence does not suffer the same problem. Bootstrapped and regression tested on powerpc64/powerpc64le. Ok for master and backports after burn in? -Pat 2023-06-27 Pat Haugen gcc/ * config/rs6000/rs6000.cc (rs6000_rtx_costs): Check if disabling scalar modulo. * config/rs6000/rs6000.h (RS6000_DISABLE_SCALAR_MODULO): New. * config/rs6000/rs6000.md (mod3, *mod3): Disable. (define_expand umod3): New. (define_insn umod3): Rename to *umod3 and disable. (umodti3, modti3): Disable. gcc/testsuite/ * gcc.target/powerpc/clone1.c: Add xfails. * gcc.target/powerpc/clone3.c: Likewise. * gcc.target/powerpc/mod-1.c: Update scan strings and add xfails. * gcc.target/powerpc/mod-2.c: Likewise. * gcc.target/powerpc/p10-vdivq-vmodq.c: Add xfails. Attaching patch since my mailer apparently messed up some formatting again. -Pat diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 546c353029b..2dae217bf64 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -22127,7 +22127,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, *total = rs6000_cost->divsi; } /* Add in shift and subtract for MOD unless we have a mod instruction. */ - if (!TARGET_MODULO && (code == MOD || code == UMOD)) + if ((!TARGET_MODULO + || (RS6000_DISABLE_SCALAR_MODULO && SCALAR_INT_MODE_P (mode))) +&& (code == MOD || code == UMOD)) *total += COSTS_N_INSNS (2); return false; diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index 3503614efbd..22595f6ebd7 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -2492,3 +2492,9 @@ while (0) rs6000_asm_output_opcode (STREAM); \ } \ while (0) + +/* Disable generation of scalar modulo instructions due to performance issues + with certain input values. This can be removed in the future when the + issues have been resolved. */ +#define RS6000_DISABLE_SCALAR_MODULO 1 + diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index b0db8ae508d..6c2f237a539 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -3421,6 +3421,17 @@ (define_expand "mod3" FAIL; operands[2] = force_reg (mode, operands[2]); + + if (RS6000_DISABLE_SCALAR_MODULO) + { + temp1 = gen_reg_rtx (mode); + temp2 = gen_reg_rtx (mode); + + emit_insn (gen_div3 (temp1, operands[1], operands[2])); + emit_insn (gen_mul3 (temp2, temp1, operands[2])); + emit_insn (gen_sub3 (operands[0], operands[1], temp2)); + DONE; + } } else { @@ -3440,17 +3451,42 @@ (define_insn "*mod3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=,r") (mod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") (match_operand:GPR 2 "gpc_reg_operand" "r,r")))] - "TARGET_MODULO" + "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO" "mods %0,%1,%2" [(set_attr "type" "div") (set_attr "size" "")]) +;; This define_expand can be removed when RS6000_DISABLE_SCALAR_MODULO is +;; removed. +(define_expand "umod3" + [(set (match_operand:GPR 0 "gpc_reg_operand") + (umod:GPR (match_operand:GPR 1 "gpc_reg_operand") + (match_operand:GPR 2 "gpc_reg_operand")))] + "" +{ + rtx temp1; + rtx temp2; + + if (!TARGET_MODULO) + FAIL; -(define_insn "umod3" + if (RS6000_DISABLE_SCALAR_MODULO) +{ + temp1 = gen_reg_rtx (mode); + temp2 = gen_reg_rtx (mode); + + emit_insn (gen_udiv3 (temp1, operands[1], operands[2])); + emit_insn (gen_mul3 (temp2, temp1, operands[2])); + emit_insn (gen_sub3 (operands[0], operands[1], temp2)); + DONE; +} +}) + +(define_insn "*umod3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=,r") (umod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") (match_operand:GPR 2 "gpc_reg_operand" "r,r")))] - "TARGET_MODULO" + "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO" "modu %0,%1,%2" [(set_attr "type" "div") (set_attr "size" "")]) @@ -3507,7 +3543,7 @@ (define_insn "umodti3" [(set (match_operand:TI 0 "altivec_register_operand" "=v") (umod:TI (match_operand:TI 1 "altivec_register_operand" "v") (match_operand:TI 2 "altivec_register_operand" "v")))] - "TARGET_POWER10 && TARGET_POWERPC64" + "TARGET_POWER10 &&
[PATCH V3, rs6000] Disable generation of scalar modulo instructions
Updated from prior version to address review comments (update rs6000_rtx_cost, update scan strings of mod-1.c/mod-2.c)l. Disable generation of scalar modulo instructions. It was recently discovered that the scalar modulo instructions can suffer noticeable performance issues for certain input values. This patch disables their generation since the equivalent div/mul/sub sequence does not suffer the same problem. Bootstrapped and regression tested on powerpc64/powerpc64le. Ok for master and backports after burn in? -Pat 2023-06-27 Pat Haugen gcc/ * config/rs6000/rs6000.cc (rs6000_rtx_costs): Check if disabling scalar modulo. * config/rs6000/rs6000.h (RS6000_DISABLE_SCALAR_MODULO): New. * config/rs6000/rs6000.md (mod3, *mod3): Disable. (define_expand umod3): New. (define_insn umod3): Rename to *umod3 and disable. (umodti3, modti3): Disable. gcc/testsuite/ * gcc.target/powerpc/clone1.c: Add xfails. * gcc.target/powerpc/clone3.c: Likewise. * gcc.target/powerpc/mod-1.c: Update scan strings and add xfails. * gcc.target/powerpc/mod-2.c: Likewise. * gcc.target/powerpc/p10-vdivq-vmodq.c: Add xfails. diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 546c353029b..2dae217bf64 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -22127,7 +22127,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, *total = rs6000_cost->divsi; } /* Add in shift and subtract for MOD unless we have a mod instruction. */ - if (!TARGET_MODULO && (code == MOD || code == UMOD)) + if ((!TARGET_MODULO + || (RS6000_DISABLE_SCALAR_MODULO && SCALAR_INT_MODE_P (mode))) +&& (code == MOD || code == UMOD)) *total += COSTS_N_INSNS (2); return false; diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index 3503614efbd..22595f6ebd7 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -2492,3 +2492,9 @@ while (0) rs6000_asm_output_opcode (STREAM); \ } \ while (0) + +/* Disable generation of scalar modulo instructions due to performance issues + with certain input values. This can be removed in the future when the + issues have been resolved. */ +#define RS6000_DISABLE_SCALAR_MODULO 1 + diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index b0db8ae508d..6c2f237a539 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -3421,6 +3421,17 @@ (define_expand "mod3" FAIL; operands[2] = force_reg (mode, operands[2]); + + if (RS6000_DISABLE_SCALAR_MODULO) + { + temp1 = gen_reg_rtx (mode); + temp2 = gen_reg_rtx (mode); + + emit_insn (gen_div3 (temp1, operands[1], operands[2])); + emit_insn (gen_mul3 (temp2, temp1, operands[2])); + emit_insn (gen_sub3 (operands[0], operands[1], temp2)); + DONE; + } } else { @@ -3440,17 +3451,42 @@ (define_insn "*mod3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=,r") (mod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") (match_operand:GPR 2 "gpc_reg_operand" "r,r")))] - "TARGET_MODULO" + "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO" "mods %0,%1,%2" [(set_attr "type" "div") (set_attr "size" "")]) +;; This define_expand can be removed when RS6000_DISABLE_SCALAR_MODULO is +;; removed. +(define_expand "umod3" + [(set (match_operand:GPR 0 "gpc_reg_operand") + (umod:GPR (match_operand:GPR 1 "gpc_reg_operand") + (match_operand:GPR 2 "gpc_reg_operand")))] + "" +{ + rtx temp1; + rtx temp2; + + if (!TARGET_MODULO) + FAIL; -(define_insn "umod3" + if (RS6000_DISABLE_SCALAR_MODULO) +{ + temp1 = gen_reg_rtx (mode); + temp2 = gen_reg_rtx (mode); + + emit_insn (gen_udiv3 (temp1, operands[1], operands[2])); + emit_insn (gen_mul3 (temp2, temp1, operands[2])); + emit_insn (gen_sub3 (operands[0], operands[1], temp2)); + DONE; +} +}) + +(define_insn "*umod3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=,r") (umod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") (match_operand:GPR 2 "gpc_reg_operand" "r,r")))] - "TARGET_MODULO" + "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO" "modu %0,%1,%2" [(set_attr "type" "div") (set_attr "size" "")]) @@ -3507,7 +3543,7 @@ (define_insn "umodti3" [(set (match_operand:TI 0 "altivec_register_operand" "=v") (umod:TI (match_operand:TI 1 "altivec_register_operand" "v") (match_operand:TI 2 "altivec_register_operand" "v")))] - "TARGET_POWER10 && TARGET_POWERPC64" + "TARGET_POWER10 && TARGET_POWERPC64 && !RS6000_DISABLE_SCALAR_MODULO" "vmoduq %0,%1,%2" [(set_attr "type" "vecdiv")