Re: [PATCH, V2] PR target/105325, Make load/cmp fusion know about prefixed load
On Mon, Mar 27, 2023 at 03:03:17PM +0800, Kewen.Lin wrote: > ... instead I suggested moving these three lines to below else arm for CCUNS, > since the arm for CC already has those variables redefined, so it's something > like: I did those changes in the 3rd version of the patch. | Date: Mon, 27 Mar 2023 23:19:55 -0400 | From: Michael Meissner | Subject: [PATCH, V3] PR target/105325, Make load/cmp fusion know about prefixed loads | Message-ID: ... > In the previous review, I put a comment that "lp64 seems not necessary.". > Did you try to test without it? (if yes, any fallouts?) Yes, I tried it without the lp64, and I removed it from V3 of the patch. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com
Re: [PATCH, V2] PR target/105325, Make load/cmp fusion know about prefixed load
Hi Mike, on 2023/3/25 07:06, Michael Meissner wrote: > I posted a version of patch on March 21st. This patch makes some code changes > suggested in the genfusion.pl code. The only change is in genfusion.pl. The > fusion.md that it makes is the same. > > The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion > optimization generates illegal assembler code. > > Ultimately the code was dying because the fusion load + compare -1/0/1 > patterns > did not handle the possibility that the load might be prefixed. > > The main cause is the constraints for the individual loads in the fusion did > not > match the machine. In particular, LWA is a ds format instruction when it is > unprefixed. The code did not also set the prefixed attribute correctly. > > This patch rewrites the genfusion.pl script so that it will have more accurate > constraints for the LWA and LD instructions (which are DS instructions). The > updated genfusion.pl was then run to update fusion.md. Finally, the code for > the "prefixed" attribute is modified so that it considers load + compare > immediate patterns to be like the normal load insns in checking whether > operand[1] is a prefixed instruction. > > I am re-running the tests right now, but they should have the same results > since fsuion.md is the same, and only code in genfusion.pl that makes > fusion.md > was modified. Assuming these runs pass can I check this into the master > branch? > > I will also need to check these same patches into GCC 11 and GCC 12 after a > waiting period (the patch applied to those branches as well). > > 2023-03-21 Michael Meissner > > gcc/ > > PR target/105325 > * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation > of the ld and lwa instructions which use the DS encoding instead of D. > Use the YZ constraint for these loads. Handle prefixed loads better. > Set the sign_extend attribute as appropriate. > * gcc/config/rs6000/fusion.md: Regenerate. > * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi > instructions to the list of instructions that might have a prefixed load > instruction. > > gcc/testsuite/ > > PR target/105325 > * g++.target/powerpc/pr105325.C: New test. > * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts. > --- > gcc/config/rs6000/fusion.md | 17 ++ > gcc/config/rs6000/genfusion.pl| 32 +++ > gcc/config/rs6000/rs6000.md | 2 +- > gcc/testsuite/g++.target/powerpc/pr105325.C | 24 ++ > .../gcc.target/powerpc/fusion-p10-ldcmpi.c| 4 +-- > 5 files changed, 62 insertions(+), 17 deletions(-) > create mode 100644 gcc/testsuite/g++.target/powerpc/pr105325.C > > diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md > index d45fb138a70..da9953d9ad9 100644 > --- a/gcc/config/rs6000/fusion.md > +++ b/gcc/config/rs6000/fusion.md > @@ -22,7 +22,7 @@ > ;; load mode is DI result mode is clobber compare mode is CC extend is none > (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" >[(set (match_operand:CC 2 "cc_reg_operand" "=x") > -(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") > +(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_m1_to_1_operand" "n"))) > (clobber (match_scratch:DI 0 "=r"))] >"(TARGET_P10_FUSION)" > @@ -43,7 +43,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" > ;; load mode is DI result mode is clobber compare mode is CCUNS extend is > none > (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" >[(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > -(compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") > +(compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_0_to_1_operand" "n"))) > (clobber (match_scratch:DI 0 "=r"))] >"(TARGET_P10_FUSION)" > @@ -64,7 +64,7 @@ (define_insn_and_split > "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" > ;; load mode is DI result mode is DI compare mode is CC extend is none > (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" >[(set (match_operand:CC 2 "cc_reg_operand" "=x") > -(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") > +(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_m1_to_1_operand" "n"))) > (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] >"(TARGET_P10_FUSION)" > @@ -85,7 +85,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" > ;; load mode is DI result mode is DI compare mode is CCUNS extend is none > (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" >[(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > -(compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand"
[PATCH, V2] PR target/105325, Make load/cmp fusion know about prefixed load
I posted a version of patch on March 21st. This patch makes some code changes suggested in the genfusion.pl code. The only change is in genfusion.pl. The fusion.md that it makes is the same. The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion optimization generates illegal assembler code. Ultimately the code was dying because the fusion load + compare -1/0/1 patterns did not handle the possibility that the load might be prefixed. The main cause is the constraints for the individual loads in the fusion did not match the machine. In particular, LWA is a ds format instruction when it is unprefixed. The code did not also set the prefixed attribute correctly. This patch rewrites the genfusion.pl script so that it will have more accurate constraints for the LWA and LD instructions (which are DS instructions). The updated genfusion.pl was then run to update fusion.md. Finally, the code for the "prefixed" attribute is modified so that it considers load + compare immediate patterns to be like the normal load insns in checking whether operand[1] is a prefixed instruction. I am re-running the tests right now, but they should have the same results since fsuion.md is the same, and only code in genfusion.pl that makes fusion.md was modified. Assuming these runs pass can I check this into the master branch? I will also need to check these same patches into GCC 11 and GCC 12 after a waiting period (the patch applied to those branches as well). 2023-03-21 Michael Meissner gcc/ PR target/105325 * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation of the ld and lwa instructions which use the DS encoding instead of D. Use the YZ constraint for these loads. Handle prefixed loads better. Set the sign_extend attribute as appropriate. * gcc/config/rs6000/fusion.md: Regenerate. * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi instructions to the list of instructions that might have a prefixed load instruction. gcc/testsuite/ PR target/105325 * g++.target/powerpc/pr105325.C: New test. * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts. --- gcc/config/rs6000/fusion.md | 17 ++ gcc/config/rs6000/genfusion.pl| 32 +++ gcc/config/rs6000/rs6000.md | 2 +- gcc/testsuite/g++.target/powerpc/pr105325.C | 24 ++ .../gcc.target/powerpc/fusion-p10-ldcmpi.c| 4 +-- 5 files changed, 62 insertions(+), 17 deletions(-) create mode 100644 gcc/testsuite/g++.target/powerpc/pr105325.C diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md index d45fb138a70..da9953d9ad9 100644 --- a/gcc/config/rs6000/fusion.md +++ b/gcc/config/rs6000/fusion.md @@ -22,7 +22,7 @@ ;; load mode is DI result mode is clobber compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") -(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") +(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -43,7 +43,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") -(compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") +(compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -64,7 +64,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" ;; load mode is DI result mode is DI compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") -(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") +(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -85,7 +85,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" ;; load mode is DI result mode is DI compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") -(compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") +(compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]