This patch fixes a reload bug that's hard to reproduce reliably (so far I've only observed it on the OG13 branch, with testcase gcc.c-torture/compile/pr70355.c), but causes an infinite loop in reload when it fails.

For some reason it wants to save a value from AVGPRs to memory, this can't happen directly on CDNA1, so secondary reload moves the value to VGPRS, but instead of proceeding to memory, LRA just goes and moves the value right back into AVGPRs. Disparaging this move (when a reload is needed) fixes the issue, but I don't know if this is the intended or optimal solution in these cases.

Andrew
amdgcn: Fix vector TImode reload loop

I've only observed the problem on the devel/omp/gcc-13 branch, but this
could theoretically affect mainline also.  The mov insns for the other modes
already have '$', so this completes the set.

gcc/ChangeLog:

        * config/gcn/gcn-valu.md (*mov<mode>_4reg): Disparage AVGPR use when a
        reload is required.

diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md
index 23f2bbe454b..a928decd408 100644
--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -566,10 +566,10 @@ (define_insn "*mov<mode>_4reg"
        (match_operand:V_4REG 1 "general_operand"))]
   ""
   {@ [cons: =0, 1; attrs: type, length, gcn_version]
-  [v,vDB;vmult,16,*    ]           v_mov_b32\t%L0, %L1\;          
v_mov_b32\t%H0, %H1\;          v_mov_b32\t%J0, %J1\;          v_mov_b32\t%K0, 
%K1
-  [v,a  ;vmult,32,*    ]  v_accvgpr_read_b32\t%L0, %L1\; 
v_accvgpr_read_b32\t%H0, %H1\; v_accvgpr_read_b32\t%J0, %J1\; 
v_accvgpr_read_b32\t%K0, %K1
-  [a,v  ;vmult,32,*    ] v_accvgpr_write_b32\t%L0, 
%L1\;v_accvgpr_write_b32\t%H0, %H1\;v_accvgpr_write_b32\t%J0, 
%J1\;v_accvgpr_write_b32\t%K0, %K1
-  [a,a  ;vmult,32,cdna2]   v_accvgpr_mov_b32\t%L0, %L1\;  
v_accvgpr_mov_b32\t%H0, %H1\;  v_accvgpr_mov_b32\t%J0, %J1\;  
v_accvgpr_mov_b32\t%K0, %K1
+  [v ,vDB;vmult,16,*    ]           v_mov_b32\t%L0, %L1\;          
v_mov_b32\t%H0, %H1\;          v_mov_b32\t%J0, %J1\;          v_mov_b32\t%K0, 
%K1
+  [v ,a  ;vmult,32,*    ]  v_accvgpr_read_b32\t%L0, %L1\; 
v_accvgpr_read_b32\t%H0, %H1\; v_accvgpr_read_b32\t%J0, %J1\; 
v_accvgpr_read_b32\t%K0, %K1
+  [$a,v  ;vmult,32,*    ] v_accvgpr_write_b32\t%L0, 
%L1\;v_accvgpr_write_b32\t%H0, %H1\;v_accvgpr_write_b32\t%J0, 
%J1\;v_accvgpr_write_b32\t%K0, %K1
+  [a ,a  ;vmult,32,cdna2]   v_accvgpr_mov_b32\t%L0, %L1\;  
v_accvgpr_mov_b32\t%H0, %H1\;  v_accvgpr_mov_b32\t%J0, %J1\;  
v_accvgpr_mov_b32\t%K0, %K1
   })
 
 (define_insn "mov<mode>_exec"

Reply via email to