Re: Reload fix for an old aarch64 issue
On 03/14/2017 05:22 PM, Bernd Schmidt wrote: This triggered a kernel miscompilation with an old (4.8 I think) aarch64 toolchain. Here's the reloads for the insn where things go wrong: Reloads for insn # 210 Reload 0: reload_in (DI) = (reg/v/f:DI 80 [ pgdata ]) GENERAL_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 0) reload_in_reg: (reg/v/f:DI 80 [ pgdata ]) reload_reg_rtx: (reg:DI 5 x5) Reload 1: reload_in (DI) = (plus:DI (reg/v/f:DI 80 [ pgdata ]) (const_int 7172 [0x1c04])) GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 0), inc by 2 reload_in_reg: (plus:DI (reg/v/f:DI 80 [ pgdata ]) (const_int 7172 [0x1c04])) reload_reg_rtx: (reg:DI 5 x5) Note how the input and input_address reloads use the same reload register. This is intended as the two categories aren't supposed to conflict. The problem is that reload 1 is a PLUS expression, and when reloading these, gen_reload may have to resort to tricks, such as loading one of the operands (the constant 7172) into the reload register, and then adding the other operand to it. In effect that's an earlyclobber of the reload register, and that is not represented in the for_input/for_input_address classification. Most likely we haven't tripped over this earlier because on other machines the addition can be done in a single insn. The following fixes this by using RELOAD_OTHER in this case. This is pessimistic, but at that point I don't think we can really know what gen_reload will have to do. Bootstrapped and tested on x86_64-linux on the 4.7 branch - that's the best method of testing that I can think of (but I think I'll also run some c6x tests with trunk). If no one objects, I'll check this in soonish. Bernd rld-other.diff * reload.c (find_reloads): When reloading a nonoffsettable address, use RELOAD_OTHER for it and its address reloads. I've already got state on this... OK. jeff
Re: Reload fix for an old aarch64 issue
On Tue, Mar 14, 2017 at 4:22 PM, Bernd Schmidt wrote: > This triggered a kernel miscompilation with an old (4.8 I think) aarch64 > toolchain. Yes RHEL's 4.8 has the issue. So did Cavium/MontaVista's GCC 4.7 with AARCH64 support backported to it. We would work around the bug in the kernel some of the time. I never looked into it since I knew it was in some part of reload and I am not a person to look into a reload issue. Thanks, Andrew > > Here's the reloads for the insn where things go wrong: > > Reloads for insn # 210 > Reload 0: reload_in (DI) = (reg/v/f:DI 80 [ pgdata ]) > GENERAL_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 0) > reload_in_reg: (reg/v/f:DI 80 [ pgdata ]) > reload_reg_rtx: (reg:DI 5 x5) > Reload 1: reload_in (DI) = (plus:DI (reg/v/f:DI 80 [ pgdata ]) > (const_int 7172 > [0x1c04])) > GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 0), inc by 2 > reload_in_reg: (plus:DI (reg/v/f:DI 80 [ pgdata ]) > (const_int 7172 > [0x1c04])) > reload_reg_rtx: (reg:DI 5 x5) > > Note how the input and input_address reloads use the same reload register. > This is intended as the two categories aren't supposed to conflict. The > problem is that reload 1 is a PLUS expression, and when reloading these, > gen_reload may have to resort to tricks, such as loading one of the operands > (the constant 7172) into the reload register, and then adding the other > operand to it. In effect that's an earlyclobber of the reload register, and > that is not represented in the for_input/for_input_address classification. > > Most likely we haven't tripped over this earlier because on other machines > the addition can be done in a single insn. > > The following fixes this by using RELOAD_OTHER in this case. This is > pessimistic, but at that point I don't think we can really know what > gen_reload will have to do. Bootstrapped and tested on x86_64-linux on the > 4.7 branch - that's the best method of testing that I can think of (but I > think I'll also run some c6x tests with trunk). If no one objects, I'll > check this in soonish. > > > Bernd
Reload fix for an old aarch64 issue
This triggered a kernel miscompilation with an old (4.8 I think) aarch64 toolchain. Here's the reloads for the insn where things go wrong: Reloads for insn # 210 Reload 0: reload_in (DI) = (reg/v/f:DI 80 [ pgdata ]) GENERAL_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 0) reload_in_reg: (reg/v/f:DI 80 [ pgdata ]) reload_reg_rtx: (reg:DI 5 x5) Reload 1: reload_in (DI) = (plus:DI (reg/v/f:DI 80 [ pgdata ]) (const_int 7172 [0x1c04])) GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 0), inc by 2 reload_in_reg: (plus:DI (reg/v/f:DI 80 [ pgdata ]) (const_int 7172 [0x1c04])) reload_reg_rtx: (reg:DI 5 x5) Note how the input and input_address reloads use the same reload register. This is intended as the two categories aren't supposed to conflict. The problem is that reload 1 is a PLUS expression, and when reloading these, gen_reload may have to resort to tricks, such as loading one of the operands (the constant 7172) into the reload register, and then adding the other operand to it. In effect that's an earlyclobber of the reload register, and that is not represented in the for_input/for_input_address classification. Most likely we haven't tripped over this earlier because on other machines the addition can be done in a single insn. The following fixes this by using RELOAD_OTHER in this case. This is pessimistic, but at that point I don't think we can really know what gen_reload will have to do. Bootstrapped and tested on x86_64-linux on the 4.7 branch - that's the best method of testing that I can think of (but I think I'll also run some c6x tests with trunk). If no one objects, I'll check this in soonish. Bernd * reload.c (find_reloads): When reloading a nonoffsettable address, use RELOAD_OTHER for it and its address reloads. Index: gcc/reload.c === --- gcc/reload.c (revision 211480) +++ gcc/reload.c (working copy) @@ -3989,14 +3989,14 @@ find_reloads (rtx insn, int replace, int &XEXP (recog_data.operand[i], 0), (rtx*) 0, base_reg_class (VOIDmode, as, MEM, SCRATCH), address_mode, - VOIDmode, 0, 0, i, RELOAD_FOR_INPUT); + VOIDmode, 0, 0, i, RELOAD_OTHER); rld[operand_reloadnum[i]].inc = GET_MODE_SIZE (GET_MODE (recog_data.operand[i])); /* If this operand is an output, we will have made any reloads for its address as RELOAD_FOR_OUTPUT_ADDRESS, but now we are treating part of the operand as an input, so - we must change these to RELOAD_FOR_INPUT_ADDRESS. */ + we must change these to RELOAD_FOR_OTHER_ADDRESS. */ if (modified[i] == RELOAD_WRITE) { @@ -4005,10 +4005,10 @@ find_reloads (rtx insn, int replace, int if (rld[j].opnum == i) { if (rld[j].when_needed == RELOAD_FOR_OUTPUT_ADDRESS) - rld[j].when_needed = RELOAD_FOR_INPUT_ADDRESS; + rld[j].when_needed = RELOAD_FOR_OTHER_ADDRESS; else if (rld[j].when_needed == RELOAD_FOR_OUTADDR_ADDRESS) - rld[j].when_needed = RELOAD_FOR_INPADDR_ADDRESS; + rld[j].when_needed = RELOAD_FOR_OTHER_ADDRESS; } } }