[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 --- Comment #11 from Tamar Christina --- Author: tnfchris Date: Thu Feb 14 17:17:20 2019 New Revision: 268884 URL: https://gcc.gnu.org/viewcvs?rev=268884&root=gcc&view=rev Log: Arm: Add HF modes to ANY iterators The iterator ANY64 are used in various general split patterns and is supposed to contain all 64 bit modes. For some reason the pattern has HI but not HF. This adds HF so that general 64 bit splits are generated for these modes as well. These are required by various split patterns that expect them to be there. gcc/ChangeLog: PR target/88850 * config/arm/iterators.md (ANY64): Add V4HF. gcc/testsuite/ChangeLog: PR target/88850 * gcc.target/arm/pr88850-2.c: New test. * lib/target-supports.exp (check_effective_target_arm_neon_softfp_fp16_ok_nocache, check_effective_target_arm_neon_softfp_fp16_ok, add_options_for_arm_neon_softfp_fp16): New. Added: trunk/gcc/testsuite/gcc.target/arm/pr88850-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/iterators.md trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/lib/target-supports.exp
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #10 from Tamar Christina --- Screwed up the PR format, but committed to trunk 268612 | tnfchris | 2019-02-07 10:05:57 + (Thu, 07 Feb 2019) | 21 lines Arm: Fix NEON REG to REG reload failures. (PR/target 88850) We currently return cost 2 for NEON REG to REG moves, which would be incorrect for 64 bit moves. We currently don't have a pattern for this in the neon_move alternatives because this is a bit of a special case. We would almost never want it to use this r -> r pattern unless it really has no choice. As such we add a new neon r -> r move pattern but also hide it from being used to determine register preferences and also disparage it during LRA. gcc/ChangeLog: PR/target 88850 * config/arm/neon.md (*neon_mov): Add r -> r case. gcc/testsuite/ChangeLog: PR/target 88850 * gcc.target/arm/pr88850.c: New test.
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 --- Comment #9 from Tamar Christina --- Yes, sorry for the delay. A different patch which doesn't change the costs is now validating. Will post today or tomorrow.
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 Alexandre Oliva changed: What|Removed |Added Status|NEW |ASSIGNED CC||aoliva at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org --- Comment #8 from Alexandre Oliva --- AFAICT Christina is working on it, so I'm changing the bug status so that this becomes apparent in bug lists
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 --- Comment #7 from Tamar Christina --- Thanks for the information Vladimir, I wasn't aware of this special treatment of cost 2. Changing the cost does indeed fix the ICE. Working on a sensible patch now.
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 --- Comment #6 from Vladimir Makarov --- (In reply to Tamar Christina from comment #5) > So yeah it seems that there are three issues here: > > 1) We should probably have an r -> r alternative for *neon_mov. > 2) The costs are now flipped from what they were before, for some reason the > VFP regs are now way more expensive. > 3) reload shouldn't have ICEd since it says > > r113: preferred GENERAL_REGS, alternative ALL_REGS, allocno ALL_REGS > > so it hasn't excluded ALL_REGS as an alternative, which should have either > a) used the VPF register again or > b) spilled the register since we have a m -> r and r -> m pattern. p113 is used in 2 move insns, both containing 2 general hard regs (r0 and r2). So it is natural to use general regs for p113. At least arm_register_move_cost says this. LRA does not check constraints because arm_register_move_cost for general regs in any mode returns 2. Such LRA/reload behaviour for cost 2 is described in gcc documentation. So adding r,r alternative to neon_movv8qi or increasing move cost for GENERAL_REGS or both will solve the problem. Actually the cost should be increased in anyway. It can not be 2 because we need 2 general hard regs for V8QImode. And this is a work for arm target maintainers.
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 --- Comment #5 from Tamar Christina --- So yeah it seems that there are three issues here: 1) We should probably have an r -> r alternative for *neon_mov. 2) The costs are now flipped from what they were before, for some reason the VFP regs are now way more expensive. 3) reload shouldn't have ICEd since it says r113: preferred GENERAL_REGS, alternative ALL_REGS, allocno ALL_REGS so it hasn't excluded ALL_REGS as an alternative, which should have either a) used the VPF register again or b) spilled the register since we have a m -> r and r -> m pattern.
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 Jakub Jelinek changed: What|Removed |Added Priority|P3 |P1 Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-18 CC||vmakarov at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #4 from Jakub Jelinek --- Started with r266385, which changed the cost: -r113: preferred VFP_REGS, alternative NO_REGS, allocno VFP_REGS -r112: preferred VFP_REGS, alternative NO_REGS, allocno VFP_REGS +r113: preferred GENERAL_REGS, alternative ALL_REGS, allocno ALL_REGS +r112: preferred GENERAL_REGS, alternative ALL_REGS, allocno ALL_REGS - a0(r113,l0) costs: LO_REGS:34000,34000 HI_REGS:34000,34000 CALLER_SAVE_REGS:34000,34000 GENERAL_REGS:34000,34000 VFP_D0_D7_REGS:4000,4000 VFP_LO_REGS:4000,4000 VFP_HI_REGS:4000,4000 VFP_REGS:4000,4000 ALL_REGS:34000,34000 MEM:24000,24000 - a1(r112,l0) costs: LO_REGS:34000,34000 HI_REGS:34000,34000 CALLER_SAVE_REGS:34000,34000 GENERAL_REGS:34000,34000 VFP_D0_D7_REGS:4000,4000 VFP_LO_REGS:4000,4000 VFP_HI_REGS:4000,4000 VFP_REGS:4000,4000 ALL_REGS:34000,34000 MEM:24000,24000 + a0(r113,l0) costs: GENERAL_REGS:4000,4000 VFP_D0_D7_REGS:6,6 VFP_LO_REGS:6,6 VFP_HI_REGS:6,6 VFP_REGS:6,6 ALL_REGS:3,3 MEM:4,4 + a1(r112,l0) costs: GENERAL_REGS:4000,4000 VFP_D0_D7_REGS:6,6 VFP_LO_REGS:6,6 VFP_HI_REGS:6,6 VFP_REGS:6,6 ALL_REGS:3,3 MEM:4,4 We have: (insn 13 4 14 2 (set (reg:V8QI 112) (reg:V8QI 0 r0 [ x ])) "pr88850-2.c":6:1 936 {*neon_movv8qi} (expr_list:REG_DEAD (reg:V8QI 0 r0 [ x ]) (nil))) (insn 14 13 7 2 (set (reg:V8QI 113) (reg:V8QI 2 r2 [ y ])) "pr88850-2.c":6:1 936 {*neon_movv8qi} (expr_list:REG_DEAD (reg:V8QI 2 r2 [ y ]) (nil))) (insn 7 14 8 2 (set (reg:V8QI 2 r2) (reg:V8QI 112)) "pr88850-2.c":7:3 936 {*neon_movv8qi} (expr_list:REG_DEAD (reg:V8QI 112) (nil))) (insn 8 7 9 2 (set (reg:V8QI 0 r0) (reg:V8QI 113)) "pr88850-2.c":7:3 936 {*neon_movv8qi} (expr_list:REG_DEAD (reg:V8QI 113) (nil))) and the move instructions don't have alternatives for GPR to GPR move, that can be done only through a VFP_REGS register.
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- typedef __builtin_neon_qi int8x8_t __attribute__ ((__vector_size__ (8))); void bar (int8x8_t *, int8x8_t, int8x8_t); void foo (int8x8_t z, int8x8_t x, int8x8_t v) { bar (&v, z, x); } ICEs too with these options and so does: typedef __builtin_neon_qi int8x8_t __attribute__ ((__vector_size__ (8))); void bar (int8x8_t, int8x8_t); void foo (int8x8_t x, int8x8_t y) { bar (y, x); } The last one works with x, y because the *neon_movv8qi instructions are eliminated by fwprop1. In any case, this looks like a target bug to me, there is nothing wrong in how the expansion works here, saving hard registers in which the parameters are passed into pseudos and then loading the hard registers in which callee expects parameters from the pseudos is the usual thing. If the *neon_movv8qi instructions need -mfloat-abi=hard or whatever else, then guess __builtin_neon_qi vectors can't be passed where gcc wants to pass them unless that condition is met - if there are no instructions to move those data from/to these registers or to/from memory, then those registers can't be really used period. Or just error if user does this?
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 --- Comment #2 from Tamar Christina --- Author: tnfchris Date: Thu Jan 17 15:17:57 2019 New Revision: 268033 URL: https://gcc.gnu.org/viewcvs?rev=268033&root=gcc&view=rev Log: Fix Arm testcase by using NEON. gcc/testsuite/ChangeLog: PR target/88850 * gcc.target/arm/pr51968.c: Use neon intrinsics. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/arm/pr51968.c
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 Tamar Christina changed: What|Removed |Added Keywords||ice-on-invalid-code --- Comment #1 from Tamar Christina --- The function __builtin_neon_vuzpv8qi is undefined since the builtin was removed, hence the call being emitted. The testcase hides this by passing `-Wno-implicit-function-declaration`. The testcase was a reduced case from pr51968. There is still an issue here that the hard register shouldn't have been emitted, but I'll fix the testcase in pr51968 in the meantime.
[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850 Richard Biener changed: What|Removed |Added Version|unknown |9.0 Target Milestone|--- |9.0