[Bug rtl-optimization/89271] gcc.target/powerpc/vsx-simode2.c stopped working in GCC 9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89271 Alan Modra changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-02-14 CC||amodra at gmail dot com Assignee|unassigned at gcc dot gnu.org |amodra at gmail dot com Ever confirmed|0 |1
[Bug rtl-optimization/89271] gcc.target/powerpc/vsx-simode2.c stopped working in GCC 9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89271 --- Comment #3 from Alan Modra --- I believe this is a bug in rs6000_register_move_cost. Testing a fix.
[Bug rtl-optimization/89271] gcc.target/powerpc/vsx-simode2.c stopped working in GCC 9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89271 --- Comment #2 from Segher Boessenkool --- Thanks for the analysis! (In reply to Vladimir Makarov from comment #1) > The very first ira-costs pass runs in sched1 and it generates the following > costs > r125 costs: BASE_REGS:1 GENERAL_REGS:1 LINK_REGS:2 > CTR_REGS:2 LINK_OR_CTR_REGS:2 SPEC_OR_GEN_REGS:2 MEM:8000 > Why does IRA calculate such costs? We have 2 insns involving p125 >15: r125:DI=%3:DI > REG_DEAD %3:DI > 7: {r123:SI=asm_operands/*r125 with constraint 'v'*/;clobber ca:SI;} > > Cost of moving r3 into p125 is 2 for base regs and 4 for memory > Cost of moving p125 into a v reg is 8 for base regs and 4 for memory Ah, so we we need to improve that. This is power8, where mtvsr insns are still a bit expensive, but not more expensive than memory. > Therefore cost of p125 is 10 for base reg and 8 for memory (multiplied by BB > frequency 1). > > Therefore IRA chooses memory for p125. > > In this particular case insn 15 can go away when we assign r3 to p125. It > means the cost for base reg could be 8 as for memory but IRA can not say in > general case that it can assign r3 to p125. > > I think increasing memory move cost at least to 5 could solve the problem. > But probably it needs benchmarking SPEC, besides running GCC testsuite, to > see that the performance is not worse after such change. Agreed.
[Bug rtl-optimization/89271] gcc.target/powerpc/vsx-simode2.c stopped working in GCC 9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89271 --- Comment #1 from Vladimir Makarov --- > This is because IRA does > > r125: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS > >a1(r125,l0) costs: BASE_REGS:14004,14004 GENERAL_REGS:14004,14004- >LINK_REGS:24010,24010 CTR_REGS:24010,24010 LINK_OR_CTR_REGS:24010,24010- >SPEC_OR_GEN_REGS:24010,24010 MEM:12000,12000 > > and it then chooses disposition mem for r125. > > In GCC 8 and before combine already has decided to use GPR3 (the first > argument register) for this, so there was no RA here before. The very first ira-costs pass runs in sched1 and it generates the following costs r125 costs: BASE_REGS:1 GENERAL_REGS:1 LINK_REGS:2 CTR_REGS:2 LINK_OR_CTR_REGS:2 SPEC_OR_GEN_REGS:2 MEM:8000 The costs in IRA are bit different because we calculate costs knowing our preference for p125 as NO_REGS. Why does IRA calculate such costs? We have 2 insns involving p125 15: r125:DI=%3:DI REG_DEAD %3:DI 7: {r123:SI=asm_operands/*r125 with constraint 'v'*/;clobber ca:SI;} Cost of moving r3 into p125 is 2 for base regs and 4 for memory Cost of moving p125 into a v reg is 8 for base regs and 4 for memory Therefore cost of p125 is 10 for base reg and 8 for memory (multiplied by BB frequency 1). Therefore IRA chooses memory for p125. In this particular case insn 15 can go away when we assign r3 to p125. It means the cost for base reg could be 8 as for memory but IRA can not say in general case that it can assign r3 to p125. I think increasing memory move cost at least to 5 could solve the problem. But probably it needs benchmarking SPEC, besides running GCC testsuite, to see that the performance is not worse after such change.
[Bug rtl-optimization/89271] gcc.target/powerpc/vsx-simode2.c stopped working in GCC 9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89271 Segher Boessenkool changed: What|Removed |Added Known to work||8.0 Version|unknown |9.0 Target Milestone|--- |9.0 Known to fail||9.0