[Bug target/87627] GCC generates rube-goldberg machine for trivial tail call on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87627 Andrew Pinski changed: What|Removed |Added Last reconfirmed|2018-10-17 00:00:00 |2021-8-19 --- Comment #7 from Andrew Pinski --- clang and MSVC get this "correct".
[Bug target/87627] GCC generates rube-goldberg machine for trivial tail call on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87627 --- Comment #6 from Alexander Monakov --- FWIW the following CSE enhancement cleans this up, but I'm unhappy with this patch because it's too narrowly targeted; in particular, won't clean up void g(int a, int *b, int c); void f(int a, int *b, int c) { if (a) *b = c; g(a, b, c); } Had to special-case for PARM_DECL because, in general, automatic variables that are not address-taken on GIMPLE can become addressable on RTL when ABI requires passing a large argument by reference. --- a/gcc/cse.c +++ b/gcc/cse.c @@ -2232,6 +2232,15 @@ hash_rtx_string (const char *ps) return hash; } +static bool +mem_escapes_p (const_rtx x) +{ + tree decl = MEM_EXPR (x); + if (!decl || TREE_CODE (decl) != PARM_DECL) +return true; + return may_be_aliased (decl); +} + /* Same as hash_rtx, but call CB on each rtx if it is not NULL. When the callback returns true, we continue with the new rtx. */ @@ -2421,7 +2430,8 @@ hash_rtx_cb (const_rtx x, machine_mode mode, return 0; } if (hash_arg_in_memory_p && !MEM_READONLY_P (x)) - *hash_arg_in_memory_p = 1; + if (*hash_arg_in_memory_p != 1) + *hash_arg_in_memory_p = mem_escapes_p (x) ? 1 : 2; /* Now that we have already found this special case, might as well speed it up as much as possible. */ @@ -6127,7 +6137,7 @@ invalidate_memory (void) for (p = table[i]; p; p = next) { next = p->next_same_hash; - if (p->in_memory) + if (p->in_memory == 1) remove_from_table (p, i); } }
[Bug target/87627] GCC generates rube-goldberg machine for trivial tail call on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87627 --- Comment #5 from Alexander Monakov --- I've spent some time looking at this again, and I couldn't find a way to preserve REG_EQUIV notes (it's actually unclear what REG_EQUIV means precisely). What I think could help in simple cases like this one, and might also be helpful in other situations, is to have mem_attrs indicate that memory does not escape. RTL CSE would not need to invalidate such MEMs when processing a call.
[Bug target/87627] GCC generates rube-goldberg machine for trivial tail call on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87627 --- Comment #4 from Rich Felker --- Thanks, that's helpful! For 64-bit what I mean is that it emits: pushq %r12 movl %edx, %r12d pushq %rbp movl %esi, %ebp pushq %rbx movl %edi, %ebx call bar movl %r12d, %edx movl %ebp, %esi movl %ebx, %edi popq %rbx popq %rbp popq %r12 jmp bah whereas it would be much more efficient to do: pushq %rdx pushq %rsi pushq %rdi call bar popq %rdi popq %rsi popq %rdx jmp bah
[Bug target/87627] GCC generates rube-goldberg machine for trivial tail call on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87627 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Target||i?86-*-* Status|UNCONFIRMED |NEW Last reconfirmed||2018-10-17 Version|unknown |9.0 Ever confirmed|0 |1
[Bug target/87627] GCC generates rube-goldberg machine for trivial tail call on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87627 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Comment #3 from Alexander Monakov --- Noticed this back when working on -fno-plt patches: https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00229.html Emitting a tailcall on RTL drops REG_EQUIV notes (perhaps because in the general case equivalences might not hold just before the sibcall when the new arguments are being prepared), and this penalizes code generation for the whole function. I'm not sure why you say "Results are similarly bad for 64-bit", there's nothing to improve in this example with three arguments all of which are on registers and thus need to be somehow saved/restored anyway?
[Bug target/87627] GCC generates rube-goldberg machine for trivial tail call on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87627 --- Comment #2 from Rich Felker --- Further trial-and-error shows that it seems to be the sibcall itself that causes the mess. My first guess is that something in the RTL considers the whole argument area as clobbered/belonging to the sibcallee as soon as it starts setting up for the sibcall, thereby forcing the arguments to be backed up somewhere else and restored, but I'm not sure why that wouldn't affect the case where there's no intervening call.
[Bug target/87627] GCC generates rube-goldberg machine for trivial tail call on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87627 --- Comment #1 from Rich Felker --- Results are similarly bad for 64-bit, except at -Os where it effectively just pushes/pops the argument registers around the call to bar rather than allocating call-saved registers for them. Using -Os on 32-bit does not help. -O0 does suppress the register shuffling but also suppresses the tail call.