http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46228

--- Comment #6 from Zeev Tarantov <zeev.tarantov at gmail dot com> 2010-10-29 
23:44:49 UTC ---
Setting -finline-limit high didn't produce different code.

This function:

  4007f8:       48 8b 77 10             mov    0x10(%rdi),%rsi
  4007fc:       e9 c5 ff ff ff          jmpq   4007c6 <std::_Rb_tree<int, int,
std::_Identity<int>, std::less<int>, std::allocator<int>
>::_M_erase(std::_Rb_tree_node<int>*)>
  400801:       90                      nop

is called 100% of the runs of the program. The code is 10 bytes for the
function and 5 bytes to call it, altogether 15 bytes. Inlined it would be 9
bytes. I don't count debug data for extra symbol and potential slowdown of the
call to code that might not be in cache. If this decision is arbitrary and
won't be changed, too bad. But please someone explain the purpose of this code:

  400939:       89 54 24 18             mov    %edx,0x18(%rsp)
  40093d:       8a 54 24 18             mov    0x18(%rsp),%dl
  400941:       88 54 24 28             mov    %dl,0x28(%rsp)
  400945:       8b 54 24 28             mov    0x28(%rsp),%edx
  400949:       48 83 c4 48             add    $0x48,%rsp
  40094d:       c3                      retq   

The writes to stack slots that are about to be wiped. The net result of leaving
"%edx & 0xff" in %edx using 16 bytes of code. The 72 bytes of stack that are
allocated and unused. How is that in any way good? And the pair variable in
main that is also saved to the stack without ever being read later. Why doesn't
the compiler eliminate writes to memory that is never read from, and then
eliminate the stack slot itself?

> Oh this is not really a regression

Is it your opinion that the code produced by gcc 4.5.1 is as good as the code
produced by gcc 4.4.5 (and clang 2.8)?

section          size
.text             952
.text             856

Reply via email to