[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-10-17 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

--- Comment #10 from vries at gcc dot gnu.org ---
Author: vries
Date: Fri Oct 17 06:36:45 2014
New Revision: 216365

URL: https://gcc.gnu.org/viewcvs?rev=216365root=gccview=rev
Log:
Use fuse-caller-save info in cprop-hardreg

2014-10-17  Tom de Vries  t...@codesourcery.com

PR rtl-optimization/61605
* regcprop.c (copyprop_hardreg_forward_1): Use
regs_invalidated_by_this_call instead of regs_invalidated_by_call.

* gcc.target/i386/fuse-caller-save.c: Update addition check.  Add movl
absence check.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/regcprop.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/i386/fuse-caller-save.c


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-10-17 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

--- Comment #9 from vries at gcc dot gnu.org ---
Author: vries
Date: Fri Oct 17 06:36:35 2014
New Revision: 216364

URL: https://gcc.gnu.org/viewcvs?rev=216364root=gccview=rev
Log:
Handle copy cycles in pass_cprop_hardreg

2014-10-17  Tom de Vries  t...@codesourcery.com

PR rtl-optimization/61605
* regcprop.c (copyprop_hardreg_forward_1): Add copy_p and noop_p.  Don't
notice stores for noops.  Don't regard noops as copies.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/regcprop.c


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-10-17 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

vries at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from vries at gcc dot gnu.org ---
Patches committed, test-case updated. Resolving as fixed.


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-10-16 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

vries at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||patch

--- Comment #8 from vries at gcc dot gnu.org ---
Patches submitted at:
https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01513.html
https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01514.html


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-09-30 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

--- Comment #6 from vries at gcc dot gnu.org ---
Created attachment 33618
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33618action=edit
[1/2] Use fuse-caller-save-info in cprop-hardreg


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-09-30 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

--- Comment #7 from vries at gcc dot gnu.org ---
Created attachment 33619
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33619action=edit
[2/2] Don't regard a copy with identical src and dest as killing dest

This patch adds handling of copies with identical source and destination
register in copyprop_hardreg_forward_1.

Using this patch series we get the desired code.


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-09-30 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

vries at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |vries at gcc dot gnu.org


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-09-29 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

vries at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-09-29
 CC||vries at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #4 from vries at gcc dot gnu.org ---
 If a function is known to not clobber an argument register then the caller
 shouldn't have to save/reload that register across the function call.

If a function is known to not clobber an call_used_reg then the caller
can use it as a non-call_used_reg across the function call.

This diff shows the example with -fno-use-caller-save vs -fuse-caller-save:
...
 foo:
 .LFB1:
 .cfi_startproc
-pushq%rbx
-.cfi_def_cfa_offset 16
-.cfi_offset 3, -16
-movl%edi, %ebx
+movl%edi, %edx
 callbar
-addl%ebx, %eax
-popq%rbx
-.cfi_def_cfa_offset 8
+addl%edx, %eax
 ret
 .cfi_endproc
 .LFE1:
...
-fuse-caller-save removes the entry/exit save/restore pair
'pushq %rbx'/'popq %rbx'.

The 'movl %edi, %edx' is indeed non-optimal, but it's not a 'save' in the sense
of save/restore pair generated at function entry/exit or around function calls.
It's a copy at function entry of a hard reg argumument to a pseudo reg,
generated at expand, which is followed by a copy of the pseudo reg to the same
register to set the argument for the function call:
...
(insn 2 4 3 2 (set (reg/v:SI 86 [ yD.1755 ])
(reg:SI 5 di [ yD.1755 ])) test.c:9 -1
 (nil))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SI 5 di)
(reg/v:SI 86 [ yD.1755 ])) test.c:10 -1
 (nil))
...
The second insn is removed in pass_fast_rtl_dce. The reg-alloc choiche for
pseudo 86 in the first insn is dx, and the insn remains.

I think there could be two ways to address this:
1. Teach a pass after ira, like pass_cprop_hardreg or pass_gcse2 to use the
   information collected by fuse-calller-save.
2. Teach ira to prefer the dx to di in this case.

My guess would be pass_cprop_hardreg.


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-09-29 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

--- Comment #5 from vries at gcc dot gnu.org ---
Created attachment 33610
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33610action=edit
proof-of-concept patch

Using this proof-of-concept patch, we manage to get the desired code. The patch
uses the fuse-caller-save information in cprop-hardreg, and runs cprop-hardreg
one more time, after pass_fast_rtl_dce.

Obviously it's not desirable to run cprop-hardreg twice. But the pass has
problems with this code:
...
(insn 2 18 3 2 (set (reg/v:SI 1 dx [orig:86 yD.1749 ] [86])
(reg:SI 5 di [ yD.1749 ])) test.c:9 90 {*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 5 di [ yD.1749 ])
(nil)))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SI 5 di)
(reg/v:SI 1 dx [orig:86 yD.1749 ] [86])) test.c:10 90 {*movsi_internal}
 (nil))
...

The first time cprop-hardreg runs, it manages to propagate the first copy (insn
2) to the second (insn 6):
...
rescanning insn with uid = 6.
insn 6: replaced reg 1 with 5
...

So insn 6 looks like:
...
(insn 6 3 7 2 (set (reg:SI 5 di)
(reg:SI 5 di [orig:86 yD.1749 ] [86])) test.c:10 90 {*movsi_internal}
 (nil))
...

That insn is remove by pass_fast_rtl_dce:
...
DCE: Deleting insn 6
deleting insn with uid = 6.
...

And only the second time we run it, we propagate the first copy to the add:
...
insn 9: replaced reg 1 with 5
rescanning insn with uid = 9.
...
which then looks like this:
...
(insn 9 7 15 2 (parallel [
(set (reg:SI 0 ax [orig:87 D.1767 ] [87])
(plus:SI (reg:SI 0 ax [orig:83 D.1767 ] [83])
(reg:SI 5 di [orig:86 yD.1749 ] [86])))
(clobber (reg:CC 17 flags))
]) test.c:10 220 {*addsi_1}
 (expr_list:REG_DEAD (reg/v:SI 1 dx [orig:86 yD.1749 ] [86])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil
...

That leaves insn 2 dead, which is deleted by dce during sched2:
...
DCE: Deleting insn 2
deleting insn with uid = 2.
...

I'm not sure yet why the cprop-hardreg doesn't work for both cases the first
time, but it's probably that the store to di by insn 6 is registered as a kill
by cprop-hardreg, not taking into account that it's the same value.


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-09-28 Thread andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

Andi Kleen andi-gcc at firstfloor dot org changed:

   What|Removed |Added

 CC||andi-gcc at firstfloor dot org,
   ||tom at codesourcery dot com

--- Comment #1 from Andi Kleen andi-gcc at firstfloor dot org ---
This is in theory implemented in mainline with -fuse-caller-save
It doesn't seem to work for me though. I also didn't see the option doing
anything on a larger program.


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-09-28 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Target||x86_64-*-*

--- Comment #2 from Andrew Pinski pinskia at gcc dot gnu.org ---
(In reply to Andi Kleen from comment #1)
 This is in theory implemented in mainline with -fuse-caller-save
 It doesn't seem to work for me though. I also didn't see the option doing
 anything on a larger program.

Most likely because it is not fully implemented for x86 :).


[Bug rtl-optimization/61605] Potential optimization: Keep unclobbered argument registers live across function calls

2014-09-28 Thread andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61605

--- Comment #3 from Andi Kleen andi-gcc at firstfloor dot org ---
It was supposed to be enabled with 

Date:   Fri May 30 11:39:49 2014 +

-fuse-caller-save - Enable for i386

2014-05-30  Tom de Vries  t...@codesourcery.com

* config/i386/i386.c (TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS):
Redefine as true.

* gcc.target/i386/fuse-caller-save.c: New test.
* gcc.dg/ira-shrinkwrap-prep-1.c: Run with -fno-use-caller-save.
* gcc.dg/ira-shrinkwrap-prep-2.c: Same.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@211078
138bc75d-0d04-0410-961f-82ee72b054a4