[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2022-05-24 Thread arjan at linux dot intel.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

--- Comment #9 from Arjan van de Ven  ---
I don't have recent measurements since we did this work quite some time ago.

basically on the CPU level (speaking for Intel style cpus at least), a CPU can
eliminate (meaning: no execution resources used) 1 to 3 (depending on
generation) register to register per clock cycle.. There's ALSO a path in the
hardware for optimizing XOR  sequences to avoid execution
resources... when we did both we maximized the total number of these
eliminations...
while only XOR you can get bottlenecked on execution if you have too many.
(all the mov's should have no other instructions depending on them, so even
though they depend on the XOR, they're still fully 'orphan' for the out of
order engine)

[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2022-05-24 Thread qing.zhao at oracle dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

--- Comment #8 from Qing Zhao  ---
> On May 24, 2022, at 11:42 AM, arjan at linux dot intel.com 
>  wrote:
> 
> --- Comment #7 from Arjan van de Ven  ---
> from a performance angle, the xor-only sequence is not so great at all; modern
> CPUs are really good at eliminating mov's so that's why the code originally 
> was
> added to do a combo of xor and mov..

Are you saying that the Xor-only sequence is slower than the previous XOR + MOV
sequence?
If so, can you explain a little bit more on why? And do you have any data for
this claim?

Thanks.

[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2022-05-24 Thread arjan at linux dot intel.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

Arjan van de Ven  changed:

   What|Removed |Added

 CC||arjan at linux dot intel.com

--- Comment #7 from Arjan van de Ven  ---
from a performance angle, the xor-only sequence is not so great at all; modern
CPUs are really good at eliminating mov's so that's why the code originally was
added to do a combo of xor and mov..

I can understand the security versus performance tradeoff.
(the original tuning was done to basically make it entirely free, so that it
could just be always enabled)

[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2022-05-24 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

qinzhao at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from qinzhao at gcc dot gnu.org ---
fixed in gcc11 and gcc12 too.

[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2022-05-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

--- Comment #5 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Qing Zhao :

https://gcc.gnu.org/g:c8d636cbe38b6a369528f58227c96b2b77b1fd3a

commit r11-10029-gc8d636cbe38b6a369528f58227c96b2b77b1fd3a
Author: Qing Zhao 
Date:   Tue May 24 15:54:06 2022 +

i386: Adjust -fzero-call-used-regs to always use XOR [PR101891]

Currently on i386, -fzero-call-used-regs uses a pattern of:

XOR regA,regA
MOV regA,regB
MOV regA,regC
...
RET

However, this introduces both a register ordering dependency (e.g. the CPU
cannot clear regB without clearing regA first), and while greatly reduces
available ROP gadgets, it does technically leave a set of "MOV" ROP gadgets
at the end of functions (e.g. "MOV regA,regC; RET").

This patch will switch to always use XOR on i386:

XOR regA,regA
XOR regB,regB
XOR regC,regC
...
RET

gcc/ChangeLog:

PR target/101891
* config/i386/i386.c (zero_call_used_regno_mode): use V2SImode
as a generic MMX mode instead of V4HImode.
(zero_all_mm_registers): Use SET to zero instead of MOV for
zeroing scratch registers.
(ix86_zero_call_used_regs): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/i386/zero-scratch-regs-1.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-10.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-13.c: Add -msse.
* gcc.target/i386/zero-scratch-regs-14.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-15.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-16.c: Likewise.
* gcc.target/i386/zero-scratch-regs-17.c: Likewise.
* gcc.target/i386/zero-scratch-regs-18.c: Add -fno-stack-protector
-fno-PIC, adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-19.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-2.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-20.c: Add -msse.
* gcc.target/i386/zero-scratch-regs-21.c: Add -fno-stack-protector
-fno-PIC, Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-22.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-23.c: Likewise.
* gcc.target/i386/zero-scratch-regs-26.c: Likewise.
* gcc.target/i386/zero-scratch-regs-27.c: Likewise.
* gcc.target/i386/zero-scratch-regs-28.c: Likewise.
* gcc.target/i386/zero-scratch-regs-3.c: Add -fno-stack-protector.
* gcc.target/i386/zero-scratch-regs-31.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-4.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-5.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-6.c: Add -fno-stack-protector.
* gcc.target/i386/zero-scratch-regs-7.c: Likewise.
* gcc.target/i386/zero-scratch-regs-8.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-9.c: Add -fno-stack-protector.

(cherry picked from commit 0b86943aca51175968e40bbb6f2662dfe3fbfe59)

[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2022-05-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

--- Comment #4 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Qing Zhao :

https://gcc.gnu.org/g:79ae75cc252154cf4ad75d28c3c909ff90f0cc76

commit r12-8413-g79ae75cc252154cf4ad75d28c3c909ff90f0cc76
Author: Qing Zhao 
Date:   Tue May 24 15:03:40 2022 +

i386: Adjust -fzero-call-used-regs to always use XOR [PR101891]

Currently on i386, -fzero-call-used-regs uses a pattern of:

XOR regA,regA
MOV regA,regB
MOV regA,regC
...
RET

However, this introduces both a register ordering dependency (e.g. the CPU
cannot clear regB without clearing regA first), and while greatly reduces
available ROP gadgets, it does technically leave a set of "MOV" ROP gadgets
at the end of functions (e.g. "MOV regA,regC; RET").

This patch will switch to always use XOR on i386:

XOR regA,regA
XOR regB,regB
XOR regC,regC
...
RET

gcc/ChangeLog:

PR target/101891
* config/i386/i386.cc (zero_call_used_regno_mode): use V2SImode
as a generic MMX mode instead of V4HImode.
(zero_all_mm_registers): Use SET to zero instead of MOV for
zeroing scratch registers.
(ix86_zero_call_used_regs): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/i386/zero-scratch-regs-1.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-10.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-13.c: Add -msse.
* gcc.target/i386/zero-scratch-regs-14.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-15.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-16.c: Likewise.
* gcc.target/i386/zero-scratch-regs-17.c: Likewise.
* gcc.target/i386/zero-scratch-regs-18.c: Add -fno-stack-protector
-fno-PIC, adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-19.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-2.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-20.c: Add -msse.
* gcc.target/i386/zero-scratch-regs-21.c: Add -fno-stack-protector
-fno-PIC, Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-22.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-23.c: Likewise.
* gcc.target/i386/zero-scratch-regs-26.c: Likewise.
* gcc.target/i386/zero-scratch-regs-27.c: Likewise.
* gcc.target/i386/zero-scratch-regs-28.c: Likewise.
* gcc.target/i386/zero-scratch-regs-3.c: Add -fno-stack-protector.
* gcc.target/i386/zero-scratch-regs-31.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-4.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-5.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-6.c: Add -fno-stack-protector.
* gcc.target/i386/zero-scratch-regs-7.c: Likewise.
* gcc.target/i386/zero-scratch-regs-8.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-9.c: Add -fno-stack-protector.

(cherry picked from commit 0b86943aca51175968e40bbb6f2662dfe3fbfe59)

[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2022-05-09 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

--- Comment #3 from qinzhao at gcc dot gnu.org ---
fixed in gcc13.

[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2022-05-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Qing Zhao :

https://gcc.gnu.org/g:0b86943aca51175968e40bbb6f2662dfe3fbfe59

commit r13-213-g0b86943aca51175968e40bbb6f2662dfe3fbfe59
Author: Qing Zhao 
Date:   Mon May 9 15:34:34 2022 +

i386: Adjust -fzero-call-used-regs to always use XOR [PR101891]

Currently on i386, -fzero-call-used-regs uses a pattern of:

XOR regA,regA
MOV regA,regB
MOV regA,regC
...
RET

However, this introduces both a register ordering dependency (e.g. the CPU
cannot clear regB without clearing regA first), and while greatly reduces
available ROP gadgets, it does technically leave a set of "MOV" ROP gadgets
at the end of functions (e.g. "MOV regA,regC; RET").

This patch will switch to always use XOR on i386:

XOR regA,regA
XOR regB,regB
XOR regC,regC
...
RET

gcc/ChangeLog:

PR target/101891
* config/i386/i386.cc (zero_call_used_regno_mode): use V2SImode
as a generic MMX mode instead of V4HImode.
(zero_all_mm_registers): Use SET to zero instead of MOV for
zeroing scratch registers.
(ix86_zero_call_used_regs): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/i386/zero-scratch-regs-1.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-10.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-13.c: Add -msse.
* gcc.target/i386/zero-scratch-regs-14.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-15.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-16.c: Likewise.
* gcc.target/i386/zero-scratch-regs-17.c: Likewise.
* gcc.target/i386/zero-scratch-regs-18.c: Add -fno-stack-protector
-fno-PIC, adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-19.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-2.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-20.c: Add -msse.
* gcc.target/i386/zero-scratch-regs-21.c: Add -fno-stack-protector
-fno-PIC, Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-22.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-23.c: Likewise.
* gcc.target/i386/zero-scratch-regs-26.c: Likewise.
* gcc.target/i386/zero-scratch-regs-27.c: Likewise.
* gcc.target/i386/zero-scratch-regs-28.c: Likewise.
* gcc.target/i386/zero-scratch-regs-3.c: Add -fno-stack-protector.
* gcc.target/i386/zero-scratch-regs-31.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-4.c: Add -fno-stack-protector
-fno-PIC.
* gcc.target/i386/zero-scratch-regs-5.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-6.c: Add -fno-stack-protector.
* gcc.target/i386/zero-scratch-regs-7.c: Likewise.
* gcc.target/i386/zero-scratch-regs-8.c: Adjust mov to xor.
* gcc.target/i386/zero-scratch-regs-9.c: Add -fno-stack-protector.

[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2022-05-05 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

qinzhao at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |13.0

[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2022-01-28 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

qinzhao at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed||2022-01-28
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR

2021-08-12 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891

Andrew Pinski  changed:

   What|Removed |Added

  Component|middle-end  |target
 Target||x86_64-linux-gnu

--- Comment #1 from Andrew Pinski  ---
The target emits the RTL this way.