[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 --- Comment #9 from Arjan van de Ven --- I don't have recent measurements since we did this work quite some time ago. basically on the CPU level (speaking for Intel style cpus at least), a CPU can eliminate (meaning: no execution resources used) 1 to 3 (depending on generation) register to register per clock cycle.. There's ALSO a path in the hardware for optimizing XOR sequences to avoid execution resources... when we did both we maximized the total number of these eliminations... while only XOR you can get bottlenecked on execution if you have too many. (all the mov's should have no other instructions depending on them, so even though they depend on the XOR, they're still fully 'orphan' for the out of order engine)
[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 --- Comment #8 from Qing Zhao --- > On May 24, 2022, at 11:42 AM, arjan at linux dot intel.com > wrote: > > --- Comment #7 from Arjan van de Ven --- > from a performance angle, the xor-only sequence is not so great at all; modern > CPUs are really good at eliminating mov's so that's why the code originally > was > added to do a combo of xor and mov.. Are you saying that the Xor-only sequence is slower than the previous XOR + MOV sequence? If so, can you explain a little bit more on why? And do you have any data for this claim? Thanks.
[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 Arjan van de Ven changed: What|Removed |Added CC||arjan at linux dot intel.com --- Comment #7 from Arjan van de Ven --- from a performance angle, the xor-only sequence is not so great at all; modern CPUs are really good at eliminating mov's so that's why the code originally was added to do a combo of xor and mov.. I can understand the security versus performance tradeoff. (the original tuning was done to basically make it entirely free, so that it could just be always enabled)
[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 qinzhao at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from qinzhao at gcc dot gnu.org --- fixed in gcc11 and gcc12 too.
[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 --- Comment #5 from CVS Commits --- The releases/gcc-11 branch has been updated by Qing Zhao : https://gcc.gnu.org/g:c8d636cbe38b6a369528f58227c96b2b77b1fd3a commit r11-10029-gc8d636cbe38b6a369528f58227c96b2b77b1fd3a Author: Qing Zhao Date: Tue May 24 15:54:06 2022 + i386: Adjust -fzero-call-used-regs to always use XOR [PR101891] Currently on i386, -fzero-call-used-regs uses a pattern of: XOR regA,regA MOV regA,regB MOV regA,regC ... RET However, this introduces both a register ordering dependency (e.g. the CPU cannot clear regB without clearing regA first), and while greatly reduces available ROP gadgets, it does technically leave a set of "MOV" ROP gadgets at the end of functions (e.g. "MOV regA,regC; RET"). This patch will switch to always use XOR on i386: XOR regA,regA XOR regB,regB XOR regC,regC ... RET gcc/ChangeLog: PR target/101891 * config/i386/i386.c (zero_call_used_regno_mode): use V2SImode as a generic MMX mode instead of V4HImode. (zero_all_mm_registers): Use SET to zero instead of MOV for zeroing scratch registers. (ix86_zero_call_used_regs): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/zero-scratch-regs-1.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-10.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-13.c: Add -msse. * gcc.target/i386/zero-scratch-regs-14.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-15.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-16.c: Likewise. * gcc.target/i386/zero-scratch-regs-17.c: Likewise. * gcc.target/i386/zero-scratch-regs-18.c: Add -fno-stack-protector -fno-PIC, adjust mov to xor. * gcc.target/i386/zero-scratch-regs-19.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-2.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-20.c: Add -msse. * gcc.target/i386/zero-scratch-regs-21.c: Add -fno-stack-protector -fno-PIC, Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-22.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-23.c: Likewise. * gcc.target/i386/zero-scratch-regs-26.c: Likewise. * gcc.target/i386/zero-scratch-regs-27.c: Likewise. * gcc.target/i386/zero-scratch-regs-28.c: Likewise. * gcc.target/i386/zero-scratch-regs-3.c: Add -fno-stack-protector. * gcc.target/i386/zero-scratch-regs-31.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-4.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-5.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-6.c: Add -fno-stack-protector. * gcc.target/i386/zero-scratch-regs-7.c: Likewise. * gcc.target/i386/zero-scratch-regs-8.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-9.c: Add -fno-stack-protector. (cherry picked from commit 0b86943aca51175968e40bbb6f2662dfe3fbfe59)
[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 --- Comment #4 from CVS Commits --- The releases/gcc-12 branch has been updated by Qing Zhao : https://gcc.gnu.org/g:79ae75cc252154cf4ad75d28c3c909ff90f0cc76 commit r12-8413-g79ae75cc252154cf4ad75d28c3c909ff90f0cc76 Author: Qing Zhao Date: Tue May 24 15:03:40 2022 + i386: Adjust -fzero-call-used-regs to always use XOR [PR101891] Currently on i386, -fzero-call-used-regs uses a pattern of: XOR regA,regA MOV regA,regB MOV regA,regC ... RET However, this introduces both a register ordering dependency (e.g. the CPU cannot clear regB without clearing regA first), and while greatly reduces available ROP gadgets, it does technically leave a set of "MOV" ROP gadgets at the end of functions (e.g. "MOV regA,regC; RET"). This patch will switch to always use XOR on i386: XOR regA,regA XOR regB,regB XOR regC,regC ... RET gcc/ChangeLog: PR target/101891 * config/i386/i386.cc (zero_call_used_regno_mode): use V2SImode as a generic MMX mode instead of V4HImode. (zero_all_mm_registers): Use SET to zero instead of MOV for zeroing scratch registers. (ix86_zero_call_used_regs): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/zero-scratch-regs-1.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-10.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-13.c: Add -msse. * gcc.target/i386/zero-scratch-regs-14.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-15.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-16.c: Likewise. * gcc.target/i386/zero-scratch-regs-17.c: Likewise. * gcc.target/i386/zero-scratch-regs-18.c: Add -fno-stack-protector -fno-PIC, adjust mov to xor. * gcc.target/i386/zero-scratch-regs-19.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-2.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-20.c: Add -msse. * gcc.target/i386/zero-scratch-regs-21.c: Add -fno-stack-protector -fno-PIC, Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-22.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-23.c: Likewise. * gcc.target/i386/zero-scratch-regs-26.c: Likewise. * gcc.target/i386/zero-scratch-regs-27.c: Likewise. * gcc.target/i386/zero-scratch-regs-28.c: Likewise. * gcc.target/i386/zero-scratch-regs-3.c: Add -fno-stack-protector. * gcc.target/i386/zero-scratch-regs-31.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-4.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-5.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-6.c: Add -fno-stack-protector. * gcc.target/i386/zero-scratch-regs-7.c: Likewise. * gcc.target/i386/zero-scratch-regs-8.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-9.c: Add -fno-stack-protector. (cherry picked from commit 0b86943aca51175968e40bbb6f2662dfe3fbfe59)
[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 --- Comment #3 from qinzhao at gcc dot gnu.org --- fixed in gcc13.
[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 --- Comment #2 from CVS Commits --- The master branch has been updated by Qing Zhao : https://gcc.gnu.org/g:0b86943aca51175968e40bbb6f2662dfe3fbfe59 commit r13-213-g0b86943aca51175968e40bbb6f2662dfe3fbfe59 Author: Qing Zhao Date: Mon May 9 15:34:34 2022 + i386: Adjust -fzero-call-used-regs to always use XOR [PR101891] Currently on i386, -fzero-call-used-regs uses a pattern of: XOR regA,regA MOV regA,regB MOV regA,regC ... RET However, this introduces both a register ordering dependency (e.g. the CPU cannot clear regB without clearing regA first), and while greatly reduces available ROP gadgets, it does technically leave a set of "MOV" ROP gadgets at the end of functions (e.g. "MOV regA,regC; RET"). This patch will switch to always use XOR on i386: XOR regA,regA XOR regB,regB XOR regC,regC ... RET gcc/ChangeLog: PR target/101891 * config/i386/i386.cc (zero_call_used_regno_mode): use V2SImode as a generic MMX mode instead of V4HImode. (zero_all_mm_registers): Use SET to zero instead of MOV for zeroing scratch registers. (ix86_zero_call_used_regs): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/zero-scratch-regs-1.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-10.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-13.c: Add -msse. * gcc.target/i386/zero-scratch-regs-14.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-15.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-16.c: Likewise. * gcc.target/i386/zero-scratch-regs-17.c: Likewise. * gcc.target/i386/zero-scratch-regs-18.c: Add -fno-stack-protector -fno-PIC, adjust mov to xor. * gcc.target/i386/zero-scratch-regs-19.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-2.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-20.c: Add -msse. * gcc.target/i386/zero-scratch-regs-21.c: Add -fno-stack-protector -fno-PIC, Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-22.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-23.c: Likewise. * gcc.target/i386/zero-scratch-regs-26.c: Likewise. * gcc.target/i386/zero-scratch-regs-27.c: Likewise. * gcc.target/i386/zero-scratch-regs-28.c: Likewise. * gcc.target/i386/zero-scratch-regs-3.c: Add -fno-stack-protector. * gcc.target/i386/zero-scratch-regs-31.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-4.c: Add -fno-stack-protector -fno-PIC. * gcc.target/i386/zero-scratch-regs-5.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-6.c: Add -fno-stack-protector. * gcc.target/i386/zero-scratch-regs-7.c: Likewise. * gcc.target/i386/zero-scratch-regs-8.c: Adjust mov to xor. * gcc.target/i386/zero-scratch-regs-9.c: Add -fno-stack-protector.
[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 qinzhao at gcc dot gnu.org changed: What|Removed |Added Target Milestone|--- |13.0
[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 qinzhao at gcc dot gnu.org changed: What|Removed |Added Last reconfirmed||2022-01-28 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1
[Bug target/101891] Adjust -fzero-call-used-regs to always use XOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891 Andrew Pinski changed: What|Removed |Added Component|middle-end |target Target||x86_64-linux-gnu --- Comment #1 from Andrew Pinski --- The target emits the RTL this way.