64: Introduce the PUSH_AND_CLEAN_REGS macro

Denys Vlasenko Mon, 12 Feb 2018 05:44:46 -0800

On 02/12/2018 02:36 PM, David Laight wrote:

From: Denys Vlasenko

Sent: 12 February 2018 13:29

...


x86/entry/64: Introduce the PUSH_AND_CLEAN_REGS macro

Those instances where ALLOC_PT_GPREGS_ON_STACK is called just before
SAVE_AND_CLEAR_REGS can trivially be replaced by PUSH_AND_CLEAN_REGS.
This macro uses PUSH instead of MOV and should therefore be faster, at
least on newer CPUs.

...

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
   arch/x86/entry/calling.h  | 36 ++++++++++++++++++++++++++++++++++++
   arch/x86/entry/entry_64.S |  6 ++----
   2 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index a05cbb8..57b1b87 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -137,6 +137,42 @@ For 32-bit we have the following conventions - kernel is 
built with
        UNWIND_HINT_REGS offset=\offset
        .endm

+       .macro PUSH_AND_CLEAR_REGS
+       /*
+        * Push registers and sanitize registers of values that a
+        * speculation attack might otherwise want to exploit. The
+        * lower registers are likely clobbered well before they
+        * could be put to use in a speculative execution gadget.
+        * Interleave XOR with PUSH for better uop scheduling:
+        */
+       pushq   %rdi            /* pt_regs->di */
+       pushq   %rsi            /* pt_regs->si */
+       pushq   %rdx            /* pt_regs->dx */
+       pushq   %rcx            /* pt_regs->cx */
+       pushq   %rax            /* pt_regs->ax */
+       pushq   %r8             /* pt_regs->r8 */
+       xorq    %r8, %r8        /* nospec   r8 */


xorq's are slower than xorl's on Silvermont/Knights Landing.
I propose using xorl instead.


Does using movq to copy the first zero to the other registers make
the code any faster?

ISTR mov reg-reg is often implemented as a register rename rather than an
alu operation.


xorl is implemented in register rename as well. Just, for some reason,
xorq did not get the same treatment on those CPUs.

Re: [tip:x86/pti] x86/entry/64: Introduce the PUSH_AND_CLEAN_REGS macro

Reply via email to