Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize

2016-12-01 Thread Ingo Molnar

* Borislav Petkov  wrote:

> On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote:
> > Aside from being excessively slow, CPUID is problematic: Linux runs
> > on a handful of CPUs that don't have CPUID.  MOV to CR2 is always
> > available, so use it instead.
> > 
> > Signed-off-by: Andy Lutomirski 
> > ---
> >  arch/x86/include/asm/processor.h | 31 ---
> >  1 file changed, 8 insertions(+), 23 deletions(-)
> 
> Looks nice.
> 
> I'm wondering if we should leave this one in tip for an additional cycle
> to have it tested on more hw. I know, it is architectural and so on but
> who knows what every implementation actually does...

I think -tip and "upstream of the day" mostly gets tested on relatively recent 
x86 
hardware - proven by the fact that these regressions are many months old.

The reason v4.9 got extra testing is the announced Long Term Support (LTS) 
aspect: 
more, older, weirder hardware is being tested because it's going to be a very 
popular base kernel.

So the best option would be to get these fixes into -tip, make sure it's sane 
all 
around and works on hardware that gets tested on bleeding edge kernels, then 
push 
it upstream sooner rather than later and also have Cc:stable tags on the 
obvious 
fixes, and handle any eventual fallout as it happens.

That's the best we can do I think.

Thanks,

Ingo


Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize

2016-12-01 Thread Ingo Molnar

* Borislav Petkov  wrote:

> On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote:
> > Aside from being excessively slow, CPUID is problematic: Linux runs
> > on a handful of CPUs that don't have CPUID.  MOV to CR2 is always
> > available, so use it instead.
> > 
> > Signed-off-by: Andy Lutomirski 
> > ---
> >  arch/x86/include/asm/processor.h | 31 ---
> >  1 file changed, 8 insertions(+), 23 deletions(-)
> 
> Looks nice.
> 
> I'm wondering if we should leave this one in tip for an additional cycle
> to have it tested on more hw. I know, it is architectural and so on but
> who knows what every implementation actually does...

I think -tip and "upstream of the day" mostly gets tested on relatively recent 
x86 
hardware - proven by the fact that these regressions are many months old.

The reason v4.9 got extra testing is the announced Long Term Support (LTS) 
aspect: 
more, older, weirder hardware is being tested because it's going to be a very 
popular base kernel.

So the best option would be to get these fixes into -tip, make sure it's sane 
all 
around and works on hardware that gets tested on bleeding edge kernels, then 
push 
it upstream sooner rather than later and also have Cc:stable tags on the 
obvious 
fixes, and handle any eventual fallout as it happens.

That's the best we can do I think.

Thanks,

Ingo


Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize

2016-12-01 Thread Andrew Cooper
On 01/12/16 17:08, Andy Lutomirski wrote:
> On Thu, Dec 1, 2016 at 1:22 AM, Borislav Petkov  wrote:
>> On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote:
>>> Aside from being excessively slow, CPUID is problematic: Linux runs
>>> on a handful of CPUs that don't have CPUID.  MOV to CR2 is always
>>> available, so use it instead.
>>>
>>> Signed-off-by: Andy Lutomirski 
>>> ---
>>>  arch/x86/include/asm/processor.h | 31 ---
>>>  1 file changed, 8 insertions(+), 23 deletions(-)
>> Looks nice.
>>
>> I'm wondering if we should leave this one in tip for an additional cycle
>> to have it tested on more hw. I know, it is architectural and so on but
>> who knows what every implementation actually does...
> I want the Xen opinion as well.
>
> Xen folks, can Linux use write_cr2 to serialize the CPU core on Xen PV
> or do we need something a bit heavier weight like native_write_cr2?

To sum up our conversation on IRC.

xen_write_cr2() is not serialising; it is just a write into a shared page.

native_write_cr2() would trap and be emulated.  This will incur #GP[0]
due to cpl, although not necessarily an iret on the way back out of Xen.

Something like an iret-to-self would be far quicker, and avoid trapping
into the hypervisor.

~Andrew


Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize

2016-12-01 Thread Andrew Cooper
On 01/12/16 17:08, Andy Lutomirski wrote:
> On Thu, Dec 1, 2016 at 1:22 AM, Borislav Petkov  wrote:
>> On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote:
>>> Aside from being excessively slow, CPUID is problematic: Linux runs
>>> on a handful of CPUs that don't have CPUID.  MOV to CR2 is always
>>> available, so use it instead.
>>>
>>> Signed-off-by: Andy Lutomirski 
>>> ---
>>>  arch/x86/include/asm/processor.h | 31 ---
>>>  1 file changed, 8 insertions(+), 23 deletions(-)
>> Looks nice.
>>
>> I'm wondering if we should leave this one in tip for an additional cycle
>> to have it tested on more hw. I know, it is architectural and so on but
>> who knows what every implementation actually does...
> I want the Xen opinion as well.
>
> Xen folks, can Linux use write_cr2 to serialize the CPU core on Xen PV
> or do we need something a bit heavier weight like native_write_cr2?

To sum up our conversation on IRC.

xen_write_cr2() is not serialising; it is just a write into a shared page.

native_write_cr2() would trap and be emulated.  This will incur #GP[0]
due to cpl, although not necessarily an iret on the way back out of Xen.

Something like an iret-to-self would be far quicker, and avoid trapping
into the hypervisor.

~Andrew


Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize

2016-12-01 Thread Andy Lutomirski
On Thu, Dec 1, 2016 at 1:22 AM, Borislav Petkov  wrote:
> On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote:
>> Aside from being excessively slow, CPUID is problematic: Linux runs
>> on a handful of CPUs that don't have CPUID.  MOV to CR2 is always
>> available, so use it instead.
>>
>> Signed-off-by: Andy Lutomirski 
>> ---
>>  arch/x86/include/asm/processor.h | 31 ---
>>  1 file changed, 8 insertions(+), 23 deletions(-)
>
> Looks nice.
>
> I'm wondering if we should leave this one in tip for an additional cycle
> to have it tested on more hw. I know, it is architectural and so on but
> who knows what every implementation actually does...

I want the Xen opinion as well.

Xen folks, can Linux use write_cr2 to serialize the CPU core on Xen PV
or do we need something a bit heavier weight like native_write_cr2?

--Andy


Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize

2016-12-01 Thread Andy Lutomirski
On Thu, Dec 1, 2016 at 1:22 AM, Borislav Petkov  wrote:
> On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote:
>> Aside from being excessively slow, CPUID is problematic: Linux runs
>> on a handful of CPUs that don't have CPUID.  MOV to CR2 is always
>> available, so use it instead.
>>
>> Signed-off-by: Andy Lutomirski 
>> ---
>>  arch/x86/include/asm/processor.h | 31 ---
>>  1 file changed, 8 insertions(+), 23 deletions(-)
>
> Looks nice.
>
> I'm wondering if we should leave this one in tip for an additional cycle
> to have it tested on more hw. I know, it is architectural and so on but
> who knows what every implementation actually does...

I want the Xen opinion as well.

Xen folks, can Linux use write_cr2 to serialize the CPU core on Xen PV
or do we need something a bit heavier weight like native_write_cr2?

--Andy


Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize

2016-12-01 Thread Borislav Petkov
On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote:
> Aside from being excessively slow, CPUID is problematic: Linux runs
> on a handful of CPUs that don't have CPUID.  MOV to CR2 is always
> available, so use it instead.
> 
> Signed-off-by: Andy Lutomirski 
> ---
>  arch/x86/include/asm/processor.h | 31 ---
>  1 file changed, 8 insertions(+), 23 deletions(-)

Looks nice.

I'm wondering if we should leave this one in tip for an additional cycle
to have it tested on more hw. I know, it is architectural and so on but
who knows what every implementation actually does...

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize

2016-12-01 Thread Borislav Petkov
On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote:
> Aside from being excessively slow, CPUID is problematic: Linux runs
> on a handful of CPUs that don't have CPUID.  MOV to CR2 is always
> available, so use it instead.
> 
> Signed-off-by: Andy Lutomirski 
> ---
>  arch/x86/include/asm/processor.h | 31 ---
>  1 file changed, 8 insertions(+), 23 deletions(-)

Looks nice.

I'm wondering if we should leave this one in tip for an additional cycle
to have it tested on more hw. I know, it is architectural and so on but
who knows what every implementation actually does...

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


[PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize

2016-11-30 Thread Andy Lutomirski
Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID.  MOV to CR2 is always
available, so use it instead.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/processor.h | 31 ---
 1 file changed, 8 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 64fbc937d586..0388f3d85700 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -593,31 +593,16 @@ static __always_inline void cpu_relax(void)
 /* Stop speculative execution and prefetching of modified code. */
 static inline void sync_core(void)
 {
-   int tmp;
-
-#ifdef CONFIG_X86_32
-   /*
-* Do a CPUID if available, otherwise do a jump.  The jump
-* can conveniently enough be the jump around CPUID.
-*/
-   asm volatile("cmpl %2,%1\n\t"
-"jl 1f\n\t"
-"cpuid\n"
-"1:"
-: "=a" (tmp)
-: "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
-: "ebx", "ecx", "edx", "memory");
-#else
/*
-* CPUID is a barrier to speculative execution.
-* Prefetched instructions are automatically
-* invalidated when modified.
+* MOV to CR2 is architecturally defined as a serializing
+* instruction.  It's nice because it works on all CPUs, it
+* doesn't clobber registers, and (unlike CPUID) it won't force
+* a VM exit.
+*
+* 0xbf172b23 is random poison just in case something ends up
+* caring about this value.
 */
-   asm volatile("cpuid"
-: "=a" (tmp)
-: "0" (1)
-: "ebx", "ecx", "edx", "memory");
-#endif
+   write_cr2(0xbf172b23);
 }
 
 extern void select_idle_routine(const struct cpuinfo_x86 *c);
-- 
2.9.3



[PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize

2016-11-30 Thread Andy Lutomirski
Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID.  MOV to CR2 is always
available, so use it instead.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/processor.h | 31 ---
 1 file changed, 8 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 64fbc937d586..0388f3d85700 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -593,31 +593,16 @@ static __always_inline void cpu_relax(void)
 /* Stop speculative execution and prefetching of modified code. */
 static inline void sync_core(void)
 {
-   int tmp;
-
-#ifdef CONFIG_X86_32
-   /*
-* Do a CPUID if available, otherwise do a jump.  The jump
-* can conveniently enough be the jump around CPUID.
-*/
-   asm volatile("cmpl %2,%1\n\t"
-"jl 1f\n\t"
-"cpuid\n"
-"1:"
-: "=a" (tmp)
-: "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
-: "ebx", "ecx", "edx", "memory");
-#else
/*
-* CPUID is a barrier to speculative execution.
-* Prefetched instructions are automatically
-* invalidated when modified.
+* MOV to CR2 is architecturally defined as a serializing
+* instruction.  It's nice because it works on all CPUs, it
+* doesn't clobber registers, and (unlike CPUID) it won't force
+* a VM exit.
+*
+* 0xbf172b23 is random poison just in case something ends up
+* caring about this value.
 */
-   asm volatile("cpuid"
-: "=a" (tmp)
-: "0" (1)
-: "ebx", "ecx", "edx", "memory");
-#endif
+   write_cr2(0xbf172b23);
 }
 
 extern void select_idle_routine(const struct cpuinfo_x86 *c);
-- 
2.9.3