Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize
* Borislav Petkovwrote: > On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote: > > Aside from being excessively slow, CPUID is problematic: Linux runs > > on a handful of CPUs that don't have CPUID. MOV to CR2 is always > > available, so use it instead. > > > > Signed-off-by: Andy Lutomirski > > --- > > arch/x86/include/asm/processor.h | 31 --- > > 1 file changed, 8 insertions(+), 23 deletions(-) > > Looks nice. > > I'm wondering if we should leave this one in tip for an additional cycle > to have it tested on more hw. I know, it is architectural and so on but > who knows what every implementation actually does... I think -tip and "upstream of the day" mostly gets tested on relatively recent x86 hardware - proven by the fact that these regressions are many months old. The reason v4.9 got extra testing is the announced Long Term Support (LTS) aspect: more, older, weirder hardware is being tested because it's going to be a very popular base kernel. So the best option would be to get these fixes into -tip, make sure it's sane all around and works on hardware that gets tested on bleeding edge kernels, then push it upstream sooner rather than later and also have Cc:stable tags on the obvious fixes, and handle any eventual fallout as it happens. That's the best we can do I think. Thanks, Ingo
Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize
* Borislav Petkov wrote: > On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote: > > Aside from being excessively slow, CPUID is problematic: Linux runs > > on a handful of CPUs that don't have CPUID. MOV to CR2 is always > > available, so use it instead. > > > > Signed-off-by: Andy Lutomirski > > --- > > arch/x86/include/asm/processor.h | 31 --- > > 1 file changed, 8 insertions(+), 23 deletions(-) > > Looks nice. > > I'm wondering if we should leave this one in tip for an additional cycle > to have it tested on more hw. I know, it is architectural and so on but > who knows what every implementation actually does... I think -tip and "upstream of the day" mostly gets tested on relatively recent x86 hardware - proven by the fact that these regressions are many months old. The reason v4.9 got extra testing is the announced Long Term Support (LTS) aspect: more, older, weirder hardware is being tested because it's going to be a very popular base kernel. So the best option would be to get these fixes into -tip, make sure it's sane all around and works on hardware that gets tested on bleeding edge kernels, then push it upstream sooner rather than later and also have Cc:stable tags on the obvious fixes, and handle any eventual fallout as it happens. That's the best we can do I think. Thanks, Ingo
Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize
On 01/12/16 17:08, Andy Lutomirski wrote: > On Thu, Dec 1, 2016 at 1:22 AM, Borislav Petkovwrote: >> On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote: >>> Aside from being excessively slow, CPUID is problematic: Linux runs >>> on a handful of CPUs that don't have CPUID. MOV to CR2 is always >>> available, so use it instead. >>> >>> Signed-off-by: Andy Lutomirski >>> --- >>> arch/x86/include/asm/processor.h | 31 --- >>> 1 file changed, 8 insertions(+), 23 deletions(-) >> Looks nice. >> >> I'm wondering if we should leave this one in tip for an additional cycle >> to have it tested on more hw. I know, it is architectural and so on but >> who knows what every implementation actually does... > I want the Xen opinion as well. > > Xen folks, can Linux use write_cr2 to serialize the CPU core on Xen PV > or do we need something a bit heavier weight like native_write_cr2? To sum up our conversation on IRC. xen_write_cr2() is not serialising; it is just a write into a shared page. native_write_cr2() would trap and be emulated. This will incur #GP[0] due to cpl, although not necessarily an iret on the way back out of Xen. Something like an iret-to-self would be far quicker, and avoid trapping into the hypervisor. ~Andrew
Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize
On 01/12/16 17:08, Andy Lutomirski wrote: > On Thu, Dec 1, 2016 at 1:22 AM, Borislav Petkov wrote: >> On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote: >>> Aside from being excessively slow, CPUID is problematic: Linux runs >>> on a handful of CPUs that don't have CPUID. MOV to CR2 is always >>> available, so use it instead. >>> >>> Signed-off-by: Andy Lutomirski >>> --- >>> arch/x86/include/asm/processor.h | 31 --- >>> 1 file changed, 8 insertions(+), 23 deletions(-) >> Looks nice. >> >> I'm wondering if we should leave this one in tip for an additional cycle >> to have it tested on more hw. I know, it is architectural and so on but >> who knows what every implementation actually does... > I want the Xen opinion as well. > > Xen folks, can Linux use write_cr2 to serialize the CPU core on Xen PV > or do we need something a bit heavier weight like native_write_cr2? To sum up our conversation on IRC. xen_write_cr2() is not serialising; it is just a write into a shared page. native_write_cr2() would trap and be emulated. This will incur #GP[0] due to cpl, although not necessarily an iret on the way back out of Xen. Something like an iret-to-self would be far quicker, and avoid trapping into the hypervisor. ~Andrew
Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize
On Thu, Dec 1, 2016 at 1:22 AM, Borislav Petkovwrote: > On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote: >> Aside from being excessively slow, CPUID is problematic: Linux runs >> on a handful of CPUs that don't have CPUID. MOV to CR2 is always >> available, so use it instead. >> >> Signed-off-by: Andy Lutomirski >> --- >> arch/x86/include/asm/processor.h | 31 --- >> 1 file changed, 8 insertions(+), 23 deletions(-) > > Looks nice. > > I'm wondering if we should leave this one in tip for an additional cycle > to have it tested on more hw. I know, it is architectural and so on but > who knows what every implementation actually does... I want the Xen opinion as well. Xen folks, can Linux use write_cr2 to serialize the CPU core on Xen PV or do we need something a bit heavier weight like native_write_cr2? --Andy
Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize
On Thu, Dec 1, 2016 at 1:22 AM, Borislav Petkov wrote: > On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote: >> Aside from being excessively slow, CPUID is problematic: Linux runs >> on a handful of CPUs that don't have CPUID. MOV to CR2 is always >> available, so use it instead. >> >> Signed-off-by: Andy Lutomirski >> --- >> arch/x86/include/asm/processor.h | 31 --- >> 1 file changed, 8 insertions(+), 23 deletions(-) > > Looks nice. > > I'm wondering if we should leave this one in tip for an additional cycle > to have it tested on more hw. I know, it is architectural and so on but > who knows what every implementation actually does... I want the Xen opinion as well. Xen folks, can Linux use write_cr2 to serialize the CPU core on Xen PV or do we need something a bit heavier weight like native_write_cr2? --Andy
Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize
On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote: > Aside from being excessively slow, CPUID is problematic: Linux runs > on a handful of CPUs that don't have CPUID. MOV to CR2 is always > available, so use it instead. > > Signed-off-by: Andy Lutomirski> --- > arch/x86/include/asm/processor.h | 31 --- > 1 file changed, 8 insertions(+), 23 deletions(-) Looks nice. I'm wondering if we should leave this one in tip for an additional cycle to have it tested on more hw. I know, it is architectural and so on but who knows what every implementation actually does... -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.
Re: [PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize
On Wed, Nov 30, 2016 at 12:34:55PM -0800, Andy Lutomirski wrote: > Aside from being excessively slow, CPUID is problematic: Linux runs > on a handful of CPUs that don't have CPUID. MOV to CR2 is always > available, so use it instead. > > Signed-off-by: Andy Lutomirski > --- > arch/x86/include/asm/processor.h | 31 --- > 1 file changed, 8 insertions(+), 23 deletions(-) Looks nice. I'm wondering if we should leave this one in tip for an additional cycle to have it tested on more hw. I know, it is architectural and so on but who knows what every implementation actually does... -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.
[PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize
Aside from being excessively slow, CPUID is problematic: Linux runs on a handful of CPUs that don't have CPUID. MOV to CR2 is always available, so use it instead. Signed-off-by: Andy Lutomirski--- arch/x86/include/asm/processor.h | 31 --- 1 file changed, 8 insertions(+), 23 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 64fbc937d586..0388f3d85700 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -593,31 +593,16 @@ static __always_inline void cpu_relax(void) /* Stop speculative execution and prefetching of modified code. */ static inline void sync_core(void) { - int tmp; - -#ifdef CONFIG_X86_32 - /* -* Do a CPUID if available, otherwise do a jump. The jump -* can conveniently enough be the jump around CPUID. -*/ - asm volatile("cmpl %2,%1\n\t" -"jl 1f\n\t" -"cpuid\n" -"1:" -: "=a" (tmp) -: "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1) -: "ebx", "ecx", "edx", "memory"); -#else /* -* CPUID is a barrier to speculative execution. -* Prefetched instructions are automatically -* invalidated when modified. +* MOV to CR2 is architecturally defined as a serializing +* instruction. It's nice because it works on all CPUs, it +* doesn't clobber registers, and (unlike CPUID) it won't force +* a VM exit. +* +* 0xbf172b23 is random poison just in case something ends up +* caring about this value. */ - asm volatile("cpuid" -: "=a" (tmp) -: "0" (1) -: "ebx", "ecx", "edx", "memory"); -#endif + write_cr2(0xbf172b23); } extern void select_idle_routine(const struct cpuinfo_x86 *c); -- 2.9.3
[PATCH 4/4] x86/asm: Change sync_core() to use MOV to CR2 to serialize
Aside from being excessively slow, CPUID is problematic: Linux runs on a handful of CPUs that don't have CPUID. MOV to CR2 is always available, so use it instead. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/processor.h | 31 --- 1 file changed, 8 insertions(+), 23 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 64fbc937d586..0388f3d85700 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -593,31 +593,16 @@ static __always_inline void cpu_relax(void) /* Stop speculative execution and prefetching of modified code. */ static inline void sync_core(void) { - int tmp; - -#ifdef CONFIG_X86_32 - /* -* Do a CPUID if available, otherwise do a jump. The jump -* can conveniently enough be the jump around CPUID. -*/ - asm volatile("cmpl %2,%1\n\t" -"jl 1f\n\t" -"cpuid\n" -"1:" -: "=a" (tmp) -: "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1) -: "ebx", "ecx", "edx", "memory"); -#else /* -* CPUID is a barrier to speculative execution. -* Prefetched instructions are automatically -* invalidated when modified. +* MOV to CR2 is architecturally defined as a serializing +* instruction. It's nice because it works on all CPUs, it +* doesn't clobber registers, and (unlike CPUID) it won't force +* a VM exit. +* +* 0xbf172b23 is random poison just in case something ends up +* caring about this value. */ - asm volatile("cpuid" -: "=a" (tmp) -: "0" (1) -: "ebx", "ecx", "edx", "memory"); -#endif + write_cr2(0xbf172b23); } extern void select_idle_routine(const struct cpuinfo_x86 *c); -- 2.9.3