Re: [GIT PULL] x86/topology changes for v5.3

2019-07-11 Thread Nadav Amit
> On Jul 11, 2019, at 8:08 AM, Kees Cook  wrote:
> 
> On Thu, Jul 11, 2019 at 10:01:34AM +0200, Peter Zijlstra wrote:
>> On Thu, Jul 11, 2019 at 07:11:19AM +, Nadav Amit wrote:
 On Jul 10, 2019, at 7:22 AM, Jiri Kosina  wrote:
 
 On Wed, 10 Jul 2019, Peter Zijlstra wrote:
 
> If we mark the key as RO after init, and then try and modify the key to
> link module usage sites, things might go bang as described.
> 
> Thanks!
> 
> 
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 27d7864e7252..5bf7a8354da2 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct 
> cpuinfo_x86 *c)
>   cr4_clear_bits(X86_CR4_UMIP);
> }
> 
> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> +DEFINE_STATIC_KEY_FALSE(cr_pinning);
 
 Good catch, I guess that is going to fix it.
 
 At the same time though, it sort of destroys the original intent of Kees' 
 patch, right? The exploits will just have to call static_key_disable() 
 prior to calling native_write_cr4() again, and the protection is gone.
>>> 
>>> Even with DEFINE_STATIC_KEY_FALSE_RO(), I presume you can just call
>>> set_memory_rw(), make the page that holds the key writable, and then call
>>> static_key_disable(), followed by a call to native_write_cr4().
>> 
>> Or call text_poke_bp() with the right set of arguments.
> 
> Right -- the point is to make it defended against an arbitrary write,
> not arbitrary execution. Nothing is safe from arbitrary exec, but we can
> do our due diligence on making things read-only.

I don’t understand.

In the PoC that motivated this this patch [1], the attacker gained the
ability to call a function, control its first argument and used it to
disable SMEP/SMAP by calling native_write_cr4(), which he did before doing
an arbitrary write (another ability he gain).

After this patch, the attacker can instead call three functions, and by
controlling their first arguments (*) disable SMEP. I didn’t see something
in the motivating PoC that prevented calling 3 functions one at a time.

So it seems to me that it raised the bar for an attack by very little.

--

(*) set_memory_rw() has a second argument - the number of pages - but many
non-zero values may be fine (if not a warning from __cpa_process_fault()
might appear).

[1] 
https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-packet.html



Re: [GIT PULL] x86/topology changes for v5.3

2019-07-11 Thread Kees Cook
On Thu, Jul 11, 2019 at 10:01:34AM +0200, Peter Zijlstra wrote:
> On Thu, Jul 11, 2019 at 07:11:19AM +, Nadav Amit wrote:
> > > On Jul 10, 2019, at 7:22 AM, Jiri Kosina  wrote:
> > > 
> > > On Wed, 10 Jul 2019, Peter Zijlstra wrote:
> > > 
> > >> If we mark the key as RO after init, and then try and modify the key to
> > >> link module usage sites, things might go bang as described.
> > >> 
> > >> Thanks!
> > >> 
> > >> 
> > >> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > >> index 27d7864e7252..5bf7a8354da2 100644
> > >> --- a/arch/x86/kernel/cpu/common.c
> > >> +++ b/arch/x86/kernel/cpu/common.c
> > >> @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct 
> > >> cpuinfo_x86 *c)
> > >>  cr4_clear_bits(X86_CR4_UMIP);
> > >> }
> > >> 
> > >> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> > >> +DEFINE_STATIC_KEY_FALSE(cr_pinning);
> > > 
> > > Good catch, I guess that is going to fix it.
> > > 
> > > At the same time though, it sort of destroys the original intent of Kees' 
> > > patch, right? The exploits will just have to call static_key_disable() 
> > > prior to calling native_write_cr4() again, and the protection is gone.
> > 
> > Even with DEFINE_STATIC_KEY_FALSE_RO(), I presume you can just call
> > set_memory_rw(), make the page that holds the key writable, and then call
> > static_key_disable(), followed by a call to native_write_cr4().
> 
> Or call text_poke_bp() with the right set of arguments.

Right -- the point is to make it defended against an arbitrary write,
not arbitrary execution. Nothing is safe from arbitrary exec, but we can
do our due diligence on making things read-only.

-- 
Kees Cook


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-11 Thread Peter Zijlstra
On Thu, Jul 11, 2019 at 07:11:19AM +, Nadav Amit wrote:
> > On Jul 10, 2019, at 7:22 AM, Jiri Kosina  wrote:
> > 
> > On Wed, 10 Jul 2019, Peter Zijlstra wrote:
> > 
> >> If we mark the key as RO after init, and then try and modify the key to
> >> link module usage sites, things might go bang as described.
> >> 
> >> Thanks!
> >> 
> >> 
> >> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> >> index 27d7864e7252..5bf7a8354da2 100644
> >> --- a/arch/x86/kernel/cpu/common.c
> >> +++ b/arch/x86/kernel/cpu/common.c
> >> @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct 
> >> cpuinfo_x86 *c)
> >>cr4_clear_bits(X86_CR4_UMIP);
> >> }
> >> 
> >> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> >> +DEFINE_STATIC_KEY_FALSE(cr_pinning);
> > 
> > Good catch, I guess that is going to fix it.
> > 
> > At the same time though, it sort of destroys the original intent of Kees' 
> > patch, right? The exploits will just have to call static_key_disable() 
> > prior to calling native_write_cr4() again, and the protection is gone.
> 
> Even with DEFINE_STATIC_KEY_FALSE_RO(), I presume you can just call
> set_memory_rw(), make the page that holds the key writable, and then call
> static_key_disable(), followed by a call to native_write_cr4().

Or call text_poke_bp() with the right set of arguments.


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-11 Thread Thomas Gleixner
On Thu, 11 Jul 2019, Nadav Amit wrote:
> > On Jul 10, 2019, at 7:22 AM, Jiri Kosina  wrote:
> > 
> > On Wed, 10 Jul 2019, Peter Zijlstra wrote:
> > 
> >> If we mark the key as RO after init, and then try and modify the key to
> >> link module usage sites, things might go bang as described.
> >> 
> >> Thanks!
> >> 
> >> 
> >> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> >> index 27d7864e7252..5bf7a8354da2 100644
> >> --- a/arch/x86/kernel/cpu/common.c
> >> +++ b/arch/x86/kernel/cpu/common.c
> >> @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct 
> >> cpuinfo_x86 *c)
> >>cr4_clear_bits(X86_CR4_UMIP);
> >> }
> >> 
> >> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> >> +DEFINE_STATIC_KEY_FALSE(cr_pinning);
> > 
> > Good catch, I guess that is going to fix it.
> > 
> > At the same time though, it sort of destroys the original intent of Kees' 
> > patch, right? The exploits will just have to call static_key_disable() 
> > prior to calling native_write_cr4() again, and the protection is gone.
> 
> Even with DEFINE_STATIC_KEY_FALSE_RO(), I presume you can just call
> set_memory_rw(), make the page that holds the key writable, and then call
> static_key_disable(), followed by a call to native_write_cr4().

That's true, but it's not worth the trouble.

Thanks,

tglx


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-11 Thread Nadav Amit
> On Jul 10, 2019, at 7:22 AM, Jiri Kosina  wrote:
> 
> On Wed, 10 Jul 2019, Peter Zijlstra wrote:
> 
>> If we mark the key as RO after init, and then try and modify the key to
>> link module usage sites, things might go bang as described.
>> 
>> Thanks!
>> 
>> 
>> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
>> index 27d7864e7252..5bf7a8354da2 100644
>> --- a/arch/x86/kernel/cpu/common.c
>> +++ b/arch/x86/kernel/cpu/common.c
>> @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct 
>> cpuinfo_x86 *c)
>>  cr4_clear_bits(X86_CR4_UMIP);
>> }
>> 
>> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
>> +DEFINE_STATIC_KEY_FALSE(cr_pinning);
> 
> Good catch, I guess that is going to fix it.
> 
> At the same time though, it sort of destroys the original intent of Kees' 
> patch, right? The exploits will just have to call static_key_disable() 
> prior to calling native_write_cr4() again, and the protection is gone.

Even with DEFINE_STATIC_KEY_FALSE_RO(), I presume you can just call
set_memory_rw(), make the page that holds the key writable, and then call
static_key_disable(), followed by a call to native_write_cr4().

Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Linus Torvalds
On Tue, Jul 9, 2019 at 10:15 PM Linus Torvalds
 wrote:
>
> I think there is _another_ problem too, and maybe it's the APCI one,
> but this one triggers some issue before the other issue even gets to
> play..

No, the other problem was the keyring ACL support from David Howells,
which seems to have broken encrypted disk setups.

So apparently I wasn't impacted by the ACPI thing, but had two other
issues instead. Not a great start to the merge window.

 Linus


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Xi Ruoyao
On 2019-07-10 17:13 +0200, Thomas Gleixner wrote:
> Something like the below. Builds and boots, must be perfect.
> 
> Thanks,
> 
>   tglx

Tested-by: Xi Ruoyao 

> 8<
> 
>  arch/x86/include/asm/processor.h |1 
>  arch/x86/include/asm/special_insns.h |   41 ---
>  arch/x86/kernel/cpu/common.c |   72 +++
> 
>  arch/x86/kernel/smpboot.c|   14 --
>  arch/x86/xen/smp_pv.c|1 
>  5 files changed, 61 insertions(+), 68 deletions(-)
> 
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -741,6 +741,7 @@ extern void load_direct_gdt(int);
>  extern void load_fixmap_gdt(int);
>  extern void load_percpu_segment(int);
>  extern void cpu_init(void);
> +extern void cr4_init(void);
>  
>  static inline unsigned long get_debugctlmsr(void)
>  {
> --- a/arch/x86/include/asm/special_insns.h
> +++ b/arch/x86/include/asm/special_insns.h
> @@ -18,9 +18,7 @@
>   */
>  extern unsigned long __force_order;
>  
> -/* Starts false and gets enabled once CPU feature detection is done. */
> -DECLARE_STATIC_KEY_FALSE(cr_pinning);
> -extern unsigned long cr4_pinned_bits;
> +void native_write_cr0(unsigned long val);
>  
>  static inline unsigned long native_read_cr0(void)
>  {
> @@ -29,24 +27,6 @@ static inline unsigned long native_read_
>   return val;
>  }
>  
> -static inline void native_write_cr0(unsigned long val)
> -{
> - unsigned long bits_missing = 0;
> -
> -set_register:
> - asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order));
> -
> - if (static_branch_likely(_pinning)) {
> - if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) {
> - bits_missing = X86_CR0_WP;
> - val |= bits_missing;
> - goto set_register;
> - }
> - /* Warn after we've set the missing bits. */
> - WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n");
> - }
> -}
> -
>  static inline unsigned long native_read_cr2(void)
>  {
>   unsigned long val;
> @@ -91,24 +71,7 @@ static inline unsigned long native_read_
>   return val;
>  }
>  
> -static inline void native_write_cr4(unsigned long val)
> -{
> - unsigned long bits_missing = 0;
> -
> -set_register:
> - asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
> -
> - if (static_branch_likely(_pinning)) {
> - if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) {
> - bits_missing = ~val & cr4_pinned_bits;
> - val |= bits_missing;
> - goto set_register;
> - }
> - /* Warn after we've set the missing bits. */
> - WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n",
> -   bits_missing);
> - }
> -}
> +void native_write_cr4(unsigned long val);
>  
>  #ifdef CONFIG_X86_64
>  static inline unsigned long native_read_cr8(void)
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -366,10 +366,62 @@ static __always_inline void setup_umip(s
>   cr4_clear_bits(X86_CR4_UMIP);
>  }
>  
> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> -EXPORT_SYMBOL(cr_pinning);
> -unsigned long cr4_pinned_bits __ro_after_init;
> -EXPORT_SYMBOL(cr4_pinned_bits);
> +static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> +static unsigned long cr4_pinned_bits __ro_after_init;
> +
> +void native_write_cr0(unsigned long val)
> +{
> + unsigned long bits_missing = 0;
> +
> +set_register:
> + asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order));
> +
> + if (static_branch_likely(_pinning)) {
> + if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) {
> + bits_missing = X86_CR0_WP;
> + val |= bits_missing;
> + goto set_register;
> + }
> + /* Warn after we've set the missing bits. */
> + WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n");
> + }
> +}
> +EXPORT_SYMBOL(native_write_cr0);
> +
> +void native_write_cr4(unsigned long val)
> +{
> + unsigned long bits_missing = 0;
> +
> +set_register:
> + asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
> +
> + if (static_branch_likely(_pinning)) {
> + if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) {
> + bits_missing = ~val & cr4_pinned_bits;
> + val |= bits_missing;
> + goto set_register;
> + }
> + /* Warn after we've set the missing bits. */
> + WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n",
> +   bits_missing);
> + }
> +}
> +EXPORT_SYMBOL(native_write_cr4);
> +
> +void cr4_init(void)
> +{
> + unsigned long cr4 = __read_cr4();
> +
> + if (boot_cpu_has(X86_FEATURE_PCID))
> + cr4 |= X86_CR4_PCIDE;
> + if 

Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Thomas Gleixner
On Wed, 10 Jul 2019, Peter Zijlstra wrote:
> On Wed, Jul 10, 2019 at 04:22:51PM +0200, Jiri Kosina wrote:
> > On Wed, 10 Jul 2019, Peter Zijlstra wrote:
> > 
> > > If we mark the key as RO after init, and then try and modify the key to
> > > link module usage sites, things might go bang as described.
> > > 
> > > Thanks!
> > > 
> > > 
> > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > > index 27d7864e7252..5bf7a8354da2 100644
> > > --- a/arch/x86/kernel/cpu/common.c
> > > +++ b/arch/x86/kernel/cpu/common.c
> > > @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct 
> > > cpuinfo_x86 *c)
> > >   cr4_clear_bits(X86_CR4_UMIP);
> > >  }
> > >  
> > > -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> > > +DEFINE_STATIC_KEY_FALSE(cr_pinning);
> > 
> > Good catch, I guess that is going to fix it.
> > 
> > At the same time though, it sort of destroys the original intent of Kees' 
> > patch, right? The exploits will just have to call static_key_disable() 
> > prior to calling native_write_cr4() again, and the protection is gone.
> 
> This is fixable by moving native_write_cr*() out-of-line, such that they
> never end up in a module.

Something like the below. Builds and boots, must be perfect.

Thanks,

tglx

8<

 arch/x86/include/asm/processor.h |1 
 arch/x86/include/asm/special_insns.h |   41 ---
 arch/x86/kernel/cpu/common.c |   72 +++
 arch/x86/kernel/smpboot.c|   14 --
 arch/x86/xen/smp_pv.c|1 
 5 files changed, 61 insertions(+), 68 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -741,6 +741,7 @@ extern void load_direct_gdt(int);
 extern void load_fixmap_gdt(int);
 extern void load_percpu_segment(int);
 extern void cpu_init(void);
+extern void cr4_init(void);
 
 static inline unsigned long get_debugctlmsr(void)
 {
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -18,9 +18,7 @@
  */
 extern unsigned long __force_order;
 
-/* Starts false and gets enabled once CPU feature detection is done. */
-DECLARE_STATIC_KEY_FALSE(cr_pinning);
-extern unsigned long cr4_pinned_bits;
+void native_write_cr0(unsigned long val);
 
 static inline unsigned long native_read_cr0(void)
 {
@@ -29,24 +27,6 @@ static inline unsigned long native_read_
return val;
 }
 
-static inline void native_write_cr0(unsigned long val)
-{
-   unsigned long bits_missing = 0;
-
-set_register:
-   asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order));
-
-   if (static_branch_likely(_pinning)) {
-   if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) {
-   bits_missing = X86_CR0_WP;
-   val |= bits_missing;
-   goto set_register;
-   }
-   /* Warn after we've set the missing bits. */
-   WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n");
-   }
-}
-
 static inline unsigned long native_read_cr2(void)
 {
unsigned long val;
@@ -91,24 +71,7 @@ static inline unsigned long native_read_
return val;
 }
 
-static inline void native_write_cr4(unsigned long val)
-{
-   unsigned long bits_missing = 0;
-
-set_register:
-   asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
-
-   if (static_branch_likely(_pinning)) {
-   if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) {
-   bits_missing = ~val & cr4_pinned_bits;
-   val |= bits_missing;
-   goto set_register;
-   }
-   /* Warn after we've set the missing bits. */
-   WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n",
- bits_missing);
-   }
-}
+void native_write_cr4(unsigned long val);
 
 #ifdef CONFIG_X86_64
 static inline unsigned long native_read_cr8(void)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -366,10 +366,62 @@ static __always_inline void setup_umip(s
cr4_clear_bits(X86_CR4_UMIP);
 }
 
-DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
-EXPORT_SYMBOL(cr_pinning);
-unsigned long cr4_pinned_bits __ro_after_init;
-EXPORT_SYMBOL(cr4_pinned_bits);
+static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
+static unsigned long cr4_pinned_bits __ro_after_init;
+
+void native_write_cr0(unsigned long val)
+{
+   unsigned long bits_missing = 0;
+
+set_register:
+   asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order));
+
+   if (static_branch_likely(_pinning)) {
+   if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) {
+   bits_missing = X86_CR0_WP;
+   val |= bits_missing;
+   goto set_register;
+   }
+   /* Warn after we've set the missing bits. */
+   WARN_ONCE(bits_missing, "CR0 WP bit went 

Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Xi Ruoyao
On 2019-07-10 16:22 +0200, Jiri Kosina wrote:
> On Wed, 10 Jul 2019, Peter Zijlstra wrote:
> 
> > If we mark the key as RO after init, and then try and modify the key to
> > link module usage sites, things might go bang as described.
> > 
> > Thanks!
> > 
> > 
> > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > index 27d7864e7252..5bf7a8354da2 100644
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct
> > cpuinfo_x86 *c)
> > cr4_clear_bits(X86_CR4_UMIP);
> >  }
> >  
> > -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> > +DEFINE_STATIC_KEY_FALSE(cr_pinning);
> 
> Good catch, I guess that is going to fix it.

Yes it works.

> At the same time though, it sort of destroys the original intent of Kees' 
> patch, right? The exploits will just have to call static_key_disable() 
> prior to calling native_write_cr4() again, and the protection is gone.

I think I should do some study and try to understand the full story of Kees'
change...
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Peter Zijlstra
On Wed, Jul 10, 2019 at 04:22:51PM +0200, Jiri Kosina wrote:
> On Wed, 10 Jul 2019, Peter Zijlstra wrote:
> 
> > If we mark the key as RO after init, and then try and modify the key to
> > link module usage sites, things might go bang as described.
> > 
> > Thanks!
> > 
> > 
> > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > index 27d7864e7252..5bf7a8354da2 100644
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct 
> > cpuinfo_x86 *c)
> > cr4_clear_bits(X86_CR4_UMIP);
> >  }
> >  
> > -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> > +DEFINE_STATIC_KEY_FALSE(cr_pinning);
> 
> Good catch, I guess that is going to fix it.
> 
> At the same time though, it sort of destroys the original intent of Kees' 
> patch, right? The exploits will just have to call static_key_disable() 
> prior to calling native_write_cr4() again, and the protection is gone.

This is fixable by moving native_write_cr*() out-of-line, such that they
never end up in a module.



Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Thomas Gleixner
On Wed, 10 Jul 2019, Thomas Gleixner wrote:

> On Wed, 10 Jul 2019, Peter Zijlstra wrote:
> 
> > On Wed, Jul 10, 2019 at 09:25:16PM +0800, Xi Ruoyao wrote:
> > > On 2019-07-10 14:31 +0200, Jiri Kosina wrote:
> > > > Adding Daniel to check whether this couldn't be some fallout of 
> > > > jumplabel 
> > > > batching.
> > > 
> > > I don't think so.  I tried to revert Daniel's jumplabel batching commits 
> > > and the
> > > issue wasn't solved.  But reverting Kees' CR0 and CR4 commits can "fix" it
> > > (apprently).
> > 
> > Xi, could you please try the below instead?
> > 
> > If we mark the key as RO after init, and then try and modify the key to
> > link module usage sites, things might go bang as described.
> 
> Right. I finally was able to reproduce that with Linus' config (slightly
> modified). Applying your patch makes it go away.
> 
> Now what puzzles me is that this never exploded in my face before and with
> a debian config it JustWorks.
> 
> Both configs have:
> 
>  CONFIG_KVM=m
>  CONFIG_KVM_INTEL=m
> 
> so I'd expect both to crash and burn when KVM_INTEL is loaded as that has a
> cr4 operation inside.
> 
> So something papers over that ... Still looking.

Just found it. All my usual configs have the paravirt muck enabled. Linus's
config does not, neither has Xi's.

With paravirt the CR pinning key is never in the module. It's in the
paravirt function and obviously always built in.

Thanks,

tglx


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Jiri Kosina
On Wed, 10 Jul 2019, Peter Zijlstra wrote:

> If we mark the key as RO after init, and then try and modify the key to
> link module usage sites, things might go bang as described.
> 
> Thanks!
> 
> 
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 27d7864e7252..5bf7a8354da2 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct cpuinfo_x86 
> *c)
>   cr4_clear_bits(X86_CR4_UMIP);
>  }
>  
> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> +DEFINE_STATIC_KEY_FALSE(cr_pinning);

Good catch, I guess that is going to fix it.

At the same time though, it sort of destroys the original intent of Kees' 
patch, right? The exploits will just have to call static_key_disable() 
prior to calling native_write_cr4() again, and the protection is gone.

-- 
Jiri Kosina
SUSE Labs



Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Thomas Gleixner
On Wed, 10 Jul 2019, Peter Zijlstra wrote:

> On Wed, Jul 10, 2019 at 09:25:16PM +0800, Xi Ruoyao wrote:
> > On 2019-07-10 14:31 +0200, Jiri Kosina wrote:
> > > Adding Daniel to check whether this couldn't be some fallout of jumplabel 
> > > batching.
> > 
> > I don't think so.  I tried to revert Daniel's jumplabel batching commits 
> > and the
> > issue wasn't solved.  But reverting Kees' CR0 and CR4 commits can "fix" it
> > (apprently).
> 
> Xi, could you please try the below instead?
> 
> If we mark the key as RO after init, and then try and modify the key to
> link module usage sites, things might go bang as described.

Right. I finally was able to reproduce that with Linus' config (slightly
modified). Applying your patch makes it go away.

Now what puzzles me is that this never exploded in my face before and with
a debian config it JustWorks.

Both configs have:

 CONFIG_KVM=m
 CONFIG_KVM_INTEL=m

so I'd expect both to crash and burn when KVM_INTEL is loaded as that has a
cr4 operation inside.

So something papers over that ... Still looking.

Thanks,

tglx




Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Peter Zijlstra
On Wed, Jul 10, 2019 at 09:25:16PM +0800, Xi Ruoyao wrote:
> On 2019-07-10 14:31 +0200, Jiri Kosina wrote:
> > Adding Daniel to check whether this couldn't be some fallout of jumplabel 
> > batching.
> 
> I don't think so.  I tried to revert Daniel's jumplabel batching commits and 
> the
> issue wasn't solved.  But reverting Kees' CR0 and CR4 commits can "fix" it
> (apprently).

Xi, could you please try the below instead?

If we mark the key as RO after init, and then try and modify the key to
link module usage sites, things might go bang as described.

Thanks!


diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 27d7864e7252..5bf7a8354da2 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct cpuinfo_x86 
*c)
cr4_clear_bits(X86_CR4_UMIP);
 }
 
-DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
+DEFINE_STATIC_KEY_FALSE(cr_pinning);
 EXPORT_SYMBOL(cr_pinning);
 unsigned long cr4_pinned_bits __ro_after_init;
 EXPORT_SYMBOL(cr4_pinned_bits);


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Xi Ruoyao
On 2019-07-10 15:28 +0200, Jiri Kosina wrote:
> On Wed, 10 Jul 2019, Jiri Kosina wrote:

> > > > > BUG: unable to handle page fault for address: 9edc1598
> > > > > #PF: supervisor write access in kernel mode
> > > > > #PF: error_code(0x0003) - permissions violation

> > Hm, and it seems to explode on dereferencing the static_key* in %rsi
> 
> ^^^ %rdi of
> course
> 
> >   21:   48 8b 37mov(%rdi),%rsi
> >   24:   83 e6 03and$0x3,%esi
> >   27:   48 09 c6or %rax,%rsi
> >   2a:*  48 89 37mov%rsi,(%rdi)  <-- trapping
> > instruction
> > 
> > which looks odd, as it derefenced it successfully just 3 instructions ago.

It seems the MMU (I guess ?) allows to read it, but disallows to write it:
"supervisor write access in kernel mode".
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Jiri Kosina
On Wed, 10 Jul 2019, Jiri Kosina wrote:

> On Wed, 10 Jul 2019, Peter Zijlstra wrote:
> 
> > > > BUG: unable to handle page fault for address: 9edc1598
> > > > #PF: supervisor write access in kernel mode
> > > > #PF: error_code(0x0003) - permissions violation
> > > > PGD 1a20c067 P4D 1a20c067 PUD 1a20d063 PMD 800019e000e1 
> > > > Oops: 0003 [#1] SMP PTI
> > > > 2 PID: 151 Comm: systemd-udevd Not tainted 5.2.0+ #54
> > > > Hardware name: LENOVO 20175/INVALID, BIOS 66CN54WW 01/21/2013
> > > > RIP: 0010:static_key_set_mod.isra.0+0x10/0x30
> > > > Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 
> > > > 00 00 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 
> > > > c3 0f 0b 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e
> > > > RSP: :a606c032bc98 EFLAGS: 00010286
> > > > RAX: 9981ddce30a0 RBX: 9edc1590 RCX: 
> > > > RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598
> > > > RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0
> > > > R10:  R11: 00028b56 R12: c06f8880
> > > > R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0
> > > > FS:  7f992dd9a680() GS:9981e708() 
> > > > knlGS:
> > > > CS:  0010 DS:  ES:  CR0: 80050033
> > > > CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0
> > > > Call Trace:
> > > >   jump_label_module_notify+0x1e7/0x2b0
> > > >   notifier_call_chain+0x44/0x70
> > > >   blocking_notifier_call_chain+0x43/0x60
> > > >   load_module+0x1bcb/0x2490
> > > >   ? vfs_read+0x11f/0x150
> > > >   ? __do_sys_finit_module+0xbf/0xe0
> > > >   __do_sys_finit_module+0xbf/0xe0
> > > >   do_syscall_64+0x43/0x110
> > > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > 
> > > > Josh, didn't you mention that yesterday or so?
> > > 
> > > That's what Tony yesterday indicated on IRC that his system is suffering 
> > > from as well.
> > > 
> > > Adding Daniel to check whether this couldn't be some fallout of jumplabel 
> > > batching.
> > 
> > AFAICT this is _before_ we get to patching. The function that explodes
> > is static_key_set_mod(), as called from jump_label_add_module().
> > 
> > What that function does is for all patch sites in the module, find the
> > corresponding key; if that key is not also in that module, allocate a
> > static_key_mod structure and link the module entries to the key. Such
> > that we can find all instances from a given key.
> > 
> > I don't think anything here has changed in a while.
> 
> Hm, and it seems to explode on dereferencing the static_key* in %rsi

  ^^^ %rdi of 
course

>   21:   48 8b 37mov(%rdi),%rsi
>   24:   83 e6 03and$0x3,%esi
>   27:   48 09 c6or %rax,%rsi
>   2a:*  48 89 37mov%rsi,(%rdi)  <-- trapping 
> instruction
> 
> which looks odd, as it derefenced it successfully just 3 instructions ago.
> 
> -- 
> Jiri Kosina
> SUSE Labs
> 
> 

-- 
Jiri Kosina
SUSE Labs



Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Jiri Kosina
On Wed, 10 Jul 2019, Peter Zijlstra wrote:

> > > BUG: unable to handle page fault for address: 9edc1598
> > > #PF: supervisor write access in kernel mode
> > > #PF: error_code(0x0003) - permissions violation
> > > PGD 1a20c067 P4D 1a20c067 PUD 1a20d063 PMD 800019e000e1 
> > > Oops: 0003 [#1] SMP PTI
> > > 2 PID: 151 Comm: systemd-udevd Not tainted 5.2.0+ #54
> > > Hardware name: LENOVO 20175/INVALID, BIOS 66CN54WW 01/21/2013
> > > RIP: 0010:static_key_set_mod.isra.0+0x10/0x30
> > > Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 
> > > 00 00 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 
> > > 0f 0b 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e
> > > RSP: :a606c032bc98 EFLAGS: 00010286
> > > RAX: 9981ddce30a0 RBX: 9edc1590 RCX: 
> > > RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598
> > > RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0
> > > R10:  R11: 00028b56 R12: c06f8880
> > > R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0
> > > FS:  7f992dd9a680() GS:9981e708() 
> > > knlGS:
> > > CS:  0010 DS:  ES:  CR0: 80050033
> > > CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0
> > > Call Trace:
> > >   jump_label_module_notify+0x1e7/0x2b0
> > >   notifier_call_chain+0x44/0x70
> > >   blocking_notifier_call_chain+0x43/0x60
> > >   load_module+0x1bcb/0x2490
> > >   ? vfs_read+0x11f/0x150
> > >   ? __do_sys_finit_module+0xbf/0xe0
> > >   __do_sys_finit_module+0xbf/0xe0
> > >   do_syscall_64+0x43/0x110
> > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > 
> > > Josh, didn't you mention that yesterday or so?
> > 
> > That's what Tony yesterday indicated on IRC that his system is suffering 
> > from as well.
> > 
> > Adding Daniel to check whether this couldn't be some fallout of jumplabel 
> > batching.
> 
> AFAICT this is _before_ we get to patching. The function that explodes
> is static_key_set_mod(), as called from jump_label_add_module().
> 
> What that function does is for all patch sites in the module, find the
> corresponding key; if that key is not also in that module, allocate a
> static_key_mod structure and link the module entries to the key. Such
> that we can find all instances from a given key.
> 
> I don't think anything here has changed in a while.

Hm, and it seems to explode on dereferencing the static_key* in %rsi

  21:   48 8b 37mov(%rdi),%rsi
  24:   83 e6 03and$0x3,%esi
  27:   48 09 c6or %rax,%rsi
  2a:*  48 89 37mov%rsi,(%rdi)  <-- trapping 
instruction

which looks odd, as it derefenced it successfully just 3 instructions ago.

-- 
Jiri Kosina
SUSE Labs



Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Xi Ruoyao
On 2019-07-10 14:31 +0200, Jiri Kosina wrote:
> Adding Daniel to check whether this couldn't be some fallout of jumplabel 
> batching.

I don't think so.  I tried to revert Daniel's jumplabel batching commits and the
issue wasn't solved.  But reverting Kees' CR0 and CR4 commits can "fix" it
(apprently).
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Peter Zijlstra
On Wed, Jul 10, 2019 at 02:31:29PM +0200, Jiri Kosina wrote:
> On Wed, 10 Jul 2019, Thomas Gleixner wrote:
> 
> > From the log:
> > 
> > BUG: unable to handle page fault for address: 9edc1598
> > #PF: supervisor write access in kernel mode
> > #PF: error_code(0x0003) - permissions violation
> > PGD 1a20c067 P4D 1a20c067 PUD 1a20d063 PMD 800019e000e1 
> > Oops: 0003 [#1] SMP PTI
> > 2 PID: 151 Comm: systemd-udevd Not tainted 5.2.0+ #54
> > Hardware name: LENOVO 20175/INVALID, BIOS 66CN54WW 01/21/2013
> > RIP: 0010:static_key_set_mod.isra.0+0x10/0x30
> > Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 00 
> > 00 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 0f 
> > 0b 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e
> > RSP: :a606c032bc98 EFLAGS: 00010286
> > RAX: 9981ddce30a0 RBX: 9edc1590 RCX: 
> > RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598
> > RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0
> > R10:  R11: 00028b56 R12: c06f8880
> > R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0
> > FS:  7f992dd9a680() GS:9981e708() knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0
> > Call Trace:
> >   jump_label_module_notify+0x1e7/0x2b0
> >   notifier_call_chain+0x44/0x70
> >   blocking_notifier_call_chain+0x43/0x60
> >   load_module+0x1bcb/0x2490
> >   ? vfs_read+0x11f/0x150
> >   ? __do_sys_finit_module+0xbf/0xe0
> >   __do_sys_finit_module+0xbf/0xe0
> >   do_syscall_64+0x43/0x110
> >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > 
> > Josh, didn't you mention that yesterday or so?
> 
> That's what Tony yesterday indicated on IRC that his system is suffering 
> from as well.
> 
> Adding Daniel to check whether this couldn't be some fallout of jumplabel 
> batching.

AFAICT this is _before_ we get to patching. The function that explodes
is static_key_set_mod(), as called from jump_label_add_module().

What that function does is for all patch sites in the module, find the
corresponding key; if that key is not also in that module, allocate a
static_key_mod structure and link the module entries to the key. Such
that we can find all instances from a given key.

I don't think anything here has changed in a while.


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Jiri Kosina
On Wed, 10 Jul 2019, Thomas Gleixner wrote:

> From the log:
> 
> BUG: unable to handle page fault for address: 9edc1598
> #PF: supervisor write access in kernel mode
> #PF: error_code(0x0003) - permissions violation
> PGD 1a20c067 P4D 1a20c067 PUD 1a20d063 PMD 800019e000e1 
> Oops: 0003 [#1] SMP PTI
> 2 PID: 151 Comm: systemd-udevd Not tainted 5.2.0+ #54
> Hardware name: LENOVO 20175/INVALID, BIOS 66CN54WW 01/21/2013
> RIP: 0010:static_key_set_mod.isra.0+0x10/0x30
> Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 00 00 
> 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 0f 0b 48 
> 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e
> RSP: :a606c032bc98 EFLAGS: 00010286
> RAX: 9981ddce30a0 RBX: 9edc1590 RCX: 
> RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598
> RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0
> R10:  R11: 00028b56 R12: c06f8880
> R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0
> FS:  7f992dd9a680() GS:9981e708() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0
> Call Trace:
>   jump_label_module_notify+0x1e7/0x2b0
>   notifier_call_chain+0x44/0x70
>   blocking_notifier_call_chain+0x43/0x60
>   load_module+0x1bcb/0x2490
>   ? vfs_read+0x11f/0x150
>   ? __do_sys_finit_module+0xbf/0xe0
>   __do_sys_finit_module+0xbf/0xe0
>   do_syscall_64+0x43/0x110
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Josh, didn't you mention that yesterday or so?

That's what Tony yesterday indicated on IRC that his system is suffering 
from as well.

Adding Daniel to check whether this couldn't be some fallout of jumplabel 
batching.

> 
> 
> RIP: 0033:0x7f992e2eeaf9
> Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 
> 01 c3 48 8b 0d 67 73 0d 00 f7 d8 64 89 01 48
> RSP: 002b:7ffca220d288 EFLAGS: 0246 ORIG_RAX: 0139
> RAX: ffda RBX: 009b8da0 RCX: 7f992e2eeaf9
> RDX:  RSI: 7f992e464885 RDI: 0010
> RBP: 0002 R08:  R09: 009c45c0
> R10: 0010 R11: 0246 R12: 7f992e464885
> R13:  R14: 009acc50 R15: 009b8da0
> Modules linked in: kvm_intel(+) kvm irqbypass hid_sensor_hub crc32_pclmul 
> mfd_core i2c_i801 snd_hda_intel i915(+) intel_gtt snd_hda_codec i2c_algo_bit 
> snd_hwdep snd_hda_core drm_kms_helper snd_pcm syscopyarea sysfillrect 
> sysimgblt fb_sys_fops drm hid_multitouch ideapad_laptop sparse_keymap 
> hid_generic wmi efivarfs
> CR2: 9edc1598
> [ end trace dbeb7e66daa9bdca ]---
> 
> RIP: 0010:static_key_set_mod.isra.0+0x10/0x30
> Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 00 00 
> 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 0f 0b 48 
> 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e
> RSP: :a606c032bc98 EFLAGS: 00010286
> RAX: 9981ddce30a0 RBX: 9edc1590 RCX: 
> RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598
> RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0
> R10:  R11: 00028b56 R12: c06f8880
> R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0
> FS:  7f992dd9a680() GS:9981e708() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0
> 

-- 
Jiri Kosina
SUSE Labs



Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Thomas Gleixner
On Wed, 10 Jul 2019, Xi Ruoyao wrote:
> On 2019-07-10 19:27 +0800, Xi Ruoyao wrote:
> > On 2019-07-09 17:31 -0700, Kees Cook wrote:
> > > On Wed, Jul 10, 2019 at 01:17:11AM +0200, Thomas Gleixner wrote:
> > > > On Wed, 10 Jul 2019, Thomas Gleixner wrote:
> > > > > That still does not explain the cr4/0 issue you have. Can you send me
> > > > > your
> > > > > .config please?
> > > > 
> > > > Does your machine have UMIP support? None of my test boxes has. So 
> > > > that'd
> > > > be the difference of bits enforced in CR4. Should not matter because 
> > > > it's
> > > > User mode instruction prevention, but who knows.
> > > 
> > > Ew. Yeah, I don't have i9 nor i7 for testing this. I did try everything
> > > else I had (and hibernation). Is only Linus able to reproduce this so far?
> > 
> > I can, too.
> > 
> > > To rule out (in?) UMIP, this would remove UMIP from the pinning:
> > > 
> > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > > index 309b6b9b49d4..f3beedb6da8a 100644
> > > --- a/arch/x86/kernel/cpu/common.c
> > > +++ b/arch/x86/kernel/cpu/common.c
> > > @@ -380,7 +380,7 @@ static void __init setup_cr_pinning(void)
> > >  {
> > >   unsigned long mask;
> > >  
> > > - mask = (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP);
> > > + mask = (X86_CR4_SMEP | X86_CR4_SMAP);
> > >   cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & mask;
> > >   static_key_enable(_pinning.key);
> > >  }
> > 
> > I'll try it.
> 
> That doesn't work, sadly.
>
> My laptop is an old i3-3217u.
> 
> My .config and syslog snip are attached.

>From the log:

BUG: unable to handle page fault for address: 9edc1598
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
PGD 1a20c067 P4D 1a20c067 PUD 1a20d063 PMD 800019e000e1 
Oops: 0003 [#1] SMP PTI
2 PID: 151 Comm: systemd-udevd Not tainted 5.2.0+ #54
Hardware name: LENOVO 20175/INVALID, BIOS 66CN54WW 01/21/2013
RIP: 0010:static_key_set_mod.isra.0+0x10/0x30
Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 00 00 
66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 0f 0b 48 8b 
37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e
RSP: :a606c032bc98 EFLAGS: 00010286
RAX: 9981ddce30a0 RBX: 9edc1590 RCX: 
RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598
RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0
R10:  R11: 00028b56 R12: c06f8880
R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0
FS:  7f992dd9a680() GS:9981e708() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0
Call Trace:
  jump_label_module_notify+0x1e7/0x2b0
  notifier_call_chain+0x44/0x70
  blocking_notifier_call_chain+0x43/0x60
  load_module+0x1bcb/0x2490
  ? vfs_read+0x11f/0x150
  ? __do_sys_finit_module+0xbf/0xe0
  __do_sys_finit_module+0xbf/0xe0
  do_syscall_64+0x43/0x110
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Josh, didn't you mention that yesterday or so?


RIP: 0033:0x7f992e2eeaf9
Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 
c3 48 8b 0d 67 73 0d 00 f7 d8 64 89 01 48
RSP: 002b:7ffca220d288 EFLAGS: 0246 ORIG_RAX: 0139
RAX: ffda RBX: 009b8da0 RCX: 7f992e2eeaf9
RDX:  RSI: 7f992e464885 RDI: 0010
RBP: 0002 R08:  R09: 009c45c0
R10: 0010 R11: 0246 R12: 7f992e464885
R13:  R14: 009acc50 R15: 009b8da0
Modules linked in: kvm_intel(+) kvm irqbypass hid_sensor_hub crc32_pclmul 
mfd_core i2c_i801 snd_hda_intel i915(+) intel_gtt snd_hda_codec i2c_algo_bit 
snd_hwdep snd_hda_core drm_kms_helper snd_pcm syscopyarea sysfillrect sysimgblt 
fb_sys_fops drm hid_multitouch ideapad_laptop sparse_keymap hid_generic wmi 
efivarfs
CR2: 9edc1598
[ end trace dbeb7e66daa9bdca ]---

RIP: 0010:static_key_set_mod.isra.0+0x10/0x30
Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 00 00 
66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 0f 0b 48 8b 
37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e
RSP: :a606c032bc98 EFLAGS: 00010286
RAX: 9981ddce30a0 RBX: 9edc1590 RCX: 
RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598
RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0
R10:  R11: 00028b56 R12: c06f8880
R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0
FS:  7f992dd9a680() GS:9981e708() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Xi Ruoyao
On 2019-07-09 17:31 -0700, Kees Cook wrote:
> On Wed, Jul 10, 2019 at 01:17:11AM +0200, Thomas Gleixner wrote:
> > On Wed, 10 Jul 2019, Thomas Gleixner wrote:
> > > That still does not explain the cr4/0 issue you have. Can you send me your
> > > .config please?
> > 
> > Does your machine have UMIP support? None of my test boxes has. So that'd
> > be the difference of bits enforced in CR4. Should not matter because it's
> > User mode instruction prevention, but who knows.
> 
> Ew. Yeah, I don't have i9 nor i7 for testing this. I did try everything
> else I had (and hibernation). Is only Linus able to reproduce this so far?

I can, too.

> To rule out (in?) UMIP, this would remove UMIP from the pinning:
> 
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 309b6b9b49d4..f3beedb6da8a 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -380,7 +380,7 @@ static void __init setup_cr_pinning(void)
>  {
>   unsigned long mask;
>  
> - mask = (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP);
> + mask = (X86_CR4_SMEP | X86_CR4_SMAP);
>   cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & mask;
>   static_key_enable(_pinning.key);
>  }

I'll try it.
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



Re: [GIT PULL] x86/topology changes for v5.3

2019-07-10 Thread Rafael J. Wysocki
On Wednesday, July 10, 2019 1:00:32 AM CEST Thomas Gleixner wrote:
> On Wed, 10 Jul 2019, Thomas Gleixner wrote:
> > On Tue, 9 Jul 2019, Linus Torvalds wrote:
> > > I also don't have any logs, because the boot never gets far enough. I
> > > assume that there was a problem bringing up a non-boot CPU, and the
> > > eventual hang ends up being due to that.
> > 
> > Hrm. I just build the tip of your tree and bootet it. It hangs at:
> > 
> > [4.788678] ACPI: 4 ACPI AML tables successfully acquired and loaded
> > [4.793860] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
> > [4.821476] ACPI: Dynamic OEM Table Load:
> > 
> > That's way after the secondary CPUs have been brought up. tip/master boots
> > without problems with the same config. Let me try x86/asm alone and also a
> > revert of the CR4/CR0 stuff.
> 
> x86/asm works alone
> 
> revert of cr4/0 pinning on top of your tree gets stuck as well
> 
> > And while writing this the softlockup detector muttered:
> 
> Have not yet fully bisected it, but Jiri did and he ended up with
> 
>   c522ad0637ca ("ACPICA: Update table load object initialization")
> 
> Reverting that on top of your tree makes it work again. Tony has the same
> issue.

I have a revert of this one queued up (with a plan to push it later today).

Cheers,
Rafael






Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Kees Cook
On Tue, Jul 09, 2019 at 10:15:21PM -0700, Linus Torvalds wrote:
> On Tue, Jul 9, 2019 at 8:21 PM Linus Torvalds
>  wrote:
> >
> > Whee. It looks like it's bisecting to the same thing. Only partway
> > done, but it feels very much like my desktop.
> 
> Confirmed.
> 
> With that config, I get this
> 
>   c21ac93288f0 (refs/bisect/bad) Merge tag 'v5.2-rc6' into x86/asm, to
> refresh the branch
>   8dbec27a242c (HEAD) x86/asm: Pin sensitive CR0 bits
>   873d50d58f67 x86/asm: Pin sensitive CR4 bits
> 
> ie those "pin sensitive bits" merge is bad, but before the commits is good.
> 
> I think there is _another_ problem too, and maybe it's the APCI one,
> but this one triggers some issue before the other issue even gets to
> play..

Okay, fun. Thanks for confirming it. Well, I guess I get to try to
find some more modern hardware to track this down. Does booting with
init=/bin/sh get far enough to see if all the CPUs are online or anything
like that? I'm baffled about the "gets mostly to userspace" part. I'd
expect this to explode very badly if it misbehaved. Or maybe something
gets confused between how many CPUs are expected and how many actually
show up to the party. Hmmm.

-- 
Kees Cook


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Linus Torvalds
On Tue, Jul 9, 2019 at 8:21 PM Linus Torvalds
 wrote:
>
> Whee. It looks like it's bisecting to the same thing. Only partway
> done, but it feels very much like my desktop.

Confirmed.

With that config, I get this

  c21ac93288f0 (refs/bisect/bad) Merge tag 'v5.2-rc6' into x86/asm, to
refresh the branch
  8dbec27a242c (HEAD) x86/asm: Pin sensitive CR0 bits
  873d50d58f67 x86/asm: Pin sensitive CR4 bits

ie those "pin sensitive bits" merge is bad, but before the commits is good.

I think there is _another_ problem too, and maybe it's the APCI one,
but this one triggers some issue before the other issue even gets to
play..

 Linus


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Linus Torvalds
On Tue, Jul 9, 2019 at 5:59 PM Linus Torvalds
 wrote:
>
> On my laptop (which I am at right now), the hang is different, and
> maybe it's similar to your ACPI hang issue. I will try that revert,
> and at least see if that solves the laptop side.

Nope, that wasn't it.  Apparently there are three different issues.

I guess I'll just have to do a full bisect on the laptop too. That's
going to take forever.

Linus


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Linus Torvalds
On Tue, Jul 9, 2019 at 4:00 PM Thomas Gleixner  wrote:
>
> That still does not explain the cr4/0 issue you have. Can you send me your
> .config please?

Gaah. Now I'm off my desktop and won't be at it again until tomorrow.
And that's the one that bisected to the cr0/cr4 bits (and I did all my
bisection on that because it's so much faster).

Anyway, the kernel config itself is pretty simple. It's basically a
F30 kernel config that has been through "make localmodconfig" and then
pared down some more to remove stuff I don't use (paravirt etc). I
would expect that if it's a confiuration difference, it's more likely
to be about the user mode side - since the hang does seem to happen in
user mode, not in the kernel (but a very early problem at cpu bringup
time could easily have been hidden).

On my laptop (which I am at right now), the hang is different, and
maybe it's similar to your ACPI hang issue. I will try that revert,
and at least see if that solves the laptop side.

Linus


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Kees Cook
On Wed, Jul 10, 2019 at 01:17:11AM +0200, Thomas Gleixner wrote:
> On Wed, 10 Jul 2019, Thomas Gleixner wrote:
> > 
> > That still does not explain the cr4/0 issue you have. Can you send me your
> > .config please?
> 
> Does your machine have UMIP support? None of my test boxes has. So that'd
> be the difference of bits enforced in CR4. Should not matter because it's
> User mode instruction prevention, but who knows.

Ew. Yeah, I don't have i9 nor i7 for testing this. I did try everything
else I had (and hibernation). Is only Linus able to reproduce this so far?

To rule out (in?) UMIP, this would remove UMIP from the pinning:

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 309b6b9b49d4..f3beedb6da8a 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -380,7 +380,7 @@ static void __init setup_cr_pinning(void)
 {
unsigned long mask;
 
-   mask = (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP);
+   mask = (X86_CR4_SMEP | X86_CR4_SMAP);
cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & mask;
static_key_enable(_pinning.key);
 }


-- 
Kees Cook


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Thomas Gleixner
On Wed, 10 Jul 2019, Thomas Gleixner wrote:
> 
> That still does not explain the cr4/0 issue you have. Can you send me your
> .config please?

Does your machine have UMIP support? None of my test boxes has. So that'd
be the difference of bits enforced in CR4. Should not matter because it's
User mode instruction prevention, but who knows.

Thanks,

tglx


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Thomas Gleixner
On Wed, 10 Jul 2019, Thomas Gleixner wrote:
> On Tue, 9 Jul 2019, Linus Torvalds wrote:
> > I also don't have any logs, because the boot never gets far enough. I
> > assume that there was a problem bringing up a non-boot CPU, and the
> > eventual hang ends up being due to that.
> 
> Hrm. I just build the tip of your tree and bootet it. It hangs at:
> 
> [4.788678] ACPI: 4 ACPI AML tables successfully acquired and loaded
> [4.793860] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
> [4.821476] ACPI: Dynamic OEM Table Load:
> 
> That's way after the secondary CPUs have been brought up. tip/master boots
> without problems with the same config. Let me try x86/asm alone and also a
> revert of the CR4/CR0 stuff.

x86/asm works alone

revert of cr4/0 pinning on top of your tree gets stuck as well

> And while writing this the softlockup detector muttered:

Have not yet fully bisected it, but Jiri did and he ended up with

  c522ad0637ca ("ACPICA: Update table load object initialization")

Reverting that on top of your tree makes it work again. Tony has the same
issue.

That still does not explain the cr4/0 issue you have. Can you send me your
.config please?

Thanks,

tglx





Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Thomas Gleixner
On Tue, 9 Jul 2019, Linus Torvalds wrote:

> On Tue, Jul 9, 2019 at 2:45 PM Linus Torvalds
>  wrote:
> >
> > and I suspect it's the sensitive bit pinning. But I'll delve all the way.
> 
> Confirmed. Bisection says
> 
> 873d50d58f67ef15d2777b5e7f7a5268bb1fbae2 is the first bad commit
> commit 873d50d58f67ef15d2777b5e7f7a5268bb1fbae2
> Author: Kees Cook 
> Date:   Mon Jun 17 21:55:02 2019 -0700
> 
> x86/asm: Pin sensitive CR4 bits
> 
> this is on a bog-standard Intel setup with F30, both desktop and
> laptop (i9-9900k and i7-8565u respectively).
> 
> I haven't confirmed yet whether reverting just that one commit is
> required, or if I need to revert the cr0 one too.
> 
> I also don't have any logs, because the boot never gets far enough. I
> assume that there was a problem bringing up a non-boot CPU, and the
> eventual hang ends up being due to that.

Hrm. I just build the tip of your tree and bootet it. It hangs at:

[4.788678] ACPI: 4 ACPI AML tables successfully acquired and loaded
[4.793860] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
[4.821476] ACPI: Dynamic OEM Table Load:

That's way after the secondary CPUs have been brought up. tip/master boots
without problems with the same config. Let me try x86/asm alone and also a
revert of the CR4/CR0 stuff.

And while writing this the softlockup detector muttered:

[  245.849408] INFO: task swapper/0:1 blocked for more than 120 seconds.
[  245.853406]   Not tainted 5.2.0+ #69
[  245.857318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  245.865405] swapper/0   D0 1  0 0x80004000
[  245.869405] Call Trace:
[  245.871859]  ? __schedule+0x2bb/0x690
[  245.877410]  ? acpi_ps_complete_op+0x259/0x279
[  245.881406]  schedule+0x29/0x90
[  245.884540]  schedule_timeout+0x20d/0x310
[  245.889409]  ? acpi_os_release_object+0xa/0x10
[  245.893406]  ? acpi_ps_delete_parse_tree+0x2d/0x59
[  245.897405]  __down_timeout+0x9b/0x100
[  245.901150]  down_timeout+0x43/0x50
[  245.905407]  acpi_os_wait_semaphore+0x48/0x60
[  245.909408]  acpi_ut_acquire_mutex+0x45/0x89
[  245.913406]  ? acpi_ns_init_one_package+0x44/0x44
[  245.917406]  acpi_walk_namespace+0x62/0xc2
[  245.921406]  acpi_ns_initialize_objects+0x40/0x7b
[  245.925407]  acpi_ex_load_table_op+0x157/0x1b9
[  245.933406]  acpi_ex_opcode_6A_0T_1R+0x158/0x1c2
[  245.937406]  acpi_ds_exec_end_op+0xca/0x401
[  245.941406]  acpi_ps_parse_loop+0x492/0x5c6
[  245.945406]  acpi_ps_parse_aml+0x91/0x2b8
[  245.949405]  acpi_ps_execute_method+0x15d/0x191
[  245.953406]  acpi_ns_evaluate+0x1bf/0x24c
[  245.957406]  acpi_evaluate_object+0x137/0x240
[  245.961408]  ? kmem_cache_alloc_trace+0x15b/0x1c0
[  245.965406]  acpi_processor_set_pdc+0x135/0x180
[  245.969408]  early_init_pdc+0xb3/0xd0
[  245.973064]  acpi_ns_walk_namespace+0xda/0x1aa
[  245.977407]  ? set_no_mwait+0x23/0x23
[  245.981062]  ? set_no_mwait+0x23/0x23
[  245.985406]  acpi_walk_namespace+0x9a/0xc2
[  245.989406]  ? acpi_sleep_proc_init+0x24/0x24
[  245.993407]  ? do_early_param+0x8e/0x8e
[  245.997235]  acpi_early_processor_set_pdc+0x31/0x49
[  246.001406]  acpi_init+0x17c/0x321


Thanks,

tglx




Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Linus Torvalds
On Tue, Jul 9, 2019 at 3:00 PM Linus Torvalds
 wrote:
>
> I haven't confirmed yet whether reverting just that one commit is
> required, or if I need to revert the cr0 one too.

Oh, I can't revert just the cr4 one, because the cr0 one depends on it.

But reverting both gets my desktop back to life. My laptop (that hung
earlier) seems to be another (possibly additional) issue.

  Linus


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Linus Torvalds
On Tue, Jul 9, 2019 at 2:45 PM Linus Torvalds
 wrote:
>
> and I suspect it's the sensitive bit pinning. But I'll delve all the way.

Confirmed. Bisection says

873d50d58f67ef15d2777b5e7f7a5268bb1fbae2 is the first bad commit
commit 873d50d58f67ef15d2777b5e7f7a5268bb1fbae2
Author: Kees Cook 
Date:   Mon Jun 17 21:55:02 2019 -0700

x86/asm: Pin sensitive CR4 bits

this is on a bog-standard Intel setup with F30, both desktop and
laptop (i9-9900k and i7-8565u respectively).

I haven't confirmed yet whether reverting just that one commit is
required, or if I need to revert the cr0 one too.

I also don't have any logs, because the boot never gets far enough. I
assume that there was a problem bringing up a non-boot CPU, and the
eventual hang ends up being due to that.

   Linus


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Linus Torvalds
On Tue, Jul 9, 2019 at 2:26 PM Linus Torvalds
 wrote:
>
> but still bisecting to pinpoint the exact thing.

One of:

  c21ac93288f0 (refs/bisect/bad) Merge tag 'v5.2-rc6' into x86/asm, to
refresh the branch
  8dbec27a242c x86/asm: Pin sensitive CR0 bits
  873d50d58f67 x86/asm: Pin sensitive CR4 bits
  7b347ad4938d (HEAD) Merge tag 'v5.2-rc5' into x86/asm, to refresh the branch
  9db9b76767f1 Documentation/x86: Fix path to entry_32.S
  7231d0165df3 x86/asm: Remove unused TASK_TI_flags from asm-offsets.c

and I suspect it's the sensitive bit pinning. But I'll delve all the way.

Linus


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Linus Torvalds
On Tue, Jul 9, 2019 at 2:20 PM Linus Torvalds
 wrote:
>
> Will keep you updated as bisection narrows it down.

Hmm. Already narrowed down to either of

Pull x86 asm updates from Ingo Molnar:
Pull scheduler updates from Ingo Molnar:

but still bisecting to pinpoint the exact thing.

   Linus


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-09 Thread Linus Torvalds
After doing a lot of build testing and merging, I finally got around
to actually boot-testing the result.

Something in this series has broken booting for me. Or rather, things
*boot*, but then it hangs in user space, with

   A start job is running for udev Wait for Complete Device Initialization

it does this on both my desktop and laptop, although the exact hang is
different (on my laptop, it hangs earlier - I don't even get to input
the disk encryption keys).

I'm bisecting, and it will take a while to pinpoint, but I can already
tell that it's from one of these pulls:

Pull x86 topology updates from Ingo Molnar:
Pull x86 platform updayes from Ingo Molnar:
Pull x86 paravirt updates from Ingo Molnar:
Pull x86 AVX512 status update from Ingo Molnar:
Pull x86 cleanups from Ingo Molnar:
Pull x86 cache resource control update from Ingo Molnar:
Pull x86 build updates from Ingo Molnar:
Pull x86 asm updates from Ingo Molnar:
Pull scheduler updates from Ingo Molnar:

Will keep you updated as bisection narrows it down.

  Linus


Re: [GIT PULL] x86/topology changes for v5.3

2019-07-08 Thread pr-tracker-bot
The pull request you sent on Mon, 8 Jul 2019 18:27:56 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> x86-topology-for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/222a21d29521d144f3dd7a0bc4d4020e448f0126

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker