Re: [GIT PULL] x86/topology changes for v5.3
> On Jul 11, 2019, at 8:08 AM, Kees Cook wrote: > > On Thu, Jul 11, 2019 at 10:01:34AM +0200, Peter Zijlstra wrote: >> On Thu, Jul 11, 2019 at 07:11:19AM +, Nadav Amit wrote: On Jul 10, 2019, at 7:22 AM, Jiri Kosina wrote: On Wed, 10 Jul 2019, Peter Zijlstra wrote: > If we mark the key as RO after init, and then try and modify the key to > link module usage sites, things might go bang as described. > > Thanks! > > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > index 27d7864e7252..5bf7a8354da2 100644 > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct > cpuinfo_x86 *c) > cr4_clear_bits(X86_CR4_UMIP); > } > > -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > +DEFINE_STATIC_KEY_FALSE(cr_pinning); Good catch, I guess that is going to fix it. At the same time though, it sort of destroys the original intent of Kees' patch, right? The exploits will just have to call static_key_disable() prior to calling native_write_cr4() again, and the protection is gone. >>> >>> Even with DEFINE_STATIC_KEY_FALSE_RO(), I presume you can just call >>> set_memory_rw(), make the page that holds the key writable, and then call >>> static_key_disable(), followed by a call to native_write_cr4(). >> >> Or call text_poke_bp() with the right set of arguments. > > Right -- the point is to make it defended against an arbitrary write, > not arbitrary execution. Nothing is safe from arbitrary exec, but we can > do our due diligence on making things read-only. I don’t understand. In the PoC that motivated this this patch [1], the attacker gained the ability to call a function, control its first argument and used it to disable SMEP/SMAP by calling native_write_cr4(), which he did before doing an arbitrary write (another ability he gain). After this patch, the attacker can instead call three functions, and by controlling their first arguments (*) disable SMEP. I didn’t see something in the motivating PoC that prevented calling 3 functions one at a time. So it seems to me that it raised the bar for an attack by very little. -- (*) set_memory_rw() has a second argument - the number of pages - but many non-zero values may be fine (if not a warning from __cpa_process_fault() might appear). [1] https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-packet.html
Re: [GIT PULL] x86/topology changes for v5.3
On Thu, Jul 11, 2019 at 10:01:34AM +0200, Peter Zijlstra wrote: > On Thu, Jul 11, 2019 at 07:11:19AM +, Nadav Amit wrote: > > > On Jul 10, 2019, at 7:22 AM, Jiri Kosina wrote: > > > > > > On Wed, 10 Jul 2019, Peter Zijlstra wrote: > > > > > >> If we mark the key as RO after init, and then try and modify the key to > > >> link module usage sites, things might go bang as described. > > >> > > >> Thanks! > > >> > > >> > > >> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > > >> index 27d7864e7252..5bf7a8354da2 100644 > > >> --- a/arch/x86/kernel/cpu/common.c > > >> +++ b/arch/x86/kernel/cpu/common.c > > >> @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct > > >> cpuinfo_x86 *c) > > >> cr4_clear_bits(X86_CR4_UMIP); > > >> } > > >> > > >> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > > >> +DEFINE_STATIC_KEY_FALSE(cr_pinning); > > > > > > Good catch, I guess that is going to fix it. > > > > > > At the same time though, it sort of destroys the original intent of Kees' > > > patch, right? The exploits will just have to call static_key_disable() > > > prior to calling native_write_cr4() again, and the protection is gone. > > > > Even with DEFINE_STATIC_KEY_FALSE_RO(), I presume you can just call > > set_memory_rw(), make the page that holds the key writable, and then call > > static_key_disable(), followed by a call to native_write_cr4(). > > Or call text_poke_bp() with the right set of arguments. Right -- the point is to make it defended against an arbitrary write, not arbitrary execution. Nothing is safe from arbitrary exec, but we can do our due diligence on making things read-only. -- Kees Cook
Re: [GIT PULL] x86/topology changes for v5.3
On Thu, Jul 11, 2019 at 07:11:19AM +, Nadav Amit wrote: > > On Jul 10, 2019, at 7:22 AM, Jiri Kosina wrote: > > > > On Wed, 10 Jul 2019, Peter Zijlstra wrote: > > > >> If we mark the key as RO after init, and then try and modify the key to > >> link module usage sites, things might go bang as described. > >> > >> Thanks! > >> > >> > >> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > >> index 27d7864e7252..5bf7a8354da2 100644 > >> --- a/arch/x86/kernel/cpu/common.c > >> +++ b/arch/x86/kernel/cpu/common.c > >> @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct > >> cpuinfo_x86 *c) > >>cr4_clear_bits(X86_CR4_UMIP); > >> } > >> > >> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > >> +DEFINE_STATIC_KEY_FALSE(cr_pinning); > > > > Good catch, I guess that is going to fix it. > > > > At the same time though, it sort of destroys the original intent of Kees' > > patch, right? The exploits will just have to call static_key_disable() > > prior to calling native_write_cr4() again, and the protection is gone. > > Even with DEFINE_STATIC_KEY_FALSE_RO(), I presume you can just call > set_memory_rw(), make the page that holds the key writable, and then call > static_key_disable(), followed by a call to native_write_cr4(). Or call text_poke_bp() with the right set of arguments.
Re: [GIT PULL] x86/topology changes for v5.3
On Thu, 11 Jul 2019, Nadav Amit wrote: > > On Jul 10, 2019, at 7:22 AM, Jiri Kosina wrote: > > > > On Wed, 10 Jul 2019, Peter Zijlstra wrote: > > > >> If we mark the key as RO after init, and then try and modify the key to > >> link module usage sites, things might go bang as described. > >> > >> Thanks! > >> > >> > >> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > >> index 27d7864e7252..5bf7a8354da2 100644 > >> --- a/arch/x86/kernel/cpu/common.c > >> +++ b/arch/x86/kernel/cpu/common.c > >> @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct > >> cpuinfo_x86 *c) > >>cr4_clear_bits(X86_CR4_UMIP); > >> } > >> > >> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > >> +DEFINE_STATIC_KEY_FALSE(cr_pinning); > > > > Good catch, I guess that is going to fix it. > > > > At the same time though, it sort of destroys the original intent of Kees' > > patch, right? The exploits will just have to call static_key_disable() > > prior to calling native_write_cr4() again, and the protection is gone. > > Even with DEFINE_STATIC_KEY_FALSE_RO(), I presume you can just call > set_memory_rw(), make the page that holds the key writable, and then call > static_key_disable(), followed by a call to native_write_cr4(). That's true, but it's not worth the trouble. Thanks, tglx
Re: [GIT PULL] x86/topology changes for v5.3
> On Jul 10, 2019, at 7:22 AM, Jiri Kosina wrote: > > On Wed, 10 Jul 2019, Peter Zijlstra wrote: > >> If we mark the key as RO after init, and then try and modify the key to >> link module usage sites, things might go bang as described. >> >> Thanks! >> >> >> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c >> index 27d7864e7252..5bf7a8354da2 100644 >> --- a/arch/x86/kernel/cpu/common.c >> +++ b/arch/x86/kernel/cpu/common.c >> @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct >> cpuinfo_x86 *c) >> cr4_clear_bits(X86_CR4_UMIP); >> } >> >> -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); >> +DEFINE_STATIC_KEY_FALSE(cr_pinning); > > Good catch, I guess that is going to fix it. > > At the same time though, it sort of destroys the original intent of Kees' > patch, right? The exploits will just have to call static_key_disable() > prior to calling native_write_cr4() again, and the protection is gone. Even with DEFINE_STATIC_KEY_FALSE_RO(), I presume you can just call set_memory_rw(), make the page that holds the key writable, and then call static_key_disable(), followed by a call to native_write_cr4().
Re: [GIT PULL] x86/topology changes for v5.3
On Tue, Jul 9, 2019 at 10:15 PM Linus Torvalds wrote: > > I think there is _another_ problem too, and maybe it's the APCI one, > but this one triggers some issue before the other issue even gets to > play.. No, the other problem was the keyring ACL support from David Howells, which seems to have broken encrypted disk setups. So apparently I wasn't impacted by the ACPI thing, but had two other issues instead. Not a great start to the merge window. Linus
Re: [GIT PULL] x86/topology changes for v5.3
On 2019-07-10 17:13 +0200, Thomas Gleixner wrote: > Something like the below. Builds and boots, must be perfect. > > Thanks, > > tglx Tested-by: Xi Ruoyao > 8< > > arch/x86/include/asm/processor.h |1 > arch/x86/include/asm/special_insns.h | 41 --- > arch/x86/kernel/cpu/common.c | 72 +++ > > arch/x86/kernel/smpboot.c| 14 -- > arch/x86/xen/smp_pv.c|1 > 5 files changed, 61 insertions(+), 68 deletions(-) > > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -741,6 +741,7 @@ extern void load_direct_gdt(int); > extern void load_fixmap_gdt(int); > extern void load_percpu_segment(int); > extern void cpu_init(void); > +extern void cr4_init(void); > > static inline unsigned long get_debugctlmsr(void) > { > --- a/arch/x86/include/asm/special_insns.h > +++ b/arch/x86/include/asm/special_insns.h > @@ -18,9 +18,7 @@ > */ > extern unsigned long __force_order; > > -/* Starts false and gets enabled once CPU feature detection is done. */ > -DECLARE_STATIC_KEY_FALSE(cr_pinning); > -extern unsigned long cr4_pinned_bits; > +void native_write_cr0(unsigned long val); > > static inline unsigned long native_read_cr0(void) > { > @@ -29,24 +27,6 @@ static inline unsigned long native_read_ > return val; > } > > -static inline void native_write_cr0(unsigned long val) > -{ > - unsigned long bits_missing = 0; > - > -set_register: > - asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order)); > - > - if (static_branch_likely(_pinning)) { > - if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) { > - bits_missing = X86_CR0_WP; > - val |= bits_missing; > - goto set_register; > - } > - /* Warn after we've set the missing bits. */ > - WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n"); > - } > -} > - > static inline unsigned long native_read_cr2(void) > { > unsigned long val; > @@ -91,24 +71,7 @@ static inline unsigned long native_read_ > return val; > } > > -static inline void native_write_cr4(unsigned long val) > -{ > - unsigned long bits_missing = 0; > - > -set_register: > - asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits)); > - > - if (static_branch_likely(_pinning)) { > - if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) { > - bits_missing = ~val & cr4_pinned_bits; > - val |= bits_missing; > - goto set_register; > - } > - /* Warn after we've set the missing bits. */ > - WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n", > - bits_missing); > - } > -} > +void native_write_cr4(unsigned long val); > > #ifdef CONFIG_X86_64 > static inline unsigned long native_read_cr8(void) > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -366,10 +366,62 @@ static __always_inline void setup_umip(s > cr4_clear_bits(X86_CR4_UMIP); > } > > -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > -EXPORT_SYMBOL(cr_pinning); > -unsigned long cr4_pinned_bits __ro_after_init; > -EXPORT_SYMBOL(cr4_pinned_bits); > +static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > +static unsigned long cr4_pinned_bits __ro_after_init; > + > +void native_write_cr0(unsigned long val) > +{ > + unsigned long bits_missing = 0; > + > +set_register: > + asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order)); > + > + if (static_branch_likely(_pinning)) { > + if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) { > + bits_missing = X86_CR0_WP; > + val |= bits_missing; > + goto set_register; > + } > + /* Warn after we've set the missing bits. */ > + WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n"); > + } > +} > +EXPORT_SYMBOL(native_write_cr0); > + > +void native_write_cr4(unsigned long val) > +{ > + unsigned long bits_missing = 0; > + > +set_register: > + asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits)); > + > + if (static_branch_likely(_pinning)) { > + if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) { > + bits_missing = ~val & cr4_pinned_bits; > + val |= bits_missing; > + goto set_register; > + } > + /* Warn after we've set the missing bits. */ > + WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n", > + bits_missing); > + } > +} > +EXPORT_SYMBOL(native_write_cr4); > + > +void cr4_init(void) > +{ > + unsigned long cr4 = __read_cr4(); > + > + if (boot_cpu_has(X86_FEATURE_PCID)) > + cr4 |= X86_CR4_PCIDE; > + if
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, 10 Jul 2019, Peter Zijlstra wrote: > On Wed, Jul 10, 2019 at 04:22:51PM +0200, Jiri Kosina wrote: > > On Wed, 10 Jul 2019, Peter Zijlstra wrote: > > > > > If we mark the key as RO after init, and then try and modify the key to > > > link module usage sites, things might go bang as described. > > > > > > Thanks! > > > > > > > > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > > > index 27d7864e7252..5bf7a8354da2 100644 > > > --- a/arch/x86/kernel/cpu/common.c > > > +++ b/arch/x86/kernel/cpu/common.c > > > @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct > > > cpuinfo_x86 *c) > > > cr4_clear_bits(X86_CR4_UMIP); > > > } > > > > > > -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > > > +DEFINE_STATIC_KEY_FALSE(cr_pinning); > > > > Good catch, I guess that is going to fix it. > > > > At the same time though, it sort of destroys the original intent of Kees' > > patch, right? The exploits will just have to call static_key_disable() > > prior to calling native_write_cr4() again, and the protection is gone. > > This is fixable by moving native_write_cr*() out-of-line, such that they > never end up in a module. Something like the below. Builds and boots, must be perfect. Thanks, tglx 8< arch/x86/include/asm/processor.h |1 arch/x86/include/asm/special_insns.h | 41 --- arch/x86/kernel/cpu/common.c | 72 +++ arch/x86/kernel/smpboot.c| 14 -- arch/x86/xen/smp_pv.c|1 5 files changed, 61 insertions(+), 68 deletions(-) --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -741,6 +741,7 @@ extern void load_direct_gdt(int); extern void load_fixmap_gdt(int); extern void load_percpu_segment(int); extern void cpu_init(void); +extern void cr4_init(void); static inline unsigned long get_debugctlmsr(void) { --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -18,9 +18,7 @@ */ extern unsigned long __force_order; -/* Starts false and gets enabled once CPU feature detection is done. */ -DECLARE_STATIC_KEY_FALSE(cr_pinning); -extern unsigned long cr4_pinned_bits; +void native_write_cr0(unsigned long val); static inline unsigned long native_read_cr0(void) { @@ -29,24 +27,6 @@ static inline unsigned long native_read_ return val; } -static inline void native_write_cr0(unsigned long val) -{ - unsigned long bits_missing = 0; - -set_register: - asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order)); - - if (static_branch_likely(_pinning)) { - if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) { - bits_missing = X86_CR0_WP; - val |= bits_missing; - goto set_register; - } - /* Warn after we've set the missing bits. */ - WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n"); - } -} - static inline unsigned long native_read_cr2(void) { unsigned long val; @@ -91,24 +71,7 @@ static inline unsigned long native_read_ return val; } -static inline void native_write_cr4(unsigned long val) -{ - unsigned long bits_missing = 0; - -set_register: - asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits)); - - if (static_branch_likely(_pinning)) { - if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) { - bits_missing = ~val & cr4_pinned_bits; - val |= bits_missing; - goto set_register; - } - /* Warn after we've set the missing bits. */ - WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n", - bits_missing); - } -} +void native_write_cr4(unsigned long val); #ifdef CONFIG_X86_64 static inline unsigned long native_read_cr8(void) --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -366,10 +366,62 @@ static __always_inline void setup_umip(s cr4_clear_bits(X86_CR4_UMIP); } -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); -EXPORT_SYMBOL(cr_pinning); -unsigned long cr4_pinned_bits __ro_after_init; -EXPORT_SYMBOL(cr4_pinned_bits); +static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); +static unsigned long cr4_pinned_bits __ro_after_init; + +void native_write_cr0(unsigned long val) +{ + unsigned long bits_missing = 0; + +set_register: + asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order)); + + if (static_branch_likely(_pinning)) { + if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) { + bits_missing = X86_CR0_WP; + val |= bits_missing; + goto set_register; + } + /* Warn after we've set the missing bits. */ + WARN_ONCE(bits_missing, "CR0 WP bit went
Re: [GIT PULL] x86/topology changes for v5.3
On 2019-07-10 16:22 +0200, Jiri Kosina wrote: > On Wed, 10 Jul 2019, Peter Zijlstra wrote: > > > If we mark the key as RO after init, and then try and modify the key to > > link module usage sites, things might go bang as described. > > > > Thanks! > > > > > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > > index 27d7864e7252..5bf7a8354da2 100644 > > --- a/arch/x86/kernel/cpu/common.c > > +++ b/arch/x86/kernel/cpu/common.c > > @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct > > cpuinfo_x86 *c) > > cr4_clear_bits(X86_CR4_UMIP); > > } > > > > -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > > +DEFINE_STATIC_KEY_FALSE(cr_pinning); > > Good catch, I guess that is going to fix it. Yes it works. > At the same time though, it sort of destroys the original intent of Kees' > patch, right? The exploits will just have to call static_key_disable() > prior to calling native_write_cr4() again, and the protection is gone. I think I should do some study and try to understand the full story of Kees' change... -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, Jul 10, 2019 at 04:22:51PM +0200, Jiri Kosina wrote: > On Wed, 10 Jul 2019, Peter Zijlstra wrote: > > > If we mark the key as RO after init, and then try and modify the key to > > link module usage sites, things might go bang as described. > > > > Thanks! > > > > > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > > index 27d7864e7252..5bf7a8354da2 100644 > > --- a/arch/x86/kernel/cpu/common.c > > +++ b/arch/x86/kernel/cpu/common.c > > @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct > > cpuinfo_x86 *c) > > cr4_clear_bits(X86_CR4_UMIP); > > } > > > > -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > > +DEFINE_STATIC_KEY_FALSE(cr_pinning); > > Good catch, I guess that is going to fix it. > > At the same time though, it sort of destroys the original intent of Kees' > patch, right? The exploits will just have to call static_key_disable() > prior to calling native_write_cr4() again, and the protection is gone. This is fixable by moving native_write_cr*() out-of-line, such that they never end up in a module.
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, 10 Jul 2019, Thomas Gleixner wrote: > On Wed, 10 Jul 2019, Peter Zijlstra wrote: > > > On Wed, Jul 10, 2019 at 09:25:16PM +0800, Xi Ruoyao wrote: > > > On 2019-07-10 14:31 +0200, Jiri Kosina wrote: > > > > Adding Daniel to check whether this couldn't be some fallout of > > > > jumplabel > > > > batching. > > > > > > I don't think so. I tried to revert Daniel's jumplabel batching commits > > > and the > > > issue wasn't solved. But reverting Kees' CR0 and CR4 commits can "fix" it > > > (apprently). > > > > Xi, could you please try the below instead? > > > > If we mark the key as RO after init, and then try and modify the key to > > link module usage sites, things might go bang as described. > > Right. I finally was able to reproduce that with Linus' config (slightly > modified). Applying your patch makes it go away. > > Now what puzzles me is that this never exploded in my face before and with > a debian config it JustWorks. > > Both configs have: > > CONFIG_KVM=m > CONFIG_KVM_INTEL=m > > so I'd expect both to crash and burn when KVM_INTEL is loaded as that has a > cr4 operation inside. > > So something papers over that ... Still looking. Just found it. All my usual configs have the paravirt muck enabled. Linus's config does not, neither has Xi's. With paravirt the CR pinning key is never in the module. It's in the paravirt function and obviously always built in. Thanks, tglx
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, 10 Jul 2019, Peter Zijlstra wrote: > If we mark the key as RO after init, and then try and modify the key to > link module usage sites, things might go bang as described. > > Thanks! > > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > index 27d7864e7252..5bf7a8354da2 100644 > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct cpuinfo_x86 > *c) > cr4_clear_bits(X86_CR4_UMIP); > } > > -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); > +DEFINE_STATIC_KEY_FALSE(cr_pinning); Good catch, I guess that is going to fix it. At the same time though, it sort of destroys the original intent of Kees' patch, right? The exploits will just have to call static_key_disable() prior to calling native_write_cr4() again, and the protection is gone. -- Jiri Kosina SUSE Labs
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, 10 Jul 2019, Peter Zijlstra wrote: > On Wed, Jul 10, 2019 at 09:25:16PM +0800, Xi Ruoyao wrote: > > On 2019-07-10 14:31 +0200, Jiri Kosina wrote: > > > Adding Daniel to check whether this couldn't be some fallout of jumplabel > > > batching. > > > > I don't think so. I tried to revert Daniel's jumplabel batching commits > > and the > > issue wasn't solved. But reverting Kees' CR0 and CR4 commits can "fix" it > > (apprently). > > Xi, could you please try the below instead? > > If we mark the key as RO after init, and then try and modify the key to > link module usage sites, things might go bang as described. Right. I finally was able to reproduce that with Linus' config (slightly modified). Applying your patch makes it go away. Now what puzzles me is that this never exploded in my face before and with a debian config it JustWorks. Both configs have: CONFIG_KVM=m CONFIG_KVM_INTEL=m so I'd expect both to crash and burn when KVM_INTEL is loaded as that has a cr4 operation inside. So something papers over that ... Still looking. Thanks, tglx
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, Jul 10, 2019 at 09:25:16PM +0800, Xi Ruoyao wrote: > On 2019-07-10 14:31 +0200, Jiri Kosina wrote: > > Adding Daniel to check whether this couldn't be some fallout of jumplabel > > batching. > > I don't think so. I tried to revert Daniel's jumplabel batching commits and > the > issue wasn't solved. But reverting Kees' CR0 and CR4 commits can "fix" it > (apprently). Xi, could you please try the below instead? If we mark the key as RO after init, and then try and modify the key to link module usage sites, things might go bang as described. Thanks! diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 27d7864e7252..5bf7a8354da2 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -366,7 +366,7 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c) cr4_clear_bits(X86_CR4_UMIP); } -DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); +DEFINE_STATIC_KEY_FALSE(cr_pinning); EXPORT_SYMBOL(cr_pinning); unsigned long cr4_pinned_bits __ro_after_init; EXPORT_SYMBOL(cr4_pinned_bits);
Re: [GIT PULL] x86/topology changes for v5.3
On 2019-07-10 15:28 +0200, Jiri Kosina wrote: > On Wed, 10 Jul 2019, Jiri Kosina wrote: > > > > > BUG: unable to handle page fault for address: 9edc1598 > > > > > #PF: supervisor write access in kernel mode > > > > > #PF: error_code(0x0003) - permissions violation > > Hm, and it seems to explode on dereferencing the static_key* in %rsi > > ^^^ %rdi of > course > > > 21: 48 8b 37mov(%rdi),%rsi > > 24: 83 e6 03and$0x3,%esi > > 27: 48 09 c6or %rax,%rsi > > 2a:* 48 89 37mov%rsi,(%rdi) <-- trapping > > instruction > > > > which looks odd, as it derefenced it successfully just 3 instructions ago. It seems the MMU (I guess ?) allows to read it, but disallows to write it: "supervisor write access in kernel mode". -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, 10 Jul 2019, Jiri Kosina wrote: > On Wed, 10 Jul 2019, Peter Zijlstra wrote: > > > > > BUG: unable to handle page fault for address: 9edc1598 > > > > #PF: supervisor write access in kernel mode > > > > #PF: error_code(0x0003) - permissions violation > > > > PGD 1a20c067 P4D 1a20c067 PUD 1a20d063 PMD 800019e000e1 > > > > Oops: 0003 [#1] SMP PTI > > > > 2 PID: 151 Comm: systemd-udevd Not tainted 5.2.0+ #54 > > > > Hardware name: LENOVO 20175/INVALID, BIOS 66CN54WW 01/21/2013 > > > > RIP: 0010:static_key_set_mod.isra.0+0x10/0x30 > > > > Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 > > > > 00 00 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 > > > > c3 0f 0b 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e > > > > RSP: :a606c032bc98 EFLAGS: 00010286 > > > > RAX: 9981ddce30a0 RBX: 9edc1590 RCX: > > > > RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598 > > > > RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0 > > > > R10: R11: 00028b56 R12: c06f8880 > > > > R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0 > > > > FS: 7f992dd9a680() GS:9981e708() > > > > knlGS: > > > > CS: 0010 DS: ES: CR0: 80050033 > > > > CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0 > > > > Call Trace: > > > > jump_label_module_notify+0x1e7/0x2b0 > > > > notifier_call_chain+0x44/0x70 > > > > blocking_notifier_call_chain+0x43/0x60 > > > > load_module+0x1bcb/0x2490 > > > > ? vfs_read+0x11f/0x150 > > > > ? __do_sys_finit_module+0xbf/0xe0 > > > > __do_sys_finit_module+0xbf/0xe0 > > > > do_syscall_64+0x43/0x110 > > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > > > > > Josh, didn't you mention that yesterday or so? > > > > > > That's what Tony yesterday indicated on IRC that his system is suffering > > > from as well. > > > > > > Adding Daniel to check whether this couldn't be some fallout of jumplabel > > > batching. > > > > AFAICT this is _before_ we get to patching. The function that explodes > > is static_key_set_mod(), as called from jump_label_add_module(). > > > > What that function does is for all patch sites in the module, find the > > corresponding key; if that key is not also in that module, allocate a > > static_key_mod structure and link the module entries to the key. Such > > that we can find all instances from a given key. > > > > I don't think anything here has changed in a while. > > Hm, and it seems to explode on dereferencing the static_key* in %rsi ^^^ %rdi of course > 21: 48 8b 37mov(%rdi),%rsi > 24: 83 e6 03and$0x3,%esi > 27: 48 09 c6or %rax,%rsi > 2a:* 48 89 37mov%rsi,(%rdi) <-- trapping > instruction > > which looks odd, as it derefenced it successfully just 3 instructions ago. > > -- > Jiri Kosina > SUSE Labs > > -- Jiri Kosina SUSE Labs
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, 10 Jul 2019, Peter Zijlstra wrote: > > > BUG: unable to handle page fault for address: 9edc1598 > > > #PF: supervisor write access in kernel mode > > > #PF: error_code(0x0003) - permissions violation > > > PGD 1a20c067 P4D 1a20c067 PUD 1a20d063 PMD 800019e000e1 > > > Oops: 0003 [#1] SMP PTI > > > 2 PID: 151 Comm: systemd-udevd Not tainted 5.2.0+ #54 > > > Hardware name: LENOVO 20175/INVALID, BIOS 66CN54WW 01/21/2013 > > > RIP: 0010:static_key_set_mod.isra.0+0x10/0x30 > > > Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 > > > 00 00 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 > > > 0f 0b 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e > > > RSP: :a606c032bc98 EFLAGS: 00010286 > > > RAX: 9981ddce30a0 RBX: 9edc1590 RCX: > > > RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598 > > > RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0 > > > R10: R11: 00028b56 R12: c06f8880 > > > R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0 > > > FS: 7f992dd9a680() GS:9981e708() > > > knlGS: > > > CS: 0010 DS: ES: CR0: 80050033 > > > CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0 > > > Call Trace: > > > jump_label_module_notify+0x1e7/0x2b0 > > > notifier_call_chain+0x44/0x70 > > > blocking_notifier_call_chain+0x43/0x60 > > > load_module+0x1bcb/0x2490 > > > ? vfs_read+0x11f/0x150 > > > ? __do_sys_finit_module+0xbf/0xe0 > > > __do_sys_finit_module+0xbf/0xe0 > > > do_syscall_64+0x43/0x110 > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > > > Josh, didn't you mention that yesterday or so? > > > > That's what Tony yesterday indicated on IRC that his system is suffering > > from as well. > > > > Adding Daniel to check whether this couldn't be some fallout of jumplabel > > batching. > > AFAICT this is _before_ we get to patching. The function that explodes > is static_key_set_mod(), as called from jump_label_add_module(). > > What that function does is for all patch sites in the module, find the > corresponding key; if that key is not also in that module, allocate a > static_key_mod structure and link the module entries to the key. Such > that we can find all instances from a given key. > > I don't think anything here has changed in a while. Hm, and it seems to explode on dereferencing the static_key* in %rsi 21: 48 8b 37mov(%rdi),%rsi 24: 83 e6 03and$0x3,%esi 27: 48 09 c6or %rax,%rsi 2a:* 48 89 37mov%rsi,(%rdi) <-- trapping instruction which looks odd, as it derefenced it successfully just 3 instructions ago. -- Jiri Kosina SUSE Labs
Re: [GIT PULL] x86/topology changes for v5.3
On 2019-07-10 14:31 +0200, Jiri Kosina wrote: > Adding Daniel to check whether this couldn't be some fallout of jumplabel > batching. I don't think so. I tried to revert Daniel's jumplabel batching commits and the issue wasn't solved. But reverting Kees' CR0 and CR4 commits can "fix" it (apprently). -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, Jul 10, 2019 at 02:31:29PM +0200, Jiri Kosina wrote: > On Wed, 10 Jul 2019, Thomas Gleixner wrote: > > > From the log: > > > > BUG: unable to handle page fault for address: 9edc1598 > > #PF: supervisor write access in kernel mode > > #PF: error_code(0x0003) - permissions violation > > PGD 1a20c067 P4D 1a20c067 PUD 1a20d063 PMD 800019e000e1 > > Oops: 0003 [#1] SMP PTI > > 2 PID: 151 Comm: systemd-udevd Not tainted 5.2.0+ #54 > > Hardware name: LENOVO 20175/INVALID, BIOS 66CN54WW 01/21/2013 > > RIP: 0010:static_key_set_mod.isra.0+0x10/0x30 > > Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 00 > > 00 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 0f > > 0b 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e > > RSP: :a606c032bc98 EFLAGS: 00010286 > > RAX: 9981ddce30a0 RBX: 9edc1590 RCX: > > RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598 > > RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0 > > R10: R11: 00028b56 R12: c06f8880 > > R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0 > > FS: 7f992dd9a680() GS:9981e708() knlGS: > > CS: 0010 DS: ES: CR0: 80050033 > > CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0 > > Call Trace: > > jump_label_module_notify+0x1e7/0x2b0 > > notifier_call_chain+0x44/0x70 > > blocking_notifier_call_chain+0x43/0x60 > > load_module+0x1bcb/0x2490 > > ? vfs_read+0x11f/0x150 > > ? __do_sys_finit_module+0xbf/0xe0 > > __do_sys_finit_module+0xbf/0xe0 > > do_syscall_64+0x43/0x110 > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > Josh, didn't you mention that yesterday or so? > > That's what Tony yesterday indicated on IRC that his system is suffering > from as well. > > Adding Daniel to check whether this couldn't be some fallout of jumplabel > batching. AFAICT this is _before_ we get to patching. The function that explodes is static_key_set_mod(), as called from jump_label_add_module(). What that function does is for all patch sites in the module, find the corresponding key; if that key is not also in that module, allocate a static_key_mod structure and link the module entries to the key. Such that we can find all instances from a given key. I don't think anything here has changed in a while.
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, 10 Jul 2019, Thomas Gleixner wrote: > From the log: > > BUG: unable to handle page fault for address: 9edc1598 > #PF: supervisor write access in kernel mode > #PF: error_code(0x0003) - permissions violation > PGD 1a20c067 P4D 1a20c067 PUD 1a20d063 PMD 800019e000e1 > Oops: 0003 [#1] SMP PTI > 2 PID: 151 Comm: systemd-udevd Not tainted 5.2.0+ #54 > Hardware name: LENOVO 20175/INVALID, BIOS 66CN54WW 01/21/2013 > RIP: 0010:static_key_set_mod.isra.0+0x10/0x30 > Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 00 00 > 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 0f 0b 48 > 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e > RSP: :a606c032bc98 EFLAGS: 00010286 > RAX: 9981ddce30a0 RBX: 9edc1590 RCX: > RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598 > RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0 > R10: R11: 00028b56 R12: c06f8880 > R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0 > FS: 7f992dd9a680() GS:9981e708() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0 > Call Trace: > jump_label_module_notify+0x1e7/0x2b0 > notifier_call_chain+0x44/0x70 > blocking_notifier_call_chain+0x43/0x60 > load_module+0x1bcb/0x2490 > ? vfs_read+0x11f/0x150 > ? __do_sys_finit_module+0xbf/0xe0 > __do_sys_finit_module+0xbf/0xe0 > do_syscall_64+0x43/0x110 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > Josh, didn't you mention that yesterday or so? That's what Tony yesterday indicated on IRC that his system is suffering from as well. Adding Daniel to check whether this couldn't be some fallout of jumplabel batching. > > > RIP: 0033:0x7f992e2eeaf9 > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 > 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 > 01 c3 48 8b 0d 67 73 0d 00 f7 d8 64 89 01 48 > RSP: 002b:7ffca220d288 EFLAGS: 0246 ORIG_RAX: 0139 > RAX: ffda RBX: 009b8da0 RCX: 7f992e2eeaf9 > RDX: RSI: 7f992e464885 RDI: 0010 > RBP: 0002 R08: R09: 009c45c0 > R10: 0010 R11: 0246 R12: 7f992e464885 > R13: R14: 009acc50 R15: 009b8da0 > Modules linked in: kvm_intel(+) kvm irqbypass hid_sensor_hub crc32_pclmul > mfd_core i2c_i801 snd_hda_intel i915(+) intel_gtt snd_hda_codec i2c_algo_bit > snd_hwdep snd_hda_core drm_kms_helper snd_pcm syscopyarea sysfillrect > sysimgblt fb_sys_fops drm hid_multitouch ideapad_laptop sparse_keymap > hid_generic wmi efivarfs > CR2: 9edc1598 > [ end trace dbeb7e66daa9bdca ]--- > > RIP: 0010:static_key_set_mod.isra.0+0x10/0x30 > Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 00 00 > 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 0f 0b 48 > 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e > RSP: :a606c032bc98 EFLAGS: 00010286 > RAX: 9981ddce30a0 RBX: 9edc1590 RCX: > RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598 > RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0 > R10: R11: 00028b56 R12: c06f8880 > R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0 > FS: 7f992dd9a680() GS:9981e708() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0 > -- Jiri Kosina SUSE Labs
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, 10 Jul 2019, Xi Ruoyao wrote: > On 2019-07-10 19:27 +0800, Xi Ruoyao wrote: > > On 2019-07-09 17:31 -0700, Kees Cook wrote: > > > On Wed, Jul 10, 2019 at 01:17:11AM +0200, Thomas Gleixner wrote: > > > > On Wed, 10 Jul 2019, Thomas Gleixner wrote: > > > > > That still does not explain the cr4/0 issue you have. Can you send me > > > > > your > > > > > .config please? > > > > > > > > Does your machine have UMIP support? None of my test boxes has. So > > > > that'd > > > > be the difference of bits enforced in CR4. Should not matter because > > > > it's > > > > User mode instruction prevention, but who knows. > > > > > > Ew. Yeah, I don't have i9 nor i7 for testing this. I did try everything > > > else I had (and hibernation). Is only Linus able to reproduce this so far? > > > > I can, too. > > > > > To rule out (in?) UMIP, this would remove UMIP from the pinning: > > > > > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > > > index 309b6b9b49d4..f3beedb6da8a 100644 > > > --- a/arch/x86/kernel/cpu/common.c > > > +++ b/arch/x86/kernel/cpu/common.c > > > @@ -380,7 +380,7 @@ static void __init setup_cr_pinning(void) > > > { > > > unsigned long mask; > > > > > > - mask = (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP); > > > + mask = (X86_CR4_SMEP | X86_CR4_SMAP); > > > cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & mask; > > > static_key_enable(_pinning.key); > > > } > > > > I'll try it. > > That doesn't work, sadly. > > My laptop is an old i3-3217u. > > My .config and syslog snip are attached. >From the log: BUG: unable to handle page fault for address: 9edc1598 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 1a20c067 P4D 1a20c067 PUD 1a20d063 PMD 800019e000e1 Oops: 0003 [#1] SMP PTI 2 PID: 151 Comm: systemd-udevd Not tainted 5.2.0+ #54 Hardware name: LENOVO 20175/INVALID, BIOS 66CN54WW 01/21/2013 RIP: 0010:static_key_set_mod.isra.0+0x10/0x30 Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 0f 0b 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e RSP: :a606c032bc98 EFLAGS: 00010286 RAX: 9981ddce30a0 RBX: 9edc1590 RCX: RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598 RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0 R10: R11: 00028b56 R12: c06f8880 R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0 FS: 7f992dd9a680() GS:9981e708() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0 Call Trace: jump_label_module_notify+0x1e7/0x2b0 notifier_call_chain+0x44/0x70 blocking_notifier_call_chain+0x43/0x60 load_module+0x1bcb/0x2490 ? vfs_read+0x11f/0x150 ? __do_sys_finit_module+0xbf/0xe0 __do_sys_finit_module+0xbf/0xe0 do_syscall_64+0x43/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Josh, didn't you mention that yesterday or so? RIP: 0033:0x7f992e2eeaf9 Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 67 73 0d 00 f7 d8 64 89 01 48 RSP: 002b:7ffca220d288 EFLAGS: 0246 ORIG_RAX: 0139 RAX: ffda RBX: 009b8da0 RCX: 7f992e2eeaf9 RDX: RSI: 7f992e464885 RDI: 0010 RBP: 0002 R08: R09: 009c45c0 R10: 0010 R11: 0246 R12: 7f992e464885 R13: R14: 009acc50 R15: 009b8da0 Modules linked in: kvm_intel(+) kvm irqbypass hid_sensor_hub crc32_pclmul mfd_core i2c_i801 snd_hda_intel i915(+) intel_gtt snd_hda_codec i2c_algo_bit snd_hwdep snd_hda_core drm_kms_helper snd_pcm syscopyarea sysfillrect sysimgblt fb_sys_fops drm hid_multitouch ideapad_laptop sparse_keymap hid_generic wmi efivarfs CR2: 9edc1598 [ end trace dbeb7e66daa9bdca ]--- RIP: 0010:static_key_set_mod.isra.0+0x10/0x30 Code: 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f0 a8 03 75 0d 48 8b 37 83 e6 03 48 09 c6 <48> 89 37 c3 0f 0b 48 8b 37 83 e6 03 48 09 c6 48 89 37 c3 66 66 2e RSP: :a606c032bc98 EFLAGS: 00010286 RAX: 9981ddce30a0 RBX: 9edc1590 RCX: RDX: 0020 RSI: 9981ddce30a0 RDI: 9edc1598 RBP: c06f4000 R08: 9981e6003980 R09: 9981ddce30a0 R10: R11: 00028b56 R12: c06f8880 R13: 9981ddce3080 R14: c06f4008 R15: c06f6dc0 FS: 7f992dd9a680() GS:9981e708() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 9edc1598 CR3: 0002233aa001 CR4: 001606e0
Re: [GIT PULL] x86/topology changes for v5.3
On 2019-07-09 17:31 -0700, Kees Cook wrote: > On Wed, Jul 10, 2019 at 01:17:11AM +0200, Thomas Gleixner wrote: > > On Wed, 10 Jul 2019, Thomas Gleixner wrote: > > > That still does not explain the cr4/0 issue you have. Can you send me your > > > .config please? > > > > Does your machine have UMIP support? None of my test boxes has. So that'd > > be the difference of bits enforced in CR4. Should not matter because it's > > User mode instruction prevention, but who knows. > > Ew. Yeah, I don't have i9 nor i7 for testing this. I did try everything > else I had (and hibernation). Is only Linus able to reproduce this so far? I can, too. > To rule out (in?) UMIP, this would remove UMIP from the pinning: > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > index 309b6b9b49d4..f3beedb6da8a 100644 > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -380,7 +380,7 @@ static void __init setup_cr_pinning(void) > { > unsigned long mask; > > - mask = (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP); > + mask = (X86_CR4_SMEP | X86_CR4_SMAP); > cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & mask; > static_key_enable(_pinning.key); > } I'll try it. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [GIT PULL] x86/topology changes for v5.3
On Wednesday, July 10, 2019 1:00:32 AM CEST Thomas Gleixner wrote: > On Wed, 10 Jul 2019, Thomas Gleixner wrote: > > On Tue, 9 Jul 2019, Linus Torvalds wrote: > > > I also don't have any logs, because the boot never gets far enough. I > > > assume that there was a problem bringing up a non-boot CPU, and the > > > eventual hang ends up being due to that. > > > > Hrm. I just build the tip of your tree and bootet it. It hangs at: > > > > [4.788678] ACPI: 4 ACPI AML tables successfully acquired and loaded > > [4.793860] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored > > [4.821476] ACPI: Dynamic OEM Table Load: > > > > That's way after the secondary CPUs have been brought up. tip/master boots > > without problems with the same config. Let me try x86/asm alone and also a > > revert of the CR4/CR0 stuff. > > x86/asm works alone > > revert of cr4/0 pinning on top of your tree gets stuck as well > > > And while writing this the softlockup detector muttered: > > Have not yet fully bisected it, but Jiri did and he ended up with > > c522ad0637ca ("ACPICA: Update table load object initialization") > > Reverting that on top of your tree makes it work again. Tony has the same > issue. I have a revert of this one queued up (with a plan to push it later today). Cheers, Rafael
Re: [GIT PULL] x86/topology changes for v5.3
On Tue, Jul 09, 2019 at 10:15:21PM -0700, Linus Torvalds wrote: > On Tue, Jul 9, 2019 at 8:21 PM Linus Torvalds > wrote: > > > > Whee. It looks like it's bisecting to the same thing. Only partway > > done, but it feels very much like my desktop. > > Confirmed. > > With that config, I get this > > c21ac93288f0 (refs/bisect/bad) Merge tag 'v5.2-rc6' into x86/asm, to > refresh the branch > 8dbec27a242c (HEAD) x86/asm: Pin sensitive CR0 bits > 873d50d58f67 x86/asm: Pin sensitive CR4 bits > > ie those "pin sensitive bits" merge is bad, but before the commits is good. > > I think there is _another_ problem too, and maybe it's the APCI one, > but this one triggers some issue before the other issue even gets to > play.. Okay, fun. Thanks for confirming it. Well, I guess I get to try to find some more modern hardware to track this down. Does booting with init=/bin/sh get far enough to see if all the CPUs are online or anything like that? I'm baffled about the "gets mostly to userspace" part. I'd expect this to explode very badly if it misbehaved. Or maybe something gets confused between how many CPUs are expected and how many actually show up to the party. Hmmm. -- Kees Cook
Re: [GIT PULL] x86/topology changes for v5.3
On Tue, Jul 9, 2019 at 8:21 PM Linus Torvalds wrote: > > Whee. It looks like it's bisecting to the same thing. Only partway > done, but it feels very much like my desktop. Confirmed. With that config, I get this c21ac93288f0 (refs/bisect/bad) Merge tag 'v5.2-rc6' into x86/asm, to refresh the branch 8dbec27a242c (HEAD) x86/asm: Pin sensitive CR0 bits 873d50d58f67 x86/asm: Pin sensitive CR4 bits ie those "pin sensitive bits" merge is bad, but before the commits is good. I think there is _another_ problem too, and maybe it's the APCI one, but this one triggers some issue before the other issue even gets to play.. Linus
Re: [GIT PULL] x86/topology changes for v5.3
On Tue, Jul 9, 2019 at 5:59 PM Linus Torvalds wrote: > > On my laptop (which I am at right now), the hang is different, and > maybe it's similar to your ACPI hang issue. I will try that revert, > and at least see if that solves the laptop side. Nope, that wasn't it. Apparently there are three different issues. I guess I'll just have to do a full bisect on the laptop too. That's going to take forever. Linus
Re: [GIT PULL] x86/topology changes for v5.3
On Tue, Jul 9, 2019 at 4:00 PM Thomas Gleixner wrote: > > That still does not explain the cr4/0 issue you have. Can you send me your > .config please? Gaah. Now I'm off my desktop and won't be at it again until tomorrow. And that's the one that bisected to the cr0/cr4 bits (and I did all my bisection on that because it's so much faster). Anyway, the kernel config itself is pretty simple. It's basically a F30 kernel config that has been through "make localmodconfig" and then pared down some more to remove stuff I don't use (paravirt etc). I would expect that if it's a confiuration difference, it's more likely to be about the user mode side - since the hang does seem to happen in user mode, not in the kernel (but a very early problem at cpu bringup time could easily have been hidden). On my laptop (which I am at right now), the hang is different, and maybe it's similar to your ACPI hang issue. I will try that revert, and at least see if that solves the laptop side. Linus
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, Jul 10, 2019 at 01:17:11AM +0200, Thomas Gleixner wrote: > On Wed, 10 Jul 2019, Thomas Gleixner wrote: > > > > That still does not explain the cr4/0 issue you have. Can you send me your > > .config please? > > Does your machine have UMIP support? None of my test boxes has. So that'd > be the difference of bits enforced in CR4. Should not matter because it's > User mode instruction prevention, but who knows. Ew. Yeah, I don't have i9 nor i7 for testing this. I did try everything else I had (and hibernation). Is only Linus able to reproduce this so far? To rule out (in?) UMIP, this would remove UMIP from the pinning: diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 309b6b9b49d4..f3beedb6da8a 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -380,7 +380,7 @@ static void __init setup_cr_pinning(void) { unsigned long mask; - mask = (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP); + mask = (X86_CR4_SMEP | X86_CR4_SMAP); cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & mask; static_key_enable(_pinning.key); } -- Kees Cook
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, 10 Jul 2019, Thomas Gleixner wrote: > > That still does not explain the cr4/0 issue you have. Can you send me your > .config please? Does your machine have UMIP support? None of my test boxes has. So that'd be the difference of bits enforced in CR4. Should not matter because it's User mode instruction prevention, but who knows. Thanks, tglx
Re: [GIT PULL] x86/topology changes for v5.3
On Wed, 10 Jul 2019, Thomas Gleixner wrote: > On Tue, 9 Jul 2019, Linus Torvalds wrote: > > I also don't have any logs, because the boot never gets far enough. I > > assume that there was a problem bringing up a non-boot CPU, and the > > eventual hang ends up being due to that. > > Hrm. I just build the tip of your tree and bootet it. It hangs at: > > [4.788678] ACPI: 4 ACPI AML tables successfully acquired and loaded > [4.793860] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored > [4.821476] ACPI: Dynamic OEM Table Load: > > That's way after the secondary CPUs have been brought up. tip/master boots > without problems with the same config. Let me try x86/asm alone and also a > revert of the CR4/CR0 stuff. x86/asm works alone revert of cr4/0 pinning on top of your tree gets stuck as well > And while writing this the softlockup detector muttered: Have not yet fully bisected it, but Jiri did and he ended up with c522ad0637ca ("ACPICA: Update table load object initialization") Reverting that on top of your tree makes it work again. Tony has the same issue. That still does not explain the cr4/0 issue you have. Can you send me your .config please? Thanks, tglx
Re: [GIT PULL] x86/topology changes for v5.3
On Tue, 9 Jul 2019, Linus Torvalds wrote: > On Tue, Jul 9, 2019 at 2:45 PM Linus Torvalds > wrote: > > > > and I suspect it's the sensitive bit pinning. But I'll delve all the way. > > Confirmed. Bisection says > > 873d50d58f67ef15d2777b5e7f7a5268bb1fbae2 is the first bad commit > commit 873d50d58f67ef15d2777b5e7f7a5268bb1fbae2 > Author: Kees Cook > Date: Mon Jun 17 21:55:02 2019 -0700 > > x86/asm: Pin sensitive CR4 bits > > this is on a bog-standard Intel setup with F30, both desktop and > laptop (i9-9900k and i7-8565u respectively). > > I haven't confirmed yet whether reverting just that one commit is > required, or if I need to revert the cr0 one too. > > I also don't have any logs, because the boot never gets far enough. I > assume that there was a problem bringing up a non-boot CPU, and the > eventual hang ends up being due to that. Hrm. I just build the tip of your tree and bootet it. It hangs at: [4.788678] ACPI: 4 ACPI AML tables successfully acquired and loaded [4.793860] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored [4.821476] ACPI: Dynamic OEM Table Load: That's way after the secondary CPUs have been brought up. tip/master boots without problems with the same config. Let me try x86/asm alone and also a revert of the CR4/CR0 stuff. And while writing this the softlockup detector muttered: [ 245.849408] INFO: task swapper/0:1 blocked for more than 120 seconds. [ 245.853406] Not tainted 5.2.0+ #69 [ 245.857318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 245.865405] swapper/0 D0 1 0 0x80004000 [ 245.869405] Call Trace: [ 245.871859] ? __schedule+0x2bb/0x690 [ 245.877410] ? acpi_ps_complete_op+0x259/0x279 [ 245.881406] schedule+0x29/0x90 [ 245.884540] schedule_timeout+0x20d/0x310 [ 245.889409] ? acpi_os_release_object+0xa/0x10 [ 245.893406] ? acpi_ps_delete_parse_tree+0x2d/0x59 [ 245.897405] __down_timeout+0x9b/0x100 [ 245.901150] down_timeout+0x43/0x50 [ 245.905407] acpi_os_wait_semaphore+0x48/0x60 [ 245.909408] acpi_ut_acquire_mutex+0x45/0x89 [ 245.913406] ? acpi_ns_init_one_package+0x44/0x44 [ 245.917406] acpi_walk_namespace+0x62/0xc2 [ 245.921406] acpi_ns_initialize_objects+0x40/0x7b [ 245.925407] acpi_ex_load_table_op+0x157/0x1b9 [ 245.933406] acpi_ex_opcode_6A_0T_1R+0x158/0x1c2 [ 245.937406] acpi_ds_exec_end_op+0xca/0x401 [ 245.941406] acpi_ps_parse_loop+0x492/0x5c6 [ 245.945406] acpi_ps_parse_aml+0x91/0x2b8 [ 245.949405] acpi_ps_execute_method+0x15d/0x191 [ 245.953406] acpi_ns_evaluate+0x1bf/0x24c [ 245.957406] acpi_evaluate_object+0x137/0x240 [ 245.961408] ? kmem_cache_alloc_trace+0x15b/0x1c0 [ 245.965406] acpi_processor_set_pdc+0x135/0x180 [ 245.969408] early_init_pdc+0xb3/0xd0 [ 245.973064] acpi_ns_walk_namespace+0xda/0x1aa [ 245.977407] ? set_no_mwait+0x23/0x23 [ 245.981062] ? set_no_mwait+0x23/0x23 [ 245.985406] acpi_walk_namespace+0x9a/0xc2 [ 245.989406] ? acpi_sleep_proc_init+0x24/0x24 [ 245.993407] ? do_early_param+0x8e/0x8e [ 245.997235] acpi_early_processor_set_pdc+0x31/0x49 [ 246.001406] acpi_init+0x17c/0x321 Thanks, tglx
Re: [GIT PULL] x86/topology changes for v5.3
On Tue, Jul 9, 2019 at 3:00 PM Linus Torvalds wrote: > > I haven't confirmed yet whether reverting just that one commit is > required, or if I need to revert the cr0 one too. Oh, I can't revert just the cr4 one, because the cr0 one depends on it. But reverting both gets my desktop back to life. My laptop (that hung earlier) seems to be another (possibly additional) issue. Linus
Re: [GIT PULL] x86/topology changes for v5.3
On Tue, Jul 9, 2019 at 2:45 PM Linus Torvalds wrote: > > and I suspect it's the sensitive bit pinning. But I'll delve all the way. Confirmed. Bisection says 873d50d58f67ef15d2777b5e7f7a5268bb1fbae2 is the first bad commit commit 873d50d58f67ef15d2777b5e7f7a5268bb1fbae2 Author: Kees Cook Date: Mon Jun 17 21:55:02 2019 -0700 x86/asm: Pin sensitive CR4 bits this is on a bog-standard Intel setup with F30, both desktop and laptop (i9-9900k and i7-8565u respectively). I haven't confirmed yet whether reverting just that one commit is required, or if I need to revert the cr0 one too. I also don't have any logs, because the boot never gets far enough. I assume that there was a problem bringing up a non-boot CPU, and the eventual hang ends up being due to that. Linus
Re: [GIT PULL] x86/topology changes for v5.3
On Tue, Jul 9, 2019 at 2:26 PM Linus Torvalds wrote: > > but still bisecting to pinpoint the exact thing. One of: c21ac93288f0 (refs/bisect/bad) Merge tag 'v5.2-rc6' into x86/asm, to refresh the branch 8dbec27a242c x86/asm: Pin sensitive CR0 bits 873d50d58f67 x86/asm: Pin sensitive CR4 bits 7b347ad4938d (HEAD) Merge tag 'v5.2-rc5' into x86/asm, to refresh the branch 9db9b76767f1 Documentation/x86: Fix path to entry_32.S 7231d0165df3 x86/asm: Remove unused TASK_TI_flags from asm-offsets.c and I suspect it's the sensitive bit pinning. But I'll delve all the way. Linus
Re: [GIT PULL] x86/topology changes for v5.3
On Tue, Jul 9, 2019 at 2:20 PM Linus Torvalds wrote: > > Will keep you updated as bisection narrows it down. Hmm. Already narrowed down to either of Pull x86 asm updates from Ingo Molnar: Pull scheduler updates from Ingo Molnar: but still bisecting to pinpoint the exact thing. Linus
Re: [GIT PULL] x86/topology changes for v5.3
After doing a lot of build testing and merging, I finally got around to actually boot-testing the result. Something in this series has broken booting for me. Or rather, things *boot*, but then it hangs in user space, with A start job is running for udev Wait for Complete Device Initialization it does this on both my desktop and laptop, although the exact hang is different (on my laptop, it hangs earlier - I don't even get to input the disk encryption keys). I'm bisecting, and it will take a while to pinpoint, but I can already tell that it's from one of these pulls: Pull x86 topology updates from Ingo Molnar: Pull x86 platform updayes from Ingo Molnar: Pull x86 paravirt updates from Ingo Molnar: Pull x86 AVX512 status update from Ingo Molnar: Pull x86 cleanups from Ingo Molnar: Pull x86 cache resource control update from Ingo Molnar: Pull x86 build updates from Ingo Molnar: Pull x86 asm updates from Ingo Molnar: Pull scheduler updates from Ingo Molnar: Will keep you updated as bisection narrows it down. Linus
Re: [GIT PULL] x86/topology changes for v5.3
The pull request you sent on Mon, 8 Jul 2019 18:27:56 +0200: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git > x86-topology-for-linus has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/222a21d29521d144f3dd7a0bc4d4020e448f0126 Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker