RE: [PATCH 2/2] futex, x86/mce: Avoid double machine checks

2021-01-08 Thread Luck, Tony
> Yeah, saw that, but that should be trivially fixable I'm thinking.

Trivial, maybe ... but then follows the audit of other get_user() calls:

git grep get_user\( | wc -l
2003

:-(

-Tony


Re: [PATCH 2/2] futex, x86/mce: Avoid double machine checks

2021-01-08 Thread Peter Zijlstra
On Fri, Jan 08, 2021 at 11:08:58PM +, Luck, Tony wrote:
> > I think this is horrid; why can't we have it return something different
> > then -EFAULT instead?
> 
> I did consider this ... but it appears that architectures aren't unified in 
> the
> return value from get_user()

But surely none are currently returning -EMEMERR or whatever name we
come up with.

> Here's another function involved in the futex call chain leading to this:
> 
> static int get_futex_value_locked(u32 *dest, u32 __user *from)
> {
> int ret;
> 
> pagefault_disable();
> ret = __get_user(*dest, from);
> pagefault_enable();
> 
> return ret ? -EFAULT : 0;
> }
> 
> It seems like the expectation here is just "zero or not" and we
> don't care what the "not" value is ... just turn it into -EFAULT.

Yeah, saw that, but that should be trivially fixable I'm thinking.


RE: [PATCH 2/2] futex, x86/mce: Avoid double machine checks

2021-01-08 Thread Luck, Tony
> I think this is horrid; why can't we have it return something different
> then -EFAULT instead?

I did consider this ... but it appears that architectures aren't unified in the
return value from get_user()

Here's another function involved in the futex call chain leading to this:

static int get_futex_value_locked(u32 *dest, u32 __user *from)
{
int ret;

pagefault_disable();
ret = __get_user(*dest, from);
pagefault_enable();

return ret ? -EFAULT : 0;
}

It seems like the expectation here is just "zero or not" and we
don't care what the "not" value is ... just turn it into -EFAULT.

-Tony


Re: [PATCH 2/2] futex, x86/mce: Avoid double machine checks

2021-01-08 Thread Peter Zijlstra
On Fri, Jan 08, 2021 at 02:22:51PM -0800, Tony Luck wrote:
> futex_wait_setup() first tries to read the user value with page faults
> disabled (because it holds a lock, and so cannot sleep). If that read
> fails it drops the lock and tries again.
> 
> But there are now two reasons why the user space read can fail. Either:
> 1) legacy case of a page fault, in which case it is reasonable to retry
> 2) machine check on the user address, bad idea to re-read
> 
> Add some infrastructure to differentiate these cases.

> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -2658,6 +2658,9 @@ static int futex_wait_setup(u32 __user *uaddr, u32 val, 
> unsigned int flags,
>   if (ret) {
>   queue_unlock(*hb);
>  
> + if (arch_memory_failure(uaddr))
> + return ret;
> +
>   ret = get_user(uval, uaddr);
>   if (ret)
>   return ret;


I think this is horrid; why can't we have it return something different
then -EFAULT instead?


[PATCH 2/2] futex, x86/mce: Avoid double machine checks

2021-01-08 Thread Tony Luck
futex_wait_setup() first tries to read the user value with page faults
disabled (because it holds a lock, and so cannot sleep). If that read
fails it drops the lock and tries again.

But there are now two reasons why the user space read can fail. Either:
1) legacy case of a page fault, in which case it is reasonable to retry
2) machine check on the user address, bad idea to re-read

Add some infrastructure to differentiate these cases.

Signed-off-by: Tony Luck 
---
 arch/x86/include/asm/mmu.h |  7 +++
 arch/x86/kernel/cpu/mce/core.c | 10 ++
 include/linux/mm.h |  4 
 kernel/futex.c |  3 +++
 4 files changed, 24 insertions(+)

diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 5d7494631ea9..a46c78381388 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -66,4 +66,11 @@ typedef struct {
 void leave_mm(int cpu);
 #define leave_mm leave_mm
 
+#if defined(CONFIG_X86_MCE) && defined(CONFIG_MEMORY_FAILURE)
+#undef arch_memory_failure
+#define arch_memory_failure x86_memory_failure
+#endif
+
+bool x86_memory_failure(u32 __user *addr);
+
 #endif /* _ASM_X86_MMU_H */
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 1bf11213e093..b27aa30290bb 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1236,6 +1236,16 @@ static void __mc_scan_banks(struct mce *m, struct 
pt_regs *regs, struct mce *fin
*m = *final;
 }
 
+bool x86_memory_failure(u32 __user *addr)
+{
+   if (current->mce_busy == 0)
+   return false;
+
+   WARN_ON(current->mce_vaddr != addr);
+
+   return true;
+}
+
 static void kill_me_now(struct callback_head *ch)
 {
force_sig(SIGBUS);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ecdf8a8cd6ae..470708a71dd3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3177,5 +3177,9 @@ unsigned long wp_shared_mapping_range(struct 
address_space *mapping,
 
 extern int sysctl_nr_trim_pages;
 
+#ifndef arch_memory_failure
+#define arch_memory_failure(vaddr) (0)
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */
diff --git a/kernel/futex.c b/kernel/futex.c
index c47d1015d759..8fa2fc854026 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2658,6 +2658,9 @@ static int futex_wait_setup(u32 __user *uaddr, u32 val, 
unsigned int flags,
if (ret) {
queue_unlock(*hb);
 
+   if (arch_memory_failure(uaddr))
+   return ret;
+
ret = get_user(uval, uaddr);
if (ret)
return ret;
-- 
2.21.1