Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-19 Thread Jue Wang
On Thu, 8 Apr 2021 10:08:52 -0700, Tony Luck wrote: > KVM apparently passes a machine check into the guest. Though it seems > to be misisng the MCG_STATUS information to tell the guest whether this > is an "Action Required" machine check, or an "Action Optional" (i.e. > whether the poison was

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-14 Thread Borislav Petkov
On Wed, Apr 14, 2021 at 07:46:49AM -0700, Jue Wang wrote: > I can see this is useful in other types of domains, e.g., on multi-tenant > cloud > servers where many VMs are collocated on the same host, > with proper recovery + live migration, a single MCE would only affect a single > VM at most. >

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-14 Thread Jue Wang
On Wed, Apr 14, 2021 at 6:10 AM Borislav Petkov wrote: > > On Tue, Apr 13, 2021 at 10:47:21PM -0700, Jue Wang wrote: > > This path is when EPT #PF finds accesses to a hwpoisoned page and > > sends SIGBUS to user space (KVM exits into user space) with the same > > semantic as if regular #PF found

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-14 Thread Borislav Petkov
On Tue, Apr 13, 2021 at 10:47:21PM -0700, Jue Wang wrote: > This path is when EPT #PF finds accesses to a hwpoisoned page and > sends SIGBUS to user space (KVM exits into user space) with the same > semantic as if regular #PF found access to a hwpoisoned page. > > The KVM_X86_SET_MCE ioctl

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-14 Thread Borislav Petkov
On Tue, Apr 13, 2021 at 04:13:03PM +, Luck, Tony wrote: > Even if no applications ever do anything with it, it is still useful to avoid > crashing the whole system and just terminate one application/guest. True. > There's one more item on my long term TODO list. Add fixups so that >

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-13 Thread Jue Wang
On Tue, 13 Apr 2021 12:07:22 +0200, Petkov, Borislav wrote: >> KVM apparently passes a machine check into the guest. > Ah, there it is: > static void kvm_send_hwpoison_signal(unsigned long address, struct > task_struct *tsk) > { > send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address,

RE: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-13 Thread Luck, Tony
> So what I'm missing with all this fun is, yeah, sure, we have this > facility out there but who's using it? Is anyone even using it at all? Even if no applications ever do anything with it, it is still useful to avoid crashing the whole system and just terminate one application/guest. > If so,

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-13 Thread Borislav Petkov
On Thu, Apr 08, 2021 at 10:08:52AM -0700, Luck, Tony wrote: > Also not clear to me either ... but sending a SIGBUS to a kthread isn't > going to do anything useful. So avoiding doing that is another worthy > goal. Ack. > With these patches nothing gets killed when kernel touches user poison. >

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-08 Thread Luck, Tony
On Thu, Apr 08, 2021 at 10:49:58AM +0200, Borislav Petkov wrote: > On Wed, Apr 07, 2021 at 02:43:10PM -0700, Luck, Tony wrote: > > On Wed, Apr 07, 2021 at 11:18:16PM +0200, Borislav Petkov wrote: > > > On Thu, Mar 25, 2021 at 05:02:34PM -0700, Tony Luck wrote: > > > > Andy Lutomirski pointed out

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-08 Thread Borislav Petkov
On Wed, Apr 07, 2021 at 02:43:10PM -0700, Luck, Tony wrote: > On Wed, Apr 07, 2021 at 11:18:16PM +0200, Borislav Petkov wrote: > > On Thu, Mar 25, 2021 at 05:02:34PM -0700, Tony Luck wrote: > > > Andy Lutomirski pointed out that sending SIGBUS to tasks that > > > hit poison in the kernel copying

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-07 Thread Luck, Tony
On Wed, Apr 07, 2021 at 11:18:16PM +0200, Borislav Petkov wrote: > On Thu, Mar 25, 2021 at 05:02:34PM -0700, Tony Luck wrote: > > Andy Lutomirski pointed out that sending SIGBUS to tasks that > > hit poison in the kernel copying syscall parameters from user > > address space is not the right

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-07 Thread Borislav Petkov
On Thu, Mar 25, 2021 at 05:02:34PM -0700, Tony Luck wrote: > Andy Lutomirski pointed out that sending SIGBUS to tasks that > hit poison in the kernel copying syscall parameters from user > address space is not the right semantic. What does that mean exactly? >From looking at the code, that is

[PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-03-25 Thread Tony Luck
Andy Lutomirski pointed out that sending SIGBUS to tasks that hit poison in the kernel copying syscall parameters from user address space is not the right semantic. So stop doing that. Add a new kill_me_never() call back that simply unmaps and offlines the poison page. current-mce_vaddr is no