On Fri, May 8, 2020 at 1:48 AM Peter Zijlstra <[email protected]> wrote: > > On Thu, May 07, 2020 at 11:02:09AM -0700, Andy Lutomirski wrote: > > On Tue, May 5, 2020 at 7:13 AM Thomas Gleixner <[email protected]> wrote: > > > > > > From: Peter Zijlstra <[email protected]> > > > > > > Convert #MC over to using task_work_add(); it will run the same code > > > slightly later, on the return to user path of the same exception. > > > > I think this patch is correct, but I think it's only one small and not > > that obviously wrong step away from being broken: > > > > > if ((m.cs & 3) == 3) { > > > /* If this triggers there is no way to recover. Die hard. > > > */ > > > BUG_ON(!on_thread_stack() || !user_mode(regs)); > > > - local_irq_enable(); > > > - preempt_enable(); > > > > > > - if (kill_it || do_memory_failure(&m)) > > > - force_sig(SIGBUS); > > > - preempt_disable(); > > > - local_irq_disable(); > > > + current->mce_addr = m.addr; > > > + current->mce_status = m.mcgstatus; > > > + current->mce_kill_me.func = kill_me_maybe; > > > + if (kill_it) > > > + current->mce_kill_me.func = kill_me_now; > > > + task_work_add(current, ¤t->mce_kill_me, true); > > > > This is fine if the source was CPL3, but it's not going to work if CPL > > was 0. We don't *currently* do this from CPL0, but people keep > > wanting to. So perhaps there should be a comment like: > > > > /* > > * The #MC originated at CPL3, so we know that we will go execute the > > task_work before returning to the offending user code. > > */ > > > > IOW, if we want to recover from CPL0 #MC, we will need a different > > mechanism. > > See part4-18's IDTRENTRY_NOIST. That will get us a clear CPL3/CPL0 > separation.
I will hold my breath. > > > I also confess a certain amount of sadness that my beautiful > > haha-not-really-atomic-here mechanism isn't being used anymore. :( > > I think we have a subtely different interpretation of 'beautiful' here. Beauty is in the eye of the beholder. And sometimes in the eye of the person who wrote the code :)

