On Fri, Apr 24, 2015 at 1:41 PM, Linus Torvalds <torva...@linux-foundation.org> wrote: > On Fri, Apr 24, 2015 at 10:33 AM, Brian Gerst <brge...@gmail.com> wrote: >> >> To clarify, I was thinking of the CONFIG_PREEMPT case. A nested >> interrupt wouldn't change SS, and IST interrupts can't schedule. > > It has absolutely nothing to do with nested interrupts or CONFIG_PREEMPT. > > The problem happens simply because > > - process A does a system call SS=__KERNEL_DS > > - the system call sleeps for whatever reason. SS is still __KERNEL_DS > > - process B runs, returns to user space, and takes an interrupt. Now SS=0 > > - process B is about to return to user space (where the interrupt > happened), but we schedule as part of that regular user-space return. > SS=0 > > - process A returns to user space using sysret, the SS selector > becomes __USER_DS, but the cached descriptor remains non-present > > Notice? No nested interrupts, no CONFIG_PREEMPT, nothing special at all. > > The reason Luto's patch fixes the problem is that now the scheduling > from B back to A will reload SS, making it __KERNEL_DS, but more > importantly, fixing the cached descriptor to be the usual present flag > one, which is what the AMD sysret instruction needs. > > Or do I misunderstand what you are talking about? > > Linus
Your explanation is correct. I meant that this can happen even if CONFIG_PREEMPT is disabled. I just took "preemption" to mean kernel preemption, not normal scheduling. -- Brian Gerst -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/