On 08/18/2015, 07:12 PM, Thomas D. wrote:
>>> --- a/arch/x86/kernel/entry_64.S
>>> +++ b/arch/x86/kernel/entry_64.S
>>> @@ -1715,19 +1715,88 @@ ENTRY(nmi)
>>>      * a nested NMI that updated the copy interrupt stack frame, a
>>>      * jump will be made to the repeat_nmi code that will handle the second
>>>      * NMI.
>>> +    *
>>> +    * However, espfix prevents us from directly returning to userspace
>>> +    * with a single IRET instruction.  Similarly, IRET to user mode
>>> +    * can fault.  We therefore handle NMIs from user space like
>>> +    * other IST entries.
>>>      */
>>>  
>>>     /* Use %rdx as out temp variable throughout */
>>>     pushq_cfi %rdx
>>>     CFI_REL_OFFSET rdx, 0
>>>  
>>> +   testb   $3, CS-RIP+8(%rsp)
>>> +   jz      .Lnmi_from_kernel
>>> +
>>> +   /*
>>> +    * NMI from user mode.  We need to run on the thread stack, but we
>>> +    * can't go through the normal entry paths: NMIs are masked, and
>>> +    * we don't want to enable interrupts, because then we'll end
>>> +    * up in an awkward situation in which IRQs are on but NMIs
>>> +    * are off.
>>> +    */
>>> +
>>> +   SWAPGS
>>> +   cld
>>> +   movq    %rsp, %rdx
>>> +   movq    PER_CPU_VAR(kernel_stack), %rsp
>>
>> I think you are wasting stack space here. With kernel_stack, you should
>> add 5*8 (KERNEL_STACK_OFFSET) to the pointer here. I.e. space for 5
>> registers is pre-reserved at kernel_stack already. (Or use movq instead
>> of the 5 pushq below.)
>>
>> Why don't you re-use the 3.16's version anyway?
>>
>>> +   pushq   5*8(%rdx)       /* pt_regs->ss */
>>> +   pushq   4*8(%rdx)       /* pt_regs->rsp */
>>> +   pushq   3*8(%rdx)       /* pt_regs->flags */
>>> +   pushq   2*8(%rdx)       /* pt_regs->cs */
>>> +   pushq   1*8(%rdx)       /* pt_regs->rip */
>>> +   pushq   $-1             /* pt_regs->orig_ax */
>>> +   pushq   %rdi            /* pt_regs->di */
>>> +   pushq   %rsi            /* pt_regs->si */
>>> +   pushq   (%rdx)          /* pt_regs->dx */
>>> +   pushq   %rcx            /* pt_regs->cx */
>>> +   pushq   %rax            /* pt_regs->ax */
>>> +   pushq   %r8             /* pt_regs->r8 */
>>> +   pushq   %r9             /* pt_regs->r9 */
>>> +   pushq   %r10            /* pt_regs->r10 */
>>> +   pushq   %r11            /* pt_regs->r11 */
>>> +   pushq   %rbx            /* pt_regs->rbx */
>>> +   pushq   %rbp            /* pt_regs->rbp */
>>> +   pushq   %r12            /* pt_regs->r12 */
>>> +   pushq   %r13            /* pt_regs->r13 */
>>> +   pushq   %r14            /* pt_regs->r14 */
>>> +   pushq   %r15            /* pt_regs->r15 */
> 
> Mh, so you mean
> 
>> +    addq    $KERNEL_STACK_OFFSET, %rsp
> 
> between
> 
>> +    movq    PER_CPU_VAR(kernel_stack), %rsp
> 
> and
> 
>> +    pushq   5*8(%rdx)       /* pt_regs->ss */
> 
> is missing?

Yep, that makes sense. But I am not an x86 maintainer :P.

-- 
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to