Re: [PATCH v2 3/3] x86/entry/64: Move #BP from IST to the IRQ stack

2015-07-29 Thread Borislav Petkov
On Wed, Jul 29, 2015 at 10:57:26AM -0700, Andy Lutomirski wrote:
> OK if I do that as a follow-up?  It would probably want to be a
> separate patch anyway.

Of course.

> Hmm, I'm starting to like this new regime in which we never ever
> switch to user mode from anywhere other than the standard kernel
> stack.  It looks like even Xen may play along and do it cleanly soon
> :)  Maybe I'll even add an assertion somewhere to make sure we don't
> break it.  (I think this also means that the bad iret fixup can be
> simplified.)

Definitely sounds like a nice, logical thing. We sometimes switch stacks
to land on the kernel stack before returning to user mode (IST and all)
but I guess that's a clean enough thing to do. Oh, and only a couple of
insns so yeah.

> Also, with all this stuff applied (and the modify_ldt thing, once the
> Xen folks figure out what's wrong), I think we can reinstate the old
> LARL check for 16-bit segments and thus prevent naughty users from
> banging on espfix using only sigreturn.

Uuh, and then only check ZF. I guess this should cover all the legacy
cases, which is nice.

Yeah, sounds coolio. :-)

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] x86/entry/64: Move #BP from IST to the IRQ stack

2015-07-29 Thread Andy Lutomirski
On Tue, Jul 28, 2015 at 2:54 AM, Borislav Petkov  wrote:
> On Fri, Jul 24, 2015 at 10:57:06PM -0700, Andy Lutomirski wrote:
>> There's nothing IST-worthy about #BP/int3.  We don't allow kprobes
>> in the small handful of places in the kernel that run at CPL0 with
>> an invalid stack, and 32-bit kernels have used normal interrupt
>> gates for #BP forever.
>>
>> Furthermore, we don't allow kprobes in places that have usergs while
>> in kernel mode, so "paranoid" is also unnecessary.
>>
>> Signed-off-by: Andy Lutomirski 
>> ---
>>  arch/x86/entry/entry_64.S |  2 +-
>>  arch/x86/kernel/traps.c   | 26 +-
>>  2 files changed, 14 insertions(+), 14 deletions(-)
>
> ...
>
>> @@ -494,7 +494,15 @@ dotraplinkage void notrace do_int3(struct pt_regs 
>> *regs, long error_code)
>>   if (poke_int3_handler(regs))
>>   return;
>>
>> + /*
>> +  * Use ist_enter despite the fact that we don't use an IST stack.
>> +  * We can be called from a kprobe in non-CONTEXT_KERNEL kernel
>> +  * mode or even during context tracking state changes.
>> +  *
>> +  * This means that we can't schedule.  That's okay.
>> +  */
>>   ist_enter(regs);
>
> Let's rename that thing. Call it atomic_ctxt_enter or whatever...
>

OK if I do that as a follow-up?  It would probably want to be a
separate patch anyway.

Hmm, I'm starting to like this new regime in which we never ever
switch to user mode from anywhere other than the standard kernel
stack.  It looks like even Xen may play along and do it cleanly soon
:)  Maybe I'll even add an assertion somewhere to make sure we don't
break it.  (I think this also means that the bad iret fixup can be
simplified.)

Also, with all this stuff applied (and the modify_ldt thing, once the
Xen folks figure out what's wrong), I think we can reinstate the old
LARL check for 16-bit segments and thus prevent naughty users from
banging on espfix using only sigreturn.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] x86/entry/64: Move #BP from IST to the IRQ stack

2015-07-28 Thread Borislav Petkov
On Fri, Jul 24, 2015 at 10:57:06PM -0700, Andy Lutomirski wrote:
> There's nothing IST-worthy about #BP/int3.  We don't allow kprobes
> in the small handful of places in the kernel that run at CPL0 with
> an invalid stack, and 32-bit kernels have used normal interrupt
> gates for #BP forever.
> 
> Furthermore, we don't allow kprobes in places that have usergs while
> in kernel mode, so "paranoid" is also unnecessary.
> 
> Signed-off-by: Andy Lutomirski 
> ---
>  arch/x86/entry/entry_64.S |  2 +-
>  arch/x86/kernel/traps.c   | 26 +-
>  2 files changed, 14 insertions(+), 14 deletions(-)

...

> @@ -494,7 +494,15 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, 
> long error_code)
>   if (poke_int3_handler(regs))
>   return;
>  
> + /*
> +  * Use ist_enter despite the fact that we don't use an IST stack.
> +  * We can be called from a kprobe in non-CONTEXT_KERNEL kernel
> +  * mode or even during context tracking state changes.
> +  *
> +  * This means that we can't schedule.  That's okay.
> +  */
>   ist_enter(regs);

Let's rename that thing. Call it atomic_ctxt_enter or whatever...

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/3] x86/entry/64: Move #BP from IST to the IRQ stack

2015-07-24 Thread Andy Lutomirski
There's nothing IST-worthy about #BP/int3.  We don't allow kprobes
in the small handful of places in the kernel that run at CPL0 with
an invalid stack, and 32-bit kernels have used normal interrupt
gates for #BP forever.

Furthermore, we don't allow kprobes in places that have usergs while
in kernel mode, so "paranoid" is also unnecessary.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/entry/entry_64.S |  2 +-
 arch/x86/kernel/traps.c   | 26 +-
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index af3573e75ed4..e353709d1fcc 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1007,7 +1007,7 @@ apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \
 #endif /* CONFIG_HYPERV */
 
 idtentry debug do_debughas_error_code=0
paranoid=1 shift_ist=DEBUG_STACK
-idtentry int3  do_int3 has_error_code=0
paranoid=1 shift_ist=DEBUG_STACK
+idtentry int3  do_int3 has_error_code=0
irqstack=1
 idtentry stack_segment do_stack_segmenthas_error_code=1
 
 #ifdef CONFIG_XEN
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 8e65d8a9b8db..d823db70f492 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -479,7 +479,7 @@ do_general_protection(struct pt_regs *regs, long error_code)
 }
 NOKPROBE_SYMBOL(do_general_protection);
 
-/* May run on IST stack. */
+/* Runs on IRQ stack. */
 dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 {
 #ifdef CONFIG_DYNAMIC_FTRACE
@@ -494,7 +494,15 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, 
long error_code)
if (poke_int3_handler(regs))
return;
 
+   /*
+* Use ist_enter despite the fact that we don't use an IST stack.
+* We can be called from a kprobe in non-CONTEXT_KERNEL kernel
+* mode or even during context tracking state changes.
+*
+* This means that we can't schedule.  That's okay.
+*/
ist_enter(regs);
+
CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
@@ -511,15 +519,10 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, 
long error_code)
SIGTRAP) == NOTIFY_STOP)
goto exit;
 
-   /*
-* Let others (NMI) know that the debug stack is in use
-* as we may switch to the interrupt stack.
-*/
-   debug_stack_usage_inc();
preempt_conditional_sti(regs);
do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, error_code, NULL);
preempt_conditional_cli(regs);
-   debug_stack_usage_dec();
+
 exit:
ist_exit(regs);
 }
@@ -885,19 +888,16 @@ void __init trap_init(void)
cpu_init();
 
/*
-* X86_TRAP_DB and X86_TRAP_BP have been set
-* in early_trap_init(). However, ITS works only after
-* cpu_init() loads TSS. See comments in early_trap_init().
+* X86_TRAP_DB was installed in early_trap_init(). However,
+* IST works only after cpu_init() loads TSS. See comments
+* in early_trap_init().
 */
set_intr_gate_ist(X86_TRAP_DB, &debug, DEBUG_STACK);
-   /* int3 can be called from all */
-   set_system_intr_gate_ist(X86_TRAP_BP, &int3, DEBUG_STACK);
 
x86_init.irqs.trap_init();
 
 #ifdef CONFIG_X86_64
memcpy(&debug_idt_table, &idt_table, IDT_ENTRIES * 16);
set_nmi_gate(X86_TRAP_DB, &debug);
-   set_nmi_gate(X86_TRAP_BP, &int3);
 #endif
 }
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/