Re: [Xen-devel] [PATCH 2/2] x86/ldt: allow to disable modify_ldt at runtime

2015-08-04 Thread Willy Tarreau
On Tue, Aug 04, 2015 at 05:54:51AM +0200, Borislav Petkov wrote:
 On Mon, Aug 03, 2015 at 11:45:24AM -0700, Andy Lutomirski wrote:
  P.P.P.S.  Who thought that IRET faults unmasking NMIs made any sense
  whatsoever when NMIs run on an IST stack?  Seriously, people?
 
 What happened with asking Intel for a sane IRET-NG?
 
 Should be relatively easy - take the current IRET microcode, get rid
 of the nasty crap, allocate a new opcode and done. Validation should
 actually have *less* to do and can reuse all current test cases.

Even easier, just add a few flags (probably 2 or 3 only) that IRET can
check to adjust its behaviour. Basically don't re-enable NMIs yet,
maybe something to adjust the behaviour on bad CS/SS/SP/IP and a few
such things could possibly help. Maybe all of this could be summarized
as a single flag I'm in a fault handler.

Willy


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] x86/ldt: allow to disable modify_ldt at runtime

2015-08-03 Thread Willy Tarreau
On Mon, Aug 03, 2015 at 03:35:15PM -0700, Kees Cook wrote:
 Yay for perm disable! Thank you! :)

Andy would like to see this evolve towards something possibly
more complete and/or generic. I think this needs more thoughts
and that we should possibly stick to 0/1 for now and decide how
we want to make this evolve later to cover permanent disable,
various ABIs, etc...

What do you think ?

Willy


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] x86/ldt: allow to disable modify_ldt at runtime

2015-08-03 Thread Kees Cook
On Mon, Aug 3, 2015 at 4:19 PM, Willy Tarreau w...@1wt.eu wrote:
 On Mon, Aug 03, 2015 at 03:35:15PM -0700, Kees Cook wrote:
 Yay for perm disable! Thank you! :)

 Andy would like to see this evolve towards something possibly
 more complete and/or generic. I think this needs more thoughts
 and that we should possibly stick to 0/1 for now and decide how
 we want to make this evolve later to cover permanent disable,
 various ABIs, etc...

 What do you think ?

That's probably the best way forward. I still think a generic syscall
disabling feature would be nice. :) I won't have time to work on it
for a little while, though.

-Kees

-- 
Kees Cook
Chrome OS Security

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] x86/ldt: allow to disable modify_ldt at runtime

2015-08-03 Thread Kees Cook
On Mon, Aug 3, 2015 at 11:23 AM, Willy Tarreau w...@1wt.eu wrote:
 For distros who prefer not to take the risk of completely disabling the
 modify_ldt syscall using CONFIG_MODIFY_LDT_SYSCALL, this patch adds a
 sysctl to enable, temporarily disable, or permanently disable it at
 runtime, and proposes to temporarily disable it by default. This can be
 a safe alternative. A message is logged if an attempt was stopped so that
 it's easy to spot if/when it is needed.

 Cc: Andy Lutomirski l...@kernel.org
 Cc: Kees Cook keesc...@chromium.org
 Signed-off-by: Willy Tarreau w...@1wt.eu
 ---
  Documentation/sysctl/kernel.txt | 16 
  arch/x86/Kconfig| 17 +
  arch/x86/kernel/ldt.c   | 15 +++
  kernel/sysctl.c | 14 ++
  4 files changed, 62 insertions(+)

 diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
 index 6fccb69..55648b9 100644
 --- a/Documentation/sysctl/kernel.txt
 +++ b/Documentation/sysctl/kernel.txt
 @@ -41,6 +41,7 @@ show up in /proc/sys/kernel:
  - kptr_restrict
  - kstack_depth_to_print   [ X86 only ]
  - l2cr[ PPC only ]
 +- modify_ldt  [ X86 only ]
  - modprobe== Documentation/debugging-modules.txt
  - modules_disabled
  - msg_next_id[ sysv ipc ]
 @@ -391,6 +392,21 @@ This flag controls the L2 cache of G3 processor boards. 
 If

  ==

 +modify_ldt: (X86 only)
 +
 +Enables (1), disables (0) or permanently disables (-1) the modify_ldt 
 syscall.
 +Modifying the LDT (Local Descriptor Table) may be needed to run a 16-bit or
 +segmented code such as Dosemu or Wine. This is done via a system call which 
 is
 +not needed to run portable applications, and which can sometimes be abused to
 +exploit some weaknesses of the architecture, opening new vulnerabilities.
 +
 +This sysctl allows one to increase the system's security by disabling the
 +system call, or to restore compatibility with specific applications when it
 +was already disabled. When permanently disabled, it is not possible to change
 +the value anymore until the next system reboot.
 +
 +==
 +
  modules_disabled:

  A toggle value indicating if modules are allowed to be loaded
 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
 index beabf30..88d10a0 100644
 --- a/arch/x86/Kconfig
 +++ b/arch/x86/Kconfig
 @@ -2069,6 +2069,23 @@ config MODIFY_LDT_SYSCALL
   surface.  Disabling it removes the modify_ldt(2) system call.

   Saying 'N' here may make sense for embedded or server kernels.
 + If really unsure, say 'Y', you'll be able to disable it at runtime.
 +
 +config DEFAULT_MODIFY_LDT_SYSCALL
 +   bool Allow userspace to modify the LDT by default
 +   depends on MODIFY_LDT_SYSCALL
 +   default y
 +   ---help---
 + Modifying the LDT (Local Descriptor Table) may be needed to run a
 + 16-bit or segmented code such as Dosemu or Wine. This is done via
 + a system call which is not needed to run portable applications,
 + and which can sometimes be abused to exploit some weaknesses of
 + the architecture, opening new vulnerabilities.
 +
 + For this reason this option allows one to enable or disable the
 + feature at runtime. It is recommended to say 'N' here to leave
 + the system protected, and to enable it at runtime only if needed
 + by setting the sys.kernel.modify_ldt sysctl.

  source kernel/livepatch/Kconfig

 diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
 index 2bcc052..420fc8f 100644
 --- a/arch/x86/kernel/ldt.c
 +++ b/arch/x86/kernel/ldt.c
 @@ -11,6 +11,7 @@
  #include linux/sched.h
  #include linux/string.h
  #include linux/mm.h
 +#include linux/ratelimit.h
  #include linux/smp.h
  #include linux/slab.h
  #include linux/vmalloc.h
 @@ -21,6 +22,11 @@
  #include asm/mmu_context.h
  #include asm/syscalls.h

 +#ifdef CONFIG_MODIFY_LDT_SYSCALL
 +int sysctl_modify_ldt __read_mostly =
 +   IS_ENABLED(CONFIG_DEFAULT_MODIFY_LDT_SYSCALL);
 +#endif
 +
  /* context.lock is held for us, so we don't need any locking. */
  static void flush_ldt(void *current_mm)
  {
 @@ -276,6 +282,15 @@ asmlinkage int sys_modify_ldt(int func, void __user *ptr,
  {
 int ret = -ENOSYS;

 +   if (sysctl_modify_ldt = 0) {
 +   printk_ratelimited(KERN_INFO

pr_info_ratelimited? *shrug*

 +   Denied a call to modify_ldt() from %s[%d] (uid: %d).
 +Adjust sysctl if this was not an exploit 
 attempt.\n,
 +   current-comm, task_pid_nr(current),
 +   from_kuid_munged(current_user_ns(), current_uid()));
 +   return ret;
 +   }
 +
 switch (func) {
 case 0:
 ret = read_ldt(ptr, bytecount);
 

Re: [Xen-devel] [PATCH 2/2] x86/ldt: allow to disable modify_ldt at runtime

2015-08-03 Thread Borislav Petkov
On Mon, Aug 03, 2015 at 11:45:24AM -0700, Andy Lutomirski wrote:
 P.P.P.S.  Who thought that IRET faults unmasking NMIs made any sense
 whatsoever when NMIs run on an IST stack?  Seriously, people?

What happened with asking Intel for a sane IRET-NG?

Should be relatively easy - take the current IRET microcode, get rid
of the nasty crap, allocate a new opcode and done. Validation should
actually have *less* to do and can reuse all current test cases.

:-)

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] x86/ldt: allow to disable modify_ldt at runtime

2015-08-03 Thread Andy Lutomirski
On Mon, Aug 3, 2015 at 11:23 AM, Willy Tarreau w...@1wt.eu wrote:
 For distros who prefer not to take the risk of completely disabling the
 modify_ldt syscall using CONFIG_MODIFY_LDT_SYSCALL, this patch adds a
 sysctl to enable, temporarily disable, or permanently disable it at
 runtime, and proposes to temporarily disable it by default. This can be
 a safe alternative. A message is logged if an attempt was stopped so that
 it's easy to spot if/when it is needed.

 Cc: Andy Lutomirski l...@kernel.org
 Cc: Kees Cook keesc...@chromium.org
 Signed-off-by: Willy Tarreau w...@1wt.eu
 ---
  Documentation/sysctl/kernel.txt | 16 
  arch/x86/Kconfig| 17 +
  arch/x86/kernel/ldt.c   | 15 +++
  kernel/sysctl.c | 14 ++
  4 files changed, 62 insertions(+)

 diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
 index 6fccb69..55648b9 100644
 --- a/Documentation/sysctl/kernel.txt
 +++ b/Documentation/sysctl/kernel.txt
 @@ -41,6 +41,7 @@ show up in /proc/sys/kernel:
  - kptr_restrict
  - kstack_depth_to_print   [ X86 only ]
  - l2cr[ PPC only ]
 +- modify_ldt  [ X86 only ]
  - modprobe== Documentation/debugging-modules.txt
  - modules_disabled
  - msg_next_id[ sysv ipc ]
 @@ -391,6 +392,21 @@ This flag controls the L2 cache of G3 processor boards. 
 If

  ==

 +modify_ldt: (X86 only)
 +
 +Enables (1), disables (0) or permanently disables (-1) the modify_ldt 
 syscall.
 +Modifying the LDT (Local Descriptor Table) may be needed to run a 16-bit or
 +segmented code such as Dosemu or Wine. This is done via a system call which 
 is
 +not needed to run portable applications, and which can sometimes be abused to
 +exploit some weaknesses of the architecture, opening new vulnerabilities.
 +

I'm not entirely convinced that the lock bit should work this way.  At
some point, we might want a setting for 32-bit only or even 32-bit,
present, not non-conforming only (like we do unconditionally for
set_thread_area).  When we do that, having -1 act like 0 might be
confusing.

I'd actually favor rigging it up to support enumerated values and/or
the word locked somewhere in the text.  So we could have 0, 1
locked, 1 or even enabled enabled locked, disabled, disabled
locked, safe 32-bit, safe 32-bit locked, etc.

I'll add an explicit 16-bit check to my infinite todo list for the asm
part.  Now that the synchronous modify_ldt code is merged, it won't be
racy, and it would make a 32-bit only mode actually be useful (except
maybe on AMD -- someone needs to test just how badly broken IRET is on
AMD systems -- I know that AMD has IRET-to-16-bit differently broken
from Intel, and that causes test-cast failures.  Grump.)

P.S. Hey CPU vendors: please consider stopping your utter suckage when
it comes to critical system instructions.  Intel and AMD both
terminally screwed up IRET in multiple ways that clearly took actual
effort.  Intel screwed up SYSRET pretty badly (AFAIK every single
64-bit OS has had at least one root hole as a result), and AMD screwed
SYSRET up differently (userspace crash bug that requires a performance
hit to mitigate because no one at AMD realized that one might preempt
a process during a syscall).

P.P.S. You know what would be *way* better than allowing IRET to
fault?  Just allow IRET to continue executing the next instruction on
failure (I'm talking about #GP, #NP, and #SS here, not page faults).

P.P.P.S.  Who thought that IRET faults unmasking NMIs made any sense
whatsoever when NMIs run on an IST stack?  Seriously, people?

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] x86/ldt: allow to disable modify_ldt at runtime

2015-08-03 Thread Willy Tarreau
On Mon, Aug 03, 2015 at 12:06:12PM -0700, Andy Lutomirski wrote:
 On Mon, Aug 3, 2015 at 12:01 PM, Willy Tarreau w...@1wt.eu wrote:
(...)
  I feel like it's probably part of a larger project then. Do you think
  we should step back and only support 0/1 for now ? I also have the
  patch available.
 
 Sounds good to me.

OK I'll send the other one instead once I unpack my PC.

  Well the good thing is that SYSRET reused the LOADALL opcode so at
  least this one cannot be screwed on 64-bit :-) It would have helped us
  to emulate IRET though.
 
 You sure?  I'm reasonably confident that Athlon 64 and newer support
 SYSRET in legacy and long mode.  Of course, I think that SYSCALL is
 totally worthless in legacy mode (SYSCALL_MASK isn't available, so I
 suspect that the lack of sensible TF handling would be a
 show-stopper).

I meant loadall cannot be screwed since it was replaced.

  P.P.S. You know what would be *way* better than allowing IRET to
  fault?  Just allow IRET to continue executing the next instruction on
  failure (I'm talking about #GP, #NP, and #SS here, not page faults).
 
  P.P.P.S.  Who thought that IRET faults unmasking NMIs made any sense
  whatsoever when NMIs run on an IST stack?  Seriously, people?
 
  A dedicated flag don't clear NMI yet would have been nice in EFLAGS
  so that the software stack could set it in fault handlers. It would be
  one-shot and always cleared by IRET. That would have been very handy.
 
 
 How about a dedicated NMI masked flag in EFLAGS?  That would be
 straightforward and dead simple to handle.

Sounds like an oxymoron. But such a flag should be atomically manipulated
so that you don't re-arm queued NMIs before calling iret. With two flags,
a read-only one for NMI masked and a modifiable one keep NMI masked,
you can provide an atomic behaviour where you have this latch executed
on iret :

 NMI_MASKED = KEEP_NMI_MASKED;
 KEEP_NMI_MASKED = 0;

But anyway we're discussing in the void, this CPU doesn't exist so unless
intel/AMD designers want to improve their design (and start by talking
together to reach the exact same behavior), we'll never see anything like
this :-/

Willy


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] x86/ldt: allow to disable modify_ldt at runtime

2015-08-03 Thread Willy Tarreau
On Mon, Aug 03, 2015 at 11:45:24AM -0700, Andy Lutomirski wrote:
 I'm not entirely convinced that the lock bit should work this way.  At
 some point, we might want a setting for 32-bit only or even 32-bit,
 present, not non-conforming only (like we do unconditionally for
 set_thread_area).  When we do that, having -1 act like 0 might be
 confusing.
 
 I'd actually favor rigging it up to support enumerated values and/or
 the word locked somewhere in the text.  So we could have 0, 1
 locked, 1 or even enabled enabled locked, disabled, disabled
 locked, safe 32-bit, safe 32-bit locked, etc.

Got it, that makes sense indeed. I asked myself whether we'd use more
than these 3 values, and estimated that locked on didn't make much
sense here, and I thought that nobody would like to manipulate such
things using bitmaps. But with words like this it can indeed make
sense.

I feel like it's probably part of a larger project then. Do you think
we should step back and only support 0/1 for now ? I also have the
patch available.

 I'll add an explicit 16-bit check to my infinite todo list for the asm
 part.  Now that the synchronous modify_ldt code is merged, it won't be
 racy, and it would make a 32-bit only mode actually be useful (except
 maybe on AMD -- someone needs to test just how badly broken IRET is on
 AMD systems -- I know that AMD has IRET-to-16-bit differently broken
 from Intel, and that causes test-cast failures.  Grump.)
 
 P.S. Hey CPU vendors: please consider stopping your utter suckage when
 it comes to critical system instructions.  Intel and AMD both
 terminally screwed up IRET in multiple ways that clearly took actual
 effort.  Intel screwed up SYSRET pretty badly (AFAIK every single
 64-bit OS has had at least one root hole as a result), and AMD screwed
 SYSRET up differently (userspace crash bug that requires a performance
 hit to mitigate because no one at AMD realized that one might preempt
 a process during a syscall).

Well the good thing is that SYSRET reused the LOADALL opcode so at
least this one cannot be screwed on 64-bit :-) It would have helped us
to emulate IRET though.

 P.P.S. You know what would be *way* better than allowing IRET to
 fault?  Just allow IRET to continue executing the next instruction on
 failure (I'm talking about #GP, #NP, and #SS here, not page faults).
 
 P.P.P.S.  Who thought that IRET faults unmasking NMIs made any sense
 whatsoever when NMIs run on an IST stack?  Seriously, people?

A dedicated flag don't clear NMI yet would have been nice in EFLAGS
so that the software stack could set it in fault handlers. It would be
one-shot and always cleared by IRET. That would have been very handy.

Willy


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] x86/ldt: allow to disable modify_ldt at runtime

2015-08-03 Thread Andy Lutomirski
On Mon, Aug 3, 2015 at 12:01 PM, Willy Tarreau w...@1wt.eu wrote:
 On Mon, Aug 03, 2015 at 11:45:24AM -0700, Andy Lutomirski wrote:
 I'm not entirely convinced that the lock bit should work this way.  At
 some point, we might want a setting for 32-bit only or even 32-bit,
 present, not non-conforming only (like we do unconditionally for
 set_thread_area).  When we do that, having -1 act like 0 might be
 confusing.

 I'd actually favor rigging it up to support enumerated values and/or
 the word locked somewhere in the text.  So we could have 0, 1
 locked, 1 or even enabled enabled locked, disabled, disabled
 locked, safe 32-bit, safe 32-bit locked, etc.

 Got it, that makes sense indeed. I asked myself whether we'd use more
 than these 3 values, and estimated that locked on didn't make much
 sense here, and I thought that nobody would like to manipulate such
 things using bitmaps. But with words like this it can indeed make
 sense.

 I feel like it's probably part of a larger project then. Do you think
 we should step back and only support 0/1 for now ? I also have the
 patch available.

Sounds good to me.


 I'll add an explicit 16-bit check to my infinite todo list for the asm
 part.  Now that the synchronous modify_ldt code is merged, it won't be
 racy, and it would make a 32-bit only mode actually be useful (except
 maybe on AMD -- someone needs to test just how badly broken IRET is on
 AMD systems -- I know that AMD has IRET-to-16-bit differently broken
 from Intel, and that causes test-cast failures.  Grump.)

 P.S. Hey CPU vendors: please consider stopping your utter suckage when
 it comes to critical system instructions.  Intel and AMD both
 terminally screwed up IRET in multiple ways that clearly took actual
 effort.  Intel screwed up SYSRET pretty badly (AFAIK every single
 64-bit OS has had at least one root hole as a result), and AMD screwed
 SYSRET up differently (userspace crash bug that requires a performance
 hit to mitigate because no one at AMD realized that one might preempt
 a process during a syscall).

 Well the good thing is that SYSRET reused the LOADALL opcode so at
 least this one cannot be screwed on 64-bit :-) It would have helped us
 to emulate IRET though.

You sure?  I'm reasonably confident that Athlon 64 and newer support
SYSRET in legacy and long mode.  Of course, I think that SYSCALL is
totally worthless in legacy mode (SYSCALL_MASK isn't available, so I
suspect that the lack of sensible TF handling would be a
show-stopper).


 P.P.S. You know what would be *way* better than allowing IRET to
 fault?  Just allow IRET to continue executing the next instruction on
 failure (I'm talking about #GP, #NP, and #SS here, not page faults).

 P.P.P.S.  Who thought that IRET faults unmasking NMIs made any sense
 whatsoever when NMIs run on an IST stack?  Seriously, people?

 A dedicated flag don't clear NMI yet would have been nice in EFLAGS
 so that the software stack could set it in fault handlers. It would be
 one-shot and always cleared by IRET. That would have been very handy.


How about a dedicated NMI masked flag in EFLAGS?  That would be
straightforward and dead simple to handle.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel