On Mon, Jan 25, 2016 at 01:34:13PM -0800, Andy Lutomirski wrote:
> Signals are always delivered to 64-bit tasks with CS set to a long
> mode segment.  In long mode, SS doesn't matter as long as it's a
> present writable segment.
> 
> If SS starts out invalid (this can happen if the signal was caused
> by an IRET fault or was delivered on the way out of set_thread_area
> or modify_ldt), then IRET to the signal handler can fail, eventually
> killing the task.
> 
> The straightforward fix would be to simply reset SS when delivering
> a signal.  That breaks DOSEMU, though: 64-bit builds of DOSEMU rely
> on SS being set to the faulting SS when signals are delivered.
> 
> As a compromise, this patch leaves SS alone so long as it's valid.
> 
> The net effect should be that the behavior of successfully delivered
> signals is unchanged.  Some signals that would previously have
> failed to be delivered will now be delivered successfully.
> 
> This has no effect for x32 or 32-bit tasks: their signal handlers
> were already called with SS == __USER_DS.
> 
> (On Xen, there's a slight hole: if a task sets SS to a writable
>  *kernel* data segment, then we will fail to identify it as invalid
>  and we'll still kill the task.  If anyone cares, this could be fixed
>  with a new paravirt hook.)
> 
> Signed-off-by: Andy Lutomirski <l...@kernel.org>
> ---
>  arch/x86/include/asm/desc_defs.h | 23 ++++++++++++++++++
>  arch/x86/kernel/signal.c         | 51 
> ++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 72 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/desc_defs.h 
> b/arch/x86/include/asm/desc_defs.h
> index 278441f39856..00971705a16d 100644
> --- a/arch/x86/include/asm/desc_defs.h
> +++ b/arch/x86/include/asm/desc_defs.h
> @@ -98,4 +98,27 @@ struct desc_ptr {
>  
>  #endif /* !__ASSEMBLY__ */
>  
> +/* Access rights as returned by LAR */
> +#define AR_TYPE_RODATA               (0 * (1 << 9))
> +#define AR_TYPE_RWDATA               (1 * (1 << 9))
> +#define AR_TYPE_RODATA_EXPDOWN       (2 * (1 << 9))
> +#define AR_TYPE_RWDATA_EXPDOWN       (3 * (1 << 9))
> +#define AR_TYPE_XOCODE               (4 * (1 << 9))
> +#define AR_TYPE_XRCODE               (5 * (1 << 9))
> +#define AR_TYPE_XOCODE_CONF  (6 * (1 << 9))
> +#define AR_TYPE_XRCODE_CONF  (7 * (1 << 9))
> +#define AR_TYPE_MASK         (7 * (1 << 9))
> +
> +#define AR_DPL0                      (0 * (1 << 13))
> +#define AR_DPL3                      (3 * (1 << 13))
> +#define AR_DPL_MASK          (3 * (1 << 13))
> +
> +#define AR_A                 (1 << 8)        /* A means "accessed" */
> +#define AR_S                 (1 << 12)       /* S means "not system" */

Ah, with "not system" you want to say that S=0b makes it a system
descriptor and S=1b a user. I think the SDM calls it more descriptively
the "S (descriptor type) flag" while the APM calls it simply the S-field
or S-bit.

I like "S (descriptor type) flag" more than "not system". :)

> +#define AR_P                 (1 << 15)       /* P means "present" */
> +#define AR_AVL                       (1 << 20)       /* AVL does nothing */

AVL = AVaiLable to software

> +#define AR_L                 (1 << 21)       /* L means "long mode" */
> +#define AR_DB                        (1 << 22)       /* D or B, depending on 
> type */
> +#define AR_G                 (1 << 23)       /* G means "limit in pages" */

Please use the names from the processor manuals. G is the Granularity
bit. "limit in pages" is only clear to the people who have already read
the Granularity bit description. :-)

>  #endif /* _ASM_X86_DESC_DEFS_H */
> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index cb6282c3638f..bb3e4208d90d 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -61,6 +61,35 @@
>       regs->seg = GET_SEG(seg) | 3;                   \
>  } while (0)
>  
> +#ifdef CONFIG_X86_64

You already have an

#else /* !CONFIG_X86_32 */

block above the 64-bit version of __setup_rt_frame(). Just put
force_valid_ss() there without that additional ifdef. That file's
ifdeffery is beyond any readability anyway.

> +/*
> + * If regs->ss will cause an IRET fault, change it.  Otherwise leave it
> + * alone.  Using this generally makes no sense unless
> + * user_64bit_mode(regs) would return true.
> + */
> +static void force_valid_ss(struct pt_regs *regs)
> +{
> +     u32 ar;
> +     asm volatile ("lar %[old_ss], %[ar]\n\t"
> +                   "jz 1f\n\t"               /* If invalid: */
> +                   "xorl %[ar], %[ar]\n\t"   /* set ar = 0 */
> +                   "1:"
> +                   : [ar] "=r" (ar)
> +                   : [old_ss] "rm" ((u16)regs->ss));
> +
> +     /*
> +      * For a valid 64-bit user context, we need DPL 3, type
> +      * read-write data or read-write exp-down data, and S and P
> +      * set.  We can't use VERW because VERW doesn't check the
> +      * P bit.
> +      */
> +     ar &= AR_DPL_MASK | AR_S | AR_P | AR_TYPE_MASK;
> +     if (ar != (AR_DPL3 | AR_S | AR_P | AR_TYPE_RWDATA) &&
> +         ar != (AR_DPL3 | AR_S | AR_P | AR_TYPE_RWDATA_EXPDOWN))
> +             regs->ss = __USER_DS;
> +}
> +#endif
> +
>  int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
>  {
>       unsigned long buf_val;
> @@ -459,10 +488,28 @@ static int __setup_rt_frame(int sig, struct ksignal 
> *ksig,
>  
>       regs->sp = (unsigned long)frame;
>  
> -     /* Set up the CS register to run signal handlers in 64-bit mode,
> -        even if the handler happens to be interrupting 32-bit code. */
> +     /*
> +      * Set up the CS and SS registers to run signal handlers in
> +      * 64-bit mode, even if the handler happens to be interrupting
> +      * 32-bit or 16-bit code.
> +      *
> +      * SS is subtle.  In 64-bit mode, we don't need any particular
> +      * SS descriptor, but we do need SS to be valid.  It's possible
> +      * that the old SS is entirely bogus -- this can happen if the
> +      * signal we're trying to deliver is #GP or #SS caused by a bad
> +      * SS value.  We also have a compatbility issue here: DOSEMU
> +      * relies on the contents of the SS register indicating the
> +      * SS value at the time of the signal, even though that code in
> +      * DOSEMU predates sigreturn's ability to restore SS.  (DOSEMU
> +      * avoids relying on sigreturn to restore SS; instead it uses
> +      * a trampoline.)  So we do our best: if the old SS was valid,
> +      * we keep it.  Otherwise we replace it.
> +      */
>       regs->cs = __USER_CS;
>  
> +     if (unlikely(regs->ss != __USER_DS))

So this is fast path AFAICT and from adding a gdb breakpoint here.

I guess we can't do the opt-in behavior and patch it out when users
don't want to run dosemu.

Or maybe we could add a CONFIG_CHECK_OLD_SS which is default y and
people can disable it... so an opt-out behavior :)

Hmmm...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

Reply via email to