Re: [PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and 64-bit effective addresses

2017-07-28 Thread Borislav Petkov
On Thu, Jul 27, 2017 at 07:04:52PM -0700, Ricardo Neri wrote:
> However using the union could be less readable than having two almost
> identical functions.

So having some small duplication for the sake of clarity and readability
is much better, if you ask me. And it's not like you're duplicating a
lot of code - it is only a handful of functions.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 24/26] x86: Enable User-Mode Instruction Prevention

2017-07-27 Thread Borislav Petkov
On Tue, Jul 25, 2017 at 05:44:08PM -0700, Ricardo Neri wrote:
> On Fri, 2017-06-09 at 18:10 +0200, Borislav Petkov wrote:
> > On Fri, May 05, 2017 at 11:17:22AM -0700, Ricardo Neri wrote:
> > > User_mode Instruction Prevention (UMIP) is enabled by setting/clearing a
> > > bit in %cr4.
> > > 
> > > It makes sense to enable UMIP at some point while booting, before user
> > > spaces come up. Like SMAP and SMEP, is not critical to have it enabled
> > > very early during boot. This is because UMIP is relevant only when there 
> > > is
> > > a userspace to be protected from. Given the similarities in relevance, it
> > > makes sense to enable UMIP along with SMAP and SMEP.
> > > 
> > > UMIP is enabled by default. It can be disabled by adding clearcpuid=514
> > > to the kernel parameters.

...

> So would this become a y when more machines have UMIP?

I guess. Stuff which proves reliable and widespread gets automatically
enabled with time, in most cases. IMHO, of course.

> Why would static_cpu_has() reply wrong if alternatives are not in place?
> Because it uses the boot CPU data? When it calls _static_cpu_has() it
> would do something equivalent to

Nevermind - I forgot that static_cpu_has() now drops to dynamic check
before alternatives application.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and 64-bit effective addresses

2017-07-27 Thread Borislav Petkov
On Tue, Jul 25, 2017 at 04:48:13PM -0700, Ricardo Neri wrote:
> I meant to say the 4 most significant bytes. In this case, the
> 64-address 0x1234 would lie in the kernel memory while
> 0x1234 would correctly be in the user space memory.

That explanation is better.

> Yes, perhaps the check above is not needed. I included that check as
> part of my argument validation. In a 64-bit kernel, this function could
> be called with val with non-zero most significant bytes.

So say that in the comment so that it is obvious *why*.

> I have looked into this closely and as far as I can see, the 4 least
> significant bytes will wrap around when using 64-bit signed numbers as
> they would when using 32-bit signed numbers. For instance, for two
> positive numbers we have:
> 
> 7fff: + 7000: = efff:.
> 
> The addition above overflows.

Yes, MSB changes.

> When sign-extended to 64-bit numbers we would have:
> 
> ::7fff: + ::7000: = ::efff:.
> 
> The addition above does not overflow. However, the 4 least significant
> bytes overflow as we expect.

No they don't - you are simply using 64-bit regs:

   0x46b8 <+8>: movq   $0x7fff,-0x8(%rbp)
   0x46c0 <+16>:movq   $0x7000,-0x10(%rbp)
   0x46c8 <+24>:mov-0x8(%rbp),%rdx
   0x46cc <+28>:mov-0x10(%rbp),%rax
=> 0x46d0 <+32>:add%rdx,%rax

rax0xefff   4026531839
rbx0x0  0
rcx0x0  0
rdx0x7fff   2147483647

...

eflags 0x206[ PF IF ]

(OF flag is not set).

> We can clamp the 4 most significant bytes.
> 
> For a two's complement negative numbers we can have:
> 
> : + 8000: = 7fff: with a carry flag.
> 
> The addition above overflows.

Yes.

> When sign-extending to 64-bit numbers we would have:
> 
> ::: + ::8000: = ::7fff: with a
> carry flag.
> 
> The addition above does not overflow. However, the 4 least significant
> bytes overflew and wrapped around as they would when using 32-bit signed
> numbers.

Right. Ok.

And come to think of it now, I'm wondering, whether it would be
better/easier/simpler/more straight-forward, to do the 32-bit operations
with 32-bit types and separate 32-bit functions and have the hardware do
that for you.

This way you can save yourself all that ugly and possibly error-prone
casting back and forth and have the code much more readable too.

Hmmm.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 10/26] x86/insn-eval: Add utility functions to get segment selector

2017-06-19 Thread Borislav Petkov
On Thu, Jun 15, 2017 at 11:37:51AM -0700, Ricardo Neri wrote:
> Wouldn't this be ending up mixing the actual segment register and
> segment register overrides? I plan to have a function that parses the
> segment override prefixes and returns SEG_REG_CS/DS/ES/FS/GS or
> SEG_REG_IGNORE for long mode or SEG_REG_DEFAULT when the default segment
> register needs to be used. A separate function will determine what such
> default segment register is. Does this make sense?

Yes.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 24/26] x86: Enable User-Mode Instruction Prevention

2017-06-09 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:22AM -0700, Ricardo Neri wrote:
> User_mode Instruction Prevention (UMIP) is enabled by setting/clearing a
> bit in %cr4.
> 
> It makes sense to enable UMIP at some point while booting, before user
> spaces come up. Like SMAP and SMEP, is not critical to have it enabled
> very early during boot. This is because UMIP is relevant only when there is
> a userspace to be protected from. Given the similarities in relevance, it
> makes sense to enable UMIP along with SMAP and SMEP.
> 
> UMIP is enabled by default. It can be disabled by adding clearcpuid=514
> to the kernel parameters.
> 
> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: Andrew Morton <a...@linux-foundation.org>
> Cc: H. Peter Anvin <h...@zytor.com>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Brian Gerst <brge...@gmail.com>
> Cc: Chen Yucong <sla...@gmail.com>
> Cc: Chris Metcalf <cmetc...@mellanox.com>
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Fenghua Yu <fenghua...@intel.com>
> Cc: Huang Rui <ray.hu...@amd.com>
> Cc: Jiri Slaby <jsl...@suse.cz>
> Cc: Jonathan Corbet <cor...@lwn.net>
> Cc: Michael S. Tsirkin <m...@redhat.com>
> Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: Shuah Khan <sh...@kernel.org>
> Cc: Vlastimil Babka <vba...@suse.cz>
> Cc: Tony Luck <tony.l...@intel.com>
> Cc: Paolo Bonzini <pbonz...@redhat.com>
> Cc: Liang Z. Li <liang.z...@intel.com>
> Cc: Alexandre Julliard <julli...@winehq.org>
> Cc: Stas Sergeev <s...@list.ru>
> Cc: x...@kernel.org
> Cc: linux-msdos@vger.kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/Kconfig | 10 ++
>  arch/x86/kernel/cpu/common.c | 16 +++-
>  2 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 702002b..1b1bbeb 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1745,6 +1745,16 @@ config X86_SMAP
>  
> If unsure, say Y.
>  
> +config X86_INTEL_UMIP
> + def_bool y

That's a bit too much. It makes sense on distro kernels but how many
machines out there actually have UMIP?

> + depends on CPU_SUP_INTEL
> + prompt "Intel User Mode Instruction Prevention" if EXPERT
> + ---help---
> +   The User Mode Instruction Prevention (UMIP) is a security
> +   feature in newer Intel processors. If enabled, a general
> +   protection fault is issued if the instructions SGDT, SLDT,
> +   SIDT, SMSW and STR are executed in user mode.
> +
>  config X86_INTEL_MPX
>   prompt "Intel MPX (Memory Protection Extensions)"
>   def_bool n
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 8ee3211..66ebded 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -311,6 +311,19 @@ static __always_inline void setup_smap(struct 
> cpuinfo_x86 *c)
>   }
>  }
>  
> +static __always_inline void setup_umip(struct cpuinfo_x86 *c)
> +{
> + if (cpu_feature_enabled(X86_FEATURE_UMIP) &&
> + cpu_has(c, X86_FEATURE_UMIP))

Hmm, so if UMIP is not build-time disabled, the cpu_feature_enabled()
will call static_cpu_has().

Looks like you want to call cpu_has() too because alternatives haven't
run yet and static_cpu_has() will reply wrong. Please state that in a
comment.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 23/26] x86/traps: Fixup general protection faults caused by UMIP

2017-06-09 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:21AM -0700, Ricardo Neri wrote:
> If the User-Mode Instruction Prevention CPU feature is available and
> enabled, a general protection fault will be issued if the instructions
> sgdt, sldt, sidt, str or smsw are executed from user-mode context
> (CPL > 0). If the fault was caused by any of the instructions protected
> by UMIP, fixup_umip_exception will emulate dummy results for these

Please end function names with parentheses.

> instructions. If emulation is successful, the result is passed to the
> user space program and no SIGSEGV signal is emitted.
> 
> Please note that fixup_umip_exception also caters for the case when
> the fault originated while running in virtual-8086 mode.
> 
> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: Andrew Morton <a...@linux-foundation.org>
> Cc: H. Peter Anvin <h...@zytor.com>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Brian Gerst <brge...@gmail.com>
> Cc: Chen Yucong <sla...@gmail.com>
> Cc: Chris Metcalf <cmetc...@mellanox.com>
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Fenghua Yu <fenghua...@intel.com>
> Cc: Huang Rui <ray.hu...@amd.com>
> Cc: Jiri Slaby <jsl...@suse.cz>
> Cc: Jonathan Corbet <cor...@lwn.net>
> Cc: Michael S. Tsirkin <m...@redhat.com>
> Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: Shuah Khan <sh...@kernel.org>
> Cc: Vlastimil Babka <vba...@suse.cz>
> Cc: Tony Luck <tony.l...@intel.com>
> Cc: Paolo Bonzini <pbonz...@redhat.com>
> Cc: Liang Z. Li <liang.z...@intel.com>
> Cc: Alexandre Julliard <julli...@winehq.org>
> Cc: Stas Sergeev <s...@list.ru>
> Cc: x...@kernel.org
> Cc: linux-msdos@vger.kernel.org
> Reviewed-by: Andy Lutomirski <l...@kernel.org>
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/kernel/traps.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 3995d3a..cec548d 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -65,6 +65,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #ifdef CONFIG_X86_64
>  #include 
> @@ -526,6 +527,9 @@ do_general_protection(struct pt_regs *regs, long 
> error_code)
>   RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
>   cond_local_irq_enable(regs);
>  

Almost definitely:

if (static_cpu_has(X86_FEATURE_UMIP)) {
if (...

> + if (user_mode(regs) && fixup_umip_exception(regs))
> + return;

We don't want to punish !UMIP machines.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 22/26] x86/umip: Force a page fault when unable to copy emulated result to user

2017-06-09 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:20AM -0700, Ricardo Neri wrote:
> fixup_umip_exception() will be called from do_general_protection. If the
  ^
  |
Please end function names with parentheses.---+

> former returns false, the latter will issue a SIGSEGV with SEND_SIG_PRIV.
> However, when emulation is successful but the emulated result cannot be
> copied to user space memory, it is more accurate to issue a SIGSEGV with
> SEGV_MAPERR with the offending address.
> A new function is inspired in

That reads funny.

> force_sig_info_fault is introduced to model the page fault.
> 
> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: Andrew Morton <a...@linux-foundation.org>
> Cc: H. Peter Anvin <h...@zytor.com>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Brian Gerst <brge...@gmail.com>
> Cc: Chen Yucong <sla...@gmail.com>
> Cc: Chris Metcalf <cmetc...@mellanox.com>
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Fenghua Yu <fenghua...@intel.com>
> Cc: Huang Rui <ray.hu...@amd.com>
> Cc: Jiri Slaby <jsl...@suse.cz>
> Cc: Jonathan Corbet <cor...@lwn.net>
> Cc: Michael S. Tsirkin <m...@redhat.com>
> Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: Shuah Khan <sh...@kernel.org>
> Cc: Vlastimil Babka <vba...@suse.cz>
> Cc: Tony Luck <tony.l...@intel.com>
> Cc: Paolo Bonzini <pbonz...@redhat.com>
> Cc: Liang Z. Li <liang.z...@intel.com>
> Cc: Alexandre Julliard <julli...@winehq.org>
> Cc: Stas Sergeev <s...@list.ru>
> Cc: x...@kernel.org
> Cc: linux-msdos@vger.kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/kernel/umip.c | 45 +++--
>  1 file changed, 43 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
> index c7c5795..ff7366a 100644
> --- a/arch/x86/kernel/umip.c
> +++ b/arch/x86/kernel/umip.c
> @@ -148,6 +148,41 @@ static int __emulate_umip_insn(struct insn *insn, enum 
> umip_insn umip_inst,
>  }
>  
>  /**
> + * __force_sig_info_umip_fault() - Force a SIGSEGV with SEGV_MAPERR
> + * @address: Address that caused the signal
> + * @regs:Register set containing the instruction pointer
> + *
> + * Force a SIGSEGV signal with SEGV_MAPERR as the error code. This function 
> is
> + * intended to be used to provide a segmentation fault when the result of the
> + * UMIP emulation could not be copied to the user space memory.
> + *
> + * Return: none
> + */
> +static void __force_sig_info_umip_fault(void __user *address,
> + struct pt_regs *regs)
> +{
> + siginfo_t info;
> + struct task_struct *tsk = current;
> +
> + if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)) {

Save an indentation level:

if (!(show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)))
return;

printk...



> + printk_ratelimited("%s[%d] umip emulation segfault ip:%lx 
> sp:%lx error:%x in %lx\n",
> +tsk->comm, task_pid_nr(tsk), regs->ip,
> +regs->sp, X86_PF_USER | X86_PF_WRITE,
> +regs->ip);
> + }
> +
> + tsk->thread.cr2 = (unsigned long)address;
> + tsk->thread.error_code  = X86_PF_USER | X86_PF_WRITE;
> + tsk->thread.trap_nr = X86_TRAP_PF;
> +
> + info.si_signo   = SIGSEGV;
> + info.si_errno   = 0;
> + info.si_code= SEGV_MAPERR;
> + info.si_addr= address;
> + force_sig_info(SIGSEGV, , tsk);
> +}
> +
> +/**
>   * fixup_umip_exception() - Fixup #GP faults caused by UMIP
>   * @regs:Registers as saved when entering the #GP trap
>   *
> @@ -235,8 +270,14 @@ bool fixup_umip_exception(struct pt_regs *regs)
>   if ((unsigned long)uaddr == -1L)
>   return false;
>   nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
> - if (nr_copied  > 0)
> - return false;
> + if (nr_copied  > 0) {
> + /*
> +  * If copy fails, send a signal and tell caller that
> +  * fault was fixed up

Pls end sentences in the comments with a fullstop.

> +  */
> + __for

Re: [PATCH v7 21/26] x86: Add emulation code for UMIP instructions

2017-06-08 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:19AM -0700, Ricardo Neri wrote:
> The feature User-Mode Instruction Prevention present in recent Intel
> processor prevents a group of instructions from being executed with
> CPL > 0. Otherwise, a general protection fault is issued.

This is one of the best opening paragraphs of a commit message I've
read this year! This is how you open: short, succinct, to the point, no
marketing bullshit. Good!

> Rather than relaying this fault to the user space (in the form of a SIGSEGV
> signal), the instructions protected by UMIP can be emulated to provide
> dummy results. This allows to conserve the current kernel behavior and not
> reveal the system resources that UMIP intends to protect (the global
> descriptor and interrupt descriptor tables, the segment selectors of the
> local descriptor table and the task state and the machine status word).
> 
> This emulation is needed because certain applications (e.g., WineHQ and
> DOSEMU2) rely on this subset of instructions to function.
> 
> The instructions protected by UMIP can be split in two groups. Those who

s/who/which/

> return a kernel memory address (sgdt and sidt) and those who return a

ditto.

> value (sldt, str and smsw).
>
> For the instructions that return a kernel memory address, applications
> such as WineHQ rely on the result being located in the kernel memory space.
> The result is emulated as a hard-coded value that, lies close to the top
> of the kernel memory. The limit for the GDT and the IDT are set to zero.

Nice.

> Given that sldt and str are not used in common in programs supported by

You wanna say "in common programs" here? Or "not commonly used in programs" ?

> WineHQ and DOSEMU2, they are not emulated.
> 
> The instruction smsw is emulated to return the value that the register CR0
> has at boot time as set in the head_32.
> 
> Care is taken to appropriately emulate the results when segmentation is
> used. This is, rather than relying on USER_DS and USER_CS, the function

"That is,... "

> insn_get_addr_ref() inspects the segment descriptor pointed by the
> registers in pt_regs. This ensures that we correctly obtain the segment
> base address and the address and operand sizes even if the user space
> application uses local descriptor table.

Btw, I could very well use all that nice explanation in umip.c too so
that the high-level behavior is documented.

> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: Andrew Morton <a...@linux-foundation.org>
> Cc: H. Peter Anvin <h...@zytor.com>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Brian Gerst <brge...@gmail.com>
> Cc: Chen Yucong <sla...@gmail.com>
> Cc: Chris Metcalf <cmetc...@mellanox.com>
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Fenghua Yu <fenghua...@intel.com>
> Cc: Huang Rui <ray.hu...@amd.com>
> Cc: Jiri Slaby <jsl...@suse.cz>
> Cc: Jonathan Corbet <cor...@lwn.net>
> Cc: Michael S. Tsirkin <m...@redhat.com>
> Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: Shuah Khan <sh...@kernel.org>
> Cc: Vlastimil Babka <vba...@suse.cz>
> Cc: Tony Luck <tony.l...@intel.com>
> Cc: Paolo Bonzini <pbonz...@redhat.com>
> Cc: Liang Z. Li <liang.z...@intel.com>
> Cc: Alexandre Julliard <julli...@winehq.org>
> Cc: Stas Sergeev <s...@list.ru>
> Cc: x...@kernel.org
> Cc: linux-msdos@vger.kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/include/asm/umip.h |  15 +++
>  arch/x86/kernel/Makefile|   1 +
>  arch/x86/kernel/umip.c  | 245 
> 
>  3 files changed, 261 insertions(+)
>  create mode 100644 arch/x86/include/asm/umip.h
>  create mode 100644 arch/x86/kernel/umip.c
> 
> diff --git a/arch/x86/include/asm/umip.h b/arch/x86/include/asm/umip.h
> new file mode 100644
> index 000..077b236
> --- /dev/null
> +++ b/arch/x86/include/asm/umip.h
> @@ -0,0 +1,15 @@
> +#ifndef _ASM_X86_UMIP_H
> +#define _ASM_X86_UMIP_H
> +
> +#include 
> +#include 
> +
> +#ifdef CONFIG_X86_INTEL_UMIP
> +bool fixup_umip_exception(struct pt_regs *regs);
> +#else
> +static inline bool fixup_umip_exception(struct pt_regs *regs)
> +{
> + return false;
> +}

Let's save some header lines:

static inline bool fixup_umip_exception(struct pt_regs *regs)   { return false; 
}

those trunks take too much space as it is.

> +#endif  /* CONFIG_X86_INTEL_UMIP */
> +#endif  /* _ASM_X86_UMIP_H */
> diff --git a/arch/x86/kernel/Makefile b/arch/x86

Re: [PATCH v7 20/26] x86/cpufeature: Add User-Mode Instruction Prevention definitions

2017-06-07 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:18AM -0700, Ricardo Neri wrote:
> User-Mode Instruction Prevention is a security feature present in new
> Intel processors that, when set, prevents the execution of a subset of
> instructions if such instructions are executed in user mode (CPL > 0).
> Attempting to execute such instructions causes a general protection
> exception.
> 
> The subset of instructions comprises:
> 
>  * SGDT - Store Global Descriptor Table
>  * SIDT - Store Interrupt Descriptor Table
>  * SLDT - Store Local Descriptor Table
>  * SMSW - Store Machine Status Word
>  * STR  - Store Task Register
> 
> This feature is also added to the list of disabled-features to allow
> a cleaner handling of build-time configuration.
> 
> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: Andrew Morton <a...@linux-foundation.org>
> Cc: H. Peter Anvin <h...@zytor.com>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Brian Gerst <brge...@gmail.com>
> Cc: Chen Yucong <sla...@gmail.com>
> Cc: Chris Metcalf <cmetc...@mellanox.com>
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Fenghua Yu <fenghua...@intel.com>
> Cc: Huang Rui <ray.hu...@amd.com>
> Cc: Jiri Slaby <jsl...@suse.cz>
> Cc: Jonathan Corbet <cor...@lwn.net>
> Cc: Michael S. Tsirkin <m...@redhat.com>
> Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: Shuah Khan <sh...@kernel.org>
> Cc: Vlastimil Babka <vba...@suse.cz>
> Cc: Tony Luck <tony.l...@intel.com>
> Cc: Paolo Bonzini <pbonz...@redhat.com>
> Cc: Liang Z. Li <liang.z...@intel.com>
> Cc: Alexandre Julliard <julli...@winehq.org>
> Cc: Stas Sergeev <s...@list.ru>
> Cc: x...@kernel.org
> Cc: linux-msdos@vger.kernel.org
> 
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/include/asm/cpufeatures.h  | 1 +
>  arch/x86/include/asm/disabled-features.h| 8 +++-
>  arch/x86/include/uapi/asm/processor-flags.h | 2 ++
>  3 files changed, 10 insertions(+), 1 deletion(-)

Reviewed-by: Borislav Petkov <b...@suse.de>

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 18/26] x86/insn-eval: Add support to resolve 16-bit addressing encodings

2017-06-07 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:16AM -0700, Ricardo Neri wrote:
> Tasks running in virtual-8086 mode or in protected mode with code
> segment descriptors that specify 16-bit default address sizes via the
> D bit will use 16-bit addressing form encodings as described in the Intel
> 64 and IA-32 Architecture Software Developer's Manual Volume 2A Section
> 2.1.5. 16-bit addressing encodings differ in several ways from the
> 32-bit/64-bit addressing form encodings: ModRM.rm points to different
> registers and, in some cases, effective addresses are indicated by the
> addition of the value of two registers. Also, there is no support for SIB
> bytes. Thus, a separate function is needed to parse this form of
> addressing.
> 
> A couple of functions are introduced. get_reg_offset_16() obtains the
> offset from the base of pt_regs of the registers indicated by the ModRM
> byte of the address encoding. get_addr_ref_16() computes the linear
> address indicated by the instructions using the value of the registers
> given by ModRM as well as the base address of the segment.
> 
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 155 
> +++
>  1 file changed, 155 insertions(+)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 9822061..928a662 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -431,6 +431,73 @@ static int get_reg_offset(struct insn *insn, struct 
> pt_regs *regs,
>  }
>  
>  /**
> + * get_reg_offset_16 - Obtain offset of register indicated by instruction

Please end function names with parentheses.

> + * @insn:Instruction structure containing ModRM and SiB bytes

s/SiB/SIB/g

> + * @regs:Structure with register values as seen when entering kernel mode
> + * @offs1:   Offset of the first operand register
> + * @offs2:   Offset of the second opeand register, if applicable.
> + *
> + * Obtain the offset, in pt_regs, of the registers indicated by the ModRM 
> byte
> + * within insn. This function is to be used with 16-bit address encodings. 
> The
> + * offs1 and offs2 will be written with the offset of the two registers
> + * indicated by the instruction. In cases where any of the registers is not
> + * referenced by the instruction, the value will be set to -EDOM.
> + *
> + * Return: 0 on success, -EINVAL on failure.
> + */
> +static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
> +  int *offs1, int *offs2)
> +{
> + /* 16-bit addressing can use one or two registers */
> + static const int regoff1[] = {
> + offsetof(struct pt_regs, bx),
> + offsetof(struct pt_regs, bx),
> + offsetof(struct pt_regs, bp),
> + offsetof(struct pt_regs, bp),
> + offsetof(struct pt_regs, si),
> + offsetof(struct pt_regs, di),
> + offsetof(struct pt_regs, bp),
> + offsetof(struct pt_regs, bx),
> + };
> +
> + static const int regoff2[] = {
> + offsetof(struct pt_regs, si),
> + offsetof(struct pt_regs, di),
> + offsetof(struct pt_regs, si),
> + offsetof(struct pt_regs, di),
> + -EDOM,
> + -EDOM,
> + -EDOM,
> + -EDOM,
> + };

You mean "Table 2-1. 16-Bit Addressing Forms with the ModR/M Byte" in
the SDM, right?

Please add a comment pointing to it here because it is not trivial to
map that code to the documentation.

> +
> + if (!offs1 || !offs2)
> + return -EINVAL;
> +
> + /* operand is a register, use the generic function */
> + if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> + *offs1 = insn_get_modrm_rm_off(insn, regs);
> + *offs2 = -EDOM;
> + return 0;
> + }
> +
> + *offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
> +

Re: [PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and 64-bit effective addresses

2017-06-07 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:14AM -0700, Ricardo Neri wrote:
> @@ -697,18 +753,21 @@ void __user *insn_get_addr_ref(struct insn *insn, 
> struct pt_regs *regs)
>  {
>   unsigned long linear_addr, seg_base_addr, seg_limit;
>   long eff_addr, base, indx;
> - int addr_offset, base_offset, indx_offset;
> + int addr_offset, base_offset, indx_offset, addr_bytes;
>   insn_byte_t sib;
>  
>   insn_get_modrm(insn);
>   insn_get_sib(insn);
>   sib = insn->sib.value;
> + addr_bytes = insn->addr_bytes;
>  
>   if (X86_MODRM_MOD(insn->modrm.value) == 3) {
>   addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
>   if (addr_offset < 0)
>   goto out_err;
> - eff_addr = regs_get_register(regs, addr_offset);
> + eff_addr = get_mem_offset(regs, addr_offset, addr_bytes);
> + if (eff_addr == -1L)
> + goto out_err;
>   seg_base_addr = insn_get_seg_base(regs, insn, addr_offset);
>   if (seg_base_addr == -1L)
>   goto out_err;

This code here is too dense, it needs spacing for better readability.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and 64-bit effective addresses

2017-06-07 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:14AM -0700, Ricardo Neri wrote:
> The 32-bit and 64-bit address encodings are identical. This means that we
> can use the same function in both cases. In order to reuse the function
> for 32-bit address encodings, we must sign-extend our 32-bit signed
> operands to 64-bit signed variables (only for 64-bit builds). To decide on
> whether sign extension is needed, we rely on the address size as given by
> the instruction structure.
> 
> Once the effective address has been computed, a special verification is
> needed for 32-bit processes. If running on a 64-bit kernel, such processes
> can address up to 4GB of memory. Hence, for instance, an effective
> address of 0x1234 would be misinterpreted as 0x1234 due to
> the sign extension mentioned above. For this reason, the 4 must be

Which 4?

> truncated to obtain the true effective address.
> 
> Lastly, before computing the linear address, we verify that the effective
> address is within the limits of the segment. The check is kept for long
> mode because in such a case the limit is set to -1L. This is the largest
> unsigned number possible. This is equivalent to a limit-less segment.
> 
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 99 
> ++--
>  1 file changed, 88 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 1a5f5a6..c7c1239 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -688,6 +688,62 @@ int insn_get_modrm_rm_off(struct insn *insn, struct 
> pt_regs *regs)
>   return get_reg_offset(insn, regs, REG_TYPE_RM);
>  }
>  
> +/**
> + * _to_signed_long() - Cast an unsigned long into signed long
> + * @val  A 32-bit or 64-bit unsigned long
> + * @long_bytes   The number of bytes used to represent a long number
> + * @out  The casted signed long
> + *
> + * Return: A signed long of either 32 or 64 bits, as per the build 
> configuration
> + * of the kernel.
> + */
> +static int _to_signed_long(unsigned long val, int long_bytes, long *out)
> +{
> + if (!out)
> + return -EINVAL;
> +
> +#ifdef CONFIG_X86_64
> + if (long_bytes == 4) {
> + /* higher bytes should all be zero */
> + if (val & ~0x)
> + return -EINVAL;
> +
> + /* sign-extend to a 64-bit long */

So this is a 32-bit userspace on a 64-bit kernel, right?

If so, how can a memory offset be > 32-bits and we have to extend it to
a 64-bit long?!?

I *think* you want to say that you want to convert it to long so that
you can do the calculation in longs.

However!

If you're a 64-bit kernel running a 32-bit userspace, you need to do
the calculation in 32-bits only so that it overflows, as it would do
on 32-bit hardware. IOW, the clamping to 32-bits at the end is not
something you wanna do but actually let it wrap if it overflows.

Or am I missing something?

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 14/26] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 5

2017-06-07 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:12AM -0700, Ricardo Neri wrote:
> Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
> Developer's Manual volume 2A states that when ModRM.mod is zero and
> ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means
> that none of the registers are used in the computation of the effective
> address. A return value of -EDOM signals callers that they should not use
> the value of registers when computing the effective address for the
> instruction.
> 
> In IA-32e 64-bit mode (long mode), the effective address is given by the
> 32-bit displacement plus the value of RIP of the next instruction.
> In IA-32e compatibility mode (protected mode), only the displacement is
> used.
> 
> The instruction decoder takes care of obtaining the displacement.

...

> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 693e5a8..4f600de 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -379,6 +379,12 @@ static int get_reg_offset(struct insn *insn, struct 
> pt_regs *regs,
>   switch (type) {
>   case REG_TYPE_RM:
>   regno = X86_MODRM_RM(insn->modrm.value);


< newline here.

> + /*
> +  * ModRM.mod == 0 and ModRM.rm == 5 means a 32-bit displacement
> +  * follows the ModRM byte.
> +  */
> + if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
> + return -EDOM;
>   if (X86_REX_B(insn->rex_prefix.value))
>   regno += 8;
>   break;
> @@ -730,9 +736,21 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
> pt_regs *regs)
>   eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
>   } else {
>   addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
> - if (addr_offset < 0)

ditto.

> + /*
> +  * -EDOM means that we must ignore the address_offset.
> +  * In such a case, in 64-bit mode the effective address
> +  * relative to the RIP of the following instruction.
> +  */
> + if (addr_offset == -EDOM) {
> + eff_addr = 0;
> + if (user_64bit_mode(regs))
> + eff_addr = (long)regs->ip +
> +insn->length;

Let that line stick out and write it balanced:

if (addr_offset == -EDOM) {
if (user_64bit_mode(regs))
eff_addr = (long)regs->ip + 
insn->length;
else
eff_addr = 0;

should be easier parseable this way.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 13/26] x86/insn-eval: Add function to get default params of code segment

2017-06-07 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:11AM -0700, Ricardo Neri wrote:
> This function returns the default values of the address and operand sizes
> as specified in the segment descriptor. This information is determined
> from the D and L bits. Hence, it can be used for both IA-32e 64-bit and
> 32-bit legacy modes. For virtual-8086 mode, the default address and
> operand sizes are always 2 bytes.
> 
> The D bit is only meaningful for code segments. Thus, these functions
> always use the code segment selector contained in regs.
> 
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/include/asm/insn-eval.h |  6 
>  arch/x86/lib/insn-eval.c | 65 
> 
>  2 files changed, 71 insertions(+)
> 
> diff --git a/arch/x86/include/asm/insn-eval.h 
> b/arch/x86/include/asm/insn-eval.h
> index 7f3c7fe..9ed1c88 100644
> --- a/arch/x86/include/asm/insn-eval.h
> +++ b/arch/x86/include/asm/insn-eval.h
> @@ -11,9 +11,15 @@
>  #include 
>  #include 
>  
> +struct insn_code_seg_defaults {

A whole struct for a function which gets called only once?

Bah, that's a bit too much, if you ask me.

So you're returning two small unsigned integers - i.e., you can just as
well return a single u8 and put address and operand sizes in there:

ret = oper_sz | addr_sz << 4;

No need for special structs for that.

> + unsigned char address_bytes;
> + unsigned char operand_bytes;
> +};
> +
>  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
>  int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
>  unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
>   int regoff);
> +struct insn_code_seg_defaults insn_get_code_seg_defaults(struct pt_regs 
> *regs);
>  
>  #endif /* _ASM_X86_INSN_EVAL_H */
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index c77ed80..693e5a8 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -603,6 +603,71 @@ static unsigned long get_seg_limit(struct pt_regs *regs, 
> struct insn *insn,
>  }
>  
>  /**
> + * insn_get_code_seg_defaults() - Obtain code segment default parameters
> + * @regs:Structure with register values as seen when entering kernel mode
> + *
> + * Obtain the default parameters of the code segment: address and operand 
> sizes.
> + * The code segment is obtained from the selector contained in the CS 
> register
> + * in regs. In protected mode, the default address is determined by 
> inspecting
> + * the L and D bits of the segment descriptor. In virtual-8086 mode, the 
> default
> + * is always two bytes for both address and operand sizes.
> + *
> + * Return: A populated insn_code_seg_defaults structure on success. The
> + * structure contains only zeros on failure.

s/failure/error/

> + */
> +struct insn_code_seg_defaults insn_get_code_seg_defaults(struct pt_regs 
> *regs)
> +{
> + struct desc_struct *desc;
> + struct insn_code_seg_defaults defs;
> + unsigned short sel;
> + /*
> +  * The most significant byte of AR_TYPE_MASK determines whether a
> +  * segment contains data or code.
> +  */
> + unsigned int type_mask = AR_TYPE_MASK & (1 << 11);
> +
> + memset(, 0, sizeof(defs));
> +
> + if (v8086_mode(regs)) {
> + defs.address_bytes = 2;
> + defs.operand_bytes = 2;
> + return defs;
> + }
> +
> + sel = (unsigned short)regs->cs;
> +
> + desc = get_desc(sel);
> + if (!desc)
> + return defs;
> +
> + /* if data segment, return */
> + if (!(desc->b & type_mask))
> + return defs;

So you can simplify that into:

/* A code segment? */
if (!(desc->b & BIT(11)))
return defs;

and remove that type_mask thing.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 07/26] x86/insn-eval: Do not BUG on invalid register type

2017-06-07 Thread Borislav Petkov
On Tue, Jun 06, 2017 at 05:28:52PM -0700, Ricardo Neri wrote:
> I see. You were more concerned about the naming of the coding artifacts
> (e.g., function names, error prints, etc) than the actual filenames.

Well, I'm not sure here. We could either have a generalized prefix or
put the function name in there - __func__ - for easier debuggability,
i.e., find the origin of the error message faster.

But I'm sensing that we're already well inside the bikeshed so let's not
change anything now. :)

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 09/26] x86/insn-eval: Add utility function to identify string instructions

2017-06-06 Thread Borislav Petkov
On Mon, Jun 05, 2017 at 11:01:21PM -0700, Ricardo Neri wrote:
> If I was to leave out string instructions from this function it should
> be renamed as is_string_instruction_non_lods_outs. In my opinion this
> separation makes the code more clear and I would end up having logic to
> decide which segment register to use in two places. Does it makes sense
> to you?

Ok, sure.

Thanks.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 07/26] x86/insn-eval: Do not BUG on invalid register type

2017-06-06 Thread Borislav Petkov
On Mon, Jun 05, 2017 at 11:06:58PM -0700, Ricardo Neri wrote:
> I agree that insn-eval reads somewhat funny. I did not want to go with
> insn-dec.c as insn.c, in my opinion, already decodes the instruction
> (i.e., it finds prefixes, opcodes, ModRM, SIB and displacement bytes).
> In insn-eval.c I simply take those decoded parameters and evaluate them
> to obtain the values they contain (e.g., a specific memory location).
> Perhaps, insn-resolve.c could be a better name? Or maybe isnn-operands?

So actually I'm gravitating towards calling all that instruction
"massaging" code with a single prefix to denote this comes from the insn
decoder/handler/whatever...

I.e.,

"insn-decoder: x86: invalid register type"

or

"inat: x86: invalid register type"

or something to that effect.

I mean, If we're going to grow our own - as we do, apparently - maybe it
all should be a separate entity with its proper name.

Hmm.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 12/26] x86/insn-eval: Add utility functions to get segment descriptor base address and limit

2017-05-31 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:10AM -0700, Ricardo Neri wrote:
> With segmentation, the base address of the segment descriptor is needed
> to compute a linear address. The segment descriptor used in the address
> computation depends on either any segment override prefixes in the
> instruction or the default segment determined by the registers involved
> in the address computation. Thus, both the instruction as well as the
> register (specified as the offset from the base of pt_regs) are given as
> inputs, along with a boolean variable to select between override and
> default.

...

> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index f46cb31..c77ed80 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -476,6 +476,133 @@ static struct desc_struct *get_desc(unsigned short sel)
>  }
>  
>  /**
> + * insn_get_seg_base() - Obtain base address of segment descriptor.
> + * @regs:Structure with register values as seen when entering kernel mode
> + * @insn:Instruction structure with selector override prefixes
> + * @regoff:  Operand offset, in pt_regs, of which the selector is needed
> + *
> + * Obtain the base address of the segment descriptor as indicated by either
> + * any segment override prefixes contained in insn or the default segment
> + * applicable to the register indicated by regoff. regoff is specified as the
> + * offset in bytes from the base of pt_regs.
> + *
> + * Return: In protected mode, base address of the segment. Zero in for long
> + * mode, except when FS or GS are used. In virtual-8086 mode, the segment
> + * selector shifted 4 positions to the right. -1L in case of
> + * error.
> + */
> +unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
> + int regoff)
> +{
> + struct desc_struct *desc;
> + unsigned short sel;
> + enum segment_register seg_reg;
> +
> + seg_reg = resolve_seg_register(insn, regs, regoff);
> + if (seg_reg == SEG_REG_INVAL)
> + return -1L;
> +
> + sel = get_segment_selector(regs, seg_reg);
> + if ((short)sel < 0)

I guess it would be better if that function returned a signed short so
you don't have to cast it here. (You're casting it to an unsigned long
below anyway.)

> + return -1L;
> +
> + if (v8086_mode(regs))
> + /*
> +  * Base is simply the segment selector shifted 4
> +  * positions to the right.
> +  */
> + return (unsigned long)(sel << 4);
> +

...

> +static unsigned long get_seg_limit(struct pt_regs *regs, struct insn *insn,
> +int regoff)
> +{
> + struct desc_struct *desc;
> + unsigned short sel;
> + unsigned long limit;
> + enum segment_register seg_reg;
> +
> + seg_reg = resolve_seg_register(insn, regs, regoff);
> + if (seg_reg == SEG_REG_INVAL)
> + return 0;
> +
> + sel = get_segment_selector(regs, seg_reg);
> + if ((short)sel < 0)

Ditto.

> + return 0;
> +
> + if (user_64bit_mode(regs) || v8086_mode(regs))
> + return -1L;
> +
> + if (!sel)
> + return 0;
> +
> + desc = get_desc(sel);
> + if (!desc)
> + return 0;
> +
> + /*
> +  * If the granularity bit is set, the limit is given in multiples
> +  * of 4096. When the granularity bit is set, the least 12 significant

 the 12 least significant 
bits

> +  * bits are not tested when checking the segment limits. In practice,
> +  * this means that the segment ends in (limit << 12) + 0xfff.
> +  */
> + limit = get_desc_limit(desc);
> + if (desc->g)
> + limit <<= 12 | 0x7;

That 0x7 doesn't look like 0xfff - it shifts limit by 15 instead. You
can simply write it like you mean it:

limit = (limit << 12) + 0xfff;


-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 10/26] x86/insn-eval: Add utility functions to get segment selector

2017-05-30 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:08AM -0700, Ricardo Neri wrote:
> When computing a linear address and segmentation is used, we need to know
> the base address of the segment involved in the computation. In most of
> the cases, the segment base address will be zero as in USER_DS/USER32_DS.
> However, it may be possible that a user space program defines its own
> segments via a local descriptor table. In such a case, the segment base
> address may not be zero .Thus, the segment base address is needed to
> calculate correctly the linear address.
> 
> The segment selector to be used when computing a linear address is
> determined by either any of segment override prefixes in the
> instruction or inferred from the registers involved in the computation of
> the effective address; in that order. Also, there are cases when the
> overrides shall be ignored (code segments are always selected by the CS
> segment register; string instructions always use the ES segment register
> along with the EDI register).
> 
> For clarity, this process can be split into two steps: resolving the
> relevant segment register to use and, once known, read its value to
> obtain the segment selector.
> 
> The method to obtain the segment selector depends on several factors. In
> 32-bit builds, segment selectors are saved into the pt_regs structure
> when switching to kernel mode. The same is also true for virtual-8086
> mode. In 64-bit builds, segmentation is mostly ignored, except when
> running a program in 32-bit legacy mode. In this case, CS and SS can be
> obtained from pt_regs. DS, ES, FS and GS can be read directly from
> the respective segment registers.
> 
> Lastly, the only two segment registers that are not ignored in long mode
> are FS and GS. In these two cases, base addresses are obtained from the
> respective MSRs.
> 
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 256 
> +++
>  1 file changed, 256 insertions(+)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 1634762..0a496f4 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  enum reg_type {
>   REG_TYPE_RM = 0,
> @@ -33,6 +34,17 @@ enum string_instruction {
>   SCASW_SCASD = 0xaf,
>  };
>  
> +enum segment_register {
> + SEG_REG_INVAL = -1,
> + SEG_REG_IGNORE = 0,
> + SEG_REG_CS = 0x23,
> + SEG_REG_SS = 0x36,
> + SEG_REG_DS = 0x3e,
> + SEG_REG_ES = 0x26,
> + SEG_REG_FS = 0x64,
> + SEG_REG_GS = 0x65,
> +};

Yuck, didn't we talk about this already?

Those are segment override prefixes so call them as such.

#define SEG_OVR_PFX_CS  0x23
#define SEG_OVR_PFX_SS  0x36
...

and we already have those!

arch/x86/include/asm/inat.h:
...
#define INAT_PFX_CS 5   /* 0x2E */
#define INAT_PFX_DS 6   /* 0x3E */
#define INAT_PFX_ES 7   /* 0x26 */
#define INAT_PFX_FS 8   /* 0x64 */
#define INAT_PFX_GS 9   /* 0x65 */
#define INAT_PFX_SS 10  /* 0x36 */

well, kinda, they're numbers there and not the actual prefix values.

And then there's:

arch/x86/kernel/uprobes.c::is_prefix_bad() which looks at some of those.

Please add your defines to inat.h and make that function is_prefix_bad()
use them instead of naked numbers. We need to pay attention to all those
different things needing to look at insn opcodes and not let them go
unwieldy by each defining and duplicating stuff.

>  /**
>   * is_string_instruction - Determine if instruction is a string instruction
>   * @insn:Instruction structure containing the opcode
> @@ -83,6 +95,250 @@ static bool is_string_instruction(struct insn *insn)
>   }
>  }
>  
> +/**
> + * resolve_seg_register() - obtain segment register

That function is still returning the segment override prefix and we use
*that* to determine the segment register.

>

Re: [PATCH v7 09/26] x86/insn-eval: Add utility function to identify string instructions

2017-05-29 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:07AM -0700, Ricardo Neri wrote:
> String instructions are special because in protected mode, the linear
> address is always obtained via the ES segment register in operands that
> use the (E)DI register.

 ... and DS for rSI.

If we're going to account for both operands of string instructions with
two operands.

Btw, LODS and OUTS use only DS:rSI as a source operand. So we have to be
careful with the generalization here. So if ES:rDI is the only seg. reg
we want, then we don't need to look at those insns... (we assume DS by
default).

...

> +/**
> + * is_string_instruction - Determine if instruction is a string instruction
> + * @insn:Instruction structure containing the opcode
> + *
> + * Return: true if the instruction, determined by the opcode, is any of the
> + * string instructions as defined in the Intel Software Development manual.
> + * False otherwise.
> + */
> +static bool is_string_instruction(struct insn *insn)
> +{
> + insn_get_opcode(insn);
> +
> + /* all string instructions have a 1-byte opcode */
> + if (insn->opcode.nbytes != 1)
> + return false;
> +
> + switch (insn->opcode.bytes[0]) {
> + case INSB:
> + /* fall through */
> + case INSW_INSD:
> + /* fall through */
> + case OUTSB:
> + /* fall through */
> + case OUTSW_OUTSD:
> + /* fall through */
> + case MOVSB:
> + /* fall through */
> + case MOVSW_MOVSD:
> + /* fall through */
> + case CMPSB:
> + /* fall through */
> + case CMPSW_CMPSD:
> + /* fall through */
> + case STOSB:
> + /* fall through */
> + case STOSW_STOSD:
> + /* fall through */
> + case LODSB:
> + /* fall through */
> + case LODSW_LODSD:
> + /* fall through */
> + case SCASB:
> + /* fall through */

That "fall through" for every opcode is just too much. Also, you can use
the regularity of the x86 opcode space and do:

case 0x6c ... 0x6f: /* INS/OUTS */
case 0xa4 ... 0xa7: /* MOVS/CMPS */
case 0xaa ... 0xaf: /* STOS/LODS/SCAS */
return true;
default:
return false;
}

And voila, there's your compact is_string_insn() function! :^)

(Modulo the exact list, as I mentioned above).

Thanks.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 08/26] x86/insn-eval: Add a utility function to get register offsets

2017-05-29 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:06AM -0700, Ricardo Neri wrote:
> The function get_reg_offset() returns the offset to the register the
> argument specifies as indicated in an enumeration of type offset. Callers
> of this function would need the definition of such enumeration. This is
> not needed. Instead, add helper functions for this purpose. These functions
> are useful in cases when, for instance, the caller needs to decide whether
> the operand is a register or a memory location by looking at the rm part
> of the ModRM byte. As of now, this is the only helper function that is
> needed.
> 
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/include/asm/insn-eval.h |  1 +
>  arch/x86/lib/insn-eval.c | 15 +++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/arch/x86/include/asm/insn-eval.h 
> b/arch/x86/include/asm/insn-eval.h
> index 5cab1b1..7e8c963 100644
> --- a/arch/x86/include/asm/insn-eval.h
> +++ b/arch/x86/include/asm/insn-eval.h
> @@ -12,5 +12,6 @@
>  #include 
>  
>  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
> +int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
>  
>  #endif /* _ASM_X86_INSN_EVAL_H */
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 182e2ae..8b16761 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -97,6 +97,21 @@ static int get_reg_offset(struct insn *insn, struct 
> pt_regs *regs,
>   return regoff[regno];
>  }
>  
> +/**
> + * insn_get_reg_offset_modrm_rm() - Obtain register in r/m part of ModRM byte

That name needs to be synced with the function name below.

> + * @insn:Instruction structure containing the ModRM byte
> + * @regs:Structure with register values as seen when entering kernel mode
> + *
> + * Return: The register indicated by the r/m part of the ModRM byte. The
> + * register is obtained as an offset from the base of pt_regs. In specific
> + * cases, the returned value can be -EDOM to indicate that the particular 
> value
> + * of ModRM does not refer to a register and shall be ignored.
> + */
> +int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)


-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 07/26] x86/insn-eval: Do not BUG on invalid register type

2017-05-29 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:05AM -0700, Ricardo Neri wrote:
> We are not in a critical failure path. The invalid register type is caused
> when trying to decode invalid instruction bytes from a user-space program.
> Thus, simply print an error message. To prevent this warning from being
> abused from user space programs, use the rate-limited variant of printk.
> 
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index e746a6f..182e2ae 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -5,6 +5,7 @@
>   */
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -85,9 +86,8 @@ static int get_reg_offset(struct insn *insn, struct pt_regs 
> *regs,
>   break;
>  
>   default:
> - pr_err("invalid register type");
> - BUG();
> - break;
> + printk_ratelimited(KERN_ERR "insn-eval: x86: invalid register 
> type");

You can use pr_err_ratelimited() and define "insn-eval" with pr_fmt.
Look for examples in the tree.

Btw, "insn-eval" is perhaps not the right name - since we're building
an instruction decoder, maybe it should be called "insn-dec" or so. I'm
looking at those other arch/x86/lib/insn.c, arch/x86/include/asm/inat.h
things and how they're starting to morph into one decoding facility,
AFAICT.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 05/26] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0

2017-05-29 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:03AM -0700, Ricardo Neri wrote:
> Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
> Developer's Manual volume 2A states that when a SIB byte is used and the
> base of the SIB byte points is base = 101b and the mod part
> of the ModRM byte is zero, the base port on the effective address
> computation is null. In this case, a 32-bit displacement follows the SIB
> byte. This is obtained when the instruction decoder parses the operands.
> 
> To signal this scenario, a -EDOM error is returned to indicate callers that
> they should ignore the base.
> 
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Nathan Howard <liverl...@gmail.com>
> Cc: Adan Hawthorn <adanhawth...@gmail.com>
> Cc: Joe Perches <j...@perches.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/mm/mpx.c | 27 ---
>  1 file changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
> index 7397b81..30aef92 100644
> --- a/arch/x86/mm/mpx.c
> +++ b/arch/x86/mm/mpx.c
> @@ -122,6 +122,15 @@ static int get_reg_offset(struct insn *insn, struct 
> pt_regs *regs,
>  
>   case REG_TYPE_BASE:
>   regno = X86_SIB_BASE(insn->sib.value);
> + /*
> +  * If ModRM.mod is 0 and SIB.base == 5, the base of the
> +  * register-indirect addressing is 0. In this case, a
> +  * 32-bit displacement is expected in this case; the
> +  * instruction decoder finds such displacement for us.

That last sentence reads funny. Just say:

"In this case, a 32-bit displacement follows the SIB byte."

> +  */
> + if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
> + return -EDOM;
> +
>   if (X86_REX_B(insn->rex_prefix.value))
>   regno += 8;
>   break;

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 02/26] x86/mm: Relocate page fault error codes to traps.h

2017-05-27 Thread Borislav Petkov
On Fri, May 26, 2017 at 08:40:26PM -0700, Ricardo Neri wrote:
> This change was initially intended to only rename the error codes,
> without functional changes. Would making change be considered a change
> in functionality?

How?

The before-and-after asm should be the identical.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 04/26] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b

2017-05-24 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:02AM -0700, Ricardo Neri wrote:
> Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
> Developer's Manual volume 2A states that when ModRM.mod !=11b and
> ModRM.rm = 100b indexed register-indirect addressing is used. In other
> words, a SIB byte follows the ModRM byte. In the specific case of
> SIB.index = 100b, the scale*index portion of the computation of the
> effective address is null. To signal callers of this particular situation,
> get_reg_offset() can return -EDOM (-EINVAL continues to indicate that an
> error when decoding the SIB byte).
> 
> An example of this situation can be the following instruction:
> 
>8b 4c 23 80   mov -0x80(%rbx,%riz,1),%rcx
>ModRM:0x4c [mod:1b][reg:1b][rm:100b]
>SIB:  0x23 [scale:0b][index:100b][base:11b]
>Displacement: 0x80  (1-byte, as per ModRM.mod = 1b)
> 
> The %riz 'register' indicates a null index.
> 
> In long mode, a REX prefix may be used. When a REX prefix is present,
> REX.X adds a fourth bit to the register selection of SIB.index. This gives
> the ability to refer to all the 16 general purpose registers. When REX.X is
> 1b and SIB.index is 100b, the index is indicated in %r12. In our example,
> this would look like:
> 
>42 8b 4c 23 80mov -0x80(%rbx,%r12,1),%rcx
>REX:  0x42 [W:0b][R:0b][X:1b][B:0b]
>ModRM:0x4c [mod:1b][reg:1b][rm:100b]
>SIB:  0x23 [scale:0b][.X: 1b, index:100b][.B:0b, base:11b]
>Displacement: 0x80  (1-byte, as per ModRM.mod = 1b)
> 
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Nathan Howard <liverl...@gmail.com>
> Cc: Adan Hawthorn <adanhawth...@gmail.com>
> Cc: Joe Perches <j...@perches.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/mm/mpx.c | 20 ++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
> index ebdead8..7397b81 100644
> --- a/arch/x86/mm/mpx.c
> +++ b/arch/x86/mm/mpx.c
> @@ -110,6 +110,14 @@ static int get_reg_offset(struct insn *insn, struct 
> pt_regs *regs,
>   regno = X86_SIB_INDEX(insn->sib.value);
>   if (X86_REX_X(insn->rex_prefix.value))
>   regno += 8;

<--- newline.

> + /*
> +  * If ModRM.mod !=3 and SIB.index (regno=4) the scale*index
> +  * portion of the address computation is null. This is
> +  * true only if REX.X is 0. In such a case, the SIB index
> +  * is used in the address computation.
> +  */
> + if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
> + return -EDOM;
>   break;
>  
>   case REG_TYPE_BASE:
> @@ -159,11 +167,19 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
> struct pt_regs *regs)
>   goto out_err;
>  
>   indx_offset = get_reg_offset(insn, regs, 
> REG_TYPE_INDEX);
> - if (indx_offset < 0)

<--- newline.

> + /*
> +  * A negative offset generally means a error, except

 an

> +  * -EDOM, which means that the contents of the register
> +  * should not be used as index.
> +  */
> + if (indx_offset == -EDOM)
> + indx = 0;
> + else if (indx_offset < 0)
>   goto out_err;
> + else
> + indx = regs_get_register(regs, indx_offset);
>  
>   base = regs_get_register(regs, base_offset);
> - indx = regs_get_register(regs, indx_offset);
>   eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
>   } else {
>   addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
> -- 
> 2.9.3
> 

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 02/26] x86/mm: Relocate page fault error codes to traps.h

2017-05-21 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:00AM -0700, Ricardo Neri wrote:
> Up to this point, only fault.c used the definitions of the page fault error
> codes. Thus, it made sense to keep them within such file. Other portions of
> code might be interested in those definitions too. For instance, the User-
> Mode Instruction Prevention emulation code will use such definitions to
> emulate a page fault when it is unable to successfully copy the results
> of the emulated instructions to user space.
> 
> While relocating the error code enumeration, the prefix X86_ is used to
> make it consistent with the rest of the definitions in traps.h. Of course,
> code using the enumeration had to be updated as well. No functional changes
> were performed.
> 
> Cc: Thomas Gleixner <t...@linutronix.de>
> Cc: Ingo Molnar <mi...@redhat.com>
> Cc: "H. Peter Anvin" <h...@zytor.com>
> Cc: Andy Lutomirski <l...@kernel.org>
> Cc: "Kirill A. Shutemov" <kirill.shute...@linux.intel.com>
> Cc: Josh Poimboeuf <jpoim...@redhat.com>
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> Cc: x...@kernel.org
> Reviewed-by: Andy Lutomirski <l...@kernel.org>
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/include/asm/traps.h | 18 +
>  arch/x86/mm/fault.c  | 88 
> +---
>  2 files changed, 52 insertions(+), 54 deletions(-)

...

> @@ -1382,7 +1362,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long 
> error_code,
>* space check, thus avoiding the deadlock:
>*/
>   if (unlikely(!down_read_trylock(>mmap_sem))) {
> -     if ((error_code & PF_USER) == 0 &&
> + if ((error_code & X86_PF_USER) == 0 &&

if (!(error_code & X86_PF_USER))

With that fixed:

Reviewed-by: Borislav Petkov <b...@suse.de>

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 01/26] ptrace,x86: Make user_64bit_mode() available to 32-bit builds

2017-05-21 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:16:59AM -0700, Ricardo Neri wrote:
> In its current form, user_64bit_mode() can only be used when CONFIG_X86_64
> is selected. This implies that code built with CONFIG_X86_64=n cannot use
> it. If a piece of code needs to be built for both CONFIG_X86_64=y and
> CONFIG_X86_64=n and wants to use this function, it needs to wrap it in
> an #ifdef/#endif; potentially, in multiple places.
> 
> This can be easily avoided with a single #ifdef/#endif pair within
> user_64bit_mode() itself.
> 
> Suggested-by: Borislav Petkov <b...@suse.de>
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/include/asm/ptrace.h | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)

Reviewed-by: Borislav Petkov <b...@suse.de>

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 07/21] x86/insn-eval: Add utility function to get segment descriptor

2017-05-15 Thread Borislav Petkov
On Thu, May 11, 2017 at 07:13:57PM -0700, Ricardo Neri wrote:
> Sure I can. Would this trigger a v8 of my series? I was hoping v7 series
> could be merged and then start doing incremental work on top of it. Does
> it make sense?

I guess that's tip guys' call.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 12/21] x86/insn: Support both signed 32-bit and 64-bit effective addresses

2017-05-08 Thread Borislav Petkov
On Wed, Apr 26, 2017 at 08:33:46PM -0700, Ricardo Neri wrote:
> This is the reason I check the value of long_bytes. If long_bytes is not
> 4, being the only other possible value 8 (perhaps I need to issue an
> error when the value is not any of these values),

Well, maybe I'm a bit too paranoid. Bottom line is, we should do the
address computations exactly like the hardware does them so that there
are no surprises. Doing them with longs looks ok to me.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 10/21] x86/insn-eval: Do not use R/EBP as base if mod in ModRM is zero

2017-05-07 Thread Borislav Petkov
On Wed, Apr 26, 2017 at 06:29:59PM -0700, Ricardo Neri wrote:
> > if (X86_MODRM_MOD(insn->modrm.value) == 0 &&
> > X86_MODRM_RM(insn->modrm.value)  == 5)
> > 
> > looks more understandable to me.
> 
> Should I go with !(X86_MODRM_MOD(insn->modrm.value)) as you suggested in
> other patches?

Ah, yes pls.

Thanks.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 08/21] x86/insn-eval: Add utility function to get segment descriptor base address

2017-05-05 Thread Borislav Petkov
On Wed, Apr 26, 2017 at 03:52:41PM -0700, Ricardo Neri wrote:
> Probably insn_get_seg_base() itself can verify if there are segment
> override prefixes in the struct insn. If yes, use them except for
> specific cases such as CS.

... and depending on whether in long mode or not.

> On an unrelated note, I still have the problem of using DS vs ES for
> string instructions. Perhaps instead of a use_default_seg flag, a
> string_instruction flag that indicates how to determine the default
> segment.

... or you can look at the insn opcode directly. AFAICT, you need
to check whether the opcode is 0xa4 or 0xa5 and that the insn is a
single-byte opcode, i.e., not from the secondary map escaped with 0xf or
some of the other multi-byte opcode maps.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 08/21] x86/insn-eval: Add utility function to get segment descriptor base address

2017-05-05 Thread Borislav Petkov
On Wed, Apr 26, 2017 at 03:37:44PM -0700, Ricardo Neri wrote:
> I need a human-readable way of identifying what segment selector (in
> pt_regs, vm86regs or directly reading the segment registers) to use.
> Since there is a segment override prefix for all of them, I thought I
> could use them.

Yes, you should...

> Perhaps I can rename enum segment to enum segment_selector and comment
> that the values in the enum are those of the override prefixes. Would
> that be reasonable?

... but you should call them what they are: "enum seg_override_pfxs" or
"enum seg_ovr_pfx" or...

Or somesuch. I suck at naming stuff.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 07/21] x86/insn-eval: Add utility function to get segment descriptor

2017-05-04 Thread Borislav Petkov
On Wed, Apr 26, 2017 at 02:51:56PM -0700, Ricardo Neri wrote:
> > > +  seg >= current->active_mm->context.ldt->size)) {
> > 
> > ldt->size is the size of the descriptor table but you've shifted seg by
> > 3. That selector index is shifted by 3 (to the left) to form an offset
> > into the descriptor table because the entries there are 8 bytes.
> 
> I double-checked the ldt code and it seems to me that size refers to the
> number of entries in the table; it is always multiplied by
> LDT_ENTRY_SIZE [1], [2]. Am I missing something?

No, you're not. I fell into that wrongly named struct member trap.

So ldt_struct.size should actually be called ldt_struct.n_entries or
similar. Because what's in there is now is not "size".

And then code like

new_ldt->size * LDT_ENTRY_SIZE

would make much more sense if written like this:

new_ldt->n_entries * LDT_ENTRY_SIZE

Would you fix that in a prepatch pls?

Thanks.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 06/21] x86/insn-eval: Add utility functions to get segment selector

2017-04-30 Thread Borislav Petkov
On Wed, Apr 26, 2017 at 01:44:43PM -0700, Ricardo Neri wrote:
> I regard that the role of this function is to obtain the the segment
> selector from either of the prefixes or inferred from the operands. It
> is the role of caller to determine if the segment selector should be
> ignored.

No, this is wrong. The function is called resolve_seg_selector() and it
gives you the segment selector. CS, DS, ES, and SS in 64-bit mode are
treated as null segments and your function should return/signal exactly
that, i.e, saying that those should be ignored in that case.

> I double-checked the latest version of the Intel Software Development
> manual [2], in the table 3-5 in section 3.7.4 mentions that DS is
> default segment for all data references, except string destinations. I
> tested this code with the UMIP-protected instructions and whenever I use
> %edi the default segment is %ds.

Yes, all correct. Except that we're adding a more-or-less generic x86
insn decoder so we should make it so...

> Is this example valid? The documentation of MOVS specifies that it
> always moves DS:(E)SI to ES:(E)DI.

... that the decoder should do exactly that:

if (MOVS and rDI)
return SEG_ES;

And you're handing in struct insn * so you can easily check which insn
you're looking at.

Thanks.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 05/21] x86/insn-eval: Add utility functions to get register offsets

2017-04-28 Thread Borislav Petkov
On Wed, Apr 26, 2017 at 11:13:44AM -0700, Ricardo Neri wrote:
> Masami Hiramatsu had originally requested to add the two functions. I
> suppose the unneeded functions could be added if/when needed.

Yap, exactly.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 03/21] x86/mpx: Do not use R/EBP as base in the SIB byte with Mod = 0

2017-04-26 Thread Borislav Petkov
On Tue, Apr 25, 2017 at 07:04:20PM -0700, Ricardo Neri wrote:
> For the specific case of ModRM.mod being 0, I feel I need to clarify
> that REX.B is not decoded and if SIB.base is %r13 the base is also 0.

Well, that all doesn't matter. The rule is this:

ModRM.mod == 00b and ModRM.r/m == 101b -> effective address: disp32

See Table 2-2. "32-Bit Addressing Forms with the ModR/M Byte" in the SDM.

So the base register is not used. How that base register is specified
then doesn't matter (undecoded REX bits or not).

> This comment adds clarity because REX.X is decoded when determining
> SIB.index.

Well, that's a different thing. The REX bits participating in the SIB
fields don't matter about this particular case. We only want to say that
we're returning a disp32 without a base register and the comment should
keep it simple without extraneous information.

I know, you want to mention what Table 2-5. "Special Cases of REX
Encodings" says but we should avoid unnecessary content in the comment.
People who want details can stare at the manuals - the comment should
only document what that particular case is.

Btw, you could write it even better:

if (!X86_MODRM_MOD(insn->modrm.value) && 
X86_MODRM_RM(insn->modrm.value) == 5)

and then it is basically a 1:1 copy of the rule from Table 2-2.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 12/21] x86/insn: Support both signed 32-bit and 64-bit effective addresses

2017-04-25 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:45PM -0800, Ricardo Neri wrote:
> The 32-bit and 64-bit address encodings are identical. This means that we
> can use the same function in both cases. In order to reuse the function for
> 32-bit address encodings, we must sign-extend our 32-bit signed operands to
> 64-bit signed variables (only for 64-bit builds). To decide on whether sign
> extension is needed, we rely on the address size as given by the
> instruction structure.
> 
> Lastly, before computing the linear address, we must truncate our signed
> 64-bit signed effective address if the address size is 32-bit.
> 
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 44 
>  1 file changed, 32 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index edb360f..a9a1704 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -559,6 +559,15 @@ int insn_get_reg_offset_sib_index(struct insn *insn, 
> struct pt_regs *regs)
>   return get_reg_offset(insn, regs, REG_TYPE_INDEX);
>  }
>  
> +static inline long __to_signed_long(unsigned long val, int long_bytes)
> +{
> +#ifdef CONFIG_X86_64
> + return long_bytes == 4 ? (long)((int)((val) & 0x)) : (long)val;

I don't think this always works as expected:

---
typedef unsigned int u32;
typedef unsigned long u64;

int main()
{
u64 v = 0x1;

printf("v: %ld, 0x%lx, %ld\n", v, v, (long)((int)((v) & 0x)));

return 0;
}
--
...

v: 8589934591, 0x1, -1

Now, this should not happen on 32-bit because unsigned long is 32-bit
there but can that happen on 64-bit?

> +#else
> + return (long)val;
> +#endif
> +}
> +
>  /*
>   * return the address being referenced be instruction
>   * for rm=3 returning the content of the rm reg
> @@ -567,19 +576,21 @@ int insn_get_reg_offset_sib_index(struct insn *insn, 
> struct pt_regs *regs)
>  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
>  {
>   unsigned long linear_addr, seg_base_addr;
> - long eff_addr, base, indx;
> - int addr_offset, base_offset, indx_offset;
> + long eff_addr, base, indx, tmp;
> + int addr_offset, base_offset, indx_offset, addr_bytes;
>   insn_byte_t sib;
>  
>   insn_get_modrm(insn);
>   insn_get_sib(insn);
>   sib = insn->sib.value;
> + addr_bytes = insn->addr_bytes;
>  
>   if (X86_MODRM_MOD(insn->modrm.value) == 3) {
>   addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
>   if (addr_offset < 0)
>   goto out_err;
> - eff_addr = regs_get_register(regs, addr_offset);
> + tmp = regs_get_register(regs, addr_offset);
> + eff_addr = __to_signed_long(tmp, addr_bytes);

This repeats throughout the function so it begs to be a separate:

get_mem_addr()

or so.

>   seg_base_addr = insn_get_seg_base(regs, insn, addr_offset,
> false);
>   } else {
> @@ -591,20 +602,24 @@ void __user *insn_get_addr_ref(struct insn *insn, 
> struct pt_regs *regs)
>* in the address computation.
>*/
>   base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
> - if (unlikely(base_offset == -EDOM))
> + if (unlikely(base_offset == -EDOM)) {
>   base = 0;
> - else if (unlikely(base_offset < 0))
> + } else if (unlikely(base_offset < 0)) {
>   goto out_err;
> - else
> - base = regs_get_register(regs, base_offset);
> + } else {
> + tmp = regs_get_register(regs, base_offset);
> +  

Re: [v6 PATCH 11/21] insn/eval: Incorporate segment base in address computation

2017-04-21 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:44PM -0800, Ricardo Neri wrote:
> insn_get_addr_ref returns the effective address as defined by the

Please end function names with parentheses.

> section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
> Developer's Manual. In order to compute the linear address, we must add
> to the effective address the segment base address as set in the segment
> descriptor. Furthermore, the segment descriptor to use depends on the
> register that is used as the base of the effective address. The effective
> base address varies depending on whether the operand is a register or a
> memory address and on whether a SiB byte is used.
> 
> In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
> segment is used or if segmentation is not used. However, the base address
> is not necessarily zero if a user programs defines its own segments. This
> is possible by using a local descriptor table.
> 
> Since the effective address is a signed quantity, the unsigned segment
> base address saved in a separate variable and added to the final effective

".. is saved..."

> address.
> 

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 10/21] x86/insn-eval: Do not use R/EBP as base if mod in ModRM is zero

2017-04-21 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:43PM -0800, Ricardo Neri wrote:
> Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
> Developer's Manual volume 2A states that when the mod part of the ModRM
> byte is zero and R/EBP is specified in the R/M part of such bit, the value
> of the aforementioned register should not be used in the address
> computation. Instead, a 32-bit displacement is expected. The instruction
> decoder takes care of setting the displacement to the expected value.
> Returning -EDOM signals callers that they should ignore the value of such
> register when computing the address encoded in the instruction operands.
> 
> Also, callers should exercise care to correctly interpret this particular
> case. In IA-32e 64-bit mode, the address is given by the displacement plus
> the value of the RIP. In IA-32e compatibility mode, the value of EIP is
> ignored. This correction is done for our insn_get_addr_ref.
> 
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 25 +++--
>  1 file changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index cda6c71..ea10b03 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -250,6 +250,14 @@ static int get_reg_offset(struct insn *insn, struct 
> pt_regs *regs,
>   switch (type) {
>   case REG_TYPE_RM:
>   regno = X86_MODRM_RM(insn->modrm.value);
> + /* if mod=0, register R/EBP is not used in the address
> +  * computation. Instead, a 32-bit displacement is expected;
> +  * the instruction decoder takes care of reading such
> +  * displacement. This is true for both R/EBP and R13, as the
> +  * REX.B bit is not decoded.
> +  */

I'd simply write here: "ModRM.mod == 0 and ModRM.rm == 5 means a 32-bit
displacement is following."

In addition, kernel comments style is:

/*
 * A sentence ending with a full-stop.
 * Another sentence. ...
 * More sentences. ...
 */

> + if (regno == 5 && X86_MODRM_MOD(insn->modrm.value) == 0)
> + return -EDOM;

if (X86_MODRM_MOD(insn->modrm.value) == 0 &&
X86_MODRM_RM(insn->modrm.value)  == 5)

looks more understandable to me.

>   if (X86_REX_B(insn->rex_prefix.value))
>   regno += 8;
>   break;
> @@ -599,9 +607,22 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
> pt_regs *regs)
>   eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
>   } else {
>   addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
> - if (addr_offset < 0)
> + /* -EDOM means that we must ignore the address_offset.
> +  * The only case in which we see this value is when
> +  * R/M points to R/EBP. In such a case, in 64-bit mode
> +  * the effective address is relative to tho RIP.

s/tho//

> +  */

Kernel comments style is:

/*
 * A sentence ending with a full-stop.
 * Another sentence. ...
 * More sentences. ...
 */

> + if (addr_offset == -EDOM) {
> + eff_addr = 0;
> +#ifdef CONFIG_X86_64
> + if (user_64bit_mode(regs))
> + eff_addr = (long)regs->ip;

Is regs->ip the rIP of the *following* insn?

> +#endif

You can do this in a prepatch and then get rid of the ifdeffery here:

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 2b5d686ea9f3..f6239273c5f1 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -115,9 +115,9 @@ static inline int v8086_mode(struct pt_regs *regs)
 #endif
 }
 
-#ifdef CONFIG_

Re: [v6 PATCH 09/21] x86/insn-eval: Add functions to get default operand and address sizes

2017-04-20 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:42PM -0800, Ricardo Neri wrote:
> These functions read the default values of the address and operand sizes
> as specified in the segment descriptor. This information is determined
> from the D and L bits. Hence, it can be used for both IA-32e 64-bit and
> 32-bit legacy modes. For virtual-8086 mode, the default address and
> operand sizes are always 2 bytes.

Yeah, we tend to call that customarily 16-bit :)

> The D bit is only meaningful for code segments. Thus, these functions
> always use the code segment selector contained in regs.
> 
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/include/asm/insn-eval.h |  2 +
>  arch/x86/lib/insn-eval.c | 80 
> 
>  2 files changed, 82 insertions(+)
> 
> diff --git a/arch/x86/include/asm/insn-eval.h 
> b/arch/x86/include/asm/insn-eval.h
> index b201742..a0d81fc 100644
> --- a/arch/x86/include/asm/insn-eval.h
> +++ b/arch/x86/include/asm/insn-eval.h
> @@ -15,6 +15,8 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
> pt_regs *regs);
>  int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
>  int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
>  int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
> +unsigned char insn_get_seg_default_address_bytes(struct pt_regs *regs);
> +unsigned char insn_get_seg_default_operand_bytes(struct pt_regs *regs);
>  unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
>   int regoff, bool use_default_seg);
>  
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 383ca83..cda6c71 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -421,6 +421,86 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, 
> struct insn *insn,
>  }
>  
>  /**
> + * insn_get_seg_default_address_bytes - Obtain default address size of 
> segment
> + * @regs:Set of registers containing the segment selector
> + *
> + * Obtain the default address size as indicated in the segment descriptor
> + * selected in regs' code segment selector. In protected mode, the default
> + * address is determined by inspecting the L and D bits of the segment
> + * descriptor. In virtual-8086 mode, the default is always two bytes.
> + *
> + * Return: Default address size of segment

0 on error.

> + */
> +unsigned char insn_get_seg_default_address_bytes(struct pt_regs *regs)
> +{
> + struct desc_struct *desc;
> + unsigned short seg;
> + int ret;
> +
> + if (v8086_mode(regs))
> + return 2;
> +
> + seg = (unsigned short)regs->cs;
> +
> + ret = get_desc(seg, );
> + if (ret)
> + return 0;
> +
> + switch ((desc->l << 1) | desc->d) {
> + case 0: /* Legacy mode. 16-bit addresses. CS.L=0, CS.D=0 */
> + return 2;
> + case 1: /* Legacy mode. 32-bit addresses. CS.L=0, CS.D=1 */
> + return 4;
> + case 2: /* IA-32e 64-bit mode. 64-bit addresses. CS.L=1, CS.D=0 */
> + return 8;
> + case 3: /* Invalid setting. CS.L=1, CS.D=1 */
> + /* fall through */
> + default:
> + return 0;
> + }
> +}
> +
> +/**
> + * insn_get_seg_default_operand_bytes - Obtain default operand size of 
> segment
> + * @regs:Set of registers containing the segment selector
> + *
> + * Obtain the default operand size as indicated in the segment descriptor
> + * selected in regs' code segment selector. In protected mode, the default
> + * operand size is determined by inspecting the L and D bits of the segment
> + * descriptor. In virtual-8086 mode, the default is always two bytes.
> + *
> + * Return: Default operand size of segment
> + */
> +unsigned char insn_get_seg_default_operand_bytes(struct pt_regs *regs)

Right, so default

Re: [v6 PATCH 08/21] x86/insn-eval: Add utility function to get segment descriptor base address

2017-04-20 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:41PM -0800, Ricardo Neri wrote:
> With segmentation, the base address of the segment descriptor is needed
> to compute a linear address. The segment descriptor used in the address
> computation depends on either any segment override prefixes in the in the

s/in the //

> instruction or the default segment determined by the registers involved
> in the address computation. Thus, both the instruction as well as the
> register (specified as the offset from the base of pt_regs) are given as
> inputs, along with a boolean variable to select between override and
> default.
> 
> The segment selector is determined by get_seg_selector with the inputs

Please end function names with parentheses: get_seg_selector().

> described above. Once the selector is known the base address is

known, ...

> determined. In protected mode, the selector is used to obtain the segment
> descriptor and then its base address. If in 64-bit user mode, the segment =
> base address is zero except when FS or GS are used. In virtual-8086 mode,
> the base address is computed as the value of the segment selector shifted 4
> positions to the left.

Good.

> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/include/asm/insn-eval.h |  2 ++
>  arch/x86/lib/insn-eval.c | 66 
> 
>  2 files changed, 68 insertions(+)
> 
> diff --git a/arch/x86/include/asm/insn-eval.h 
> b/arch/x86/include/asm/insn-eval.h
> index 754211b..b201742 100644
> --- a/arch/x86/include/asm/insn-eval.h
> +++ b/arch/x86/include/asm/insn-eval.h
> @@ -15,5 +15,7 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
> pt_regs *regs);
>  int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
>  int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
>  int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
> +unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
> + int regoff, bool use_default_seg);
>  
>  #endif /* _ASM_X86_INSN_EVAL_H */
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 8608adf..383ca83 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -355,6 +355,72 @@ static int get_desc(unsigned short seg, struct 
> desc_struct **desc)
>  }
>  
>  /**
> + * insn_get_seg_base() - Obtain base address contained in descriptor
> + * @regs:Set of registers containing the segment selector
> + * @insn:Instruction structure with selector override prefixes
> + * @regoff:  Operand offset, in pt_regs, of which the selector is needed
> + * @use_default_seg: Use the default segment instead of prefix overrides

I'm wondering whether you really need that bool or you can deduce this
from pt_regs... I guess I'll see...

> + *
> + * Obtain the base address of the segment descriptor as indicated by either
> + * any segment override prefixes contained in insn or the default segment
> + * applicable to the register indicated by regoff. regoff is specified as the
> + * offset in bytes from the base of pt_regs.
> + *
> + * Return: In protected mode, base address of the segment. It may be zero in
> + * certain cases for 64-bit builds and/or 64-bit applications. In 
> virtual-8086
> + * mode, the segment selector shifed 4 positions to the right. -1L in case of

s/shifed/shifted/

> + * error.
> + */
> +unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
> + int regoff, bool use_default_seg)
> +{
> + struct desc_struct *desc;
> + unsigned short seg;
> + enum segment seg_type;
> + int ret;
> +
> + seg_type = resolve_seg_selector(insn, regoff, use_default_seg);

<--- error handling.

And that's not really a "seg_type" but simply the "sel"-ector. And that
"enum segment" is no

Re: [v6 PATCH 07/21] x86/insn-eval: Add utility function to get segment descriptor

2017-04-19 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:40PM -0800, Ricardo Neri wrote:
> The segment descriptor contains information that is relevant to how linear
> address need to be computed. It contains the default size of addresses as
> well as the base address of the segment. Thus, given a segment selector,
> we ought look at segment descriptor to correctly calculate the linear
> address.
> 
> In protected mode, the segment selector might indicate a segment
> descriptor from either the global descriptor table or a local descriptor
> table. Both cases are considered in this function.
> 
> This function is the initial implementation for subsequent functions that
> will obtain the aforementioned attributes of the segment descriptor.
> 
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 61 
> 
>  1 file changed, 61 insertions(+)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 8d45df8..8608adf 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -5,9 +5,13 @@
>   */
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  
>  enum reg_type {
> @@ -294,6 +298,63 @@ static int get_reg_offset(struct insn *insn, struct 
> pt_regs *regs,
>  }
>  
>  /**
> + * get_desc() - Obtain address of segment descriptor
> + * @seg: Segment selector

Maybe that should be

@sel

if it is a sel-ector. :)

And using "sel" makes more sense then when you look at:

desc_base = sel & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);

for example:

> + * @desc:Pointer to the selected segment descriptor
> + *
> + * Given a segment selector, obtain a memory pointer to the segment

s/memory //

> + * descriptor. Both global and local descriptor tables are supported.
> + * desc will contain the address of the descriptor.
> + *
> + * Return: 0 if success, -EINVAL if failure

Why isn't this function returning the pointer or NULL on error? Maybe
the later patches have an answer and I'll discover it if I continue
reviewing :)

> + */
> +static int get_desc(unsigned short seg, struct desc_struct **desc)
> +{
> + struct desc_ptr gdt_desc = {0, 0};
> + unsigned long desc_base;
> +
> + if (!desc)
> + return -EINVAL;
> +
> + desc_base = seg & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);

That looks useless as you're doing it below again.

> +
> +#ifdef CONFIG_MODIFY_LDT_SYSCALL
> + if ((seg & SEGMENT_TI_MASK) == SEGMENT_LDT) {
> + seg >>= 3;
> +
> + mutex_lock(>active_mm->context.lock);
> + if (unlikely(!current->active_mm->context.ldt ||

Is that really a fast path to complicate the if-test with an unlikely()?
If not, you don't really need it.

> +  seg >= current->active_mm->context.ldt->size)) {

ldt->size is the size of the descriptor table but you've shifted seg by
3. That selector index is shifted by 3 (to the left) to form an offset
into the descriptor table because the entries there are 8 bytes.

So I *think* you wanna use the "useless" desc_base above... :)

> + *desc = NULL;
> + mutex_unlock(>active_mm->context.lock);
> + return -EINVAL;
> + }
> +
> + *desc = >active_mm->context.ldt->entries[seg];

... and seg here as it is an index into the table.

> + mutex_unlock(>active_mm->context.lock);
> + return 0;
> + }
> +#endif
> + native_store_gdt(_desc);
> +
> + /*
> +  * Bits [15:3] of the segment selector contain the index. Such
> +  * index needs to be multiplied by 8.

... because .

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 06/21] x86/insn-eval: Add utility functions to get segment selector

2017-04-18 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:39PM -0800, Ricardo Neri wrote:
> When computing a linear address and segmentation is used, we need to know
> the base address of the segment involved in the computation. In most of
> the cases, the segment base address will be zero as in USER_DS/USER32_DS.
> However, it may be possible that a user space program defines its own
> segments via a local descriptor table. In such a case, the segment base
> address may not be zero .Thus, the segment base address is needed to
> calculate correctly the linear address.
> 
> The segment selector to be used when computing a linear address is
> determined by either any of segment select override prefixes in the
> instruction or inferred from the registers involved in the computation of
> the effective address; in that order. Also, there are cases when the
> overrides shall be ignored.
> 
> For clarity, this process can be split into two steps: resolving the
> relevant segment and, once known, read the applicable segment selector.
> The method to obtain the segment selector depends on several factors. In
> 32-bit builds, segment selectors are saved into the pt_regs structure
> when switching to kernel mode. The same is also true for virtual-8086
> mode. In 64-bit builds, segmentation is mostly ignored, except when
> running a program in 32-bit legacy mode. In this case, CS and SS can be
> obtained from pt_regs. DS, ES, FS and GS can be read directly from
> registers.

> Lastly, segmentation is possible in 64-bit mode via FS and GS.

I'd say "Lastly, the only two segment registers which are not ignored in
long mode are FS and GS."

> In these two cases, base addresses are obtained from the relevant MSRs.

s/relevant/respective/

> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/lib/insn-eval.c | 195 
> +++
>  1 file changed, 195 insertions(+)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 78df1c9..8d45df8 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -8,6 +8,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  enum reg_type {
>   REG_TYPE_RM = 0,
> @@ -15,6 +16,200 @@ enum reg_type {
>   REG_TYPE_BASE,
>  };
>  
> +enum segment {
> + SEG_CS = 0x23,
> + SEG_SS = 0x36,
> + SEG_DS = 0x3e,
> + SEG_ES = 0x26,
> + SEG_FS = 0x64,
> + SEG_GS = 0x65
> +};
> +
> +/**
> + * resolve_seg_selector() - obtain segment selector
> + * @regs:Set of registers containing the segment selector

That arg is gone.

> + * @insn:Instruction structure with selector override prefixes
> + * @regoff:  Operand offset, in pt_regs, of which the selector is needed
> + * @default: Resolve default segment selector (i.e., ignore overrides)
> + *
> + * The segment selector to which an effective address refers depends on
> + * a) segment selector overrides instruction prefixes or b) the operand
> + * register indicated in the ModRM or SiB byte.
> + *
> + * For case a), the function inspects any prefixes in the insn instruction;

s/insn //

> + * insn can be null to indicate that selector override prefixes shall be
> + * ignored.

This is not what the code does: it returns -EINVAL when insn is NULL.

> This is useful when the use of prefixes is forbidden (e.g.,
> + * obtaining the code selector). For case b), the operand register shall be
> + * represented as the offset from the base address of pt_regs. Also, regoff
> + * can be -EINVAL for cases in which registers are not used as operands 
> (e.g.,
> + * when the mod and r/m parts of the ModRM byte are 0 and 5, respectively).
> + *
> + * This function returns the segment selector to utilize as per the 
> conditions
> + * described above. Please note that this functin does not return the value
> + * of the segment selector. The value of the segment selector needs to be
> + * obtained using get_segment_s

Re: [v6 PATCH 05/21] x86/insn-eval: Add utility functions to get register offsets

2017-04-12 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:38PM -0800, Ricardo Neri wrote:
> The function insn_get_reg_offset takes as argument an enumeration that

Please end function names with parentheses.

And do you mean get_reg_offset(), per chance?

> indicates the type of offset that is returned: the R/M part of the ModRM
> byte, the index of the SIB byte or the base of the SIB byte.

Err, you mean, it returns the offset to the register the argument
specifies.

> Callers of
> this function would need the definition of such enumeration. This is not
> needed. Instead, helper functions can be defined for this purpose can be
> added.

"Instead, add helpers... "

> These functions are useful in cases when, for instance, the caller
> needs to decide whether the operand is a register or a memory location by
> looking at the mod part of the ModRM byte.
> 
> Cc: Dave Hansen <dave.han...@linux.intel.com>
> Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> Cc: Colin Ian King <colin.k...@canonical.com>
> Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> Cc: Qiaowei Ren <qiaowei@intel.com>
> Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> Cc: Masami Hiramatsu <mhira...@kernel.org>
> Cc: Adrian Hunter <adrian.hun...@intel.com>
> Cc: Kees Cook <keesc...@chromium.org>
> Cc: Thomas Garnier <thgar...@google.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Dmitry Vyukov <dvyu...@google.com>
> Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> ---
>  arch/x86/include/asm/insn-eval.h |  3 +++
>  arch/x86/lib/insn-eval.c | 51 
> 
>  2 files changed, 54 insertions(+)
> 
> diff --git a/arch/x86/include/asm/insn-eval.h 
> b/arch/x86/include/asm/insn-eval.h
> index 5cab1b1..754211b 100644
> --- a/arch/x86/include/asm/insn-eval.h
> +++ b/arch/x86/include/asm/insn-eval.h
> @@ -12,5 +12,8 @@
>  #include 
>  
>  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
> +int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
> +int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
> +int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);

Forgotten to edit the copy-paste?

Which means, nothing really needs insn_get_reg_offset_sib_index() and
you can get rid of it?

>  #endif /* _ASM_X86_INSN_EVAL_H */
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 23cf010..78df1c9 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -98,6 +98,57 @@ static int get_reg_offset(struct insn *insn, struct 
> pt_regs *regs,
>   return regoff[regno];
>  }
>  
> +/**
> + * insn_get_reg_offset_modrm_rm - Obtain register in r/m part of ModRM byte
> + * @insn:Instruction structure containing the ModRM byte
> + * @regs:Set of registers indicated by the ModRM byte

That's simply struct pt_regs - not a set of registers indicated by
ModRM?!?

> + * Obtain the register indicated by the r/m part of the ModRM byte. The
> + * register is obtained as an offset from the base of pt_regs. In specific
> + * cases, the returned value can be -EDOM to indicate that the particular 
> value
> + * of ModRM does not refer to a register.

Put that sentence under the "Return: " paragraph below so that it is
immediately obvious what the retvals are.

> + *
> + * Return: Register indicated by r/m, as an offset within struct pt_regs
> + */
> +int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs)

That name is too long: insn_get_modrm_rm_off() should be enough.

> +{
> + return get_reg_offset(insn, regs, REG_TYPE_RM);
> +}
> +
> +/**
> + * insn_get_reg_offset_sib_base - Obtain register in base part of SiB byte
> + * @insn:Instruction structure containing the SiB byte
> + * @regs:Set of registers indicated by the SiB byte
> + *
> + * Obtain the register indicated by the base part of the SiB byte. The
> + * register is obtained as an offset from the base of pt_regs. In specific
> + * cases, the returned value can be -EDOM to indicate that the particular 
> value
> + * of SiB does not refer to a register.
> + *
> + * Return: Register indicated by SiB's base, as an offset within struct 
> pt_regs

Let's stick to a single spelling: SIB, all caps.

> + */
> +int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs)

insn_get_sib_base_off()

Ditto for the rest of the comments on insn_get_reg_offset_modrm_rm() above.

> +{
> + return get_reg_offset(insn, regs, REG_TYPE_BASE);

Re: [v6 PATCH 04/21] x86/mpx, x86/insn: Relocate insn util functions to a new insn-kernel

2017-04-12 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:37PM -0800, Ricardo Neri wrote:
> Other kernel submodules can benefit from using the utility functions
> defined in mpx.c to obtain the addresses and values of operands contained
> in the general purpose registers. An instance of this is the emulation code
> used for instructions protected by the Intel User-Mode Instruction
> Prevention feature.
> 
> Thus, these functions are relocated to a new insn-eval.c file. The reason
> to not relocate these utilities into insn.c is that the latter solely
> analyses instructions given by a struct insn without any knowledge of the
> meaning of the values of instruction operands. This new utility insn-
> eval.c aims to be used to resolve effective and userspace linear addresses
> based on the contents of the instruction operands as well as the contents
> of pt_regs structure.
> 
> These utilities come with a separate header. This is to avoid taking insn.c
> out of sync from the instructions decoders under tools/obj and tools/perf.
> This also avoids adding cumbersome #ifdef's for the #include'd files
> required to decode instructions in a kernel context.
> 
> Functions are simply relocated. There are not functional or indentation
> changes.

...

> + case REG_TYPE_BASE:
> + regno = X86_SIB_BASE(insn->sib.value);
> + /*
> +  * If mod is 0 and register R/EBP (regno=5) is indicated in the
> +  * base part of the SIB byte, the value of such register should
> +  * not be used in the address computation. Also, a 32-bit
> +  * displacement is expected in this case; the instruction
> +  * decoder takes care of it. This is true for both R13 and
> +  * R/EBP as REX.B will not be decoded.
> +  */
> + if (regno == 5 && X86_MODRM_MOD(insn->modrm.value) == 0)
> + return -EDOM;
> +
> + if (X86_REX_B(insn->rex_prefix.value))
> + regno += 8;
> + break;
> +
> + default:
> + pr_err("invalid register type");
> + BUG();

WARNING: Avoid crashing the kernel - try using WARN_ON & recovery code rather 
than BUG() or BUG_ON()
#211: FILE: arch/x86/lib/insn-eval.c:90:
+   BUG();

And checkpatch is kinda right. We need to warn here, not explode. Oh and
that function returns negative values on error...

Please change that with a patch ontop of the move.

Thanks.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 03/21] x86/mpx: Do not use R/EBP as base in the SIB byte with Mod = 0

2017-04-11 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:36PM -0800, Ricardo Neri wrote:
> Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
> Developer's Manual volume 2A states that when a SIB byte is used and the
> base of the SIB byte points to R/EBP (i.e., base = 5) and the mod part
> of the ModRM byte is zero, the value of such register will not be used
> as part of the address computation. To signal this, a -EDOM error is
> returned to indicate callers that they should ignore the value.
> 
> Also, for this particular case, a displacement of 32-bits should follow
> the SIB byte if the mod part of ModRM is equal to zero. The instruction
> decoder ensures that this is the case.
> 
> Cc: Dave Hansen 
> Cc: Adam Buchbinder 
> Cc: Colin Ian King 
> Cc: Lorenzo Stoakes 
> Cc: Qiaowei Ren 
> Cc: Peter Zijlstra 
> Cc: Nathan Howard 
> Cc: Adan Hawthorn 
> Cc: Joe Perches 
> Cc: Ravi V. Shankar 
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri 
> ---
>  arch/x86/mm/mpx.c | 29 ++---
>  1 file changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
> index d9e92d6..ef7eb67 100644
> --- a/arch/x86/mm/mpx.c
> +++ b/arch/x86/mm/mpx.c
> @@ -121,6 +121,17 @@ static int get_reg_offset(struct insn *insn, struct 
> pt_regs *regs,
>  
>   case REG_TYPE_BASE:
>   regno = X86_SIB_BASE(insn->sib.value);
> + /*
> +  * If mod is 0 and register R/EBP (regno=5) is indicated in the
> +  * base part of the SIB byte,

you can simply say here: "if SIB.base == 5, the base of the
register-indirect addressing is 0."

> the value of such register should
> +  * not be used in the address computation. Also, a 32-bit

Not "Also" but "In this case, a 32-bit displacement..."

> +  * displacement is expected in this case; the instruction
> +  * decoder takes care of it. This is true for both R13 and
> +  * R/EBP as REX.B will not be decoded.

You don't need that sentence as the only thing that matters is ModRM.mod
being 0.

> +  */
> + if (regno == 5 && X86_MODRM_MOD(insn->modrm.value) == 0)

The 0 test we normally do with the ! (also flip parts of if-condition):

if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)

> + return -EDOM;
> +
>   if (X86_REX_B(insn->rex_prefix.value))
>   regno += 8;
>   break;
> @@ -161,16 +172,21 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
> struct pt_regs *regs)
>   eff_addr = regs_get_register(regs, addr_offset);
>   } else {
>   if (insn->sib.nbytes) {
> + /*
> +  * Negative values in the base and index offset means
> +  * an error when decoding the SIB byte. Except -EDOM,
> +  * which means that the registers should not be used
> +  * in the address computation.
> +  */
>   base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
> - if (base_offset < 0)
> + if (unlikely(base_offset == -EDOM))
> + base = 0;
> + else if (unlikely(base_offset < 0))

Bah, unlikely's in something which is not really a hot path. They only
encumber readability, no need for them.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 01/21] x86/mpx: Use signed variables to compute effective addresses

2017-04-11 Thread Borislav Petkov
On Tue, Mar 07, 2017 at 04:32:34PM -0800, Ricardo Neri wrote:
> Even though memory addresses are unsigned. The operands used to compute the

... unsigned, the operands ...

> effective address do have a sign. This is true for the r/m part of the
> ModRM byte, the base and index parts of the SiB byte as well as the
> displacement. Thus, signed variables shall be used when computing the
> effective address from these operands. Once the signed effective address
> has been computed, it is casted to an unsigned long to determine the
> linear address.
> 
> Variables are renamed to better reflect the type of address being
> computed.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html