Re: [PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and 64-bit effective addresses

2017-07-27 Thread Ricardo Neri
On Thu, 2017-07-27 at 15:26 +0200, Borislav Petkov wrote:
> On Tue, Jul 25, 2017 at 04:48:13PM -0700, Ricardo Neri wrote:
> > I meant to say the 4 most significant bytes. In this case, the
> > 64-address 0x1234 would lie in the kernel memory while
> > 0x1234 would correctly be in the user space memory.
> 
> That explanation is better.
> 
> > Yes, perhaps the check above is not needed. I included that check as
> > part of my argument validation. In a 64-bit kernel, this function could
> > be called with val with non-zero most significant bytes.
> 
> So say that in the comment so that it is obvious *why*.
> 
> > I have looked into this closely and as far as I can see, the 4 least
> > significant bytes will wrap around when using 64-bit signed numbers as
> > they would when using 32-bit signed numbers. For instance, for two
> > positive numbers we have:
> > 
> > 7fff: + 7000: = efff:.
> > 
> > The addition above overflows.
> 
> Yes, MSB changes.
> 
> > When sign-extended to 64-bit numbers we would have:
> > 
> > ::7fff: + ::7000: = ::efff:.
> > 
> > The addition above does not overflow. However, the 4 least significant
> > bytes overflow as we expect.
> 
> No they don't - you are simply using 64-bit regs:
> 
>0x46b8 <+8>: movq   $0x7fff,-0x8(%rbp)
>0x46c0 <+16>:movq   $0x7000,-0x10(%rbp)
>0x46c8 <+24>:mov-0x8(%rbp),%rdx
>0x46cc <+28>:mov-0x10(%rbp),%rax
> => 0x46d0 <+32>:add%rdx,%rax
> 
> rax0xefff   4026531839
> rbx0x0  0
> rcx0x0  0
> rdx0x7fff   2147483647
> 
> ...
> 
> eflags 0x206[ PF IF ]
> 
> (OF flag is not set).

True, I don't have the OF set. However the 4 least significant bytes
wrapped around; which is what I needed.
> 
> > We can clamp the 4 most significant bytes.
> > 
> > For a two's complement negative numbers we can have:
> > 
> > : + 8000: = 7fff: with a carry flag.
> > 
> > The addition above overflows.
> 
> Yes.
> 
> > When sign-extending to 64-bit numbers we would have:
> > 
> > ::: + ::8000: = ::7fff: with a
> > carry flag.
> > 
> > The addition above does not overflow. However, the 4 least significant
> > bytes overflew and wrapped around as they would when using 32-bit signed
> > numbers.
> 
> Right. Ok.
> 
> And come to think of it now, I'm wondering, whether it would be
> better/easier/simpler/more straight-forward, to do the 32-bit operations
> with 32-bit types and separate 32-bit functions and have the hardware do
> that for you.
> 
> This way you can save yourself all that ugly and possibly error-prone
> casting back and forth and have the code much more readable too.

That sounds fair. I had to explain a lot this code and probably is not
worth it. I can definitely use 32-bit variable types for the 32-bit case
and drop all these castings.

The 32-bit and 64-bit functions would look identical except for the
variables used to compute the effective address. Perhaps I could use a
union:

union eff_addr {
#if  CONFIG_X86_64
longaddr64;
#endif
int addr32;
};

And use one or the other based on the address size given by the CS.L
CS.D bits of the segment descriptor or address size overrides.

However using the union could be less readable than having two almost
identical functions.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 24/26] x86: Enable User-Mode Instruction Prevention

2017-07-25 Thread Ricardo Neri
On Fri, 2017-06-09 at 18:10 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:22AM -0700, Ricardo Neri wrote:
> > User_mode Instruction Prevention (UMIP) is enabled by setting/clearing a
> > bit in %cr4.
> > 
> > It makes sense to enable UMIP at some point while booting, before user
> > spaces come up. Like SMAP and SMEP, is not critical to have it enabled
> > very early during boot. This is because UMIP is relevant only when there is
> > a userspace to be protected from. Given the similarities in relevance, it
> > makes sense to enable UMIP along with SMAP and SMEP.
> > 
> > UMIP is enabled by default. It can be disabled by adding clearcpuid=514
> > to the kernel parameters.
> > 
> > Cc: Andy Lutomirski <l...@kernel.org>
> > Cc: Andrew Morton <a...@linux-foundation.org>
> > Cc: H. Peter Anvin <h...@zytor.com>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Brian Gerst <brge...@gmail.com>
> > Cc: Chen Yucong <sla...@gmail.com>
> > Cc: Chris Metcalf <cmetc...@mellanox.com>
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Fenghua Yu <fenghua...@intel.com>
> > Cc: Huang Rui <ray.hu...@amd.com>
> > Cc: Jiri Slaby <jsl...@suse.cz>
> > Cc: Jonathan Corbet <cor...@lwn.net>
> > Cc: Michael S. Tsirkin <m...@redhat.com>
> > Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: Shuah Khan <sh...@kernel.org>
> > Cc: Vlastimil Babka <vba...@suse.cz>
> > Cc: Tony Luck <tony.l...@intel.com>
> > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > Cc: Liang Z. Li <liang.z...@intel.com>
> > Cc: Alexandre Julliard <julli...@winehq.org>
> > Cc: Stas Sergeev <s...@list.ru>
> > Cc: x...@kernel.org
> > Cc: linux-msdos@vger.kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/Kconfig | 10 ++
> >  arch/x86/kernel/cpu/common.c | 16 +++-
> >  2 files changed, 25 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index 702002b..1b1bbeb 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -1745,6 +1745,16 @@ config X86_SMAP
> >  
> >   If unsure, say Y.
> >  
> > +config X86_INTEL_UMIP
> > +   def_bool y
> 
> That's a bit too much. It makes sense on distro kernels but how many
> machines out there actually have UMIP?

So would this become a y when more machines have UMIP?
> 
> > +   depends on CPU_SUP_INTEL
> > +   prompt "Intel User Mode Instruction Prevention" if EXPERT
> > +   ---help---
> > + The User Mode Instruction Prevention (UMIP) is a security
> > + feature in newer Intel processors. If enabled, a general
> > + protection fault is issued if the instructions SGDT, SLDT,
> > + SIDT, SMSW and STR are executed in user mode.
> > +
> >  config X86_INTEL_MPX
> > prompt "Intel MPX (Memory Protection Extensions)"
> > def_bool n
> > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > index 8ee3211..66ebded 100644
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -311,6 +311,19 @@ static __always_inline void setup_smap(struct 
> > cpuinfo_x86 *c)
> > }
> >  }
> >  
> > +static __always_inline void setup_umip(struct cpuinfo_x86 *c)
> > +{
> > +   if (cpu_feature_enabled(X86_FEATURE_UMIP) &&
> > +   cpu_has(c, X86_FEATURE_UMIP))
> 
> Hmm, so if UMIP is not build-time disabled, the cpu_feature_enabled()
> will call static_cpu_has().
> 
> Looks like you want to call cpu_has() too because alternatives haven't
> run yet and static_cpu_has() will reply wrong. Please state that in a
> comment.

Why would static_cpu_has() reply wrong if alternatives are not in place?
Because it uses the boot CPU data? When it calls _static_cpu_has() it
would do something equivalent to

   testb test_bit, boot_cpu_data.x86_capability[bit].

I am calling cpu_has because cpu_feature_enabled(), via
static_cpu_has(), will use the boot CPU data while cpu_has would use the
local CPU data. Is this what you meant?

I can definitely add a comment with this explanation, if it makes sense.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 23/26] x86/traps: Fixup general protection faults caused by UMIP

2017-07-25 Thread Ricardo Neri
I am sorry Boris, I also missed this feedback.

On Fri, 2017-06-09 at 15:02 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:21AM -0700, Ricardo Neri wrote:
> > If the User-Mode Instruction Prevention CPU feature is available and
> > enabled, a general protection fault will be issued if the instructions
> > sgdt, sldt, sidt, str or smsw are executed from user-mode context
> > (CPL > 0). If the fault was caused by any of the instructions protected
> > by UMIP, fixup_umip_exception will emulate dummy results for these
> 
> Please end function names with parentheses.

I have audited my commit messages to remove all instances of this error.
> 
> > instructions. If emulation is successful, the result is passed to the
> > user space program and no SIGSEGV signal is emitted.
> > 
> > Please note that fixup_umip_exception also caters for the case when
> > the fault originated while running in virtual-8086 mode.
> > 
> > Cc: Andy Lutomirski <l...@kernel.org>
> > Cc: Andrew Morton <a...@linux-foundation.org>
> > Cc: H. Peter Anvin <h...@zytor.com>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Brian Gerst <brge...@gmail.com>
> > Cc: Chen Yucong <sla...@gmail.com>
> > Cc: Chris Metcalf <cmetc...@mellanox.com>
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Fenghua Yu <fenghua...@intel.com>
> > Cc: Huang Rui <ray.hu...@amd.com>
> > Cc: Jiri Slaby <jsl...@suse.cz>
> > Cc: Jonathan Corbet <cor...@lwn.net>
> > Cc: Michael S. Tsirkin <m...@redhat.com>
> > Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: Shuah Khan <sh...@kernel.org>
> > Cc: Vlastimil Babka <vba...@suse.cz>
> > Cc: Tony Luck <tony.l...@intel.com>
> > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > Cc: Liang Z. Li <liang.z...@intel.com>
> > Cc: Alexandre Julliard <julli...@winehq.org>
> > Cc: Stas Sergeev <s...@list.ru>
> > Cc: x...@kernel.org
> > Cc: linux-msdos@vger.kernel.org
> > Reviewed-by: Andy Lutomirski <l...@kernel.org>
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/kernel/traps.c | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> > index 3995d3a..cec548d 100644
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -65,6 +65,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #ifdef CONFIG_X86_64
> >  #include 
> > @@ -526,6 +527,9 @@ do_general_protection(struct pt_regs *regs, long 
> > error_code)
> > RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> > cond_local_irq_enable(regs);
> >  
> 
> Almost definitely:
> 
>   if (static_cpu_has(X86_FEATURE_UMIP)) {
>   if (...

I will make this update.

> 
> > +   if (user_mode(regs) && fixup_umip_exception(regs))
> > +   return;
> 
> We don't want to punish !UMIP machines.

I will add this check.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 22/26] x86/umip: Force a page fault when unable to copy emulated result to user

2017-07-25 Thread Ricardo Neri
On Fri, 2017-06-09 at 13:02 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:20AM -0700, Ricardo Neri wrote:
> > fixup_umip_exception() will be called from do_general_protection. If the
> ^
> |
> Please end function names with parentheses.  ---+
> 
> > former returns false, the latter will issue a SIGSEGV with SEND_SIG_PRIV.
> > However, when emulation is successful but the emulated result cannot be
> > copied to user space memory, it is more accurate to issue a SIGSEGV with
> > SEGV_MAPERR with the offending address.
> > A new function is inspired in
> 
> That reads funny.

I will correct this.
> 
> > force_sig_info_fault is introduced to model the page fault.
> > 
> > Cc: Andy Lutomirski <l...@kernel.org>
> > Cc: Andrew Morton <a...@linux-foundation.org>
> > Cc: H. Peter Anvin <h...@zytor.com>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Brian Gerst <brge...@gmail.com>
> > Cc: Chen Yucong <sla...@gmail.com>
> > Cc: Chris Metcalf <cmetc...@mellanox.com>
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Fenghua Yu <fenghua...@intel.com>
> > Cc: Huang Rui <ray.hu...@amd.com>
> > Cc: Jiri Slaby <jsl...@suse.cz>
> > Cc: Jonathan Corbet <cor...@lwn.net>
> > Cc: Michael S. Tsirkin <m...@redhat.com>
> > Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: Shuah Khan <sh...@kernel.org>
> > Cc: Vlastimil Babka <vba...@suse.cz>
> > Cc: Tony Luck <tony.l...@intel.com>
> > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > Cc: Liang Z. Li <liang.z...@intel.com>
> > Cc: Alexandre Julliard <julli...@winehq.org>
> > Cc: Stas Sergeev <s...@list.ru>
> > Cc: x...@kernel.org
> > Cc: linux-msdos@vger.kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/kernel/umip.c | 45 +++--
> >  1 file changed, 43 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
> > index c7c5795..ff7366a 100644
> > --- a/arch/x86/kernel/umip.c
> > +++ b/arch/x86/kernel/umip.c
> > @@ -148,6 +148,41 @@ static int __emulate_umip_insn(struct insn *insn, enum 
> > umip_insn umip_inst,
> >  }
> >  
> >  /**
> > + * __force_sig_info_umip_fault() - Force a SIGSEGV with SEGV_MAPERR
> > + * @address:   Address that caused the signal
> > + * @regs:  Register set containing the instruction pointer
> > + *
> > + * Force a SIGSEGV signal with SEGV_MAPERR as the error code. This 
> > function is
> > + * intended to be used to provide a segmentation fault when the result of 
> > the
> > + * UMIP emulation could not be copied to the user space memory.
> > + *
> > + * Return: none
> > + */
> > +static void __force_sig_info_umip_fault(void __user *address,
> > +   struct pt_regs *regs)
> > +{
> > +   siginfo_t info;
> > +   struct task_struct *tsk = current;
> > +
> > +   if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)) {
> 
> Save an indentation level:
> 
>   if (!(show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)))
>   return;
> 
>   printk...
> 
I will implement like this.
> 
> 
> > +   printk_ratelimited("%s[%d] umip emulation segfault ip:%lx 
> > sp:%lx error:%x in %lx\n",
> > +  tsk->comm, task_pid_nr(tsk), regs->ip,
> > +  regs->sp, X86_PF_USER | X86_PF_WRITE,
> > +  regs->ip);
> > +   }
> > +
> > +   tsk->thread.cr2 = (unsigned long)address;
> > +   tsk->thread.error_code  = X86_PF_USER | X86_PF_WRITE;
> > +   tsk->thread.trap_nr = X86_TRAP_PF;
> > +
> > +   info.si_signo   = SIGSEGV;
> > +   info.si_errno   = 0;
> > +   info.si_code= SEGV_MAPERR;
> > +   info.si_addr= address;
> > +   force_sig_info(SIGSEGV, , tsk);
> > +}
> > +
> > +/**
> >   * fixup_umip_exception() - Fixup #GP faults caused by UMIP
> >   * @regs:  Registers as saved when entering the #GP trap
> >   *
> > @@ -235,8 +270,14 @@ bool fixup_umip_exception(struct pt_regs *regs)
> > if ((unsigned long)uaddr == -1L)
> > return false;
> > nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
> > -   if (nr_copied  > 0)
> > -   return false;
> > +   if (nr_copied  > 0) {
> > +   /*
> > +* If copy fails, send a signal and tell caller that
> > +* fault was fixed up
> 
> Pls end sentences in the comments with a fullstop.

I will correct this.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and 64-bit effective addresses

2017-07-25 Thread Ricardo Neri
I am sorry Boris, while working on this series I missed a few of your
feedback comments.

On Wed, 2017-06-07 at 17:48 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:14AM -0700, Ricardo Neri wrote:
> > The 32-bit and 64-bit address encodings are identical. This means that we
> > can use the same function in both cases. In order to reuse the function
> > for 32-bit address encodings, we must sign-extend our 32-bit signed
> > operands to 64-bit signed variables (only for 64-bit builds). To decide on
> > whether sign extension is needed, we rely on the address size as given by
> > the instruction structure.
> > 
> > Once the effective address has been computed, a special verification is
> > needed for 32-bit processes. If running on a 64-bit kernel, such processes
> > can address up to 4GB of memory. Hence, for instance, an effective
> > address of 0x1234 would be misinterpreted as 0x1234 due to
> > the sign extension mentioned above. For this reason, the 4 must be
> 
> Which 4?

I meant to say the 4 most significant bytes. In this case, the
64-address 0x1234 would lie in the kernel memory while
0x1234 would correctly be in the user space memory.
> 
> > truncated to obtain the true effective address.
> > 
> > Lastly, before computing the linear address, we verify that the effective
> > address is within the limits of the segment. The check is kept for long
> > mode because in such a case the limit is set to -1L. This is the largest
> > unsigned number possible. This is equivalent to a limit-less segment.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 99 
> > ++--
> >  1 file changed, 88 insertions(+), 11 deletions(-)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 1a5f5a6..c7c1239 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -688,6 +688,62 @@ int insn_get_modrm_rm_off(struct insn *insn, struct 
> > pt_regs *regs)
> > return get_reg_offset(insn, regs, REG_TYPE_RM);
> >  }
> >  
> > +/**
> > + * _to_signed_long() - Cast an unsigned long into signed long
> > + * @valA 32-bit or 64-bit unsigned long
> > + * @long_bytes The number of bytes used to represent a long number
> > + * @outThe casted signed long
> > + *
> > + * Return: A signed long of either 32 or 64 bits, as per the build 
> > configuration
> > + * of the kernel.
> > + */
> > +static int _to_signed_long(unsigned long val, int long_bytes, long *out)
> > +{
> > +   if (!out)
> > +   return -EINVAL;
> > +
> > +#ifdef CONFIG_X86_64
> > +   if (long_bytes == 4) {
> > +   /* higher bytes should all be zero */
> > +   if (val & ~0x)
> > +   return -EINVAL;
> > +
> > +   /* sign-extend to a 64-bit long */
> 
> So this is a 32-bit userspace on a 64-bit kernel, right?

Yes.
> 
> If so, how can a memory offset be > 32-bits and we have to extend it to
> a 64-bit long?!?

Yes, perhaps the check above is not needed. I included that check as
part of my argument validation. In a 64-bit kernel, this function could
be called with val with non-zero most significant bytes.
> 
> I *think* you want to say that you want to convert it to long so that
> you can do the calculation in longs.

That is exactly what I meant. More specifically, I want to convert my
32-bit variables into 64-bit signed longs; this is the reason I need the
sign extension.
> 
> However!
> 
> If you're a 64-bit kernel running a 32-bit userspace, you need to do
> the calculation in 32-bits only so that it overflows, as it would do
> on 32-bit hardware. IO

Re: [PATCH v7 07/26] x86/insn-eval: Do not BUG on invalid register type

2017-06-27 Thread Ricardo Neri
Hi Stas,

On Wed, 2017-06-07 at 21:54 +0300, Stas Sergeev wrote:
> Hi Ricardo, would you mind unsubscribing
> linux-msdos@ from all your future mails on
> this subject? Otherwise I am afraid there
> would be no subscribers left when you are
> finally done. :)

Sure! I will drop linux-msdos in the subsequent round of patches.

> I think all non-kernel-dev MLs should be
> treated with more care. Eg your initial
> questions were certainly on-topic, but the
> kernel patch-series (esp in such quantity)
> are definitely not.

Sure thing. I apologize for such a large quantity of e-mail. I just
wanted to not miss your input and the inputs of your ML. I agree that at
this point this can be handled in the kernel-specific MLs.

Thanks and BR,
Ricardo

PS. Just bear with me in a couple of extra e-mails while the discussion
on v7 is finished. I promise this is the last one!

Thanks and BR,
Ricardo


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 21/26] x86: Add emulation code for UMIP instructions

2017-06-16 Thread Ricardo Neri
On Thu, 2017-06-08 at 20:38 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:19AM -0700, Ricardo Neri wrote:
> > The feature User-Mode Instruction Prevention present in recent Intel
> > processor prevents a group of instructions from being executed with
> > CPL > 0. Otherwise, a general protection fault is issued.
> 
> This is one of the best opening paragraphs of a commit message I've
> read this year! This is how you open: short, succinct, to the point, no
> marketing bullshit. Good!

Thanks you!
> 
> > Rather than relaying this fault to the user space (in the form of a SIGSEGV
> > signal), the instructions protected by UMIP can be emulated to provide
> > dummy results. This allows to conserve the current kernel behavior and not
> > reveal the system resources that UMIP intends to protect (the global
> > descriptor and interrupt descriptor tables, the segment selectors of the
> > local descriptor table and the task state and the machine status word).
> > 
> > This emulation is needed because certain applications (e.g., WineHQ and
> > DOSEMU2) rely on this subset of instructions to function.
> > 
> > The instructions protected by UMIP can be split in two groups. Those who
> 
> s/who/which/

I will correct.
> 
> > return a kernel memory address (sgdt and sidt) and those who return a
> 
> ditto.

I will correct here also.
> 
> > value (sldt, str and smsw).
> >
> > For the instructions that return a kernel memory address, applications
> > such as WineHQ rely on the result being located in the kernel memory space.
> > The result is emulated as a hard-coded value that, lies close to the top
> > of the kernel memory. The limit for the GDT and the IDT are set to zero.
> 
> Nice.
> 
> > Given that sldt and str are not used in common in programs supported by
> 
> You wanna say "in common programs" here? Or "not commonly used in programs" ?

I will rephrase this comment.
> 
> > WineHQ and DOSEMU2, they are not emulated.
> > 
> > The instruction smsw is emulated to return the value that the register CR0
> > has at boot time as set in the head_32.
> > 
> > Care is taken to appropriately emulate the results when segmentation is
> > used. This is, rather than relying on USER_DS and USER_CS, the function
> 
>   "That is,... "

I will correct it.
> 
> > insn_get_addr_ref() inspects the segment descriptor pointed by the
> > registers in pt_regs. This ensures that we correctly obtain the segment
> > base address and the address and operand sizes even if the user space
> > application uses local descriptor table.
> 
> Btw, I could very well use all that nice explanation in umip.c too so
> that the high-level behavior is documented.

Sure, I will include a high-level description in the file itself.

> 
> > Cc: Andy Lutomirski <l...@kernel.org>
> > Cc: Andrew Morton <a...@linux-foundation.org>
> > Cc: H. Peter Anvin <h...@zytor.com>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Brian Gerst <brge...@gmail.com>
> > Cc: Chen Yucong <sla...@gmail.com>
> > Cc: Chris Metcalf <cmetc...@mellanox.com>
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Fenghua Yu <fenghua...@intel.com>
> > Cc: Huang Rui <ray.hu...@amd.com>
> > Cc: Jiri Slaby <jsl...@suse.cz>
> > Cc: Jonathan Corbet <cor...@lwn.net>
> > Cc: Michael S. Tsirkin <m...@redhat.com>
> > Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: Shuah Khan <sh...@kernel.org>
> > Cc: Vlastimil Babka <vba...@suse.cz>
> > Cc: Tony Luck <tony.l...@intel.com>
> > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > Cc: Liang Z. Li <liang.z...@intel.com>
> > Cc: Alexandre Julliard <julli...@winehq.org>
> > Cc: Stas Sergeev <s...@list.ru>
> > Cc: x...@kernel.org
> > Cc: linux-msdos@vger.kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/include/asm/umip.h |  15 +++
> >  arch/x86/kernel/Makefile|   1 +
> >  arch/x86/kernel/umip.c  | 245 
> > 
> >  3 files changed, 261 insertions(+)
> >  create mode 100644 arch/x86/include/asm/umip.h
> >  create mode 100644 arch/x86/kernel/umip.c
> > 
> > diff --git a/arch/x86/include/asm/umip.h b/arch/x86/include/asm/umip.h
> > new file mode 100644
> > index 000

Re: [PATCH v7 18/26] x86/insn-eval: Add support to resolve 16-bit addressing encodings

2017-06-15 Thread Ricardo Neri
On Wed, 2017-06-07 at 18:28 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:16AM -0700, Ricardo Neri wrote:
> > Tasks running in virtual-8086 mode or in protected mode with code
> > segment descriptors that specify 16-bit default address sizes via the
> > D bit will use 16-bit addressing form encodings as described in the Intel
> > 64 and IA-32 Architecture Software Developer's Manual Volume 2A Section
> > 2.1.5. 16-bit addressing encodings differ in several ways from the
> > 32-bit/64-bit addressing form encodings: ModRM.rm points to different
> > registers and, in some cases, effective addresses are indicated by the
> > addition of the value of two registers. Also, there is no support for SIB
> > bytes. Thus, a separate function is needed to parse this form of
> > addressing.
> > 
> > A couple of functions are introduced. get_reg_offset_16() obtains the
> > offset from the base of pt_regs of the registers indicated by the ModRM
> > byte of the address encoding. get_addr_ref_16() computes the linear
> > address indicated by the instructions using the value of the registers
> > given by ModRM as well as the base address of the segment.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 155 
> > +++
> >  1 file changed, 155 insertions(+)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 9822061..928a662 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -431,6 +431,73 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> >  }
> >  
> >  /**
> > + * get_reg_offset_16 - Obtain offset of register indicated by instruction
> 
> Please end function names with parentheses.

I will correct.
> 
> > + * @insn:  Instruction structure containing ModRM and SiB bytes
> 
> s/SiB/SIB/g

I will correct.
> 
> > + * @regs:  Structure with register values as seen when entering kernel mode
> > + * @offs1: Offset of the first operand register
> > + * @offs2: Offset of the second opeand register, if applicable.
> > + *
> > + * Obtain the offset, in pt_regs, of the registers indicated by the ModRM 
> > byte
> > + * within insn. This function is to be used with 16-bit address encodings. 
> > The
> > + * offs1 and offs2 will be written with the offset of the two registers
> > + * indicated by the instruction. In cases where any of the registers is not
> > + * referenced by the instruction, the value will be set to -EDOM.
> > + *
> > + * Return: 0 on success, -EINVAL on failure.
> > + */
> > +static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
> > +int *offs1, int *offs2)
> > +{
> > +   /* 16-bit addressing can use one or two registers */
> > +   static const int regoff1[] = {
> > +   offsetof(struct pt_regs, bx),
> > +   offsetof(struct pt_regs, bx),
> > +   offsetof(struct pt_regs, bp),
> > +   offsetof(struct pt_regs, bp),
> > +   offsetof(struct pt_regs, si),
> > +   offsetof(struct pt_regs, di),
> > +   offsetof(struct pt_regs, bp),
> > +   offsetof(struct pt_regs, bx),
> > +   };
> > +
> > +   static const int regoff2[] = {
> > +   offsetof(struct pt_regs, si),
> > +   offsetof(struct pt_regs, di),
> > +   offsetof(struct pt_regs, si),
> > +   offsetof(struct pt_regs, di),
> > +   -EDOM,
> > +   -EDOM,
> > +   -EDOM,
> > +   -EDOM,
> > +   };
> 
> You mean "Table 2-1. 16-Bit Addressing Forms with the ModR/M Byte" in
> the SDM, right?

Yes

Re: [PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and 64-bit effective addresses

2017-06-15 Thread Ricardo Neri
On Wed, 2017-06-07 at 17:49 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:14AM -0700, Ricardo Neri wrote:
> > @@ -697,18 +753,21 @@ void __user *insn_get_addr_ref(struct insn *insn, 
> > struct pt_regs *regs)
> >  {
> > unsigned long linear_addr, seg_base_addr, seg_limit;
> > long eff_addr, base, indx;
> > -   int addr_offset, base_offset, indx_offset;
> > +   int addr_offset, base_offset, indx_offset, addr_bytes;
> > insn_byte_t sib;
> >  
> > insn_get_modrm(insn);
> > insn_get_sib(insn);
> > sib = insn->sib.value;
> > +   addr_bytes = insn->addr_bytes;
> >  
> > if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> > addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
> > if (addr_offset < 0)
> > goto out_err;
> > -   eff_addr = regs_get_register(regs, addr_offset);
> > +   eff_addr = get_mem_offset(regs, addr_offset, addr_bytes);
> > +   if (eff_addr == -1L)
> > +   goto out_err;
> > seg_base_addr = insn_get_seg_base(regs, insn, addr_offset);
> > if (seg_base_addr == -1L)
> > goto out_err;
> 
> This code here is too dense, it needs spacing for better readability.

I have spaced out in my upcoming version.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 14/26] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 5

2017-06-15 Thread Ricardo Neri
On Wed, 2017-06-07 at 15:15 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:12AM -0700, Ricardo Neri wrote:
> > Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
> > Developer's Manual volume 2A states that when ModRM.mod is zero and
> > ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means
> > that none of the registers are used in the computation of the effective
> > address. A return value of -EDOM signals callers that they should not use
> > the value of registers when computing the effective address for the
> > instruction.
> > 
> > In IA-32e 64-bit mode (long mode), the effective address is given by the
> > 32-bit displacement plus the value of RIP of the next instruction.
> > In IA-32e compatibility mode (protected mode), only the displacement is
> > used.
> > 
> > The instruction decoder takes care of obtaining the displacement.
> 
> ...
> 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 693e5a8..4f600de 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -379,6 +379,12 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> > switch (type) {
> > case REG_TYPE_RM:
> > regno = X86_MODRM_RM(insn->modrm.value);
> 
> 
> < newline here.

Will add the new line.
> 
> > +   /*
> > +* ModRM.mod == 0 and ModRM.rm == 5 means a 32-bit displacement
> > +* follows the ModRM byte.
> > +*/
> > +   if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
> > +   return -EDOM;
> > if (X86_REX_B(insn->rex_prefix.value))
> > regno += 8;
> > break;
> > @@ -730,9 +736,21 @@ void __user *insn_get_addr_ref(struct insn *insn, 
> > struct pt_regs *regs)
> > eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
> > } else {
> > addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
> > -   if (addr_offset < 0)
> 
> ditto.

Will add the new line.
> 
> > +   /*
> > +* -EDOM means that we must ignore the address_offset.
> > +* In such a case, in 64-bit mode the effective address
> > +* relative to the RIP of the following instruction.
> > +*/
> > +   if (addr_offset == -EDOM) {
> > +   eff_addr = 0;
> > +   if (user_64bit_mode(regs))
> > +   eff_addr = (long)regs->ip +
> > +  insn->length;
> 
> Let that line stick out and write it balanced:
> 
> if (addr_offset == -EDOM) {
> if (user_64bit_mode(regs))
> eff_addr = (long)regs->ip + 
> insn->length;
> else
> eff_addr = 0;
> 
> should be easier parseable this way.

Will rewrite as you suggest.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 13/26] x86/insn-eval: Add function to get default params of code segment

2017-06-15 Thread Ricardo Neri
On Wed, 2017-06-07 at 14:59 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:11AM -0700, Ricardo Neri wrote:
> > This function returns the default values of the address and operand sizes
> > as specified in the segment descriptor. This information is determined
> > from the D and L bits. Hence, it can be used for both IA-32e 64-bit and
> > 32-bit legacy modes. For virtual-8086 mode, the default address and
> > operand sizes are always 2 bytes.
> > 
> > The D bit is only meaningful for code segments. Thus, these functions
> > always use the code segment selector contained in regs.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/include/asm/insn-eval.h |  6 
> >  arch/x86/lib/insn-eval.c | 65 
> > 
> >  2 files changed, 71 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/insn-eval.h 
> > b/arch/x86/include/asm/insn-eval.h
> > index 7f3c7fe..9ed1c88 100644
> > --- a/arch/x86/include/asm/insn-eval.h
> > +++ b/arch/x86/include/asm/insn-eval.h
> > @@ -11,9 +11,15 @@
> >  #include 
> >  #include 
> >  
> > +struct insn_code_seg_defaults {
> 
> A whole struct for a function which gets called only once?
> 
> Bah, that's a bit too much, if you ask me.
> 
> So you're returning two small unsigned integers - i.e., you can just as
> well return a single u8 and put address and operand sizes in there:
> 
>   ret = oper_sz | addr_sz << 4;
> 
> No need for special structs for that.

OK. This makes sense. Perhaps I can use a couple of #define's to set and
get the the address and operand sizes in a single u8. This would make
the code more readable.

> 
> > +   unsigned char address_bytes;
> > +   unsigned char operand_bytes;
> > +};
> > +
> >  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
> >  int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
> >  unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
> > int regoff);
> > +struct insn_code_seg_defaults insn_get_code_seg_defaults(struct pt_regs 
> > *regs);
> >  
> >  #endif /* _ASM_X86_INSN_EVAL_H */
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index c77ed80..693e5a8 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -603,6 +603,71 @@ static unsigned long get_seg_limit(struct pt_regs 
> > *regs, struct insn *insn,
> >  }
> >  
> >  /**
> > + * insn_get_code_seg_defaults() - Obtain code segment default parameters
> > + * @regs:  Structure with register values as seen when entering kernel mode
> > + *
> > + * Obtain the default parameters of the code segment: address and operand 
> > sizes.
> > + * The code segment is obtained from the selector contained in the CS 
> > register
> > + * in regs. In protected mode, the default address is determined by 
> > inspecting
> > + * the L and D bits of the segment descriptor. In virtual-8086 mode, the 
> > default
> > + * is always two bytes for both address and operand sizes.
> > + *
> > + * Return: A populated insn_code_seg_defaults structure on success. The
> > + * structure contains only zeros on failure.
> 
> s/failure/error/

Will correct.
> 
> > + */
> > +struct insn_code_seg_defaults insn_get_code_seg_defaults(struct pt_regs 
> > *regs)
> > +{
> > +   struct desc_struct *desc;
> > +   struct insn_code_seg_defaults defs;
> > +   unsigned short sel;
> > +   /*
> > +* The most significant byte of AR_TYPE_MASK determines whether a
> > +* segment contains data or code.
> > +*/
> > +   unsigned int type_mask = AR_TYPE_

Re: [PATCH v7 10/26] x86/insn-eval: Add utility functions to get segment selector

2017-06-15 Thread Ricardo Neri
On Thu, 2017-06-15 at 11:37 -0700, Ricardo Neri wrote:
> > Yuck, didn't we talk about this already?
> 
> I am sorry Borislav. I thought you agreed that I could use the values
> of
> the segment override prefixes to identify the segment registers [1].

This time with the reference:
[1]. https://lkml.org/lkml/2017/5/5/377


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 10/26] x86/insn-eval: Add utility functions to get segment selector

2017-06-15 Thread Ricardo Neri
On Tue, 2017-05-30 at 12:35 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:08AM -0700, Ricardo Neri wrote:
> > When computing a linear address and segmentation is used, we need to know
> > the base address of the segment involved in the computation. In most of
> > the cases, the segment base address will be zero as in USER_DS/USER32_DS.
> > However, it may be possible that a user space program defines its own
> > segments via a local descriptor table. In such a case, the segment base
> > address may not be zero .Thus, the segment base address is needed to
> > calculate correctly the linear address.
> > 
> > The segment selector to be used when computing a linear address is
> > determined by either any of segment override prefixes in the
> > instruction or inferred from the registers involved in the computation of
> > the effective address; in that order. Also, there are cases when the
> > overrides shall be ignored (code segments are always selected by the CS
> > segment register; string instructions always use the ES segment register
> > along with the EDI register).
> > 
> > For clarity, this process can be split into two steps: resolving the
> > relevant segment register to use and, once known, read its value to
> > obtain the segment selector.
> > 
> > The method to obtain the segment selector depends on several factors. In
> > 32-bit builds, segment selectors are saved into the pt_regs structure
> > when switching to kernel mode. The same is also true for virtual-8086
> > mode. In 64-bit builds, segmentation is mostly ignored, except when
> > running a program in 32-bit legacy mode. In this case, CS and SS can be
> > obtained from pt_regs. DS, ES, FS and GS can be read directly from
> > the respective segment registers.
> > 
> > Lastly, the only two segment registers that are not ignored in long mode
> > are FS and GS. In these two cases, base addresses are obtained from the
> > respective MSRs.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 256 
> > +++
> >  1 file changed, 256 insertions(+)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 1634762..0a496f4 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -9,6 +9,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  enum reg_type {
> > REG_TYPE_RM = 0,
> > @@ -33,6 +34,17 @@ enum string_instruction {
> > SCASW_SCASD = 0xaf,
> >  };
> >  
> > +enum segment_register {
> > +   SEG_REG_INVAL = -1,
> > +   SEG_REG_IGNORE = 0,
> > +   SEG_REG_CS = 0x23,
> > +   SEG_REG_SS = 0x36,
> > +   SEG_REG_DS = 0x3e,
> > +   SEG_REG_ES = 0x26,
> > +   SEG_REG_FS = 0x64,
> > +   SEG_REG_GS = 0x65,
> > +};
> 
> Yuck, didn't we talk about this already?

I am sorry Borislav. I thought you agreed that I could use the values of
the segment override prefixes to identify the segment registers [1].
> 
> Those are segment override prefixes so call them as such.
> 
> #define SEG_OVR_PFX_CS0x23
> #define SEG_OVR_PFX_SS0x36
> ...
> 
> and we already have those!
> 
> arch/x86/include/asm/inat.h:
> ...
> #define INAT_PFX_CS 5   /* 0x2E */
> #define INAT_PFX_DS 6   /* 0x3E */
> #define INAT_PFX_ES 7   /* 0x26 */
> #define INAT_PFX_FS 8   /* 0x64 */
> #define INAT_PFX_GS 9   /* 0x65 */
> #define INAT_PFX_SS 10  /* 0x36 */
> 
> well, kinda, they're numbers there and not the actual prefix values.

These numbers can 'translated' to the actual value of the prefixes via
inat_get_opcode_attribute(). In my 

Re: [PATCH v7 07/26] x86/insn-eval: Do not BUG on invalid register type

2017-06-06 Thread Ricardo Neri
On Tue, 2017-06-06 at 13:58 +0200, Borislav Petkov wrote:
> On Mon, Jun 05, 2017 at 11:06:58PM -0700, Ricardo Neri wrote:
> > I agree that insn-eval reads somewhat funny. I did not want to go with
> > insn-dec.c as insn.c, in my opinion, already decodes the instruction
> > (i.e., it finds prefixes, opcodes, ModRM, SIB and displacement bytes).
> > In insn-eval.c I simply take those decoded parameters and evaluate them
> > to obtain the values they contain (e.g., a specific memory location).
> > Perhaps, insn-resolve.c could be a better name? Or maybe isnn-operands?
> 
> So actually I'm gravitating towards calling all that instruction
> "massaging" code with a single prefix to denote this comes from the insn
> decoder/handler/whatever...
> 
> I.e.,
> 
>   "insn-decoder: x86: invalid register type"
> 
> or
> 
>   "inat: x86: invalid register type"
> 
> or something to that effect.
> 
> I mean, If we're going to grow our own - as we do, apparently - maybe it
> all should be a separate entity with its proper name.

I see. You were more concerned about the naming of the coding artifacts
(e.g., function names, error prints, etc) than the actual filenames. I
think I have aligned with the function naming of insn.c in all the
functions that are exposed via header by using the inns_ prefix. For
static functions I don't use that prefix. Perhaps I can use the __
prefix as insn.c does.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 05/26] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0

2017-06-06 Thread Ricardo Neri
On Mon, 2017-05-29 at 15:07 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:03AM -0700, Ricardo Neri wrote:
> > Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
> > Developer's Manual volume 2A states that when a SIB byte is used and the
> > base of the SIB byte points is base = 101b and the mod part
> > of the ModRM byte is zero, the base port on the effective address
> > computation is null. In this case, a 32-bit displacement follows the SIB
> > byte. This is obtained when the instruction decoder parses the operands.
> > 
> > To signal this scenario, a -EDOM error is returned to indicate callers that
> > they should ignore the base.
> > 
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Andy Lutomirski <l...@kernel.org>
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Nathan Howard <liverl...@gmail.com>
> > Cc: Adan Hawthorn <adanhawth...@gmail.com>
> > Cc: Joe Perches <j...@perches.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/mm/mpx.c | 27 ---
> >  1 file changed, 20 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
> > index 7397b81..30aef92 100644
> > --- a/arch/x86/mm/mpx.c
> > +++ b/arch/x86/mm/mpx.c
> > @@ -122,6 +122,15 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> >  
> > case REG_TYPE_BASE:
> > regno = X86_SIB_BASE(insn->sib.value);
> > +   /*
> > +* If ModRM.mod is 0 and SIB.base == 5, the base of the
> > +* register-indirect addressing is 0. In this case, a
> > +* 32-bit displacement is expected in this case; the
> > +* instruction decoder finds such displacement for us.
> 
> That last sentence reads funny. Just say:
> 
> "In this case, a 32-bit displacement follows the SIB byte."

Agreed. I will update the comment to make more sense.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 07/26] x86/insn-eval: Do not BUG on invalid register type

2017-06-06 Thread Ricardo Neri
On Mon, 2017-05-29 at 18:37 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:05AM -0700, Ricardo Neri wrote:
> > We are not in a critical failure path. The invalid register type is caused
> > when trying to decode invalid instruction bytes from a user-space program.
> > Thus, simply print an error message. To prevent this warning from being
> > abused from user space programs, use the rate-limited variant of printk.
> > 
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Andy Lutomirski <l...@kernel.org>
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index e746a6f..182e2ae 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -5,6 +5,7 @@
> >   */
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -85,9 +86,8 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> > break;
> >  
> > default:
> > -   pr_err("invalid register type");
> > -   BUG();
> > -   break;
> > +   printk_ratelimited(KERN_ERR "insn-eval: x86: invalid register 
> > type");
> 
> You can use pr_err_ratelimited() and define "insn-eval" with pr_fmt.
> Look for examples in the tree.

Will do. I have looked at the examples.
> 
> Btw, "insn-eval" is perhaps not the right name - since we're building
> an instruction decoder, maybe it should be called "insn-dec" or so. I'm
> looking at those other arch/x86/lib/insn.c, arch/x86/include/asm/inat.h
> things and how they're starting to morph into one decoding facility,
> AFAICT.

I agree that insn-eval reads somewhat funny. I did not want to go with
insn-dec.c as insn.c, in my opinion, already decodes the instruction
(i.e., it finds prefixes, opcodes, ModRM, SIB and displacement bytes).
In insn-eval.c I simply take those decoded parameters and evaluate them
to obtain the values they contain (e.g., a specific memory location).
Perhaps, insn-resolve.c could be a better name? Or maybe isnn-operands?

Thanks and BR,
Ricardo


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 08/26] x86/insn-eval: Add a utility function to get register offsets

2017-06-06 Thread Ricardo Neri
On Mon, 2017-05-29 at 19:16 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:06AM -0700, Ricardo Neri wrote:
> > The function get_reg_offset() returns the offset to the register the
> > argument specifies as indicated in an enumeration of type offset. Callers
> > of this function would need the definition of such enumeration. This is
> > not needed. Instead, add helper functions for this purpose. These functions
> > are useful in cases when, for instance, the caller needs to decide whether
> > the operand is a register or a memory location by looking at the rm part
> > of the ModRM byte. As of now, this is the only helper function that is
> > needed.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/include/asm/insn-eval.h |  1 +
> >  arch/x86/lib/insn-eval.c | 15 +++
> >  2 files changed, 16 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/insn-eval.h 
> > b/arch/x86/include/asm/insn-eval.h
> > index 5cab1b1..7e8c963 100644
> > --- a/arch/x86/include/asm/insn-eval.h
> > +++ b/arch/x86/include/asm/insn-eval.h
> > @@ -12,5 +12,6 @@
> >  #include 
> >  
> >  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
> > +int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
> >  
> >  #endif /* _ASM_X86_INSN_EVAL_H */
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 182e2ae..8b16761 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -97,6 +97,21 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> > return regoff[regno];
> >  }
> >  
> > +/**
> > + * insn_get_reg_offset_modrm_rm() - Obtain register in r/m part of ModRM 
> > byte
> 
> That name needs to be synced with the function name below.

Ugh! I missed this. I will update accordingly. Thanks for the detailed
review.

BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 09/26] x86/insn-eval: Add utility function to identify string instructions

2017-06-06 Thread Ricardo Neri
On Mon, 2017-05-29 at 23:48 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:07AM -0700, Ricardo Neri wrote:
> > String instructions are special because in protected mode, the linear
> > address is always obtained via the ES segment register in operands that
> > use the (E)DI register.
> 
>  ... and DS for rSI.

Right, I omitted this in the commit message.
> 
> If we're going to account for both operands of string instructions with
> two operands.
> 
> Btw, LODS and OUTS use only DS:rSI as a source operand. So we have to be
> careful with the generalization here. So if ES:rDI is the only seg. reg
> we want, then we don't need to look at those insns... (we assume DS by
> default).

My intention with this function is to write a function that does only
one thing: identify string instructions, irrespective of the operands
they use. A separate function, resolve_seg_register, will have the logic
to decide what to segment register to use based on the registers used as
operands, whether we are looking at a string instruction, whether we
have segment override prefixes and whether such overrides should be
ignored.

If I was to leave out string instructions from this function it should
be renamed as is_string_instruction_non_lods_outs. In my opinion this
separation makes the code more clear and I would end up having logic to
decide which segment register to use in two places. Does it makes sense
to you?

> 
> ...
> 
> > +/**
> > + * is_string_instruction - Determine if instruction is a string instruction
> > + * @insn:  Instruction structure containing the opcode
> > + *
> > + * Return: true if the instruction, determined by the opcode, is any of the
> > + * string instructions as defined in the Intel Software Development manual.
> > + * False otherwise.
> > + */
> > +static bool is_string_instruction(struct insn *insn)
> > +{
> > +   insn_get_opcode(insn);
> > +
> > +   /* all string instructions have a 1-byte opcode */
> > +   if (insn->opcode.nbytes != 1)
> > +   return false;
> > +
> > +   switch (insn->opcode.bytes[0]) {
> > +   case INSB:
> > +   /* fall through */
> > +   case INSW_INSD:
> > +   /* fall through */
> > +   case OUTSB:
> > +   /* fall through */
> > +   case OUTSW_OUTSD:
> > +   /* fall through */
> > +   case MOVSB:
> > +   /* fall through */
> > +   case MOVSW_MOVSD:
> > +   /* fall through */
> > +   case CMPSB:
> > +   /* fall through */
> > +   case CMPSW_CMPSD:
> > +   /* fall through */
> > +   case STOSB:
> > +   /* fall through */
> > +   case STOSW_STOSD:
> > +   /* fall through */
> > +   case LODSB:
> > +   /* fall through */
> > +   case LODSW_LODSD:
> > +   /* fall through */
> > +   case SCASB:
> > +   /* fall through */
> 
> That "fall through" for every opcode is just too much. Also, you can use
> the regularity of the x86 opcode space and do:
> 
>   case 0x6c ... 0x6f: /* INS/OUTS */
>   case 0xa4 ... 0xa7: /* MOVS/CMPS */
>   case 0xaa ... 0xaf: /* STOS/LODS/SCAS */
>   return true;
>   default:
>   return false;
> }
> 
> And voila, there's your compact is_string_insn() function! :^)

Thanks for the suggestion! It looks really nice. I will implement
accordingly.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 12/26] x86/insn-eval: Add utility functions to get segment descriptor base address and limit

2017-06-03 Thread Ricardo Neri
On Wed, 2017-05-31 at 18:58 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:10AM -0700, Ricardo Neri wrote:
> > With segmentation, the base address of the segment descriptor is needed
> > to compute a linear address. The segment descriptor used in the address
> > computation depends on either any segment override prefixes in the
> > instruction or the default segment determined by the registers involved
> > in the address computation. Thus, both the instruction as well as the
> > register (specified as the offset from the base of pt_regs) are given as
> > inputs, along with a boolean variable to select between override and
> > default.
> 
> ...
> 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index f46cb31..c77ed80 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -476,6 +476,133 @@ static struct desc_struct *get_desc(unsigned short 
> > sel)
> >  }
> >  
> >  /**
> > + * insn_get_seg_base() - Obtain base address of segment descriptor.
> > + * @regs:  Structure with register values as seen when entering kernel mode
> > + * @insn:  Instruction structure with selector override prefixes
> > + * @regoff:Operand offset, in pt_regs, of which the selector is 
> > needed
> > + *
> > + * Obtain the base address of the segment descriptor as indicated by either
> > + * any segment override prefixes contained in insn or the default segment
> > + * applicable to the register indicated by regoff. regoff is specified as 
> > the
> > + * offset in bytes from the base of pt_regs.
> > + *
> > + * Return: In protected mode, base address of the segment. Zero in for long
> > + * mode, except when FS or GS are used. In virtual-8086 mode, the segment
> > + * selector shifted 4 positions to the right. -1L in case of
> > + * error.
> > + */
> > +unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
> > +   int regoff)
> > +{
> > +   struct desc_struct *desc;
> > +   unsigned short sel;
> > +   enum segment_register seg_reg;
> > +
> > +   seg_reg = resolve_seg_register(insn, regs, regoff);
> > +   if (seg_reg == SEG_REG_INVAL)
> > +   return -1L;
> > +
> > +   sel = get_segment_selector(regs, seg_reg);
> > +   if ((short)sel < 0)
> 
> I guess it would be better if that function returned a signed short so
> you don't have to cast it here. (You're casting it to an unsigned long
> below anyway.)

Yes, this make sense. I will make this change.
> 
> > +   return -1L;
> > +
> > +   if (v8086_mode(regs))
> > +   /*
> > +* Base is simply the segment selector shifted 4
> > +* positions to the right.
> > +*/
> > +   return (unsigned long)(sel << 4);
> > +
> 
> ...
> 
> > +static unsigned long get_seg_limit(struct pt_regs *regs, struct insn *insn,
> > +  int regoff)
> > +{
> > +   struct desc_struct *desc;
> > +   unsigned short sel;
> > +   unsigned long limit;
> > +   enum segment_register seg_reg;
> > +
> > +   seg_reg = resolve_seg_register(insn, regs, regoff);
> > +   if (seg_reg == SEG_REG_INVAL)
> > +   return 0;
> > +
> > +   sel = get_segment_selector(regs, seg_reg);
> > +   if ((short)sel < 0)
> 
> Ditto.

Here as well.

> 
> > +   return 0;
> > +
> > +   if (user_64bit_mode(regs) || v8086_mode(regs))
> > +   return -1L;
> > +
> > +   if (!sel)
> > +   return 0;
> > +
> > +   desc = get_desc(sel);
> > +   if (!desc)
> > +   return 0;
> > +
> > +   /*
> > +* If the granularity bit is set, the limit is given in multiples
> > +* of 4096. When the granularity bit is set, the least 12 significant
> 
>the 12 least significant 
> bits
> 
> > +* bits are not tested when checking the segment limits. In practice,
> > +* this means that the segment ends in (limit << 12) + 0xfff.
> > +*/
> > +   limit = get_desc_limit(desc);
> > +   if (desc->g)
> > +   limit <<= 12 | 0x7;
> 
> That 0x7 doesn't look like 0xfff - it shifts limit by 15 instead. You
> can simply write it like you mean it:
> 
>   limit = (limit << 12) + 0xfff;

You are right, this wrong. I will implement as you mention.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 02/26] x86/mm: Relocate page fault error codes to traps.h

2017-05-31 Thread Ricardo Neri
On Sat, 2017-05-27 at 12:13 +0200, Borislav Petkov wrote:
> On Fri, May 26, 2017 at 08:40:26PM -0700, Ricardo Neri wrote:
> > This change was initially intended to only rename the error codes,
> > without functional changes. Would making change be considered a
> change
> > in functionality?
> 
> How?
> 
> The before-and-after asm should be the identical.

Yes but it reads differently. I just wanted to double check. I will make
this change, which keeps functionality but is written differently.

Thanks and BR,
Ricardo


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 04/26] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b

2017-05-26 Thread Ricardo Neri
On Wed, 2017-05-24 at 15:37 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:02AM -0700, Ricardo Neri wrote:
> > Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
> > Developer's Manual volume 2A states that when ModRM.mod !=11b and
> > ModRM.rm = 100b indexed register-indirect addressing is used. In other
> > words, a SIB byte follows the ModRM byte. In the specific case of
> > SIB.index = 100b, the scale*index portion of the computation of the
> > effective address is null. To signal callers of this particular situation,
> > get_reg_offset() can return -EDOM (-EINVAL continues to indicate that an
> > error when decoding the SIB byte).
> > 
> > An example of this situation can be the following instruction:
> > 
> >8b 4c 23 80   mov -0x80(%rbx,%riz,1),%rcx
> >ModRM:0x4c [mod:1b][reg:1b][rm:100b]
> >SIB:  0x23 [scale:0b][index:100b][base:11b]
> >Displacement: 0x80  (1-byte, as per ModRM.mod = 1b)
> > 
> > The %riz 'register' indicates a null index.
> > 
> > In long mode, a REX prefix may be used. When a REX prefix is present,
> > REX.X adds a fourth bit to the register selection of SIB.index. This gives
> > the ability to refer to all the 16 general purpose registers. When REX.X is
> > 1b and SIB.index is 100b, the index is indicated in %r12. In our example,
> > this would look like:
> > 
> >42 8b 4c 23 80mov -0x80(%rbx,%r12,1),%rcx
> >REX:  0x42 [W:0b][R:0b][X:1b][B:0b]
> >ModRM:0x4c [mod:1b][reg:1b][rm:100b]
> >SIB:  0x23 [scale:0b][.X: 1b, index:100b][.B:0b, base:11b]
> >Displacement: 0x80  (1-byte, as per ModRM.mod = 1b)
> > 
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Andy Lutomirski <l...@kernel.org>
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Nathan Howard <liverl...@gmail.com>
> > Cc: Adan Hawthorn <adanhawth...@gmail.com>
> > Cc: Joe Perches <j...@perches.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/mm/mpx.c | 20 ++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
> > index ebdead8..7397b81 100644
> > --- a/arch/x86/mm/mpx.c
> > +++ b/arch/x86/mm/mpx.c
> > @@ -110,6 +110,14 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> > regno = X86_SIB_INDEX(insn->sib.value);
> > if (X86_REX_X(insn->rex_prefix.value))
> > regno += 8;
> 
> <--- newline.
I will add a new line here.

> 
> > +   /*
> > +* If ModRM.mod !=3 and SIB.index (regno=4) the scale*index
> > +* portion of the address computation is null. This is
> > +* true only if REX.X is 0. In such a case, the SIB index
> > +* is used in the address computation.
> > +*/
> > +   if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
> > +   return -EDOM;
> > break;
> >  
> > case REG_TYPE_BASE:
> > @@ -159,11 +167,19 @@ static void __user *mpx_get_addr_ref(struct insn 
> > *insn, struct pt_regs *regs)
> > goto out_err;
> >  
> > indx_offset = get_reg_offset(insn, regs, 
> > REG_TYPE_INDEX);
> > -   if (indx_offset < 0)
> 
> <--- newline.
I will add a new line here.

> 
> > +   /*
> > +* A negative offset generally means a error, except
> 
>an
> 
> > +* -EDOM, which means that the contents of the register
> > +* should not be used as index.
> > +*/
> > +   if (indx_offset == -EDOM)
> > +   indx = 0;
> > +   else if (indx_offset < 0)
> > goto out_err;
> > +   else
> > +   indx = regs_get_register(r

Re: [PATCH v7 02/26] x86/mm: Relocate page fault error codes to traps.h

2017-05-26 Thread Ricardo Neri
On Sun, 2017-05-21 at 16:23 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:00AM -0700, Ricardo Neri wrote:
> > Up to this point, only fault.c used the definitions of the page fault error
> > codes. Thus, it made sense to keep them within such file. Other portions of
> > code might be interested in those definitions too. For instance, the User-
> > Mode Instruction Prevention emulation code will use such definitions to
> > emulate a page fault when it is unable to successfully copy the results
> > of the emulated instructions to user space.
> > 
> > While relocating the error code enumeration, the prefix X86_ is used to
> > make it consistent with the rest of the definitions in traps.h. Of course,
> > code using the enumeration had to be updated as well. No functional changes
> > were performed.
> > 
> > Cc: Thomas Gleixner <t...@linutronix.de>
> > Cc: Ingo Molnar <mi...@redhat.com>
> > Cc: "H. Peter Anvin" <h...@zytor.com>
> > Cc: Andy Lutomirski <l...@kernel.org>
> > Cc: "Kirill A. Shutemov" <kirill.shute...@linux.intel.com>
> > Cc: Josh Poimboeuf <jpoim...@redhat.com>
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> > Cc: x...@kernel.org
> > Reviewed-by: Andy Lutomirski <l...@kernel.org>
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/include/asm/traps.h | 18 +
> >  arch/x86/mm/fault.c  | 88 
> > +---
> >  2 files changed, 52 insertions(+), 54 deletions(-)
> 
> ...
> 
> > @@ -1382,7 +1362,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long 
> > error_code,
> >  * space check, thus avoiding the deadlock:
> >  */
> > if (unlikely(!down_read_trylock(>mmap_sem))) {
> > -   if ((error_code & PF_USER) == 0 &&
> > +   if ((error_code & X86_PF_USER) == 0 &&
> 
>   if (!(error_code & X86_PF_USER))

This change was initially intended to only rename the error codes,
without functional changes. Would making change be considered a change
in functionality? The behavior would be preserved, though.

Thanks and BR,
Ricardo


> 
> With that fixed:
> 
> Reviewed-by: Borislav Petkov <b...@suse.de>

Thank you for your review!

BR,
Ricardo
> 
> -- 
> Regards/Gruss,
> Boris.
> 
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nürnberg)


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 00/26] x86: Enable User-Mode Instruction Prevention

2017-05-17 Thread Ricardo Neri
Hi Ingo, Thomas,

On Fri, 2017-05-05 at 11:16 -0700, Ricardo Neri wrote:
> This is v7 of this series. The six previous submissions can be found
> here [1], here [2], here[3], here[4], here[5] and here[6]. This
> version
> addresses the comments received in v6 plus improvements of the
> handling
> of exceptions unrelated to UMIP as well as corner cases in
> virtual-8086
> mode. Please see details in the change log.

Since there have been no more comments in the version and if this series
look good to you, could this be considered to be merged into the tip
tree?

The only remaining item is a cleanup patch that Borislav Petkov
suggested [1]. I could work on it incrementally on top of this series.

Thanks and BR,
Ricardo

[1]. https://lkml.org/lkml/2017/5/4/244




--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 07/21] x86/insn-eval: Add utility function to get segment descriptor

2017-05-11 Thread Ricardo Neri
On Thu, 2017-05-04 at 13:02 +0200, Borislav Petkov wrote:
> On Wed, Apr 26, 2017 at 02:51:56PM -0700, Ricardo Neri wrote:
> > > > +seg >= 
> > > > current->active_mm->context.ldt->size)) {
> > > 
> > > ldt->size is the size of the descriptor table but you've shifted seg by
> > > 3. That selector index is shifted by 3 (to the left) to form an offset
> > > into the descriptor table because the entries there are 8 bytes.
> > 
> > I double-checked the ldt code and it seems to me that size refers to the
> > number of entries in the table; it is always multiplied by
> > LDT_ENTRY_SIZE [1], [2]. Am I missing something?
> 
> No, you're not. I fell into that wrongly named struct member trap.
> 
> So ldt_struct.size should actually be called ldt_struct.n_entries or
> similar. Because what's in there is now is not "size".
> 
> And then code like
> 
>   new_ldt->size * LDT_ENTRY_SIZE
> 
> would make much more sense if written like this:
> 
>   new_ldt->n_entries * LDT_ENTRY_SIZE
> 
> Would you fix that in a prepatch pls?
> 

Sure I can. Would this trigger a v8 of my series? I was hoping v7 series
could be merged and then start doing incremental work on top of it. Does
it make sense?

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 08/21] x86/insn-eval: Add utility function to get segment descriptor base address

2017-05-11 Thread Ricardo Neri
On Fri, 2017-05-05 at 19:19 +0200, Borislav Petkov wrote:
> On Wed, Apr 26, 2017 at 03:37:44PM -0700, Ricardo Neri wrote:
> > I need a human-readable way of identifying what segment selector (in
> > pt_regs, vm86regs or directly reading the segment registers) to use.
> > Since there is a segment override prefix for all of them, I thought I
> > could use them.
> 
> Yes, you should...
> 
> > Perhaps I can rename enum segment to enum segment_selector and comment
> > that the values in the enum are those of the override prefixes. Would
> > that be reasonable?
> 
> ... but you should call them what they are: "enum seg_override_pfxs" or
> "enum seg_ovr_pfx" or...
> 
> Or somesuch. I suck at naming stuff.

In my v7, I simply named my enumeration enum segment_register, which is
what they are. Some of its entries happen to have the value of the
segment override prefixes but also have special entries as SEG_REG_INVAL
when for errors and SEG_REG_IGNORE for long mode [1].

Thanks and BR,
Ricardo

[1]. https://lkml.org/lkml/2017/5/5/405

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 08/21] x86/insn-eval: Add utility function to get segment descriptor base address

2017-05-11 Thread Ricardo Neri
On Fri, 2017-05-05 at 19:28 +0200, Borislav Petkov wrote:
> On Wed, Apr 26, 2017 at 03:52:41PM -0700, Ricardo Neri wrote:
> > Probably insn_get_seg_base() itself can verify if there are segment
> > override prefixes in the struct insn. If yes, use them except for
> > specific cases such as CS.
> 
> ... and depending on whether in long mode or not.

Yes, in my v7 I ignore the segment register if we are in long mode [1].
> 
> > On an unrelated note, I still have the problem of using DS vs ES for
> > string instructions. Perhaps instead of a use_default_seg flag, a
> > string_instruction flag that indicates how to determine the default
> > segment.
> 
> ... or you can look at the insn opcode directly. AFAICT, you need
> to check whether the opcode is 0xa4 or 0xa5 and that the insn is a
> single-byte opcode, i.e., not from the secondary map escaped with 0xf or
> some of the other multi-byte opcode maps.

In my v7, I have added a section my function resolve_seg_register() that
ignores
segment overrides if it sees string instructions and the register EDI
and defaults to ES. If the register is EIP, it defaults to CS. To
determine if an instruction is a string instruction I do check for the
size of the opcode and the opcodes that you mention plus others based on
the Intel Software Development Manual[2].

[1]. https://lkml.org/lkml/2017/5/5/405
[2]. https://lkml.org/lkml/2017/5/5/410

Thanks and BR,
Ricardo


> 
> -- 
> Regards/Gruss,
> Boris.
> 
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nürnberg)


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 10/21] x86/insn-eval: Do not use R/EBP as base if mod in ModRM is zero

2017-05-11 Thread Ricardo Neri
On Sun, 2017-05-07 at 19:20 +0200, Borislav Petkov wrote:
> On Wed, Apr 26, 2017 at 06:29:59PM -0700, Ricardo Neri wrote:
> > >   if (X86_MODRM_MOD(insn->modrm.value) == 0 &&
> > >   X86_MODRM_RM(insn->modrm.value)  == 5)
> > > 
> > > looks more understandable to me.
> > 
> > Should I go with !(X86_MODRM_MOD(insn->modrm.value)) as you suggested in
> > other patches?
> 
> Ah, yes pls.
> 
 I did this in v7[1].

Thanks and BR,
Ricardo

[1]. https://lkml.org/lkml/2017/5/5/399

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 12/21] x86/insn: Support both signed 32-bit and 64-bit effective addresses

2017-05-11 Thread Ricardo Neri
On Mon, 2017-05-08 at 13:42 +0200, Borislav Petkov wrote:
> On Wed, Apr 26, 2017 at 08:33:46PM -0700, Ricardo Neri wrote:
> > This is the reason I check the value of long_bytes. If long_bytes is not
> > 4, being the only other possible value 8 (perhaps I need to issue an
> > error when the value is not any of these values),
> 
> Well, maybe I'm a bit too paranoid. Bottom line is, we should do the
> address computations exactly like the hardware does them so that there
> are no surprises. Doing them with longs looks ok to me.

Using long is exactly what I intend to do. The problem that I am trying
to resolve is to sign-extend signed memory offsets of 32-bit programs
running on 64-bit kernels. For 64-bit programs running on 64-bit kernels
I can simply use longs. I added error checking in my v7 of this series
[1].

Thanks and BR,
Ricardo

[1]. https://lkml.org/lkml/2017/5/5/407


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 20/26] x86/cpufeature: Add User-Mode Instruction Prevention definitions

2017-05-10 Thread Ricardo Neri
On Sat, 2017-05-06 at 11:04 +0200, Paolo Bonzini wrote:
> 
> 
> On 05/05/2017 20:17, Ricardo Neri wrote:
> > User-Mode Instruction Prevention is a security feature present in
> new
> > Intel processors that, when set, prevents the execution of a subset
> of
> > instructions if such instructions are executed in user mode (CPL >
> 0).
> > Attempting to execute such instructions causes a general protection
> > exception.
> > 
> > The subset of instructions comprises:
> > 
> >  * SGDT - Store Global Descriptor Table
> >  * SIDT - Store Interrupt Descriptor Table
> >  * SLDT - Store Local Descriptor Table
> >  * SMSW - Store Machine Status Word
> >  * STR  - Store Task Register
> > 
> > This feature is also added to the list of disabled-features to allow
> > a cleaner handling of build-time configuration.
> > 
> > Cc: Andy Lutomirski <l...@kernel.org>
> > Cc: Andrew Morton <a...@linux-foundation.org>
> > Cc: H. Peter Anvin <h...@zytor.com>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Brian Gerst <brge...@gmail.com>
> > Cc: Chen Yucong <sla...@gmail.com>
> > Cc: Chris Metcalf <cmetc...@mellanox.com>
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Fenghua Yu <fenghua...@intel.com>
> > Cc: Huang Rui <ray.hu...@amd.com>
> > Cc: Jiri Slaby <jsl...@suse.cz>
> > Cc: Jonathan Corbet <cor...@lwn.net>
> > Cc: Michael S. Tsirkin <m...@redhat.com>
> > Cc: Paul Gortmaker <paul.gortma...@windriver.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: Shuah Khan <sh...@kernel.org>
> > Cc: Vlastimil Babka <vba...@suse.cz>
> > Cc: Tony Luck <tony.l...@intel.com>
> > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > Cc: Liang Z. Li <liang.z...@intel.com>
> > Cc: Alexandre Julliard <julli...@winehq.org>
> > Cc: Stas Sergeev <s...@list.ru>
> > Cc: x...@kernel.org
> > Cc: linux-msdos@vger.kernel.org
> > 
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> 
> Would it be possible to have this patch in a topic branch for KVM's
> consumption?
> 
I have put a branch here with this single patch:

https://github.com/ricardon/tip.git rneri/umip_for_kvm

This is based on Linux v4.11. Please let me know if this works for your
or you'd prefer it to be based on a different branch/commit/repo.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 02/26] x86/mm: Relocate page fault error codes to traps.h

2017-05-05 Thread Ricardo Neri
Up to this point, only fault.c used the definitions of the page fault error
codes. Thus, it made sense to keep them within such file. Other portions of
code might be interested in those definitions too. For instance, the User-
Mode Instruction Prevention emulation code will use such definitions to
emulate a page fault when it is unable to successfully copy the results
of the emulated instructions to user space.

While relocating the error code enumeration, the prefix X86_ is used to
make it consistent with the rest of the definitions in traps.h. Of course,
code using the enumeration had to be updated as well. No functional changes
were performed.

Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: "Kirill A. Shutemov" <kirill.shute...@linux.intel.com>
Cc: Josh Poimboeuf <jpoim...@redhat.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: x...@kernel.org
Reviewed-by: Andy Lutomirski <l...@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/traps.h | 18 +
 arch/x86/mm/fault.c  | 88 +---
 2 files changed, 52 insertions(+), 54 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 01fd0a7..4a2e585 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -148,4 +148,22 @@ enum {
X86_TRAP_IRET = 32, /* 32, IRET Exception */
 };
 
+/*
+ * Page fault error code bits:
+ *
+ *   bit 0 ==   0: no page found   1: protection fault
+ *   bit 1 ==   0: read access 1: write access
+ *   bit 2 ==   0: kernel-mode access  1: user-mode access
+ *   bit 3 ==  1: use of reserved bit detected
+ *   bit 4 ==  1: fault was an instruction fetch
+ *   bit 5 ==  1: protection keys block access
+ */
+enum x86_pf_error_code {
+   X86_PF_PROT =   1 << 0,
+   X86_PF_WRITE=   1 << 1,
+   X86_PF_USER =   1 << 2,
+   X86_PF_RSVD =   1 << 3,
+   X86_PF_INSTR=   1 << 4,
+   X86_PF_PK   =   1 << 5,
+};
 #endif /* _ASM_X86_TRAPS_H */
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 8ad91a0..32f3070 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -29,26 +29,6 @@
 #include 
 
 /*
- * Page fault error code bits:
- *
- *   bit 0 ==   0: no page found   1: protection fault
- *   bit 1 ==   0: read access 1: write access
- *   bit 2 ==   0: kernel-mode access  1: user-mode access
- *   bit 3 ==  1: use of reserved bit detected
- *   bit 4 ==  1: fault was an instruction fetch
- *   bit 5 ==  1: protection keys block access
- */
-enum x86_pf_error_code {
-
-   PF_PROT =   1 << 0,
-   PF_WRITE=   1 << 1,
-   PF_USER =   1 << 2,
-   PF_RSVD =   1 << 3,
-   PF_INSTR=   1 << 4,
-   PF_PK   =   1 << 5,
-};
-
-/*
  * Returns 0 if mmiotrace is disabled, or if the fault is not
  * handled by mmiotrace:
  */
@@ -149,7 +129,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, 
unsigned long addr)
 * If it was a exec (instruction fetch) fault on NX page, then
 * do not ignore the fault:
 */
-   if (error_code & PF_INSTR)
+   if (error_code & X86_PF_INSTR)
return 0;
 
instr = (void *)convert_ip_to_linear(current, regs);
@@ -179,7 +159,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, 
unsigned long addr)
  * siginfo so userspace can discover which protection key was set
  * on the PTE.
  *
- * If we get here, we know that the hardware signaled a PF_PK
+ * If we get here, we know that the hardware signaled a X86_PF_PK
  * fault and that there was a VMA once we got in the fault
  * handler.  It does *not* guarantee that the VMA we find here
  * was the one that we faulted on.
@@ -205,7 +185,7 @@ static void fill_sig_info_pkey(int si_code, siginfo_t *info,
/*
 * force_sig_info_fault() is called from a number of
 * contexts, some of which have a VMA and some of which
-* do not.  The PF_PK handing happens after we have a
+* do not.  The X86_PF_PK handing happens after we have a
 * valid VMA, so we should never reach this without a
 * valid VMA.
 */
@@ -695,7 +675,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long 
error_code,
if (!oops_may_print())
  

[PATCH v7 03/26] x86/mpx: Use signed variables to compute effective addresses

2017-05-05 Thread Ricardo Neri
Even though memory addresses are unsigned, the operands used to compute the
effective address do have a sign. This is true for the ModRM.rm, SIB.base,
SIB.index as well as the displacement bytes. Thus, signed variables shall
be used when computing the effective address from these operands. Once the
signed effective address has been computed, it is casted to an unsigned
long to determine the linear address.

Variables are renamed to better reflect the type of address being
computed.

Cc: Borislav Petkov <b...@suse.de>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Nathan Howard <liverl...@gmail.com>
Cc: Adan Hawthorn <adanhawth...@gmail.com>
Cc: Joe Perches <j...@perches.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/mm/mpx.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 1c34b76..ebdead8 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -138,7 +138,8 @@ static int get_reg_offset(struct insn *insn, struct pt_regs 
*regs,
  */
 static void __user *mpx_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-   unsigned long addr, base, indx;
+   unsigned long linear_addr;
+   long eff_addr, base, indx;
int addr_offset, base_offset, indx_offset;
insn_byte_t sib;
 
@@ -150,7 +151,7 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
if (addr_offset < 0)
goto out_err;
-   addr = regs_get_register(regs, addr_offset);
+   eff_addr = regs_get_register(regs, addr_offset);
} else {
if (insn->sib.nbytes) {
base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
@@ -163,16 +164,18 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
 
base = regs_get_register(regs, base_offset);
indx = regs_get_register(regs, indx_offset);
-   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
if (addr_offset < 0)
goto out_err;
-   addr = regs_get_register(regs, addr_offset);
+   eff_addr = regs_get_register(regs, addr_offset);
}
-   addr += insn->displacement.value;
+   eff_addr += insn->displacement.value;
}
-   return (void __user *)addr;
+   linear_addr = (unsigned long)eff_addr;
+
+   return (void __user *)linear_addr;
 out_err:
return (void __user *)-1;
 }
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 05/26] x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0

2017-05-05 Thread Ricardo Neri
Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when a SIB byte is used and the
base of the SIB byte points is base = 101b and the mod part
of the ModRM byte is zero, the base port on the effective address
computation is null. In this case, a 32-bit displacement follows the SIB
byte. This is obtained when the instruction decoder parses the operands.

To signal this scenario, a -EDOM error is returned to indicate callers that
they should ignore the base.

Cc: Borislav Petkov <b...@suse.de>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Nathan Howard <liverl...@gmail.com>
Cc: Adan Hawthorn <adanhawth...@gmail.com>
Cc: Joe Perches <j...@perches.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/mm/mpx.c | 27 ---
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 7397b81..30aef92 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -122,6 +122,15 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
 
case REG_TYPE_BASE:
regno = X86_SIB_BASE(insn->sib.value);
+   /*
+* If ModRM.mod is 0 and SIB.base == 5, the base of the
+* register-indirect addressing is 0. In this case, a
+* 32-bit displacement is expected in this case; the
+* instruction decoder finds such displacement for us.
+*/
+   if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+   return -EDOM;
+
if (X86_REX_B(insn->rex_prefix.value))
regno += 8;
break;
@@ -162,16 +171,21 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
eff_addr = regs_get_register(regs, addr_offset);
} else {
if (insn->sib.nbytes) {
+   /*
+* Negative values in the base and index offset means
+* an error when decoding the SIB byte. Except -EDOM,
+* which means that the registers should not be used
+* in the address computation.
+*/
base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-   if (base_offset < 0)
+   if (base_offset == -EDOM)
+   base = 0;
+   else if (base_offset < 0)
goto out_err;
+   else
+   base = regs_get_register(regs, base_offset);
 
indx_offset = get_reg_offset(insn, regs, 
REG_TYPE_INDEX);
-   /*
-* A negative offset generally means a error, except
-* -EDOM, which means that the contents of the register
-* should not be used as index.
-*/
if (indx_offset == -EDOM)
indx = 0;
else if (indx_offset < 0)
@@ -179,7 +193,6 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
else
indx = regs_get_register(regs, indx_offset);
 
-   base = regs_get_register(regs, base_offset);
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 08/26] x86/insn-eval: Add a utility function to get register offsets

2017-05-05 Thread Ricardo Neri
The function get_reg_offset() returns the offset to the register the
argument specifies as indicated in an enumeration of type offset. Callers
of this function would need the definition of such enumeration. This is
not needed. Instead, add helper functions for this purpose. These functions
are useful in cases when, for instance, the caller needs to decide whether
the operand is a register or a memory location by looking at the rm part
of the ModRM byte. As of now, this is the only helper function that is
needed.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |  1 +
 arch/x86/lib/insn-eval.c | 15 +++
 2 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 5cab1b1..7e8c963 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -12,5 +12,6 @@
 #include 
 
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
+int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 182e2ae..8b16761 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -97,6 +97,21 @@ static int get_reg_offset(struct insn *insn, struct pt_regs 
*regs,
return regoff[regno];
 }
 
+/**
+ * insn_get_reg_offset_modrm_rm() - Obtain register in r/m part of ModRM byte
+ * @insn:  Instruction structure containing the ModRM byte
+ * @regs:  Structure with register values as seen when entering kernel mode
+ *
+ * Return: The register indicated by the r/m part of the ModRM byte. The
+ * register is obtained as an offset from the base of pt_regs. In specific
+ * cases, the returned value can be -EDOM to indicate that the particular value
+ * of ModRM does not refer to a register and shall be ignored.
+ */
+int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
+{
+   return get_reg_offset(insn, regs, REG_TYPE_RM);
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 19/26] x86/insn-eval: Add wrapper function for 16-bit and 32-bit address encodings

2017-05-05 Thread Ricardo Neri
Convert the function insn_get_add_ref() into a wrapper function that calls
the correct static address-decoding function depending on the address size
In this way, callers do not need to worry about calling the correct
function and decreases the number of functions that need to be exposed.

To this end, the function insn_get_addr_ref() used to obtain linear
addresses from the 32/64-bit encodings is renamed as get_addr_ref_32_64()
to reflect the type of address encodings that it handles.

Documentation is added to the new wrapper function and the documentation
for the 32/64-bit address decoding function is improved.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 48 +++-
 1 file changed, 43 insertions(+), 5 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 928a662..8914884 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -899,12 +899,22 @@ long get_mem_offset(struct pt_regs *regs, int reg_offset, 
int addr_size)
return -1L;
return offset;
 }
-/*
- * return the address being referenced be instruction
- * for rm=3 returning the content of the rm reg
- * for rm!=3 calculates the address using SIB and Disp
+
+/**
+ * get_addr_ref_32_64() - Obtain a 32/64-bit linear address
+ * @insn:  Instruction struct with ModRM and SiB bytes and displacement
+ * @regs:  Structure with register values as seen when entering kernel mode
+ *
+ * This function is to be used with 32-bit and 64-bit address encodings to
+ * obtain the effective memory address referred by the instruction's ModRM,
+ * SIB, and displacement bytes, as applicable. Also, the segment base is used
+ * to compute the linear address. In protected mode, segment limits are
+ * enforced.
+ *
+ * Return: linear address referenced by instruction and registers on success.
+ * -1L on failure.
  */
-void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+static void __user *get_addr_ref_32_64(struct insn *insn, struct pt_regs *regs)
 {
unsigned long linear_addr, seg_base_addr, seg_limit;
long eff_addr, base, indx;
@@ -1026,3 +1036,31 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
 out_err:
return (void __user *)-1;
 }
+
+/**
+ * insn_get_addr_ref() - Obtain the linear address referred by instruction
+ * @insn:  Instruction structure containing ModRM byte and displacement
+ * @regs:  Structure with register values as seen when entering kernel mode
+ *
+ * Obtain the memory address referred by the instruction's ModRM bytes and
+ * displacement. Also, the segment used as base is determined by either any
+ * segment override prefixes in insn or the default segment of the registers
+ * involved in the address computation. In protected mode, segment limits
+ * are enforced.
+ *
+ * Return: linear address referenced by instruction and registers on success.
+ * -1L on failure.
+ */
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+{
+   switch (insn->addr_bytes) {
+   case 2:
+   return get_addr_ref_16(insn, regs);
+   case 4:
+   /* fall through */
+   case 8:
+   return get_addr_ref_32_64(insn, regs);
+   default:
+   return (void __user *)-1;
+   }
+}
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 18/26] x86/insn-eval: Add support to resolve 16-bit addressing encodings

2017-05-05 Thread Ricardo Neri
Tasks running in virtual-8086 mode or in protected mode with code
segment descriptors that specify 16-bit default address sizes via the
D bit will use 16-bit addressing form encodings as described in the Intel
64 and IA-32 Architecture Software Developer's Manual Volume 2A Section
2.1.5. 16-bit addressing encodings differ in several ways from the
32-bit/64-bit addressing form encodings: ModRM.rm points to different
registers and, in some cases, effective addresses are indicated by the
addition of the value of two registers. Also, there is no support for SIB
bytes. Thus, a separate function is needed to parse this form of
addressing.

A couple of functions are introduced. get_reg_offset_16() obtains the
offset from the base of pt_regs of the registers indicated by the ModRM
byte of the address encoding. get_addr_ref_16() computes the linear
address indicated by the instructions using the value of the registers
given by ModRM as well as the base address of the segment.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 155 +++
 1 file changed, 155 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 9822061..928a662 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -431,6 +431,73 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
 }
 
 /**
+ * get_reg_offset_16 - Obtain offset of register indicated by instruction
+ * @insn:  Instruction structure containing ModRM and SiB bytes
+ * @regs:  Structure with register values as seen when entering kernel mode
+ * @offs1: Offset of the first operand register
+ * @offs2: Offset of the second opeand register, if applicable.
+ *
+ * Obtain the offset, in pt_regs, of the registers indicated by the ModRM byte
+ * within insn. This function is to be used with 16-bit address encodings. The
+ * offs1 and offs2 will be written with the offset of the two registers
+ * indicated by the instruction. In cases where any of the registers is not
+ * referenced by the instruction, the value will be set to -EDOM.
+ *
+ * Return: 0 on success, -EINVAL on failure.
+ */
+static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
+int *offs1, int *offs2)
+{
+   /* 16-bit addressing can use one or two registers */
+   static const int regoff1[] = {
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, bx),
+   };
+
+   static const int regoff2[] = {
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   -EDOM,
+   -EDOM,
+   -EDOM,
+   -EDOM,
+   };
+
+   if (!offs1 || !offs2)
+   return -EINVAL;
+
+   /* operand is a register, use the generic function */
+   if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+   *offs1 = insn_get_modrm_rm_off(insn, regs);
+   *offs2 = -EDOM;
+   return 0;
+   }
+
+   *offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
+   *offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)];
+
+   /*
+* If no displacement is indicated in the mod part of the ModRM byte,
+* (mod part is 0) and the r/m part of the same byte is 6, no register
+* is used caculate the operand address. An r/m part of 6 means that
+* the second register offset is already invalid.
+*/
+   if ((X86_MODRM_MOD(insn->modrm.value) == 0) &&
+   (X86_MODRM_RM(insn->modrm.value) == 6))
+   *offs1 = -EDOM;
+
+   return 0;
+}
+
+/**
  * get_desc() - Obtain address of segment descriptor
  * @sel:   Segment selector
  *
@@ -689,6 +756,94 @@ int insn_get_modrm_rm_off(struct insn *insn, struct 
pt_regs *regs)
 }
 
 /**
+ * get_addr_re

[PATCH v7 21/26] x86: Add emulation code for UMIP instructions

2017-05-05 Thread Ricardo Neri
The feature User-Mode Instruction Prevention present in recent Intel
processor prevents a group of instructions from being executed with
CPL > 0. Otherwise, a general protection fault is issued.

Rather than relaying this fault to the user space (in the form of a SIGSEGV
signal), the instructions protected by UMIP can be emulated to provide
dummy results. This allows to conserve the current kernel behavior and not
reveal the system resources that UMIP intends to protect (the global
descriptor and interrupt descriptor tables, the segment selectors of the
local descriptor table and the task state and the machine status word).

This emulation is needed because certain applications (e.g., WineHQ and
DOSEMU2) rely on this subset of instructions to function.

The instructions protected by UMIP can be split in two groups. Those who
return a kernel memory address (sgdt and sidt) and those who return a
value (sldt, str and smsw).

For the instructions that return a kernel memory address, applications
such as WineHQ rely on the result being located in the kernel memory space.
The result is emulated as a hard-coded value that, lies close to the top
of the kernel memory. The limit for the GDT and the IDT are set to zero.

Given that sldt and str are not used in common in programs supported by
WineHQ and DOSEMU2, they are not emulated.

The instruction smsw is emulated to return the value that the register CR0
has at boot time as set in the head_32.

Care is taken to appropriately emulate the results when segmentation is
used. This is, rather than relying on USER_DS and USER_CS, the function
insn_get_addr_ref() inspects the segment descriptor pointed by the
registers in pt_regs. This ensures that we correctly obtain the segment
base address and the address and operand sizes even if the user space
application uses local descriptor table.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Cc: Tony Luck <tony.l...@intel.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Liang Z. Li <liang.z...@intel.com>
Cc: Alexandre Julliard <julli...@winehq.org>
Cc: Stas Sergeev <s...@list.ru>
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/umip.h |  15 +++
 arch/x86/kernel/Makefile|   1 +
 arch/x86/kernel/umip.c  | 245 
 3 files changed, 261 insertions(+)
 create mode 100644 arch/x86/include/asm/umip.h
 create mode 100644 arch/x86/kernel/umip.c

diff --git a/arch/x86/include/asm/umip.h b/arch/x86/include/asm/umip.h
new file mode 100644
index 000..077b236
--- /dev/null
+++ b/arch/x86/include/asm/umip.h
@@ -0,0 +1,15 @@
+#ifndef _ASM_X86_UMIP_H
+#define _ASM_X86_UMIP_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_INTEL_UMIP
+bool fixup_umip_exception(struct pt_regs *regs);
+#else
+static inline bool fixup_umip_exception(struct pt_regs *regs)
+{
+   return false;
+}
+#endif  /* CONFIG_X86_INTEL_UMIP */
+#endif  /* _ASM_X86_UMIP_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 4b99423..cc1b7cc 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -123,6 +123,7 @@ obj-$(CONFIG_EFI)   += sysfb_efi.o
 obj-$(CONFIG_PERF_EVENTS)  += perf_regs.o
 obj-$(CONFIG_TRACING)  += tracepoint.o
 obj-$(CONFIG_SCHED_MC_PRIO)+= itmt.o
+obj-$(CONFIG_X86_INTEL_UMIP)   += umip.o
 
 ifdef CONFIG_FRAME_POINTER
 obj-y  += unwind_frame.o
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
new file mode 100644
index 000..c7c5795
--- /dev/null
+++ b/arch/x86/kernel/umip.c
@@ -0,0 +1,245 @@
+/*
+ * umip.c Emulation for instruction protected by the Intel User-Mode
+ * Instruction Prevention. The instructions are:
+ *sgdt
+ *sldt
+ *sidt
+ *str
+ *smsw
+ *
+ * Copyright (c) 2017, Intel Corporation.
+ * Ricardo Neri <ricardo.n...@linux.intel.com>
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * == Base addresses of GDT and IDT
+ * Some applications to function rely finding the global descriptor table (GDT)
+ * and th

[PATCH v7 22/26] x86/umip: Force a page fault when unable to copy emulated result to user

2017-05-05 Thread Ricardo Neri
fixup_umip_exception() will be called from do_general_protection. If the
former returns false, the latter will issue a SIGSEGV with SEND_SIG_PRIV.
However, when emulation is successful but the emulated result cannot be
copied to user space memory, it is more accurate to issue a SIGSEGV with
SEGV_MAPERR with the offending address. A new function is inspired in
force_sig_info_fault is introduced to model the page fault.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Cc: Tony Luck <tony.l...@intel.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Liang Z. Li <liang.z...@intel.com>
Cc: Alexandre Julliard <julli...@winehq.org>
Cc: Stas Sergeev <s...@list.ru>
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/kernel/umip.c | 45 +++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index c7c5795..ff7366a 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -148,6 +148,41 @@ static int __emulate_umip_insn(struct insn *insn, enum 
umip_insn umip_inst,
 }
 
 /**
+ * __force_sig_info_umip_fault() - Force a SIGSEGV with SEGV_MAPERR
+ * @address:   Address that caused the signal
+ * @regs:  Register set containing the instruction pointer
+ *
+ * Force a SIGSEGV signal with SEGV_MAPERR as the error code. This function is
+ * intended to be used to provide a segmentation fault when the result of the
+ * UMIP emulation could not be copied to the user space memory.
+ *
+ * Return: none
+ */
+static void __force_sig_info_umip_fault(void __user *address,
+   struct pt_regs *regs)
+{
+   siginfo_t info;
+   struct task_struct *tsk = current;
+
+   if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)) {
+   printk_ratelimited("%s[%d] umip emulation segfault ip:%lx 
sp:%lx error:%x in %lx\n",
+  tsk->comm, task_pid_nr(tsk), regs->ip,
+  regs->sp, X86_PF_USER | X86_PF_WRITE,
+  regs->ip);
+   }
+
+   tsk->thread.cr2 = (unsigned long)address;
+   tsk->thread.error_code  = X86_PF_USER | X86_PF_WRITE;
+   tsk->thread.trap_nr = X86_TRAP_PF;
+
+   info.si_signo   = SIGSEGV;
+   info.si_errno   = 0;
+   info.si_code= SEGV_MAPERR;
+   info.si_addr= address;
+   force_sig_info(SIGSEGV, , tsk);
+}
+
+/**
  * fixup_umip_exception() - Fixup #GP faults caused by UMIP
  * @regs:  Registers as saved when entering the #GP trap
  *
@@ -235,8 +270,14 @@ bool fixup_umip_exception(struct pt_regs *regs)
if ((unsigned long)uaddr == -1L)
return false;
nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
-   if (nr_copied  > 0)
-   return false;
+   if (nr_copied  > 0) {
+   /*
+* If copy fails, send a signal and tell caller that
+* fault was fixed up
+*/
+   __force_sig_info_umip_fault(uaddr, regs);
+   return true;
+   }
}
 
/* increase IP to let the program keep going */
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 20/26] x86/cpufeature: Add User-Mode Instruction Prevention definitions

2017-05-05 Thread Ricardo Neri
User-Mode Instruction Prevention is a security feature present in new
Intel processors that, when set, prevents the execution of a subset of
instructions if such instructions are executed in user mode (CPL > 0).
Attempting to execute such instructions causes a general protection
exception.

The subset of instructions comprises:

 * SGDT - Store Global Descriptor Table
 * SIDT - Store Interrupt Descriptor Table
 * SLDT - Store Local Descriptor Table
 * SMSW - Store Machine Status Word
 * STR  - Store Task Register

This feature is also added to the list of disabled-features to allow
a cleaner handling of build-time configuration.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Cc: Tony Luck <tony.l...@intel.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Liang Z. Li <liang.z...@intel.com>
Cc: Alexandre Julliard <julli...@winehq.org>
Cc: Stas Sergeev <s...@list.ru>
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org

Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/cpufeatures.h  | 1 +
 arch/x86/include/asm/disabled-features.h| 8 +++-
 arch/x86/include/uapi/asm/processor-flags.h | 2 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 2701e5f..f1d61d2 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -289,6 +289,7 @@
 
 /* Intel-defined CPU features, CPUID level 0x0007:0 (ecx), word 16 */
 #define X86_FEATURE_AVX512VBMI  (16*32+ 1) /* AVX512 Vector Bit Manipulation 
instructions*/
+#define X86_FEATURE_UMIP   (16*32+ 2) /* User Mode Instruction Protection 
*/
 #define X86_FEATURE_PKU(16*32+ 3) /* Protection Keys for 
Userspace */
 #define X86_FEATURE_OSPKE  (16*32+ 4) /* OS Protection Keys Enable */
 #define X86_FEATURE_AVX512_VPOPCNTDQ (16*32+14) /* POPCNT for vectors of DW/QW 
*/
diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index 5dff775..7adaef7 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -16,6 +16,12 @@
 # define DISABLE_MPX   (1<<(X86_FEATURE_MPX & 31))
 #endif
 
+#ifdef CONFIG_X86_INTEL_UMIP
+# define DISABLE_UMIP  0
+#else
+# define DISABLE_UMIP  (1<<(X86_FEATURE_UMIP & 31))
+#endif
+
 #ifdef CONFIG_X86_64
 # define DISABLE_VME   (1<<(X86_FEATURE_VME & 31))
 # define DISABLE_K6_MTRR   (1<<(X86_FEATURE_K6_MTRR & 31))
@@ -61,7 +67,7 @@
 #define DISABLED_MASK130
 #define DISABLED_MASK140
 #define DISABLED_MASK150
-#define DISABLED_MASK16(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57)
+#define DISABLED_MASK16
(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP)
 #define DISABLED_MASK170
 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
diff --git a/arch/x86/include/uapi/asm/processor-flags.h 
b/arch/x86/include/uapi/asm/processor-flags.h
index 567de50..d2c2af8 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -104,6 +104,8 @@
 #define X86_CR4_OSFXSR _BITUL(X86_CR4_OSFXSR_BIT)
 #define X86_CR4_OSXMMEXCPT_BIT 10 /* enable unmasked SSE exceptions */
 #define X86_CR4_OSXMMEXCPT _BITUL(X86_CR4_OSXMMEXCPT_BIT)
+#define X86_CR4_UMIP_BIT   11 /* enable UMIP support */
+#define X86_CR4_UMIP   _BITUL(X86_CR4_UMIP_BIT)
 #define X86_CR4_VMXE_BIT   13 /* enable VMX virtualization */
 #define X86_CR4_VMXE   _BITUL(X86_CR4_VMXE_BIT)
 #define X86_CR4_SMXE_BIT   14 /* enable safer mode (TXT) */
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 04/26] x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is not 11b

2017-05-05 Thread Ricardo Neri
Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when ModRM.mod !=11b and
ModRM.rm = 100b indexed register-indirect addressing is used. In other
words, a SIB byte follows the ModRM byte. In the specific case of
SIB.index = 100b, the scale*index portion of the computation of the
effective address is null. To signal callers of this particular situation,
get_reg_offset() can return -EDOM (-EINVAL continues to indicate that an
error when decoding the SIB byte).

An example of this situation can be the following instruction:

   8b 4c 23 80   mov -0x80(%rbx,%riz,1),%rcx
   ModRM:0x4c [mod:1b][reg:1b][rm:100b]
   SIB:  0x23 [scale:0b][index:100b][base:11b]
   Displacement: 0x80  (1-byte, as per ModRM.mod = 1b)

The %riz 'register' indicates a null index.

In long mode, a REX prefix may be used. When a REX prefix is present,
REX.X adds a fourth bit to the register selection of SIB.index. This gives
the ability to refer to all the 16 general purpose registers. When REX.X is
1b and SIB.index is 100b, the index is indicated in %r12. In our example,
this would look like:

   42 8b 4c 23 80mov -0x80(%rbx,%r12,1),%rcx
   REX:  0x42 [W:0b][R:0b][X:1b][B:0b]
   ModRM:0x4c [mod:1b][reg:1b][rm:100b]
   SIB:  0x23 [scale:0b][.X: 1b, index:100b][.B:0b, base:11b]
   Displacement: 0x80  (1-byte, as per ModRM.mod = 1b)

Cc: Borislav Petkov <b...@suse.de>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Nathan Howard <liverl...@gmail.com>
Cc: Adan Hawthorn <adanhawth...@gmail.com>
Cc: Joe Perches <j...@perches.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/mm/mpx.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index ebdead8..7397b81 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -110,6 +110,14 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
regno = X86_SIB_INDEX(insn->sib.value);
if (X86_REX_X(insn->rex_prefix.value))
regno += 8;
+   /*
+* If ModRM.mod !=3 and SIB.index (regno=4) the scale*index
+* portion of the address computation is null. This is
+* true only if REX.X is 0. In such a case, the SIB index
+* is used in the address computation.
+*/
+   if (X86_MODRM_MOD(insn->modrm.value) != 3 && regno == 4)
+   return -EDOM;
break;
 
case REG_TYPE_BASE:
@@ -159,11 +167,19 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
goto out_err;
 
indx_offset = get_reg_offset(insn, regs, 
REG_TYPE_INDEX);
-   if (indx_offset < 0)
+   /*
+* A negative offset generally means a error, except
+* -EDOM, which means that the contents of the register
+* should not be used as index.
+*/
+   if (indx_offset == -EDOM)
+   indx = 0;
+   else if (indx_offset < 0)
goto out_err;
+   else
+   indx = regs_get_register(regs, indx_offset);
 
base = regs_get_register(regs, base_offset);
-   indx = regs_get_register(regs, indx_offset);
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 23/26] x86/traps: Fixup general protection faults caused by UMIP

2017-05-05 Thread Ricardo Neri
If the User-Mode Instruction Prevention CPU feature is available and
enabled, a general protection fault will be issued if the instructions
sgdt, sldt, sidt, str or smsw are executed from user-mode context
(CPL > 0). If the fault was caused by any of the instructions protected
by UMIP, fixup_umip_exception will emulate dummy results for these
instructions. If emulation is successful, the result is passed to the
user space program and no SIGSEGV signal is emitted.

Please note that fixup_umip_exception also caters for the case when
the fault originated while running in virtual-8086 mode.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Cc: Tony Luck <tony.l...@intel.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Liang Z. Li <liang.z...@intel.com>
Cc: Alexandre Julliard <julli...@winehq.org>
Cc: Stas Sergeev <s...@list.ru>
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org
Reviewed-by: Andy Lutomirski <l...@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/kernel/traps.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 3995d3a..cec548d 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -526,6 +527,9 @@ do_general_protection(struct pt_regs *regs, long error_code)
RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
cond_local_irq_enable(regs);
 
+   if (user_mode(regs) && fixup_umip_exception(regs))
+   return;
+
if (v8086_mode(regs)) {
local_irq_enable();
handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 09/26] x86/insn-eval: Add utility function to identify string instructions

2017-05-05 Thread Ricardo Neri
String instructions are special because in protected mode, the linear
address is always obtained via the ES segment register in operands that
use the (E)DI register. Segment override prefixes are ignored. non-
string instructions use DS as the default segment register and it can
be overridden with a segment override prefix.

This function will be used in a subsequent commmit that introduces a
function to determine the segment register to use given the instruction,
operands and segment override prefixes.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 67 
 1 file changed, 67 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 8b16761..1634762 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -16,6 +16,73 @@ enum reg_type {
REG_TYPE_BASE,
 };
 
+enum string_instruction {
+   INSB= 0x6c,
+   INSW_INSD   = 0x6d,
+   OUTSB   = 0x6e,
+   OUTSW_OUTSD = 0x6f,
+   MOVSB   = 0xa4,
+   MOVSW_MOVSD = 0xa5,
+   CMPSB   = 0xa6,
+   CMPSW_CMPSD = 0xa7,
+   STOSB   = 0xaa,
+   STOSW_STOSD = 0xab,
+   LODSB   = 0xac,
+   LODSW_LODSD = 0xad,
+   SCASB   = 0xae,
+   SCASW_SCASD = 0xaf,
+};
+
+/**
+ * is_string_instruction - Determine if instruction is a string instruction
+ * @insn:  Instruction structure containing the opcode
+ *
+ * Return: true if the instruction, determined by the opcode, is any of the
+ * string instructions as defined in the Intel Software Development manual.
+ * False otherwise.
+ */
+static bool is_string_instruction(struct insn *insn)
+{
+   insn_get_opcode(insn);
+
+   /* all string instructions have a 1-byte opcode */
+   if (insn->opcode.nbytes != 1)
+   return false;
+
+   switch (insn->opcode.bytes[0]) {
+   case INSB:
+   /* fall through */
+   case INSW_INSD:
+   /* fall through */
+   case OUTSB:
+   /* fall through */
+   case OUTSW_OUTSD:
+   /* fall through */
+   case MOVSB:
+   /* fall through */
+   case MOVSW_MOVSD:
+   /* fall through */
+   case CMPSB:
+   /* fall through */
+   case CMPSW_CMPSD:
+   /* fall through */
+   case STOSB:
+   /* fall through */
+   case STOSW_STOSD:
+   /* fall through */
+   case LODSB:
+   /* fall through */
+   case LODSW_LODSD:
+   /* fall through */
+   case SCASB:
+   /* fall through */
+   case SCASW_SCASD:
+   return true;
+   default:
+   return false;
+   }
+}
+
 static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
  enum reg_type type)
 {
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 07/26] x86/insn-eval: Do not BUG on invalid register type

2017-05-05 Thread Ricardo Neri
We are not in a critical failure path. The invalid register type is caused
when trying to decode invalid instruction bytes from a user-space program.
Thus, simply print an error message. To prevent this warning from being
abused from user space programs, use the rate-limited variant of printk.

Cc: Borislav Petkov <b...@suse.de>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index e746a6f..182e2ae 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -5,6 +5,7 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -85,9 +86,8 @@ static int get_reg_offset(struct insn *insn, struct pt_regs 
*regs,
break;
 
default:
-   pr_err("invalid register type");
-   BUG();
-   break;
+   printk_ratelimited(KERN_ERR "insn-eval: x86: invalid register 
type");
+   return -EINVAL;
}
 
if (regno >= nr_registers) {
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 15/26] x86/insn-eval: Incorporate segment base and limit in linear address computation

2017-05-05 Thread Ricardo Neri
insn_get_addr_ref() returns the effective address as defined by the
section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
Developer's Manual. In order to compute the linear address, we must add
to the effective address the segment base address as set in the segment
descriptor. Furthermore, the segment descriptor to use depends on the
register that is used as the base of the effective address. The effective
base address varies depending on whether the operand is a register or a
memory address and on whether a SiB byte is used.

In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
segment is used or if segmentation is not used. However, the base address
is not necessarily zero if a user programs defines its own segments. This
is possible by using a local descriptor table.

Since the effective address is a signed quantity, the unsigned segment
base address is saved in a separate variable and added to the final
effective address.

Before returning the linear address, we check if the computed effective
address is within the segment limit. In protected mode segment limits are
not enforced. We can keep the check as get_seg_limit() return -1L in this
case.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 4f600de..1a5f5a6 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -695,7 +695,7 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs 
*regs)
  */
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-   unsigned long linear_addr;
+   unsigned long linear_addr, seg_base_addr, seg_limit;
long eff_addr, base, indx;
int addr_offset, base_offset, indx_offset;
insn_byte_t sib;
@@ -709,6 +709,10 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
if (addr_offset < 0)
goto out_err;
eff_addr = regs_get_register(regs, addr_offset);
+   seg_base_addr = insn_get_seg_base(regs, insn, addr_offset);
+   if (seg_base_addr == -1L)
+   goto out_err;
+   seg_limit = get_seg_limit(regs, insn, addr_offset);
} else {
if (insn->sib.nbytes) {
/*
@@ -734,6 +738,11 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
indx = regs_get_register(regs, indx_offset);
 
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   seg_base_addr = insn_get_seg_base(regs, insn,
+ base_offset);
+   if (seg_base_addr == -1L)
+   goto out_err;
+   seg_limit = get_seg_limit(regs, insn, base_offset);
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
/*
@@ -751,10 +760,25 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
} else {
eff_addr = regs_get_register(regs, addr_offset);
}
+   seg_base_addr = insn_get_seg_base(regs, insn,
+ addr_offset);
+   if (seg_base_addr == -1L)
+   goto out_err;
+   seg_limit = get_seg_limit(regs, insn, addr_offset);
}
eff_addr += insn->displacement.value;
}
+
linear_addr = (unsigned long)eff_addr;
+   /*
+* Make sure the effective address is within the limits of the
+* segment. In long mode, the limit is -1L. Thus, the second part
+* of the check always succeeds.
+*/
+   if (linear_addr > seg_limit)
+   goto out_err;
+
+   linear_addr += seg_base_addr;
 
return (void __user *)linear_addr;
 out_err:
-- 
2.9.3

--
To unsubscribe from this list: send the line 

[PATCH v7 11/26] x86/insn-eval: Add utility function to get segment descriptor

2017-05-05 Thread Ricardo Neri
The segment descriptor contains information that is relevant to how linear
address need to be computed. It contains the default size of addresses as
well as the base address of the segment. Thus, given a segment selector,
we ought look at segment descriptor to correctly calculate the linear
address.

In protected mode, the segment selector might indicate a segment
descriptor from either the global descriptor table or a local descriptor
table. Both cases are considered in this function.

This function is the initial implementation for subsequent functions that
will obtain the aforementioned attributes of the segment descriptor.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 55 
 1 file changed, 55 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 0a496f4..f46cb31 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -6,9 +6,13 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 enum reg_type {
@@ -421,6 +425,57 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
 }
 
 /**
+ * get_desc() - Obtain address of segment descriptor
+ * @sel:   Segment selector
+ *
+ * Given a segment selector, obtain a pointer to the segment descriptor.
+ * Both global and local descriptor tables are supported.
+ *
+ * Return: pointer to segment descriptor on success. NULL on failure.
+ */
+static struct desc_struct *get_desc(unsigned short sel)
+{
+   struct desc_ptr gdt_desc = {0, 0};
+   struct desc_struct *desc = NULL;
+   unsigned long desc_base;
+
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
+   if ((sel & SEGMENT_TI_MASK) == SEGMENT_LDT) {
+   /* Bits [15:3] contain the index of the desired entry. */
+   sel >>= 3;
+
+   mutex_lock(>active_mm->context.lock);
+   /* The size of the LDT refers to the number of entries. */
+   if (!current->active_mm->context.ldt ||
+   sel >= current->active_mm->context.ldt->size) {
+   mutex_unlock(>active_mm->context.lock);
+   return NULL;
+   }
+
+   desc = >active_mm->context.ldt->entries[sel];
+   mutex_unlock(>active_mm->context.lock);
+   return desc;
+   }
+#endif
+   native_store_gdt(_desc);
+
+   /*
+* Segment descriptors have a size of 8 bytes. Thus, the index is
+* multiplied by 8 to obtain the offset of the desired descriptor from
+* the start of the GDT. As bits [15:3] of the segment selector contain
+* the index, it can be regarded multiplied by 8 already. All that
+* remains is to clear bits [2:0].
+*/
+   desc_base = sel & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
+
+   if (desc_base > gdt_desc.size)
+   return NULL;
+
+   desc = (struct desc_struct *)(gdt_desc.address + desc_base);
+   return desc;
+}
+
+/**
  * insn_get_reg_offset_modrm_rm() - Obtain register in r/m part of ModRM byte
  * @insn:  Instruction structure containing the ModRM byte
  * @regs:  Structure with register values as seen when entering kernel mode
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and 64-bit effective addresses

2017-05-05 Thread Ricardo Neri
The 32-bit and 64-bit address encodings are identical. This means that we
can use the same function in both cases. In order to reuse the function
for 32-bit address encodings, we must sign-extend our 32-bit signed
operands to 64-bit signed variables (only for 64-bit builds). To decide on
whether sign extension is needed, we rely on the address size as given by
the instruction structure.

Once the effective address has been computed, a special verification is
needed for 32-bit processes. If running on a 64-bit kernel, such processes
can address up to 4GB of memory. Hence, for instance, an effective
address of 0x1234 would be misinterpreted as 0x1234 due to
the sign extension mentioned above. For this reason, the 4 must be
truncated to obtain the true effective address.

Lastly, before computing the linear address, we verify that the effective
address is within the limits of the segment. The check is kept for long
mode because in such a case the limit is set to -1L. This is the largest
unsigned number possible. This is equivalent to a limit-less segment.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 99 ++--
 1 file changed, 88 insertions(+), 11 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 1a5f5a6..c7c1239 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -688,6 +688,62 @@ int insn_get_modrm_rm_off(struct insn *insn, struct 
pt_regs *regs)
return get_reg_offset(insn, regs, REG_TYPE_RM);
 }
 
+/**
+ * _to_signed_long() - Cast an unsigned long into signed long
+ * @valA 32-bit or 64-bit unsigned long
+ * @long_bytes The number of bytes used to represent a long number
+ * @outThe casted signed long
+ *
+ * Return: A signed long of either 32 or 64 bits, as per the build 
configuration
+ * of the kernel.
+ */
+static int _to_signed_long(unsigned long val, int long_bytes, long *out)
+{
+   if (!out)
+   return -EINVAL;
+
+#ifdef CONFIG_X86_64
+   if (long_bytes == 4) {
+   /* higher bytes should all be zero */
+   if (val & ~0x)
+   return -EINVAL;
+
+   /* sign-extend to a 64-bit long */
+   *out = (long)((int)(val));
+   return 0;
+   } else if (long_bytes == 8) {
+   *out = (long)val;
+   return 0;
+   } else {
+   return -EINVAL;
+   }
+#else
+   *out = (long)val;
+   return 0;
+#endif
+}
+
+/** get_mem_offset() - Obtain the memory offset indicated in operand register
+ * @regs   Structure with register values as seen when entering kernel mode
+ * @reg_offset Offset from the base of pt_regs of the operand register
+ * @addr_size  Address size of the code segment in use
+ *
+ * Obtain the offset (a signed number with size as specified in addr_size)
+ * indicated in the register used for register-indirect memory adressing.
+ *
+ * Return: A memory offset to be used in the computation of effective address.
+ */
+long get_mem_offset(struct pt_regs *regs, int reg_offset, int addr_size)
+{
+   int ret;
+   long offset = -1L;
+   unsigned long uoffset = regs_get_register(regs, reg_offset);
+
+   ret = _to_signed_long(uoffset, addr_size, );
+   if (ret)
+   return -1L;
+   return offset;
+}
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
@@ -697,18 +753,21 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
 {
unsigned long linear_addr, seg_base_addr, seg_limit;
long eff_addr, base, indx;
-   int addr_offset, base_offset, indx_offset;
+   int addr_offset, base_offset, indx_offset, addr_bytes;
insn_byte_t sib;
 
insn_get_modrm(insn);
insn_get_sib(insn);
sib = insn->sib.value;
+   addr_bytes = insn->addr_bytes;
 
if (X86_MODRM_MOD(insn->modrm.value) == 3) {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
if (addr_offset < 0)
goto out_err;
-   eff_addr = reg

[PATCH v7 13/26] x86/insn-eval: Add function to get default params of code segment

2017-05-05 Thread Ricardo Neri
This function returns the default values of the address and operand sizes
as specified in the segment descriptor. This information is determined
from the D and L bits. Hence, it can be used for both IA-32e 64-bit and
32-bit legacy modes. For virtual-8086 mode, the default address and
operand sizes are always 2 bytes.

The D bit is only meaningful for code segments. Thus, these functions
always use the code segment selector contained in regs.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |  6 
 arch/x86/lib/insn-eval.c | 65 
 2 files changed, 71 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 7f3c7fe..9ed1c88 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -11,9 +11,15 @@
 #include 
 #include 
 
+struct insn_code_seg_defaults {
+   unsigned char address_bytes;
+   unsigned char operand_bytes;
+};
+
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
 int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
 unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
int regoff);
+struct insn_code_seg_defaults insn_get_code_seg_defaults(struct pt_regs *regs);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index c77ed80..693e5a8 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -603,6 +603,71 @@ static unsigned long get_seg_limit(struct pt_regs *regs, 
struct insn *insn,
 }
 
 /**
+ * insn_get_code_seg_defaults() - Obtain code segment default parameters
+ * @regs:  Structure with register values as seen when entering kernel mode
+ *
+ * Obtain the default parameters of the code segment: address and operand 
sizes.
+ * The code segment is obtained from the selector contained in the CS register
+ * in regs. In protected mode, the default address is determined by inspecting
+ * the L and D bits of the segment descriptor. In virtual-8086 mode, the 
default
+ * is always two bytes for both address and operand sizes.
+ *
+ * Return: A populated insn_code_seg_defaults structure on success. The
+ * structure contains only zeros on failure.
+ */
+struct insn_code_seg_defaults insn_get_code_seg_defaults(struct pt_regs *regs)
+{
+   struct desc_struct *desc;
+   struct insn_code_seg_defaults defs;
+   unsigned short sel;
+   /*
+* The most significant byte of AR_TYPE_MASK determines whether a
+* segment contains data or code.
+*/
+   unsigned int type_mask = AR_TYPE_MASK & (1 << 11);
+
+   memset(, 0, sizeof(defs));
+
+   if (v8086_mode(regs)) {
+   defs.address_bytes = 2;
+   defs.operand_bytes = 2;
+   return defs;
+   }
+
+   sel = (unsigned short)regs->cs;
+
+   desc = get_desc(sel);
+   if (!desc)
+   return defs;
+
+   /* if data segment, return */
+   if (!(desc->b & type_mask))
+   return defs;
+
+   switch ((desc->l << 1) | desc->d) {
+   case 0: /* Legacy mode. CS.L=0, CS.D=0 */
+   defs.address_bytes = 2;
+   defs.operand_bytes = 2;
+   break;
+   case 1: /* Legacy mode. CS.L=0, CS.D=1 */
+   defs.address_bytes = 4;
+   defs.operand_bytes = 4;
+   break;
+   case 2: /* IA-32e 64-bit mode. CS.L=1, CS.D=0 */
+   defs.address_bytes = 8;
+   defs.operand_bytes = 4;
+   break;
+   case 3: /* Invalid setting. CS.L=1, CS.D=1 */
+   /* fall through */
+   default:
+   defs.address_bytes = 0;
+   defs.operand_bytes = 0;
+   }
+
+   return defs;
+}
+
+/**
  * insn_get_reg_offset_modrm_rm() - Obtain register in r/m part of ModRM byte
  * @insn:  Instruction structure containing the ModRM byte
  * @regs:  Structure with register values as seen when entering kernel mode
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 12/26] x86/insn-eval: Add utility functions to get segment descriptor base address and limit

2017-05-05 Thread Ricardo Neri
With segmentation, the base address of the segment descriptor is needed
to compute a linear address. The segment descriptor used in the address
computation depends on either any segment override prefixes in the
instruction or the default segment determined by the registers involved
in the address computation. Thus, both the instruction as well as the
register (specified as the offset from the base of pt_regs) are given as
inputs, along with a boolean variable to select between override and
default.

The segment selector is determined by get_seg_selector() with the inputs
described above. Once the selector is known, the base address is
determined. In protected mode, the selector is used to obtain the segment
descriptor and then its base address. If in 64-bit user mode, the segment
base address is zero except when FS or GS are used. In virtual-8086 mode,
the base address is computed as the value of the segment selector shifted 4
positions to the left.

In protected mode, segment limits are enforced. Thus, a function to
determine the limit of the segment is added. Segment limits are not
enforced in long or virtual-8086. For the latter, addresses are limited
to 20 bits; address size will be handled when computing the linear
address.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |   2 +
 arch/x86/lib/insn-eval.c | 127 +++
 2 files changed, 129 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 7e8c963..7f3c7fe 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -13,5 +13,7 @@
 
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
 int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
+unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
+   int regoff);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index f46cb31..c77ed80 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -476,6 +476,133 @@ static struct desc_struct *get_desc(unsigned short sel)
 }
 
 /**
+ * insn_get_seg_base() - Obtain base address of segment descriptor.
+ * @regs:  Structure with register values as seen when entering kernel mode
+ * @insn:  Instruction structure with selector override prefixes
+ * @regoff:Operand offset, in pt_regs, of which the selector is needed
+ *
+ * Obtain the base address of the segment descriptor as indicated by either
+ * any segment override prefixes contained in insn or the default segment
+ * applicable to the register indicated by regoff. regoff is specified as the
+ * offset in bytes from the base of pt_regs.
+ *
+ * Return: In protected mode, base address of the segment. Zero in for long
+ * mode, except when FS or GS are used. In virtual-8086 mode, the segment
+ * selector shifted 4 positions to the right. -1L in case of
+ * error.
+ */
+unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
+   int regoff)
+{
+   struct desc_struct *desc;
+   unsigned short sel;
+   enum segment_register seg_reg;
+
+   seg_reg = resolve_seg_register(insn, regs, regoff);
+   if (seg_reg == SEG_REG_INVAL)
+   return -1L;
+
+   sel = get_segment_selector(regs, seg_reg);
+   if ((short)sel < 0)
+   return -1L;
+
+   if (v8086_mode(regs))
+   /*
+* Base is simply the segment selector shifted 4
+* positions to the right.
+*/
+   return (unsigned long)(sel << 4);
+
+   if (user_64bit_mode(regs)) {
+   /*
+* Only FS or GS will have a base address, the rest of
+* the segments' bases are forced to 0.
+*/
+   unsigned long base;
+
+   if (seg_reg == SEG_REG_FS)
+   rdmsrl(MSR_FS_BASE, base);
+   else if (seg_reg == SEG_REG_GS)
+   /*
+* swapgs was called at the kernel entry point. Thus,
+* MSR_KERNEL_GS_BASE will have th

[PATCH v7 10/26] x86/insn-eval: Add utility functions to get segment selector

2017-05-05 Thread Ricardo Neri
When computing a linear address and segmentation is used, we need to know
the base address of the segment involved in the computation. In most of
the cases, the segment base address will be zero as in USER_DS/USER32_DS.
However, it may be possible that a user space program defines its own
segments via a local descriptor table. In such a case, the segment base
address may not be zero .Thus, the segment base address is needed to
calculate correctly the linear address.

The segment selector to be used when computing a linear address is
determined by either any of segment override prefixes in the
instruction or inferred from the registers involved in the computation of
the effective address; in that order. Also, there are cases when the
overrides shall be ignored (code segments are always selected by the CS
segment register; string instructions always use the ES segment register
along with the EDI register).

For clarity, this process can be split into two steps: resolving the
relevant segment register to use and, once known, read its value to
obtain the segment selector.

The method to obtain the segment selector depends on several factors. In
32-bit builds, segment selectors are saved into the pt_regs structure
when switching to kernel mode. The same is also true for virtual-8086
mode. In 64-bit builds, segmentation is mostly ignored, except when
running a program in 32-bit legacy mode. In this case, CS and SS can be
obtained from pt_regs. DS, ES, FS and GS can be read directly from
the respective segment registers.

Lastly, the only two segment registers that are not ignored in long mode
are FS and GS. In these two cases, base addresses are obtained from the
respective MSRs.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 256 +++
 1 file changed, 256 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 1634762..0a496f4 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 enum reg_type {
REG_TYPE_RM = 0,
@@ -33,6 +34,17 @@ enum string_instruction {
SCASW_SCASD = 0xaf,
 };
 
+enum segment_register {
+   SEG_REG_INVAL = -1,
+   SEG_REG_IGNORE = 0,
+   SEG_REG_CS = 0x23,
+   SEG_REG_SS = 0x36,
+   SEG_REG_DS = 0x3e,
+   SEG_REG_ES = 0x26,
+   SEG_REG_FS = 0x64,
+   SEG_REG_GS = 0x65,
+};
+
 /**
  * is_string_instruction - Determine if instruction is a string instruction
  * @insn:  Instruction structure containing the opcode
@@ -83,6 +95,250 @@ static bool is_string_instruction(struct insn *insn)
}
 }
 
+/**
+ * resolve_seg_register() - obtain segment register
+ * @insn:  Instruction structure with segment override prefixes
+ * @regs:  Structure with register values as seen when entering kernel mode
+ * @regoff:Operand offset, in pt_regs, used to deterimine segment register
+ *
+ * The segment register to which an effective address refers depends on
+ * a) whether segment override prefixes must be ignored: always use CS when
+ * the register is (R|E)IP; always use ES when operand register is (E)DI with
+ * string instructions as defined in the Intel documentation. b) If segment
+ * overrides prefixes are used in the instruction instruction prefixes. C) Use
+ * the default segment register associated with the operand register.
+ *
+ * The operand register, regoff, is represented as the offset from the base of
+ * pt_regs. Also, regoff can be -EDOM for cases in which registers are not
+ * used as operands (e.g., displacement-only memory addressing).
+ *
+ * This function returns the segment register as value from an enumeration
+ * as per the conditions described above. Please note that this function
+ * does not return the value in the segment register (i.e., the segment
+ * selector). The segment selector needs to be obtained using
+ * get_segment_selector() and passing the segment register resolved by
+ * this function.
+ *
+ * Return: Enumerated segment register to use, among CS, SS, DS, ES, FS, GS,
+ * ignore (in 64-bit mode as applicable), or -EINVAL in case of error.
+ */
+static enum segment_register resolve_seg_register

[PATCH v7 14/26] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 5

2017-05-05 Thread Ricardo Neri
Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when ModRM.mod is zero and
ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means
that none of the registers are used in the computation of the effective
address. A return value of -EDOM signals callers that they should not use
the value of registers when computing the effective address for the
instruction.

In IA-32e 64-bit mode (long mode), the effective address is given by the
32-bit displacement plus the value of RIP of the next instruction.
In IA-32e compatibility mode (protected mode), only the displacement is
used.

The instruction decoder takes care of obtaining the displacement.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 693e5a8..4f600de 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -379,6 +379,12 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
switch (type) {
case REG_TYPE_RM:
regno = X86_MODRM_RM(insn->modrm.value);
+   /*
+* ModRM.mod == 0 and ModRM.rm == 5 means a 32-bit displacement
+* follows the ModRM byte.
+*/
+   if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)
+   return -EDOM;
if (X86_REX_B(insn->rex_prefix.value))
regno += 8;
break;
@@ -730,9 +736,21 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-   if (addr_offset < 0)
+   /*
+* -EDOM means that we must ignore the address_offset.
+* In such a case, in 64-bit mode the effective address
+* relative to the RIP of the following instruction.
+*/
+   if (addr_offset == -EDOM) {
+   eff_addr = 0;
+   if (user_64bit_mode(regs))
+   eff_addr = (long)regs->ip +
+  insn->length;
+   } else if (addr_offset < 0) {
goto out_err;
-   eff_addr = regs_get_register(regs, addr_offset);
+   } else {
+   eff_addr = regs_get_register(regs, addr_offset);
+   }
}
eff_addr += insn->displacement.value;
}
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 24/26] x86: Enable User-Mode Instruction Prevention

2017-05-05 Thread Ricardo Neri
User_mode Instruction Prevention (UMIP) is enabled by setting/clearing a
bit in %cr4.

It makes sense to enable UMIP at some point while booting, before user
spaces come up. Like SMAP and SMEP, is not critical to have it enabled
very early during boot. This is because UMIP is relevant only when there is
a userspace to be protected from. Given the similarities in relevance, it
makes sense to enable UMIP along with SMAP and SMEP.

UMIP is enabled by default. It can be disabled by adding clearcpuid=514
to the kernel parameters.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Cc: Tony Luck <tony.l...@intel.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Liang Z. Li <liang.z...@intel.com>
Cc: Alexandre Julliard <julli...@winehq.org>
Cc: Stas Sergeev <s...@list.ru>
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/Kconfig | 10 ++
 arch/x86/kernel/cpu/common.c | 16 +++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 702002b..1b1bbeb 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1745,6 +1745,16 @@ config X86_SMAP
 
  If unsure, say Y.
 
+config X86_INTEL_UMIP
+   def_bool y
+   depends on CPU_SUP_INTEL
+   prompt "Intel User Mode Instruction Prevention" if EXPERT
+   ---help---
+ The User Mode Instruction Prevention (UMIP) is a security
+ feature in newer Intel processors. If enabled, a general
+ protection fault is issued if the instructions SGDT, SLDT,
+ SIDT, SMSW and STR are executed in user mode.
+
 config X86_INTEL_MPX
prompt "Intel MPX (Memory Protection Extensions)"
def_bool n
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8ee3211..66ebded 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -311,6 +311,19 @@ static __always_inline void setup_smap(struct cpuinfo_x86 
*c)
}
 }
 
+static __always_inline void setup_umip(struct cpuinfo_x86 *c)
+{
+   if (cpu_feature_enabled(X86_FEATURE_UMIP) &&
+   cpu_has(c, X86_FEATURE_UMIP))
+   cr4_set_bits(X86_CR4_UMIP);
+   else
+   /*
+* Make sure UMIP is disabled in case it was enabled in a
+* previous boot (e.g., via kexec).
+*/
+   cr4_clear_bits(X86_CR4_UMIP);
+}
+
 /*
  * Protection Keys are not available in 32-bit mode.
  */
@@ -1121,9 +1134,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
/* Disable the PN if appropriate */
squash_the_stupid_serial_number(c);
 
-   /* Set up SMEP/SMAP */
+   /* Set up SMEP/SMAP/UMIP */
setup_smep(c);
setup_smap(c);
+   setup_umip(c);
 
/*
 * The vendor-specific functions might have changed features.
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 26/26] selftests/x86: Add tests for instruction str and sldt

2017-05-05 Thread Ricardo Neri
The instructions str and sldt are not recognized when running on virtual-
8086 mode and generate an invalid operand exception. These two
instructions are protected by the Intel User-Mode Instruction Prevention
(UMIP) security feature. In protected mode, if UMIP is enabled, these
instructions generate a general protection fault if called from CPL > 0.
Linux traps the general protection fault and emulate the results with
dummy values.

These tests are added to verify that the emulation code does not emulate
these two instructions but issue the expected invalid operand exception.

Tests fallback to exit with int3 in case emulation does happen.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 tools/testing/selftests/x86/entry_from_vm86.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/entry_from_vm86.c 
b/tools/testing/selftests/x86/entry_from_vm86.c
index 130e8ad..b7a0c90 100644
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -111,6 +111,11 @@ asm (
"smsw %ax\n\t"
"mov %ax, (2080)\n\t"
"int3\n\t"
+   "vmcode_umip_str:\n\t"
+   "str %eax\n\t"
+   "vmcode_umip_sldt:\n\t"
+   "sldt %eax\n\t"
+   "int3\n\t"
".size vmcode, . - vmcode\n\t"
"end_vmcode:\n\t"
".code32\n\t"
@@ -119,7 +124,8 @@ asm (
 
 extern unsigned char vmcode[], end_vmcode[];
 extern unsigned char vmcode_bound[], vmcode_sysenter[], vmcode_syscall[],
-   vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_umip[];
+   vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_umip[],
+   vmcode_umip_str[], vmcode_umip_sldt[];
 
 /* Returns false if the test was skipped. */
 static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
@@ -226,6 +232,16 @@ void do_umip_tests(struct vm86plus_struct *vm86, unsigned 
char *test_mem)
printf("[FAIL]\tAll the results of SIDT should be the same.\n");
else
printf("[PASS]\tAll the results from SIDT are identical.\n");
+
+   sethandler(SIGILL, sighandler, 0);
+   do_test(vm86, vmcode_umip_str - vmcode, VM86_SIGNAL, 0,
+   "STR instruction");
+   clearhandler(SIGILL);
+
+   sethandler(SIGILL, sighandler, 0);
+   do_test(vm86, vmcode_umip_sldt - vmcode, VM86_SIGNAL, 0,
+   "SLDT instruction");
+   clearhandler(SIGILL);
 }
 
 int main(void)
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 25/26] selftests/x86: Add tests for User-Mode Instruction Prevention

2017-05-05 Thread Ricardo Neri
Certain user space programs that run on virtual-8086 mode may utilize
instructions protected by the User-Mode Instruction Prevention (UMIP)
security feature present in new Intel processors: SGDT, SIDT and SMSW. In
such a case, a general protection fault is issued if UMIP is enabled. When
such a fault happens, the kernel traps it and emulates the results of
these instructions with dummy values. The purpose of this new
test is to verify whether the impacted instructions can be executed
without causing such #GP. If no #GP exceptions occur, we expect to exit
virtual-8086 mode from INT3.

The instructions protected by UMIP are executed in representative use
cases:
 a) displacement-only memory addressing
 b) register-indirect memory addressing
 c) results stored directly in operands

Unfortunately, it is not possible to check the results against a set of
expected values because no emulation will occur in systems that do not
have the UMIP feature. Instead, results are printed for verification. A
simple verification is done to ensure that results of all tests are
identical.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 tools/testing/selftests/x86/entry_from_vm86.c | 73 ++-
 1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/entry_from_vm86.c 
b/tools/testing/selftests/x86/entry_from_vm86.c
index d075ea0..130e8ad 100644
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -95,6 +95,22 @@ asm (
"int3\n\t"
"vmcode_int80:\n\t"
"int $0x80\n\t"
+   "vmcode_umip:\n\t"
+   /* addressing via displacements */
+   "smsw (2052)\n\t"
+   "sidt (2054)\n\t"
+   "sgdt (2060)\n\t"
+   /* addressing via registers */
+   "mov $2066, %bx\n\t"
+   "smsw (%bx)\n\t"
+   "mov $2068, %bx\n\t"
+   "sidt (%bx)\n\t"
+   "mov $2074, %bx\n\t"
+   "sgdt (%bx)\n\t"
+   /* register operands, only for smsw */
+   "smsw %ax\n\t"
+   "mov %ax, (2080)\n\t"
+   "int3\n\t"
".size vmcode, . - vmcode\n\t"
"end_vmcode:\n\t"
".code32\n\t"
@@ -103,7 +119,7 @@ asm (
 
 extern unsigned char vmcode[], end_vmcode[];
 extern unsigned char vmcode_bound[], vmcode_sysenter[], vmcode_syscall[],
-   vmcode_sti[], vmcode_int3[], vmcode_int80[];
+   vmcode_sti[], vmcode_int3[], vmcode_int80[], vmcode_umip[];
 
 /* Returns false if the test was skipped. */
 static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
@@ -160,6 +176,58 @@ static bool do_test(struct vm86plus_struct *v86, unsigned 
long eip,
return true;
 }
 
+void do_umip_tests(struct vm86plus_struct *vm86, unsigned char *test_mem)
+{
+   struct table_desc {
+   unsigned short limit;
+   unsigned long base;
+   } __attribute__((packed));
+
+   /* Initialize variables with arbitrary values */
+   struct table_desc gdt1 = { .base = 0x3c3c3c3c, .limit = 0x };
+   struct table_desc gdt2 = { .base = 0x1a1a1a1a, .limit = 0xaeae };
+   struct table_desc idt1 = { .base = 0x7b7b7b7b, .limit = 0xf1f1 };
+   struct table_desc idt2 = { .base = 0x89898989, .limit = 0x1313 };
+   unsigned short msw1 = 0x1414, msw2 = 0x2525, msw3 = 3737;
+
+   /* UMIP -- exit with INT3 unless kernel emulation did not trap #GP */
+   do_test(vm86, vmcode_umip - vmcode, VM86_TRAP, 3, "UMIP tests");
+
+   /* Results from displacement-only addressing */
+   msw1 = *(unsigned short *)(test_mem + 2052);
+   memcpy(, test_mem + 2054, sizeof(idt1));
+   memcpy(, test_mem + 2060, sizeof(gdt1));
+
+   /* Results from register-indirect addressing */
+   msw2 = *(unsigned short *)(test_mem + 2066);
+   memcpy(, test_mem + 2068, sizeof(idt2));
+   memcpy(, test_mem + 2074, sizeof(gdt2));
+
+   /* Results when using register operands */
+   msw3 = *(unsigned short *)(test_mem + 2080);
+
+   printf("[

[PATCH v7 17/26] x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode

2017-05-05 Thread Ricardo Neri
It is possible to utilize 32-bit address encodings in virtual-8086 mode via
an address override instruction prefix. However, the range of address is
still limited to [0x-0x]. In such a case, return error.

Also, linear addresses in virtual-8086 mode are limited to 20 bits. Enforce
such limit by truncating the most significant bytes of the computed linear
address.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index c7c1239..9822061 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -848,6 +848,12 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
linear_addr &= 0x;
 
/*
+* Even though 32-bit address encodings are allowed in virtual-8086
+* mode, the address range is still limited to [0x-0x].
+*/
+   if (v8086_mode(regs) && (linear_addr & ~0x))
+   goto out_err;
+   /*
 * Make sure the effective address is within the limits of the
 * segment. In long mode, the limit is -1L. Thus, the second part
 * of the check always succeeds.
@@ -857,6 +863,10 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
 
linear_addr += seg_base_addr;
 
+   /* Limit linear address to 20 bits */
+   if (v8086_mode(regs))
+   linear_addr &= 0xf;
+
return (void __user *)linear_addr;
 out_err:
return (void __user *)-1;
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 00/26] x86: Enable User-Mode Instruction Prevention

2017-05-05 Thread Ricardo Neri
tually read the segment selector.
* Fixed a bug when evaluating 32-bit effective addresses with 64-bit
  kernels.
* Split patches further for for easier review.
* Use signed variables for computation of effective address.
* Fixed issue with a spurious static modifier in function insn_get_addr_ref
  found by kbuild test bot.
* Removed comparison between true and fixup_umip_exception.
* Reworked check logic when identifying erroneous vs invalid values of the
  SiB base and index.

Changes since V3:
* Limited emulation to 32-bit and 16-bit modes. For 64-bit mode, a general
  protection fault is still issued when UMIP-protected instructions are
  executed with CPL > 0.
* Expanded instruction-evaluating code to obtain segment descriptor along
  with their attributes such as base address and default address and
  operand sizes. Also, support for 16-bit encodings in protected mode was
  implemented.
* When getting a segment descriptor, this include support to obtain those
  of a local descriptor table.
* Now the instruction-evaluating code returns -EDOM when the value of
  registers should not be used in calculating the effective address. The
  value -EINVAL is left for errors.
* Incorporate the value of the segment base address in the computation of
  linear addresses.
* Renamed new instruction evaluation library from insn-kernel.c to
  insn-eval.c
* Exported functions insn_get_reg_offset_* to obtain the register offset
  by ModRM r/m, SiB base and SiB index.
* Improved documentation of functions.
* Split patches further for easier review.

Changes since V2:
* Added new utility functions to decode the memory addresses contained in
  registers when the 16-bit addressing encodings are used. This includes
  code to obtain and compute memory addresses using segment selectors for
  real-mode address translation.
* Added support to emulate UMIP-protected instructions for virtual-8086
  tasks.
* Added self-tests for virtual-8086 mode that contains representative
  use cases: address represented as a displacement, address in registers
  and registers as operands.
* Instead of maintaining a static variable for the dummy base addresses
  of the IDT and GDT, a hard-coded value is used.
* The emulated SMSW instructions now return the value with which the CR0
  register is programmed in head_32/64.S This is: PE | MP | ET | NE | WP
  | AM. For x86_64, PG is also enabled.
* The new file arch/x86/lib/insn-utils.c is now renamed as arch/x86/lib/
  insn-kernel.c. It also has its own header. This helps keep in sync the
  the kernel and objtool instruction decoders. Also, the new insn-kernel.c
  contains utility functions that are only relevant in a kernel context.
* Removed printed warnings for errors that occur when decoding instructions
  with invalid operands.
* Added more comments on fixes in the instruction-decoding MPX functions.
* Now user_64bit_mode(regs) is used instead of test_thread_flag(TIF_IA32)
  to determine if the task is 32-bit or 64-bit.
* Found and fixed a bug in insn-decoder in which X86_MODRM_RM was
  incorrectly used to obtain the mod part of the ModRM byte.
* Added more explanatory code in emulation and instruction decoding code.
  This includes a comment regarding that copy_from_user could fail if there
  exists a memory protection key in place.
* Tested code with CONFIG_X86_DECODER_SELFTEST=y and everything passes now.
* Prefixed get_reg_offset_rm with insn_ as this function is exposed
  via a header file. For clarity, this function was added in a separate
  patch.

Changes since V1:
* Virtual-8086 mode tasks are not treated in a special manner. All code
  for this purpose was removed.
* Instead of attempting to disable UMIP during a context switch or when
  entering virtual-8086 mode, UMIP remains enabled all the time. General
  protection faults that occur are fixed-up by returning dummy values as
  detailed above.
* Removed umip= kernel parameter in favor of using clearcpuid=514 to
  disable UMIP.
* Removed selftests designed to detect the absence of SIGSEGV signals when
  running in virtual-8086 mode.
* Reused code from MPX to decode instructions operands. For this purpose
  code was put in a common location.
* Fixed two bugs in MPX code that decodes operands.

Ricardo Neri (26):
  ptrace,x86: Make user_64bit_mode() available to 32-bit builds
  x86/mm: Relocate page fault error codes to traps.h
  x86/mpx: Use signed variables to compute effective addresses
  x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is
not 11b
  x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0
  x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval
file
  x86/insn-eval: Do not BUG on invalid register type
  x86/insn-eval: Add a utility function to get register offsets
  x86/insn-eval: Add utility function to identify string instructions
  x86/insn-eval: Add utility functions to get segment selector
  x86/insn-eval: Add utility function to get segment descriptor
  x86/insn-

Re: [v6 PATCH 03/21] x86/mpx: Do not use R/EBP as base in the SIB byte with Mod = 0

2017-04-27 Thread Ricardo Neri
On Wed, 2017-04-26 at 10:05 +0200, Borislav Petkov wrote:
> On Tue, Apr 25, 2017 at 07:04:20PM -0700, Ricardo Neri wrote:
> > For the specific case of ModRM.mod being 0, I feel I need to clarify
> > that REX.B is not decoded and if SIB.base is %r13 the base is also 0.
> 
> Well, that all doesn't matter. The rule is this:
> 
> ModRM.mod == 00b and ModRM.r/m == 101b -> effective address: disp32
> 
> See Table 2-2. "32-Bit Addressing Forms with the ModR/M Byte" in the SDM.

You are right. This summarizes the rule. Then I will shorten the
comment.
> 
> So the base register is not used. How that base register is specified
> then doesn't matter (undecoded REX bits or not).
> 
> > This comment adds clarity because REX.X is decoded when determining
> > SIB.index.
> 
> Well, that's a different thing. The REX bits participating in the SIB
> fields don't matter about this particular case. We only want to say that
> we're returning a disp32 without a base register and the comment should
> keep it simple without extraneous information.
> 
> I know, you want to mention what Table 2-5. "Special Cases of REX
> Encodings" says but we should avoid unnecessary content in the comment.
> People who want details can stare at the manuals - the comment should
> only document what that particular case is.
> 
> Btw, you could write it even better:
> 
>   if (!X86_MODRM_MOD(insn->modrm.value) && 
> X86_MODRM_RM(insn->modrm.value) == 5)
> 
> and then it is basically a 1:1 copy of the rule from Table 2-2.

It is!

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 12/21] x86/insn: Support both signed 32-bit and 64-bit effective addresses

2017-04-26 Thread Ricardo Neri
On Tue, 2017-04-25 at 15:51 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:45PM -0800, Ricardo Neri wrote:
> > The 32-bit and 64-bit address encodings are identical. This means that we
> > can use the same function in both cases. In order to reuse the function for
> > 32-bit address encodings, we must sign-extend our 32-bit signed operands to
> > 64-bit signed variables (only for 64-bit builds). To decide on whether sign
> > extension is needed, we rely on the address size as given by the
> > instruction structure.
> > 
> > Lastly, before computing the linear address, we must truncate our signed
> > 64-bit signed effective address if the address size is 32-bit.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 44 
> >  1 file changed, 32 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index edb360f..a9a1704 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -559,6 +559,15 @@ int insn_get_reg_offset_sib_index(struct insn *insn, 
> > struct pt_regs *regs)
> > return get_reg_offset(insn, regs, REG_TYPE_INDEX);
> >  }
> >  
> > +static inline long __to_signed_long(unsigned long val, int long_bytes)
> > +{
> > +#ifdef CONFIG_X86_64
> > +   return long_bytes == 4 ? (long)((int)((val) & 0x)) : (long)val;
> 
> I don't think this always works as expected:
> 
> ---
> typedef unsigned int u32;
> typedef unsigned long u64;
> 
> int main()
> {
> u64 v = 0x1;
> 
> printf("v: %ld, 0x%lx, %ld\n", v, v, (long)((int)((v) & 0x)));
> 
> return 0;
> }
> --
> ...
> 
> v: 8589934591, 0x1, -1
> 
> Now, this should not happen on 32-bit because unsigned long is 32-bit
> there but can that happen on 64-bit?

This is the reason I check the value of long_bytes. If long_bytes is not
4, being the only other possible value 8 (perhaps I need to issue an
error when the value is not any of these values), the cast is simply
(long)val. I modified your test program with:

printf("v: %ld, 0x%lx, %ld, %ld\n", v, v, (long)((int)((v) &
0x)), (long)v);

and I get:

v: 8589934591, 0x1, -1, 8589934591.

Am I missing something?

> 
> > +#else
> > +   return (long)val;
> > +#endif
> > +}
> > +
> >  /*
> >   * return the address being referenced be instruction
> >   * for rm=3 returning the content of the rm reg
> > @@ -567,19 +576,21 @@ int insn_get_reg_offset_sib_index(struct insn *insn, 
> > struct pt_regs *regs)
> >  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
> >  {
> > unsigned long linear_addr, seg_base_addr;
> > -   long eff_addr, base, indx;
> > -   int addr_offset, base_offset, indx_offset;
> > +   long eff_addr, base, indx, tmp;
> > +   int addr_offset, base_offset, indx_offset, addr_bytes;
> > insn_byte_t sib;
> >  
> > insn_get_modrm(insn);
> > insn_get_sib(insn);
> > sib = insn->sib.value;
> > +   addr_bytes = insn->addr_bytes;
> >  
> > if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> > addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
> > if (addr_offset < 0)
> > goto out_err;
> > -   eff_addr = regs_get_register(regs, addr_offset);
> > +   tmp = regs_get_register(regs, addr_offset);
> > +   eff_addr = __to_signed_long(tmp, addr_bytes);
> 
> This repeats throughout the function so it begs to be a separate:
> 
>   get_mem_addr()
> 
> or so.

Yes, the same pattern is used in all places except when 

Re: [v6 PATCH 11/21] insn/eval: Incorporate segment base in address computation

2017-04-26 Thread Ricardo Neri
On Fri, 2017-04-21 at 16:55 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:44PM -0800, Ricardo Neri wrote:
> > insn_get_addr_ref returns the effective address as defined by the
> 
> Please end function names with parentheses.

Will do.
> 
> > section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
> > Developer's Manual. In order to compute the linear address, we must add
> > to the effective address the segment base address as set in the segment
> > descriptor. Furthermore, the segment descriptor to use depends on the
> > register that is used as the base of the effective address. The effective
> > base address varies depending on whether the operand is a register or a
> > memory address and on whether a SiB byte is used.
> > 
> > In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
> > segment is used or if segmentation is not used. However, the base address
> > is not necessarily zero if a user programs defines its own segments. This
> > is possible by using a local descriptor table.
> > 
> > Since the effective address is a signed quantity, the unsigned segment
> > base address saved in a separate variable and added to the final effective
> 
> ".. is saved..."

I will correct this.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 10/21] x86/insn-eval: Do not use R/EBP as base if mod in ModRM is zero

2017-04-26 Thread Ricardo Neri
On Fri, 2017-04-21 at 12:52 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:43PM -0800, Ricardo Neri wrote:
> > Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
> > Developer's Manual volume 2A states that when the mod part of the ModRM
> > byte is zero and R/EBP is specified in the R/M part of such bit, the value
> > of the aforementioned register should not be used in the address
> > computation. Instead, a 32-bit displacement is expected. The instruction
> > decoder takes care of setting the displacement to the expected value.
> > Returning -EDOM signals callers that they should ignore the value of such
> > register when computing the address encoded in the instruction operands.
> > 
> > Also, callers should exercise care to correctly interpret this particular
> > case. In IA-32e 64-bit mode, the address is given by the displacement plus
> > the value of the RIP. In IA-32e compatibility mode, the value of EIP is
> > ignored. This correction is done for our insn_get_addr_ref.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 25 +++--
> >  1 file changed, 23 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index cda6c71..ea10b03 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -250,6 +250,14 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> > switch (type) {
> > case REG_TYPE_RM:
> > regno = X86_MODRM_RM(insn->modrm.value);
> > +   /* if mod=0, register R/EBP is not used in the address
> > +* computation. Instead, a 32-bit displacement is expected;
> > +* the instruction decoder takes care of reading such
> > +* displacement. This is true for both R/EBP and R13, as the
> > +* REX.B bit is not decoded.
> > +*/
> 
> I'd simply write here: "ModRM.mod == 0 and ModRM.rm == 5 means a 32-bit
> displacement is following."

I will shorten the comment.
> 
> In addition, kernel comments style is:
> 
>   /*
>* A sentence ending with a full-stop.
>* Another sentence. ...
>* More sentences. ...
>*/

... and use the correct style. I feel bad I missed this one.
> 
> > +   if (regno == 5 && X86_MODRM_MOD(insn->modrm.value) == 0)
> > +   return -EDOM;
> 
>   if (X86_MODRM_MOD(insn->modrm.value) == 0 &&
>   X86_MODRM_RM(insn->modrm.value)  == 5)
> 
> looks more understandable to me.

Should I go with !(X86_MODRM_MOD(insn->modrm.value)) as you suggested in
other patches?

> 
> > if (X86_REX_B(insn->rex_prefix.value))
> > regno += 8;
> > break;
> > @@ -599,9 +607,22 @@ void __user *insn_get_addr_ref(struct insn *insn, 
> > struct pt_regs *regs)
> > eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
> > } else {
> > addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
> > -   if (addr_offset < 0)
> > +   /* -EDOM means that we must ignore the address_offset.
> > +* The only case in which we see this value is when
> > +* R/M points to R/EBP. In such a case, in 64-bit mode
> > +* the effective address is relative to tho RIP.
> 
> s/tho//

Will correct.
> 
> > +*/
> 
> Kernel comments style is:
> 
>   /*
>* A sentence ending with a full-stop.
>* Another sentence. ...
>* More sentences. ...
>*/
> 

Will correct.
> &

Re: [v6 PATCH 09/21] x86/insn-eval: Add functions to get default operand and address sizes

2017-04-26 Thread Ricardo Neri
On Thu, 2017-04-20 at 15:06 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:42PM -0800, Ricardo Neri wrote:
> > These functions read the default values of the address and operand sizes
> > as specified in the segment descriptor. This information is determined
> > from the D and L bits. Hence, it can be used for both IA-32e 64-bit and
> > 32-bit legacy modes. For virtual-8086 mode, the default address and
> > operand sizes are always 2 bytes.
> 
> Yeah, we tend to call that customarily 16-bit :)

I will call it like this.
> 
> > The D bit is only meaningful for code segments. Thus, these functions
> > always use the code segment selector contained in regs.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/include/asm/insn-eval.h |  2 +
> >  arch/x86/lib/insn-eval.c | 80 
> > 
> >  2 files changed, 82 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/insn-eval.h 
> > b/arch/x86/include/asm/insn-eval.h
> > index b201742..a0d81fc 100644
> > --- a/arch/x86/include/asm/insn-eval.h
> > +++ b/arch/x86/include/asm/insn-eval.h
> > @@ -15,6 +15,8 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
> > pt_regs *regs);
> >  int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
> >  int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
> >  int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
> > +unsigned char insn_get_seg_default_address_bytes(struct pt_regs *regs);
> > +unsigned char insn_get_seg_default_operand_bytes(struct pt_regs *regs);
> >  unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
> > int regoff, bool use_default_seg);
> >  
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 383ca83..cda6c71 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -421,6 +421,86 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, 
> > struct insn *insn,
> >  }
> >  
> >  /**
> > + * insn_get_seg_default_address_bytes - Obtain default address size of 
> > segment
> > + * @regs:  Set of registers containing the segment selector
> > + *
> > + * Obtain the default address size as indicated in the segment descriptor
> > + * selected in regs' code segment selector. In protected mode, the default
> > + * address is determined by inspecting the L and D bits of the segment
> > + * descriptor. In virtual-8086 mode, the default is always two bytes.
> > + *
> > + * Return: Default address size of segment
> 
>   0 on error.
> 
> > + */
> > +unsigned char insn_get_seg_default_address_bytes(struct pt_regs *regs)
> > +{
> > +   struct desc_struct *desc;
> > +   unsigned short seg;
> > +   int ret;
> > +
> > +   if (v8086_mode(regs))
> > +   return 2;
> > +
> > +   seg = (unsigned short)regs->cs;
> > +
> > +   ret = get_desc(seg, );
> > +   if (ret)
> > +   return 0;
> > +
> > +   switch ((desc->l << 1) | desc->d) {
> > +   case 0: /* Legacy mode. 16-bit addresses. CS.L=0, CS.D=0 */
> > +   return 2;
> > +   case 1: /* Legacy mode. 32-bit addresses. CS.L=0, CS.D=1 */
> > +   return 4;
> > +   case 2: /* IA-32e 64-bit mode. 64-bit addresses. CS.L=1, CS.D=0 */
> > +   return 8;
> > +   case 3: /* Invalid setting. CS.L=1, CS.D=1 */
> > +   /* fall through */
> > +   default:
> > +   return 0;
> > +   }
> > +}
> > +
> > +/**
> > + * insn_get_seg_default_operand_bytes - Obtain default operand size of 
> > segment
> &

Re: [v6 PATCH 08/21] x86/insn-eval: Add utility function to get segment descriptor base address

2017-04-26 Thread Ricardo Neri
On Thu, 2017-04-20 at 10:25 +0200, Borislav Petkov wrote:
> > + * insn_get_seg_base() - Obtain base address contained in
> descriptor
> > + * @regs:Set of registers containing the segment selector
> > + * @insn:Instruction structure with selector override prefixes
> > + * @regoff:  Operand offset, in pt_regs, of which the selector is
> needed
> > + * @use_default_seg: Use the default segment instead of prefix
> overrides
> 
> I'm wondering whether you really need that bool or you can deduce this
> from pt_regs... I guess I'll see...

Probably insn_get_seg_base() itself can verify if there are segment
override prefixes in the struct insn. If yes, use them except for
specific cases such as CS.

On an unrelated note, I still have the problem of using DS vs ES for
string instructions. Perhaps instead of a use_default_seg flag, a
string_instruction flag that indicates how to determine the default
segment.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 08/21] x86/insn-eval: Add utility function to get segment descriptor base address

2017-04-26 Thread Ricardo Neri
On Thu, 2017-04-20 at 10:25 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:41PM -0800, Ricardo Neri wrote:
> > With segmentation, the base address of the segment descriptor is needed
> > to compute a linear address. The segment descriptor used in the address
> > computation depends on either any segment override prefixes in the in the
> 
> s/in the //

I will fix this typo.
> 
> > instruction or the default segment determined by the registers involved
> > in the address computation. Thus, both the instruction as well as the
> > register (specified as the offset from the base of pt_regs) are given as
> > inputs, along with a boolean variable to select between override and
> > default.
> > 
> > The segment selector is determined by get_seg_selector with the inputs
> 
> Please end function names with parentheses: get_seg_selector().

I will use parentheses.
> 
> > described above. Once the selector is known the base address is
> 
>   known, ...

Will fix.
> 
> > determined. In protected mode, the selector is used to obtain the segment
> > descriptor and then its base address. If in 64-bit user mode, the segment =
> > base address is zero except when FS or GS are used. In virtual-8086 mode,
> > the base address is computed as the value of the segment selector shifted 4
> > positions to the left.
> 
> Good.
> 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/include/asm/insn-eval.h |  2 ++
> >  arch/x86/lib/insn-eval.c | 66 
> > 
> >  2 files changed, 68 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/insn-eval.h 
> > b/arch/x86/include/asm/insn-eval.h
> > index 754211b..b201742 100644
> > --- a/arch/x86/include/asm/insn-eval.h
> > +++ b/arch/x86/include/asm/insn-eval.h
> > @@ -15,5 +15,7 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
> > pt_regs *regs);
> >  int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
> >  int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
> >  int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
> > +unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
> > +   int regoff, bool use_default_seg);
> >  
> >  #endif /* _ASM_X86_INSN_EVAL_H */
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 8608adf..383ca83 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -355,6 +355,72 @@ static int get_desc(unsigned short seg, struct 
> > desc_struct **desc)
> >  }
> >  
> >  /**
> > + * insn_get_seg_base() - Obtain base address contained in descriptor
> > + * @regs:  Set of registers containing the segment selector
> > + * @insn:  Instruction structure with selector override prefixes
> > + * @regoff:Operand offset, in pt_regs, of which the selector is 
> > needed
> > + * @use_default_seg: Use the default segment instead of prefix overrides
> 
> I'm wondering whether you really need that bool or you can deduce this
> from pt_regs... I guess I'll see...
> 
> > + *
> > + * Obtain the base address of the segment descriptor as indicated by either
> > + * any segment override prefixes contained in insn or the default segment
> > + * applicable to the register indicated by regoff. regoff is specified as 
> > the
> > + * offset in bytes from the base of pt_regs.
> > + *
> > + * Return: In protected mode, base address of the segment. It may be zero 
> > in
> > + * certain cases for 64-bit builds and/or 64-bit applications. In 
> > virtual-8086
> > + * mode, the segment selector shifed 4 positio

Re: [v6 PATCH 07/21] x86/insn-eval: Add utility function to get segment descriptor

2017-04-26 Thread Ricardo Neri
On Wed, 2017-04-19 at 12:26 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:40PM -0800, Ricardo Neri wrote:
> > The segment descriptor contains information that is relevant to how linear
> > address need to be computed. It contains the default size of addresses as
> > well as the base address of the segment. Thus, given a segment selector,
> > we ought look at segment descriptor to correctly calculate the linear
> > address.
> > 
> > In protected mode, the segment selector might indicate a segment
> > descriptor from either the global descriptor table or a local descriptor
> > table. Both cases are considered in this function.
> > 
> > This function is the initial implementation for subsequent functions that
> > will obtain the aforementioned attributes of the segment descriptor.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 61 
> > 
> >  1 file changed, 61 insertions(+)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 8d45df8..8608adf 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -5,9 +5,13 @@
> >   */
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> >  #include 
> >  
> >  enum reg_type {
> > @@ -294,6 +298,63 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> >  }
> >  
> >  /**
> > + * get_desc() - Obtain address of segment descriptor
> > + * @seg:   Segment selector
> 
> Maybe that should be
> 
> @sel
> 
> if it is a sel-ector. :)

It makes sense. I will rename it.
> 
> And using "sel" makes more sense then when you look at:
> 
>   desc_base = sel & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
> 
> for example:
> 
> > + * @desc:  Pointer to the selected segment descriptor
> > + *
> > + * Given a segment selector, obtain a memory pointer to the segment
> 
> s/memory //

Will update it.
> 
> > + * descriptor. Both global and local descriptor tables are supported.
> > + * desc will contain the address of the descriptor.
> > + *
> > + * Return: 0 if success, -EINVAL if failure
> 
> Why isn't this function returning the pointer or NULL on error? Maybe
> the later patches have an answer and I'll discover it if I continue
> reviewing :)

After revisiting the code, I don't see why the function cannot return
NULL.
> 
> > + */
> > +static int get_desc(unsigned short seg, struct desc_struct **desc)
> > +{
> > +   struct desc_ptr gdt_desc = {0, 0};
> > +   unsigned long desc_base;
> > +
> > +   if (!desc)
> > +   return -EINVAL;
> > +
> > +   desc_base = seg & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
> 
> That looks useless as you're doing it below again.

Yes, it is useless. Please see my comment below.
> 
> > +
> > +#ifdef CONFIG_MODIFY_LDT_SYSCALL
> > +   if ((seg & SEGMENT_TI_MASK) == SEGMENT_LDT) {
> > +   seg >>= 3;
> > +
> > +   mutex_lock(>active_mm->context.lock);
> > +   if (unlikely(!current->active_mm->context.ldt ||
> 
> Is that really a fast path to complicate the if-test with an unlikely()?
> If not, you don't really need it.

I will remove it.
> 
> > +seg >= current->active_mm->context.ldt->size)) {
> 
> ldt->size is the size of the descriptor table but you've shifted seg by
> 3. That selector index is shifted by 3 (to the left) to form an offset
> into the descriptor table because the entries there are 8 bytes.

I double-ch

Re: [v6 PATCH 06/21] x86/insn-eval: Add utility functions to get segment selector

2017-04-26 Thread Ricardo Neri
On Wed, 2017-04-26 at 13:44 -0700, Ricardo Neri wrote:
> > 
> > > +*/
> > > +   for (i = 0; i < insn->prefixes.nbytes; i++) {
> > > +   switch (insn->prefixes.bytes[i]) {
> > > +   case SEG_CS:
> > > +   return SEG_CS;
> > > +   case SEG_SS:
> > > +   return SEG_SS;
> > > +   case SEG_DS:
> > > +   return SEG_DS;
> > > +   case SEG_ES:
> > > +   return SEG_ES;
> > > +   case SEG_FS:
> > > +   return SEG_FS;
> > > +   case SEG_GS:
> > > +   return SEG_GS;
> > 
> > So what happens if you're in 64-bit mode and you have CS, DS, ES, or
> SS?
> > Or is this what @get_default is supposed to do? But it doesn't look
> like
> > it, it still returns segments ignored in 64-bit mode.
> 
> I regard that the role of this function is to obtain the the segment
> selector from either of the prefixes or inferred from the operands. It
> is the role of caller to determine if the segment selector should be
> ignored. So far the only caller is insn_get_seg_base() [1]. If in long
> mode, the segment base address is regarded as 0 unless the segment
> selector is FS or GS.
> > 
> > > +   default:
> > > +   return -EINVAL;
> > > +   }
> > > +   }
> > > +
> > > +default_seg:
> > > +   /*
> > > +* If no overrides, use default selectors as described in the
> > > +* Intel documentation: SS for ESP or EBP. DS for all data
> references,
> > > +* except when relative to stack or string destination.
> > > +* Also, AX, CX and DX are not valid register operands in
> 16-bit
> > > +* address encodings.
> > > +* Callers must interpret the result correctly according to
> the type
> > > +* of instructions (e.g., use ES for string instructions).
> > > +* Also, some values of modrm and sib might seem to indicate
> the use
> > > +* of EBP and ESP (e.g., modrm_mod = 0, modrm_rm = 5) but
> actually
> > > +* they refer to cases in which only a displacement used.
> These cases
> > > +* should be indentified by the caller and not with this
> function.
> > > +*/
> > > +   switch (regoff) {
> > > +   case offsetof(struct pt_regs, ax):
> > > +   /* fall through */
> > > +   case offsetof(struct pt_regs, cx):
> > > +   /* fall through */
> > > +   case offsetof(struct pt_regs, dx):
> > > +   if (insn && insn->addr_bytes == 2)
> > > +   return -EINVAL;
> > > +   case -EDOM: /* no register involved in address computation */
> > > +   case offsetof(struct pt_regs, bx):
> > > +   /* fall through */
> > > +   case offsetof(struct pt_regs, di):
> > > +   /* fall through */
> > 
> >   return SEG_ES;
> > 
> > ?
> 
> I double-checked the latest version of the Intel Software Development
> manual [2], in the table 3-5 in section 3.7.4 mentions that DS is
> default segment for all data references, except string destinations. I
> tested this code with the UMIP-protected instructions and whenever I
> use
> %edi the default segment is %ds.


I forgot my references:

[1]. https://lkml.org/lkml/2017/3/7/876
[2]. https://software.intel.com/en-us/articles/intel-sdm#combined

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 06/21] x86/insn-eval: Add utility functions to get segment selector

2017-04-26 Thread Ricardo Neri
On Tue, 2017-04-18 at 11:42 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:39PM -0800, Ricardo Neri wrote:
> > When computing a linear address and segmentation is used, we need to know
> > the base address of the segment involved in the computation. In most of
> > the cases, the segment base address will be zero as in USER_DS/USER32_DS.
> > However, it may be possible that a user space program defines its own
> > segments via a local descriptor table. In such a case, the segment base
> > address may not be zero .Thus, the segment base address is needed to
> > calculate correctly the linear address.
> > 
> > The segment selector to be used when computing a linear address is
> > determined by either any of segment select override prefixes in the
> > instruction or inferred from the registers involved in the computation of
> > the effective address; in that order. Also, there are cases when the
> > overrides shall be ignored.
> > 
> > For clarity, this process can be split into two steps: resolving the
> > relevant segment and, once known, read the applicable segment selector.
> > The method to obtain the segment selector depends on several factors. In
> > 32-bit builds, segment selectors are saved into the pt_regs structure
> > when switching to kernel mode. The same is also true for virtual-8086
> > mode. In 64-bit builds, segmentation is mostly ignored, except when
> > running a program in 32-bit legacy mode. In this case, CS and SS can be
> > obtained from pt_regs. DS, ES, FS and GS can be read directly from
> > registers.
> 
> > Lastly, segmentation is possible in 64-bit mode via FS and GS.
> 
> I'd say "Lastly, the only two segment registers which are not ignored in
> long mode are FS and GS."

I will make this clarification.
> 
> > In these two cases, base addresses are obtained from the relevant MSRs.
> 
> s/relevant/respective/

Will clarify.
> 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 195 
> > +++
> >  1 file changed, 195 insertions(+)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 78df1c9..8d45df8 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -8,6 +8,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  enum reg_type {
> > REG_TYPE_RM = 0,
> > @@ -15,6 +16,200 @@ enum reg_type {
> > REG_TYPE_BASE,
> >  };
> >  
> > +enum segment {
> > +   SEG_CS = 0x23,
> > +   SEG_SS = 0x36,
> > +   SEG_DS = 0x3e,
> > +   SEG_ES = 0x26,
> > +   SEG_FS = 0x64,
> > +   SEG_GS = 0x65
> > +};
> > +
> > +/**
> > + * resolve_seg_selector() - obtain segment selector
> > + * @regs:  Set of registers containing the segment selector
> 
> That arg is gone.

This came from one of my initial implementations. I will remove it.
> 
> > + * @insn:  Instruction structure with selector override prefixes
> > + * @regoff:Operand offset, in pt_regs, of which the selector is 
> > needed
> > + * @default:   Resolve default segment selector (i.e., ignore 
> > overrides)
> > + *
> > + * The segment selector to which an effective address refers depends on
> > + * a) segment selector overrides instruction prefixes or b) the operand
> > + * register indicated in the ModRM or SiB byte.
> > + *
> > + * For case a), the function inspects any prefixes in the insn instruction;
> 
> s/insn //

In this case I meant "any prefixes in the insn structure". Probably it
will make it more clear.
> 
> > + * insn can be null to indicate that selector override prefixes shall be
> > +

Re: [v6 PATCH 05/21] x86/insn-eval: Add utility functions to get register offsets

2017-04-26 Thread Ricardo Neri
On Wed, 2017-04-12 at 18:28 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:38PM -0800, Ricardo Neri wrote:
> > The function insn_get_reg_offset takes as argument an enumeration that
> 
> Please end function names with parentheses.

Will do! 
> 
> And do you mean get_reg_offset(), per chance?

Yes, I meant that. This was a copy/paste error.
> 
> > indicates the type of offset that is returned: the R/M part of the ModRM
> > byte, the index of the SIB byte or the base of the SIB byte.
> 
> Err, you mean, it returns the offset to the register the argument
> specifies.

Yes. I will reword.
> 
> > Callers of
> > this function would need the definition of such enumeration. This is not
> > needed. Instead, helper functions can be defined for this purpose can be
> > added.
> 
> "Instead, add helpers... "

I will reword.
> 
> > These functions are useful in cases when, for instance, the caller
> > needs to decide whether the operand is a register or a memory location by
> > looking at the mod part of the ModRM byte.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
> > Cc: Masami Hiramatsu <mhira...@kernel.org>
> > Cc: Adrian Hunter <adrian.hun...@intel.com>
> > Cc: Kees Cook <keesc...@chromium.org>
> > Cc: Thomas Garnier <thgar...@google.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Dmitry Vyukov <dvyu...@google.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/include/asm/insn-eval.h |  3 +++
> >  arch/x86/lib/insn-eval.c | 51 
> > 
> >  2 files changed, 54 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/insn-eval.h 
> > b/arch/x86/include/asm/insn-eval.h
> > index 5cab1b1..754211b 100644
> > --- a/arch/x86/include/asm/insn-eval.h
> > +++ b/arch/x86/include/asm/insn-eval.h
> > @@ -12,5 +12,8 @@
> >  #include 
> >  
> >  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
> > +int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
> > +int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
> > +int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
> 
> Forgotten to edit the copy-paste?
> 
> Which means, nothing really needs insn_get_reg_offset_sib_index() and
> you can get rid of it?

Yes, I can get rid of it.
> 
> >  #endif /* _ASM_X86_INSN_EVAL_H */
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 23cf010..78df1c9 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -98,6 +98,57 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> > return regoff[regno];
> >  }
> >  
> > +/**
> > + * insn_get_reg_offset_modrm_rm - Obtain register in r/m part of ModRM byte
> > + * @insn:  Instruction structure containing the ModRM byte
> > + * @regs:  Set of registers indicated by the ModRM byte
> 
> That's simply struct pt_regs - not a set of registers indicated by
> ModRM?!?

I will reword it to say "A struct pt_regs containing register values
indicated by the ModRM byte".
> 
> > + * Obtain the register indicated by the r/m part of the ModRM byte. The
> > + * register is obtained as an offset from the base of pt_regs. In specific
> > + * cases, the returned value can be -EDOM to indicate that the particular 
> > value
> > + * of ModRM does not refer to a register.
> 
> Put that sentence under the "Return: " paragraph below so that it is
> immediately obvious what the retvals are.

Will do.
> 
> > + *
> > + * Return: Register indicated by r/m, as an offset within struct pt_regs
> > + */
> > +int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs)
> 
> That name is too long: insn_get_modrm_rm_off() should be enough.
> 
> > +{
> > +   return get_reg_offset(insn, regs, REG_TYPE_RM);
> > +}
> > +
> > +/**
> > + * insn_get_reg_offset_sib_base - Obtain register in base part of SiB byte
> > + * @insn:  Instruction str

Re: [v6 PATCH 04/21] x86/mpx, x86/insn: Relocate insn util functions to a new insn-kernel

2017-04-25 Thread Ricardo Neri
On Wed, 2017-04-12 at 12:03 +0200, Borislav Petkov wrote:
> > +  * If mod is 0 and register R/EBP (regno=5) is
> indicated in the
> > +  * base part of the SIB byte, the value of such
> register should
> > +  * not be used in the address computation. Also, a
> 32-bit
> > +  * displacement is expected in this case; the
> instruction
> > +  * decoder takes care of it. This is true for both R13
> and
> > +  * R/EBP as REX.B will not be decoded.
> > +  */
> > + if (regno == 5 && X86_MODRM_MOD(insn->modrm.value) ==
> 0)
> > + return -EDOM;
> > +
> > + if (X86_REX_B(insn->rex_prefix.value))
> > + regno += 8;
> > + break;
> > +
> > + default:
> > + pr_err("invalid register type");
> > + BUG();
> 
> WARNING: Avoid crashing the kernel - try using WARN_ON & recovery code
> rather than BUG() or BUG_ON()
> #211: FILE: arch/x86/lib/insn-eval.c:90:
> +   BUG();
> 
> And checkpatch is kinda right. We need to warn here, not explode. Oh
> and
> that function returns negative values on error...
> 
> Please change that with a patch ontop of the move.

Sure, I will change it.


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 03/21] x86/mpx: Do not use R/EBP as base in the SIB byte with Mod = 0

2017-04-25 Thread Ricardo Neri
On Wed, 2017-04-12 at 00:08 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:36PM -0800, Ricardo Neri wrote:
> > Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
> > Developer's Manual volume 2A states that when a SIB byte is used and the
> > base of the SIB byte points to R/EBP (i.e., base = 5) and the mod part
> > of the ModRM byte is zero, the value of such register will not be used
> > as part of the address computation. To signal this, a -EDOM error is
> > returned to indicate callers that they should ignore the value.
> > 
> > Also, for this particular case, a displacement of 32-bits should follow
> > the SIB byte if the mod part of ModRM is equal to zero. The instruction
> > decoder ensures that this is the case.
> > 
> > Cc: Dave Hansen <dave.han...@linux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
> > Cc: Colin Ian King <colin.k...@canonical.com>
> > Cc: Lorenzo Stoakes <lstoa...@gmail.com>
> > Cc: Qiaowei Ren <qiaowei@intel.com>
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Nathan Howard <liverl...@gmail.com>
> > Cc: Adan Hawthorn <adanhawth...@gmail.com>
> > Cc: Joe Perches <j...@perches.com>
> > Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> > ---
> >  arch/x86/mm/mpx.c | 29 ++---
> >  1 file changed, 22 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
> > index d9e92d6..ef7eb67 100644
> > --- a/arch/x86/mm/mpx.c
> > +++ b/arch/x86/mm/mpx.c
> > @@ -121,6 +121,17 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> >  
> > case REG_TYPE_BASE:
> > regno = X86_SIB_BASE(insn->sib.value);
> > +   /*
> > +* If mod is 0 and register R/EBP (regno=5) is indicated in the
> > +* base part of the SIB byte,
> 
> you can simply say here: "if SIB.base == 5, the base of the
> register-indirect addressing is 0."

This is better wording. I will change it.
> 
> > the value of such register should
> > +* not be used in the address computation. Also, a 32-bit
> 
> Not "Also" but "In this case, a 32-bit displacement..."

Will change.
> 
> > +* displacement is expected in this case; the instruction
> > +* decoder takes care of it. This is true for both R13 and
> > +* R/EBP as REX.B will not be decoded.
> 
> You don't need that sentence as the only thing that matters is ModRM.mod
> being 0.

For the specific case of ModRM.mod being 0, I feel I need to clarify
that REX.B is not decoded and if SIB.base is %r13 the base is also 0.
This comment adds clarity because REX.X is decoded when determining
SIB.index.
> 
> > +*/
> > +   if (regno == 5 && X86_MODRM_MOD(insn->modrm.value) == 0)
> 
> The 0 test we normally do with the ! (also flip parts of if-condition):
> 
>   if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5)

Will change it.
> 
> > +   return -EDOM;
> > +
> > if (X86_REX_B(insn->rex_prefix.value))
> > regno += 8;
> > break;
> > @@ -161,16 +172,21 @@ static void __user *mpx_get_addr_ref(struct insn 
> > *insn, struct pt_regs *regs)
> > eff_addr = regs_get_register(regs, addr_offset);
> > } else {
> > if (insn->sib.nbytes) {
> > +   /*
> > +* Negative values in the base and index offset means
> > +* an error when decoding the SIB byte. Except -EDOM,
> > +* which means that the registers should not be used
> > +* in the address computation.
> > +*/
> > base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
> > -   if (base_offset < 0)
> > +   if (unlikely(base_offset == -EDOM))
> > +   base = 0;
> > +   else if (unlikely(base_offset < 0))
> 
> Bah, unlikely's in something which is not really a hot path. They only
> encumber readability, no need for them.

I will remove them.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 01/21] x86/mpx: Use signed variables to compute effective addresses

2017-04-25 Thread Ricardo Neri
On Tue, 2017-04-11 at 23:56 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:34PM -0800, Ricardo Neri wrote:
> > Even though memory addresses are unsigned. The operands used to compute the
> 
>   ... unsigned, the operands ...

Oops! I will correct.


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-04-03 Thread Ricardo Neri
On Fri, 2017-03-31 at 16:11 +0200, Alexandre Julliard wrote:
> Ricardo Neri <ricardo.neri-calde...@linux.intel.com> writes:
> 
> > On Thu, 2017-03-30 at 13:10 +0300, Stas Sergeev wrote:
> >> 30.03.2017 08:14, Ricardo Neri пишет:
> >> >>>> But at least dosemu implements it, so probably it is needed.
> >> >>> Right.
> >> >>>
> >> >>>> Of course if it is used by one of 100 DOS progs, then there
> >> >>>> is an option to just add its support to dosemu2 and pretend
> >> >>>> the compatibility problems did not exist. :)
> >> >>> Do you mean relaying the GP fault to dosemu instead of trapping it and
> >> >>> emulating it in the kernel?
> >> >> Yes, that would be optimal if this does not severely break
> >> >> the current setups. If we can find out that smsw is not in
> >> >> the real use, we can probably do exactly that.
> >> >> But other
> >> >> instructions are not in real use in v86 for sure, so I
> >> >> wouldn't be adding the explicit test-cases to the kernel
> >> >> that will make you depend on some particular behaviour
> >> >> that no one may need.
> >> >> My objection was that we shouldn't
> >> >> write tests before we know exactly how we want this to work.
> >> > OK, if only SMSW is used then I'll keep the emulation for SMSW only.
> >> In fact, smsw has an interesting property, which is that
> >> no one will ever want to disable its in-kernel emulation
> >> to provide its own.
> >> So while I'll try to estimate its usage, emulating it in kernel
> >> will not be that problematic in either case.
> >
> > Ah good to know!
> >
> >> As for protected mode, if wine only needs sgdt/sidt, then
> >> again, no one will want to disable its emulation. Not the
> >> case with sldt, but AFAICS wine doesn't need sldt, and so
> >> we can leave sldt without a fixups. Is my understanding
> >> correct?
> >
> > This is my understanding as well. I could not find any use of sldt in
> > wine. Alexandre, would you mind confirming?
> 
> Some versions of the Themida software protection are known to use sldt
> as part of the virtual machine detection code [1]. The check currently
> fails because it expects the LDT to be zero, so the app is already
> broken, but sldt segfaulting would still cause a crash where there
> wasn't one before.
> 
> However, I'm only aware of one application using this, and being able to
> catch and emulate sldt ourselves would actually give us a chance to fix
> this app in newer Wine versions, so I'm not opposed to having it
> segfault.

Great! Then this is in line with what we are aiming to do with dosemu2:
not emulate str and sldt.
> 
> In fact it would be nice to be able to make sidt/sgdt/etc. segfault
> too. I know a new syscall is a pain, but as far as Wine is concerned,
> being able to opt out from any emulation would be potentially useful.

I see. I guess for now there should not be a problem with emulating
sidt/sgdt/smsw, right? In this way we don't break current versions of
winehq and programs using it. In a phase two we can introduce the
syscall so that kernel fixups can be disabled. Does this make sense?

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-30 Thread Ricardo Neri
On Thu, 2017-03-30 at 13:10 +0300, Stas Sergeev wrote:
> 30.03.2017 08:14, Ricardo Neri пишет:
> >>>> But at least dosemu implements it, so probably it is needed.
> >>> Right.
> >>>
> >>>> Of course if it is used by one of 100 DOS progs, then there
> >>>> is an option to just add its support to dosemu2 and pretend
> >>>> the compatibility problems did not exist. :)
> >>> Do you mean relaying the GP fault to dosemu instead of trapping it and
> >>> emulating it in the kernel?
> >> Yes, that would be optimal if this does not severely break
> >> the current setups. If we can find out that smsw is not in
> >> the real use, we can probably do exactly that.
> >> But other
> >> instructions are not in real use in v86 for sure, so I
> >> wouldn't be adding the explicit test-cases to the kernel
> >> that will make you depend on some particular behaviour
> >> that no one may need.
> >> My objection was that we shouldn't
> >> write tests before we know exactly how we want this to work.
> > OK, if only SMSW is used then I'll keep the emulation for SMSW only.
> In fact, smsw has an interesting property, which is that
> no one will ever want to disable its in-kernel emulation
> to provide its own.
> So while I'll try to estimate its usage, emulating it in kernel
> will not be that problematic in either case.

Ah good to know!

> As for protected mode, if wine only needs sgdt/sidt, then
> again, no one will want to disable its emulation. Not the
> case with sldt, but AFAICS wine doesn't need sldt, and so
> we can leave sldt without a fixups. Is my understanding
> correct?

This is my understanding as well. I could not find any use of sldt in
wine. Alexandre, would you mind confirming?

> In this case, I suppose, we are very well on a way to avoid
> the extra syscalls to toggle the emulation features.

Great! Then I will keep the emulation for sgdt, sidt, and smsw but not
for str and sldt; for both vm86 and protected mode. This seems to be the
agreement.

Thanks and BR,
Ricardo


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-29 Thread Ricardo Neri
On Wed, 2017-03-29 at 23:55 +0300, Stas Sergeev wrote:
> 29.03.2017 07:38, Ricardo Neri пишет:
> >> Probably you could also remove
> >> the sldt and str emulation for protected mode, because,
> >> as I understand from this thread, wine does not
> >> need those.
> > I see. I would lean on keeping the emulation because I already
> > implemented it :), for completeness, and because it is performed in a
> > single switch. The bulk of the emulation code deals with operands.
> But this is not for free.
> As Andy said, you will then need a syscall and
> a feature mask to be able to disable this emulation.
> And AFAIK you haven't implemented that yet, so
> there is something to consider.

Right, I see your point.

> >>>> You know the wine's
> >>>> requirements now - they are very small. And
> >>>> dosemu doesn't need anything at all but smsw.
> >>>> And even smsw is very rare.
> >>> But emulation is still needed for SMSW, right?
> >> Likely so.
> >> If you want, I can enable the logging of this command
> >> and see if it is used by some of the DOS programs I have.
> > It would be great if you could do that, if you don't mind.
> OK, scheduled to the week-end.
> I'll let you know.

Thanks!

> 
> >> But at least dosemu implements it, so probably it is needed.
> > Right.
> >
> >> Of course if it is used by one of 100 DOS progs, then there
> >> is an option to just add its support to dosemu2 and pretend
> >> the compatibility problems did not exist. :)
> > Do you mean relaying the GP fault to dosemu instead of trapping it and
> > emulating it in the kernel?
> Yes, that would be optimal if this does not severely break
> the current setups. If we can find out that smsw is not in
> the real use, we can probably do exactly that. 
> But other
> instructions are not in real use in v86 for sure, so I
> wouldn't be adding the explicit test-cases to the kernel
> that will make you depend on some particular behaviour
> that no one may need.

> My objection was that we shouldn't
> write tests before we know exactly how we want this to work.
OK, if only SMSW is used then I'll keep the emulation for SMSW only.


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-28 Thread Ricardo Neri
On Tue, 2017-03-28 at 12:38 +0300, Stas Sergeev wrote:
> 28.03.2017 02:46, Ricardo Neri пишет:
> > On Tue, 2017-03-14 at 00:25 +0300, Stas Sergeev wrote:
> >> 11.03.2017 02:59, Ricardo Neri пишет:
> >>> On Fri, 2017-03-10 at 14:33 +0300, Stas Sergeev wrote:
> >>>
> >>>> Why would you need one?
> >>>> Or do you really want to allow these instructions
> >>>> in v86 by the means of emulation? If so - this wasn't
> >>>> clearly stated in the patch description, neither it was
> >>>> properly discussed, it seems.
> >>> It str and sldt can be emulated in vm86 but as Andy mention, the
> >>> behavior sould be the same with and without emulation.
> >> Why would you do that?
> >> I looked up the dosemu2 CPU simulator code that
> >> is used under x86-64. It says this:
> > Stas, I apologize for the delayed reply; I missed your e-mail.
> >> It only implements smsw.
> >> So maybe you can make your code much
> >> simpler and remove the unneeded emulation?
> >> Same is for prot mode.
> > Do you mean the unneeded emulation for SLDT and STR?
> Not quite, I meant also sgdt and sidt in vm86.
> Yes that it will be a somewhat "incompatible" change,
> but if there is nothing to stay compatible with,
> then why to worry?

My idea of compatibility was to have the emulation code behave exactly
as a processor without UMIP :)

> Probably you could also remove
> the sldt and str emulation for protected mode, because,
> as I understand from this thread, wine does not
> need those.

I see. I would lean on keeping the emulation because I already
implemented it :), for completeness, and because it is performed in a
single switch. The bulk of the emulation code deals with operands.
> 
> Note that these days dosemu2 uses v86 mode set
> up under kvm rather than vm86(). Your patches
> affect that the same way as they do for vm86()
> syscall, or can there be some differences?
My code does not touch kvm at all. I would need to assess how kvm will
behave.
> Or should
> the UMIP be enabled under kvm by hands?
There was an attempt to emulate UMIP that was submitted a while ago:
https://lkml.org/lkml/2016/7/12/644

> 
> >> You know the wine's
> >> requirements now - they are very small. And
> >> dosemu doesn't need anything at all but smsw.
> >> And even smsw is very rare.
> > But emulation is still needed for SMSW, right?
> Likely so.
> If you want, I can enable the logging of this command
> and see if it is used by some of the DOS programs I have.

It would be great if you could do that, if you don't mind.
> But at least dosemu implements it, so probably it is needed.

Right.

> Of course if it is used by one of 100 DOS progs, then there
> is an option to just add its support to dosemu2 and pretend
> the compatibility problems did not exist. :)
Do you mean relaying the GP fault to dosemu instead of trapping it and
emulating it in the kernel?

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-27 Thread Ricardo Neri
On Tue, 2017-03-14 at 00:25 +0300, Stas Sergeev wrote:
> 11.03.2017 02:59, Ricardo Neri пишет:
> > On Fri, 2017-03-10 at 14:33 +0300, Stas Sergeev wrote:
> >
> >> Why would you need one?
> >> Or do you really want to allow these instructions
> >> in v86 by the means of emulation? If so - this wasn't
> >> clearly stated in the patch description, neither it was
> >> properly discussed, it seems.
> > It str and sldt can be emulated in vm86 but as Andy mention, the
> > behavior sould be the same with and without emulation.
> Why would you do that?
> I looked up the dosemu2 CPU simulator code that
> is used under x86-64. It says this:

Stas, I apologize for the delayed reply; I missed your e-mail. 
> ---
>  CODE_FLUSH();
>  if (REALMODE()) goto illegal_op;
>  PC += ModRMSim(PC+1, mode) + 1;
>  error("SLDT not implemented\n");
>  break;
>  case 1: /* STR */
>  /* Store Task Register */
>  CODE_FLUSH();
>  if (REALMODE()) goto illegal_op;
>  PC += ModRMSim(PC+1, mode) + 1;
>  error("STR not implemented\n");
>  break;
> ...
>  case 0: /* SGDT */
>  /* Store Global Descriptor Table 
> Register */
>  PC++; PC += ModRM(opc, PC, 
> mode|DATA16|MSTORE);
>  error("SGDT not implemented\n");
>  break;
>  case 1: /* SIDT */
>  /* Store Interrupt Descriptor Table 
> Register */
>  PC++; PC += ModRM(opc, PC, 
> mode|DATA16|MSTORE);
>  error("SIDT not implemented\n");
>  break;
> ---
> 
> It only implements smsw.
> So maybe you can make your code much
> simpler and remove the unneeded emulation?
> Same is for prot mode.

Do you mean the unneeded emulation for SLDT and STR?

> You know the wine's
> requirements now - they are very small. And
> dosemu doesn't need anything at all but smsw.
> And even smsw is very rare.
But emulation is still needed for SMSW, right?

The majority of my patches deal with computing the effective based on
the instruction operands and linear addresses based on the effective
address and the segment descriptor. Only two or three patches deal with
identifying particular UMIP-protected instructions. Not having to worry
about STR and SLDT in vm86 could simplify things a bit, though.

Thanks and BR,
Ricardo


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-10 Thread Ricardo Neri
On Fri, 2017-03-10 at 06:17 -0800, Andy Lutomirski wrote:
> On Fri, Mar 10, 2017 at 3:33 AM, Stas Sergeev <s...@list.ru> wrote:
> > 10.03.2017 05:39, Andy Lutomirski пишет:
> >
> >> On Thu, Mar 9, 2017 at 2:10 PM, Stas Sergeev <s...@list.ru> wrote:
> >>>
> >>> 09.03.2017 04:15, Ricardo Neri пишет:
> >>>
> >>>> On Wed, 2017-03-08 at 08:46 -0800, Andy Lutomirski wrote:
> >>>>>
> >>>>> On Wed, Mar 8, 2017 at 8:29 AM, Stas Sergeev <s...@list.ru> wrote:
> >>>>>>
> >>>>>> 08.03.2017 19:06, Andy Lutomirski пишет:
> >>>>>>>
> >>>>>>> On Wed, Mar 8, 2017 at 6:08 AM, Stas Sergeev <s...@list.ru> wrote:
> >>>>>>>>
> >>>>>>>> 08.03.2017 03:32, Ricardo Neri пишет:
> >>>>>>>>>
> >>>>>>>>> These are the instructions covered by UMIP:
> >>>>>>>>> * SGDT - Store Global Descriptor Table
> >>>>>>>>> * SIDT - Store Interrupt Descriptor Table
> >>>>>>>>> * SLDT - Store Local Descriptor Table
> >>>>>>>>> * SMSW - Store Machine Status Word
> >>>>>>>>> * STR - Store Task Register
> >>>>>>>>>
> >>>>>>>>> This patchset initially treated tasks running in virtual-8086
> >>>>>
> >>>>> mode as a
> >>>>>>>>>
> >>>>>>>>> special case. However, I received clarification that DOSEMU[8]
> >>>>>
> >>>>> does not
> >>>>>>>>>
> >>>>>>>>> support applications that use these instructions.
> >>>>>>>
> >>>>>>> Can you remind me what was special about it?  It looks like you
> >>>>>
> >>>>> still
> >>>>>>>
> >>>>>>> emulate them in v8086 mode.
> >>>>>>
> >>>>>> Indeed, sorry, I meant prot mode here. :)
> >>>>>> So I wonder what was cited to be special about v86.
> >>>>
> >>>> Initially my patches disabled UMIP on virtual-8086 instructions, without
> >>>> regards of protected mode (i.e., UMIP was always enabled). I didn't have
> >>>> emulation at the time. Then, I added emulation code that now covers
> >>>> protected and virtual-8086 modes. I guess it is not special anymore.
> >>>
> >>> But isn't SLDT just throw UD in v86?
> >>> How does UMIP affect this? How does your patch affect
> >>> this?
> >>
> >> Er, right.  Ricardo, your code may need fixing.  But don't you have a
> >> test case for this?
> >
> > Why would you need one?
> > Or do you really want to allow these instructions
> > in v86 by the means of emulation? If so - this wasn't
> > clearly stated in the patch description, neither it was
> > properly discussed, it seems.
> 
> What I meant was: if the patches incorrectly started making these
> instructions work in vm86 mode where they used to cause a vm86 exit,
> then that's a bug that the selftest should have caught.

Yes, this is the case. I will fix this behavior... and update the test
cases.


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-10 Thread Ricardo Neri
On Sat, 2017-03-11 at 02:58 +0300, Stas Sergeev wrote:
> 11.03.2017 02:47, Ricardo Neri пишет:
> >>
> >>>> It doesn't need to be a matter of this particular
> >>>> patch set, i.e. this proposal should not trigger a
> >>>> v7 resend of all 21 patches. :) But it would be useful
> >>>> for the future development of dosemu2.
> >>> Would dosemu2 use 32-bit processes in order to keep segmentation? If it
> >>> could use 64-bit processes, emulation is not used in this case and the
> >>> SIGSEGV is delivered to user space.
> >> It does use the mix: 64bit process but some segments
> >> are 32bit for DOS code.
> > Do you mean that dosemu2 will start as a 64-bit process and will jump to
> > 32-bit code segments?
> Yes, so the offending insns are executed only in 32bit
> and 16bit segments, even if the process itself is 64bit.
> I guess you handle 16bit segments same as 32bit ones.

I have code to handle 16-bit and 32-bit address encodings differently.
Segmentation is used if !user_64bit_mode(regs). In such a case, the
emulation code will check the segment descriptor D flag and the
address-size overrides prefix to determine the address size and use
16-bit or 32-bit address encodings as applicable.

> 
> >   My emulation code should work in this case as it
> > will use segmentation in 32-bit code descriptors. Is there anything else
> > needed?
> If I understand you correctly, you are saying that SLDT
> executed in 64bit code segment, will inevitably segfault
> to userspace. 
Correct.

> If this is the case and it makes your code
> simpler, then its perfectly fine with me as dosemu does
> not do this and the 64bit DOS progs are not anticipated.

But if 32-bit or 16-bit code segments are used emulation will be used.

Thanks and BR,
Ricardo


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-10 Thread Ricardo Neri
On Fri, 2017-03-10 at 14:33 +0300, Stas Sergeev wrote:
> 10.03.2017 05:39, Andy Lutomirski пишет:
> > On Thu, Mar 9, 2017 at 2:10 PM, Stas Sergeev <s...@list.ru> wrote:
> >> 09.03.2017 04:15, Ricardo Neri пишет:
> >>
> >>> On Wed, 2017-03-08 at 08:46 -0800, Andy Lutomirski wrote:
> >>>> On Wed, Mar 8, 2017 at 8:29 AM, Stas Sergeev <s...@list.ru> wrote:
> >>>>> 08.03.2017 19:06, Andy Lutomirski пишет:
> >>>>>> On Wed, Mar 8, 2017 at 6:08 AM, Stas Sergeev <s...@list.ru> wrote:
> >>>>>>> 08.03.2017 03:32, Ricardo Neri пишет:
> >>>>>>>> These are the instructions covered by UMIP:
> >>>>>>>> * SGDT - Store Global Descriptor Table
> >>>>>>>> * SIDT - Store Interrupt Descriptor Table
> >>>>>>>> * SLDT - Store Local Descriptor Table
> >>>>>>>> * SMSW - Store Machine Status Word
> >>>>>>>> * STR - Store Task Register
> >>>>>>>>
> >>>>>>>> This patchset initially treated tasks running in virtual-8086
> >>>> mode as a
> >>>>>>>> special case. However, I received clarification that DOSEMU[8]
> >>>> does not
> >>>>>>>> support applications that use these instructions.
> >>>>>> Can you remind me what was special about it?  It looks like you
> >>>> still
> >>>>>> emulate them in v8086 mode.
> >>>>> Indeed, sorry, I meant prot mode here. :)
> >>>>> So I wonder what was cited to be special about v86.
> >>> Initially my patches disabled UMIP on virtual-8086 instructions, without
> >>> regards of protected mode (i.e., UMIP was always enabled). I didn't have
> >>> emulation at the time. Then, I added emulation code that now covers
> >>> protected and virtual-8086 modes. I guess it is not special anymore.
> >> But isn't SLDT just throw UD in v86?
> >> How does UMIP affect this? How does your patch affect
> >> this?
> > Er, right.  Ricardo, your code may need fixing.  But don't you have a
> > test case for this?
> Why would you need one?
> Or do you really want to allow these instructions
> in v86 by the means of emulation? If so - this wasn't
> clearly stated in the patch description, neither it was
> properly discussed, it seems.

It str and sldt can be emulated in vm86 but as Andy mention, the
behavior sould be the same with and without emulation.

Thanks and BR,
Ricardo


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-10 Thread Ricardo Neri
On Thu, 2017-03-09 at 18:39 -0800, Andy Lutomirski wrote:
> On Thu, Mar 9, 2017 at 2:10 PM, Stas Sergeev <s...@list.ru> wrote:
> > 09.03.2017 04:15, Ricardo Neri пишет:
> >
> >> On Wed, 2017-03-08 at 08:46 -0800, Andy Lutomirski wrote:
> >>>
> >>> On Wed, Mar 8, 2017 at 8:29 AM, Stas Sergeev <s...@list.ru> wrote:
> >>>>
> >>>> 08.03.2017 19:06, Andy Lutomirski пишет:
> >>>>>
> >>>>> On Wed, Mar 8, 2017 at 6:08 AM, Stas Sergeev <s...@list.ru> wrote:
> >>>>>>
> >>>>>> 08.03.2017 03:32, Ricardo Neri пишет:
> >>>>>>>
> >>>>>>> These are the instructions covered by UMIP:
> >>>>>>> * SGDT - Store Global Descriptor Table
> >>>>>>> * SIDT - Store Interrupt Descriptor Table
> >>>>>>> * SLDT - Store Local Descriptor Table
> >>>>>>> * SMSW - Store Machine Status Word
> >>>>>>> * STR - Store Task Register
> >>>>>>>
> >>>>>>> This patchset initially treated tasks running in virtual-8086
> >>>
> >>> mode as a
> >>>>>>>
> >>>>>>> special case. However, I received clarification that DOSEMU[8]
> >>>
> >>> does not
> >>>>>>>
> >>>>>>> support applications that use these instructions.
> >>>>>
> >>>>> Can you remind me what was special about it?  It looks like you
> >>>
> >>> still
> >>>>>
> >>>>> emulate them in v8086 mode.
> >>>>
> >>>> Indeed, sorry, I meant prot mode here. :)
> >>>> So I wonder what was cited to be special about v86.
> >>
> >> Initially my patches disabled UMIP on virtual-8086 instructions, without
> >> regards of protected mode (i.e., UMIP was always enabled). I didn't have
> >> emulation at the time. Then, I added emulation code that now covers
> >> protected and virtual-8086 modes. I guess it is not special anymore.
> >
> > But isn't SLDT just throw UD in v86?
> > How does UMIP affect this? How does your patch affect
> > this?
> 
> Er, right.  Ricardo, your code may need fixing.  But don't you have a
> test case for this?  The behavior should be the same with and without
> your patches applied.  The exception is #UD, not #GP, so maybe your
> code just never executes in the vm86 case.

Ouch! Yes, I am afraid my code will attempt to emulate sldt in vm86
mode. The test cases that I have for vm86 are only for the instructions
that are valid in vm86: smsw, sidt and sgdt.

I will add test cases for str and sldt and make sure that a #UD is
issued.

Would this trigger a v7 series?

Thanks and BR,
Ricardo
> 
> --Andy


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-10 Thread Ricardo Neri
On Fri, 2017-03-10 at 01:01 +0300, Stas Sergeev wrote:
> 09.03.2017 03:46, Ricardo Neri пишет:
> > On Wed, 2017-03-08 at 17:08 +0300, Stas Sergeev wrote:
> >> 08.03.2017 03:32, Ricardo Neri пишет:
> >>> These are the instructions covered by UMIP:
> >>> * SGDT - Store Global Descriptor Table
> >>> * SIDT - Store Interrupt Descriptor Table
> >>> * SLDT - Store Local Descriptor Table
> >>> * SMSW - Store Machine Status Word
> >>> * STR - Store Task Register
> >>>
> >>> This patchset initially treated tasks running in virtual-8086 mode as a
> >>> special case. However, I received clarification that DOSEMU[8] does not
> >>> support applications that use these instructions.
> >> Yes, this is the case.
> >> But at least in the past there was an attempt to
> >> support SLDT as it is used by an ancient pharlap
> >> DOS extender (currently unsupported by dosemu1/2).
> >> So how difficult would it be to add an optional
> >> possibility of delivering such SIGSEGV to userspace
> >> so that the kernel's dummy emulation can be overridden?
> > I suppose a umip=noemulation kernel parameter could be added in this
> > case.
> Why?
> It doesn't need to be global: the app should be
> able to change that on its own. Note that no app currently
> requires this, so its just for the future, and in the
> future the app can start using the new API for this,
> if you provide one.

Right, I missed this detail. Then, yes the API should allow only one app
to relay the SIGSEGV.
> 
> 
> >> It doesn't need to be a matter of this particular
> >> patch set, i.e. this proposal should not trigger a
> >> v7 resend of all 21 patches. :) But it would be useful
> >> for the future development of dosemu2.
> > Would dosemu2 use 32-bit processes in order to keep segmentation? If it
> > could use 64-bit processes, emulation is not used in this case and the
> > SIGSEGV is delivered to user space.
> It does use the mix: 64bit process but some segments
> are 32bit for DOS code.

Do you mean that dosemu2 will start as a 64-bit process and will jump to
32-bit code segments? My emulation code should work in this case as it
will use segmentation in 32-bit code descriptors. Is there anything else
needed?

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 21/21] selftests/x86: Add tests for User-Mode Instruction Prevention

2017-03-10 Thread Ricardo Neri
On Wed, 2017-03-08 at 07:56 -0800, Andy Lutomirski wrote:
> On Tue, Mar 7, 2017 at 4:32 PM, Ricardo Neri
> <ricardo.neri-calde...@linux.intel.com> wrote:
> > Certain user space programs that run on virtual-8086 mode may utilize
> > instructions protected by the User-Mode Instruction Prevention (UMIP)
> > security feature present in new Intel processors: SGDT, SIDT and SMSW. In
> > such a case, a general protection fault is issued if UMIP is enabled. When
> > such a fault happens, the kernel catches it and emulates the results of
> > these instructions with dummy values. The purpose of this new
> > test is to verify whether the impacted instructions can be executed without
> > causing such #GP. If no #GP exceptions occur, we expect to exit virtual-
> > 8086 mode from INT 0x80.
> >
> > The instructions protected by UMIP are executed in representative use
> > cases:
> >  a) the memory address of the result is given in the form of a displacement
> > from the base of the data segment
> >  b) the memory address of the result is given in a general purpose register
> >  c) the result is stored directly in a general purpose register.
> >
> > Unfortunately, it is not possible to check the results against a set of
> > expected values because no emulation will occur in systems that do not have
> > the UMIP feature. Instead, results are printed for verification.
> 
> You could pre-initialize the result buffer to a bunch of non-matching
> values (1, 2, 3, ...) and then check that all the invocations of the
> same instruction gave the same value.

Yes, I can do this. Alternatively, I can check in the test program if
the CPU has UMIP and only run the tests in that case.

> 
> If you do this, maybe make it a follow-up patch -- see other email.

Great! Thank you!

Thanks and BR,
Ricardo


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-08 Thread Ricardo Neri
On Wed, 2017-03-08 at 08:46 -0800, Andy Lutomirski wrote:
> On Wed, Mar 8, 2017 at 8:29 AM, Stas Sergeev <s...@list.ru> wrote:
> > 08.03.2017 19:06, Andy Lutomirski пишет:
> >>
> >> On Wed, Mar 8, 2017 at 6:08 AM, Stas Sergeev <s...@list.ru> wrote:
> >>>
> >>> 08.03.2017 03:32, Ricardo Neri пишет:
> >>>>
> >>>> These are the instructions covered by UMIP:
> >>>> * SGDT - Store Global Descriptor Table
> >>>> * SIDT - Store Interrupt Descriptor Table
> >>>> * SLDT - Store Local Descriptor Table
> >>>> * SMSW - Store Machine Status Word
> >>>> * STR - Store Task Register
> >>>>
> >>>> This patchset initially treated tasks running in virtual-8086
> mode as a
> >>>> special case. However, I received clarification that DOSEMU[8]
> does not
> >>>> support applications that use these instructions.
> >>
> >> Can you remind me what was special about it?  It looks like you
> still
> >> emulate them in v8086 mode.
> >
> > Indeed, sorry, I meant prot mode here. :)
> > So I wonder what was cited to be special about v86.

Initially my patches disabled UMIP on virtual-8086 instructions, without
regards of protected mode (i.e., UMIP was always enabled). I didn't have
emulation at the time. Then, I added emulation code that now covers
protected and virtual-8086 modes. I guess it is not special anymore.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-08 Thread Ricardo Neri
On Wed, 2017-03-08 at 19:53 +0300, Stas Sergeev wrote:
> 08.03.2017 19:46, Andy Lutomirski пишет:
> >> No no, since I meant prot mode, this is not what I need.
> >> I would never need to disable UMIP as to allow the
> >> prot mode apps to do SLDT. Instead it would be good
> >> to have an ability to provide a replacement for the dummy
> >> emulation that is currently being proposed for kernel.
> >> All is needed for this, is just to deliver a SIGSEGV.
> > That's what I meant.  Turning off FIXUP_UMIP would leave UMIP on but
> > turn off the fixup, so you'd get a SIGSEGV indicating #GP (or a vm86
> > GP exit).
> But then I am confused with the word "compat" in
> your "COMPAT_MASK0_X86_UMIP_FIXUP" and
> "sys_adjust_compat_mask(int op, int word, u32 mask);"
> 
> Leaving UMIP on and only disabling a fixup doesn't
> sound like a compat option to me. I would expect
> compat to disable it completely.

I guess that the _UMIP_FIXUP part makes it clear that emulation, not
UMIP is disabled, allowing the SIGSEGV be delivered to the user space
program.

Would having a COMPAT_MASK0_X86_UMIP_FIXUP to disable emulation and a
COMPAT_MASK0_X86_UMIP to disable UMIP make sense?

Also, wouldn't having a COMPAT_MASK0_X86_UMIP to disable UMIP defeat its
purpose? Applications could simply use this compat mask to bypass UMIP
and gain access to the instructions it protects.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6 PATCH 00/21] x86: Enable User-Mode Instruction Prevention

2017-03-08 Thread Ricardo Neri
On Wed, 2017-03-08 at 17:08 +0300, Stas Sergeev wrote:
> 08.03.2017 03:32, Ricardo Neri пишет:
> > These are the instructions covered by UMIP:
> > * SGDT - Store Global Descriptor Table
> > * SIDT - Store Interrupt Descriptor Table
> > * SLDT - Store Local Descriptor Table
> > * SMSW - Store Machine Status Word
> > * STR - Store Task Register
> >
> > This patchset initially treated tasks running in virtual-8086 mode as a
> > special case. However, I received clarification that DOSEMU[8] does not
> > support applications that use these instructions.
> Yes, this is the case.
> But at least in the past there was an attempt to
> support SLDT as it is used by an ancient pharlap
> DOS extender (currently unsupported by dosemu1/2).
> So how difficult would it be to add an optional
> possibility of delivering such SIGSEGV to userspace
> so that the kernel's dummy emulation can be overridden?

I suppose a umip=noemulation kernel parameter could be added in this
case.

> It doesn't need to be a matter of this particular
> patch set, i.e. this proposal should not trigger a
> v7 resend of all 21 patches. :) But it would be useful
> for the future development of dosemu2.

Would dosemu2 use 32-bit processes in order to keep segmentation? If it
could use 64-bit processes, emulation is not used in this case and the
SIGSEGV is delivered to user space.

Thanks and BR,
Ricardo


--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 19/21] x86/traps: Fixup general protection faults caused by UMIP

2017-03-07 Thread Ricardo Neri
If the User-Mode Instruction Prevention CPU feature is available and
enabled, a general protection fault will be issued if the instructions
sgdt, sldt, sidt, str or smsw are executed from user-mode context
(CPL > 0). If the fault was caused by any of the instructions protected
by UMIP, fixup_umip_exception will emulate dummy results for these
instructions. If emulation is successful, the result is passed to the
user space program and no SIGSEGV signal is emitted.

Please note that fixup_umip_exception also caters for the case when
the fault originated while running in virtual-8086 mode.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Cc: Tony Luck <tony.l...@intel.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Liang Z. Li <liang.z...@intel.com>
Cc: Alexandre Julliard <julli...@winehq.org>
Cc: Stas Sergeev <s...@list.ru>
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/kernel/traps.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 948443e..86efbcb 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -492,6 +493,9 @@ do_general_protection(struct pt_regs *regs, long error_code)
RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
cond_local_irq_enable(regs);
 
+   if (user_mode(regs) && fixup_umip_exception(regs))
+   return;
+
if (v8086_mode(regs)) {
local_irq_enable();
handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 02/21] x86/mpx: Do not use SIB index if index points to R/ESP

2017-03-07 Thread Ricardo Neri
Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when memory addressing is used
(i.e., mod part of ModR/M is not 3), a SIB byte is used and the index of
the SIB byte points to the R/ESP (i.e., index = 4), the index should not be
used in the computation of the memory address.

In these cases the address is simply the value present in the register
pointed by the base part of the SIB byte plus the displacement byte.

An example of such instruction could be

insn -0x80(%rsp)

This is represented as:

 [opcode] 4c 23 80

  ModR/M=0x4c: mod: 0x1, reg: 0x1: r/m: 0x4(R/ESP)
  SIB=0x23: sc: 0, index: 0x100(R/ESP), base: 0x11(R/EBX):
  Displacement -0x80

The correct address is (base) + displacement; no index is used.

We can achieve the desired effect of not using the index by making
get_reg_offset return -EDOM in this particular case. This value indicates
callers that they should not use the index to calculate the address.
EINVAL continues to indicate that an error when decoding the SIB byte.

Care is taken to allow R12 to be used as index, which is a valid scenario.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Nathan Howard <liverl...@gmail.com>
Cc: Adan Hawthorn <adanhawth...@gmail.com>
Cc: Joe Perches <j...@perches.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/mm/mpx.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index ff112e3..d9e92d6 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -110,6 +110,13 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
regno = X86_SIB_INDEX(insn->sib.value);
if (X86_REX_X(insn->rex_prefix.value))
regno += 8;
+   /*
+* If mod !=3, register R/ESP (regno=4) is not used as index in
+* the address computation. Check is done after looking at REX.X
+* This is because R12 (regno=12) can be used as an index.
+*/
+   if (regno == 4 && X86_MODRM_MOD(insn->modrm.value) != 3)
+   return -EDOM;
break;
 
case REG_TYPE_BASE:
@@ -159,11 +166,19 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
goto out_err;
 
indx_offset = get_reg_offset(insn, regs, 
REG_TYPE_INDEX);
-   if (indx_offset < 0)
+   /*
+* A negative offset generally means a error, except
+* -EDOM, which means that the contents of the register
+* should not be used as index.
+*/
+   if (unlikely(indx_offset == -EDOM))
+   indx = 0;
+   else if (unlikely(indx_offset < 0))
goto out_err;
+   else
+   indx = regs_get_register(regs, indx_offset);
 
base = regs_get_register(regs, base_offset);
-   indx = regs_get_register(regs, indx_offset);
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 20/21] x86: Enable User-Mode Instruction Prevention

2017-03-07 Thread Ricardo Neri
User_mode Instruction Prevention (UMIP) is enabled by setting/clearing a
bit in %cr4.

It makes sense to enable UMIP at some point while booting, before user
spaces come up. Like SMAP and SMEP, is not critical to have it enabled
very early during boot. This is because UMIP is relevant only when there is
a userspace to be protected from. Given the similarities in relevance, it
makes sense to enable UMIP along with SMAP and SMEP.

UMIP is enabled by default. It can be disabled by adding clearcpuid=514
to the kernel parameters.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Cc: Tony Luck <tony.l...@intel.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Liang Z. Li <liang.z...@intel.com>
Cc: Alexandre Julliard <julli...@winehq.org>
Cc: Stas Sergeev <s...@list.ru>
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/Kconfig | 10 ++
 arch/x86/kernel/cpu/common.c | 16 +++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cc98d5a..b7f1226 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1735,6 +1735,16 @@ config X86_SMAP
 
  If unsure, say Y.
 
+config X86_INTEL_UMIP
+   def_bool y
+   depends on CPU_SUP_INTEL
+   prompt "Intel User Mode Instruction Prevention" if EXPERT
+   ---help---
+ The User Mode Instruction Prevention (UMIP) is a security
+ feature in newer Intel processors. If enabled, a general
+ protection fault is issued if the instructions SGDT, SLDT,
+ SIDT, SMSW and STR are executed in user mode.
+
 config X86_INTEL_MPX
prompt "Intel MPX (Memory Protection Extensions)"
def_bool n
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 58094a1..9f59eb5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -311,6 +311,19 @@ static __always_inline void setup_smap(struct cpuinfo_x86 
*c)
}
 }
 
+static __always_inline void setup_umip(struct cpuinfo_x86 *c)
+{
+   if (cpu_feature_enabled(X86_FEATURE_UMIP) &&
+   cpu_has(c, X86_FEATURE_UMIP))
+   cr4_set_bits(X86_CR4_UMIP);
+   else
+   /*
+* Make sure UMIP is disabled in case it was enabled in a
+* previous boot (e.g., via kexec).
+*/
+   cr4_clear_bits(X86_CR4_UMIP);
+}
+
 /*
  * Protection Keys are not available in 32-bit mode.
  */
@@ -1080,9 +1093,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
/* Disable the PN if appropriate */
squash_the_stupid_serial_number(c);
 
-   /* Set up SMEP/SMAP */
+   /* Set up SMEP/SMAP/UMIP */
setup_smep(c);
setup_smap(c);
+   setup_umip(c);
 
/*
 * The vendor-specific functions might have changed features.
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 12/21] x86/insn: Support both signed 32-bit and 64-bit effective addresses

2017-03-07 Thread Ricardo Neri
The 32-bit and 64-bit address encodings are identical. This means that we
can use the same function in both cases. In order to reuse the function for
32-bit address encodings, we must sign-extend our 32-bit signed operands to
64-bit signed variables (only for 64-bit builds). To decide on whether sign
extension is needed, we rely on the address size as given by the
instruction structure.

Lastly, before computing the linear address, we must truncate our signed
64-bit signed effective address if the address size is 32-bit.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 44 
 1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index edb360f..a9a1704 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -559,6 +559,15 @@ int insn_get_reg_offset_sib_index(struct insn *insn, 
struct pt_regs *regs)
return get_reg_offset(insn, regs, REG_TYPE_INDEX);
 }
 
+static inline long __to_signed_long(unsigned long val, int long_bytes)
+{
+#ifdef CONFIG_X86_64
+   return long_bytes == 4 ? (long)((int)((val) & 0x)) : (long)val;
+#else
+   return (long)val;
+#endif
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
@@ -567,19 +576,21 @@ int insn_get_reg_offset_sib_index(struct insn *insn, 
struct pt_regs *regs)
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
unsigned long linear_addr, seg_base_addr;
-   long eff_addr, base, indx;
-   int addr_offset, base_offset, indx_offset;
+   long eff_addr, base, indx, tmp;
+   int addr_offset, base_offset, indx_offset, addr_bytes;
insn_byte_t sib;
 
insn_get_modrm(insn);
insn_get_sib(insn);
sib = insn->sib.value;
+   addr_bytes = insn->addr_bytes;
 
if (X86_MODRM_MOD(insn->modrm.value) == 3) {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
if (addr_offset < 0)
goto out_err;
-   eff_addr = regs_get_register(regs, addr_offset);
+   tmp = regs_get_register(regs, addr_offset);
+   eff_addr = __to_signed_long(tmp, addr_bytes);
seg_base_addr = insn_get_seg_base(regs, insn, addr_offset,
  false);
} else {
@@ -591,20 +602,24 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
 * in the address computation.
 */
base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-   if (unlikely(base_offset == -EDOM))
+   if (unlikely(base_offset == -EDOM)) {
base = 0;
-   else if (unlikely(base_offset < 0))
+   } else if (unlikely(base_offset < 0)) {
goto out_err;
-   else
-   base = regs_get_register(regs, base_offset);
+   } else {
+   tmp = regs_get_register(regs, base_offset);
+   base = __to_signed_long(tmp, addr_bytes);
+   }
 
indx_offset = get_reg_offset(insn, regs, 
REG_TYPE_INDEX);
-   if (unlikely(indx_offset == -EDOM))
+   if (unlikely(indx_offset == -EDOM)) {
indx = 0;
-   else if (unlikely(indx_offset < 0))
+   } else if (unlikely(indx_offset < 0)) {
goto out_err;
-   else
-   indx = regs_get_register(regs, indx_offset);
+   } else {
+   tmp = regs_get_register(regs, indx_offset);
+   indx = __to_signed_long(tmp, addr_bytes);
+   }
 
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
seg_base_addr =

[v6 PATCH 13/21] x86/insn-eval: Add support to resolve 16-bit addressing encodings

2017-03-07 Thread Ricardo Neri
Tasks running in virtual-8086 mode or in protected mode with code
segment descriptors that specify 16-bit default address sizes via the
D bit will use 16-bit addressing form encodings as described in the Intel
64 and IA-32 Architecture Software Developer's Manual Volume 2A Section
2.1.5. 16-bit addressing encodings differ in several ways from the
32-bit/64-bit addressing form encodings: the r/m part of the ModRM byte
points to different registers and, in some cases, addresses can be
indicated by the addition of the value of two registers. Also, there is
no support for SiB bytes. Thus, a separate function is needed to parse
this form of addressing.

A couple of functions are introduced. get_reg_offset_16 obtains the
offset from the base of pt_regs of the registers indicated by the ModRM
byte of the address encoding. insn_get_addr_ref_16 computes the linear
address indicated by the instructions using the value of the registers
given by ModRM as well as the base address of the segment.

Lastly, the original function insn_get_addr_ref is renamed as
insn_get_addr_ref_32_64. A new insn_get_addr_ref function decides what
type of address decoding must be done base on the number of address bytes
given by the instruction. Documentation for insn_get_addr_ref_32_64 is
also improved.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 137 +++
 1 file changed, 137 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index a9a1704..cb1076d 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -306,6 +306,73 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
 }
 
 /**
+ * get_reg_offset_16 - Obtain offset of register indicated by instruction
+ * @insn:  Instruction structure containing ModRM and SiB bytes
+ * @regs:  Set of registers referred by the instruction
+ * @offs1: Offset of the first operand register
+ * @offs2: Offset of the second opeand register, if applicable.
+ *
+ * Obtain the offset, in pt_regs, of the registers indicated by the ModRM byte
+ * within insn. This function is to be used with 16-bit address encodings. The
+ * offs1 and offs2 will be written with the offset of the two registers
+ * indicated by the instruction. In cases where any of the registers is not
+ * referenced by the instruction, the value will be set to -EDOM.
+ *
+ * Return: 0 on success, -EINVAL on failure.
+ */
+static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
+int *offs1, int *offs2)
+{
+   /* 16-bit addressing can use one or two registers */
+   static const int regoff1[] = {
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, bx),
+   };
+
+   static const int regoff2[] = {
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   -EDOM,
+   -EDOM,
+   -EDOM,
+   -EDOM,
+   };
+
+   if (!offs1 || !offs2)
+   return -EINVAL;
+
+   /* operand is a register, use the generic function */
+   if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+   *offs1 = insn_get_reg_offset_modrm_rm(insn, regs);
+   *offs2 = -EDOM;
+   return 0;
+   }
+
+   *offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
+   *offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)];
+
+   /*
+* If no displacement is indicated in the mod part of the ModRM byte,
+* (mod part is 0) and the r/m part of the same byte is 6, no register
+* is used caculate the operand address. An r/m part of 6 means that
+* the second register offset is already invalid.
+*/
+   if ((X86_MODRM_MOD(insn->modrm.value) == 0) &&
+   (X86_MODRM_RM(i

[v6 PATCH 14/21] x86/insn-eval: Add wrapper function for 16-bit and 32-bit address encodings

2017-03-07 Thread Ricardo Neri
Convert the function insn_get_add_ref into a wrapper function that calls
the correct static address-decoding function depending on the size of the
address. In this way, callers do not need to worry about calling the
correct function and decreases the number of functions that need to be
exposed.

To this end, the original 32/64-bit insn_get_addr_ref is renamed as
insn_get_addr_ref_32_64 to reflect the type of address encodings that it
handles.

Documentation is added to the new wrapper function and the documentation
for the 32/64-bit address decoding function is improved.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 45 -
 1 file changed, 40 insertions(+), 5 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index cb1076d..e633588 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -705,12 +705,21 @@ static inline long __to_signed_long(unsigned long val, 
int long_bytes)
 #endif
 }
 
-/*
- * return the address being referenced be instruction
- * for rm=3 returning the content of the rm reg
- * for rm!=3 calculates the address using SIB and Disp
+/**
+ * insn_get_addr_ref_32_64 - Obtain a 32/64-bit address referred by instruction
+ * @insn:  Instruction struct with ModRM and SiB bytes and displacement
+ * @regs:  Set of registers referred by the instruction
+ *
+ * This function is to be used with 32-bit and 64-bit address encodings. Obtain
+ * the memory address referred by the instruction's ModRM bytes and
+ * displacement. Also, the segment used as base is determined by either any
+ * segment override prefixes in insn or the default segment of the registers
+ * involved in the linear address computation.
+ *
+ * Return: linear address referenced by instruction and registers
  */
-void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+static void __user *insn_get_addr_ref_32_64(struct insn *insn,
+   struct pt_regs *regs)
 {
unsigned long linear_addr, seg_base_addr;
long eff_addr, base, indx, tmp;
@@ -795,3 +804,29 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
 out_err:
return (void __user *)-1;
 }
+
+/**
+ * insn_get_addr_ref - Obtain the linear address referred by instruction
+ * @insn:  Instruction structure containing ModRM byte and displacement
+ * @regs:  Set of registers referred by the instruction
+ *
+ * Obtain the memory address referred by the instruction's ModRM bytes and
+ * displacement. Also, the segment used as base is determined by either any
+ * segment override prefixes in insn or the default segment of the registers
+ * involved in the address computation.
+ *
+ * Return: linear address referenced by instruction and registers
+ */
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+{
+   switch (insn->addr_bytes) {
+   case 2:
+   return insn_get_addr_ref_16(insn, regs);
+   case 4:
+   /* fall through */
+   case 8:
+   return insn_get_addr_ref_32_64(insn, regs);
+   default:
+   return (void __user *)-1;
+   }
+}
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 10/21] x86/insn-eval: Do not use R/EBP as base if mod in ModRM is zero

2017-03-07 Thread Ricardo Neri
Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when the mod part of the ModRM
byte is zero and R/EBP is specified in the R/M part of such bit, the value
of the aforementioned register should not be used in the address
computation. Instead, a 32-bit displacement is expected. The instruction
decoder takes care of setting the displacement to the expected value.
Returning -EDOM signals callers that they should ignore the value of such
register when computing the address encoded in the instruction operands.

Also, callers should exercise care to correctly interpret this particular
case. In IA-32e 64-bit mode, the address is given by the displacement plus
the value of the RIP. In IA-32e compatibility mode, the value of EIP is
ignored. This correction is done for our insn_get_addr_ref.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index cda6c71..ea10b03 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -250,6 +250,14 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
switch (type) {
case REG_TYPE_RM:
regno = X86_MODRM_RM(insn->modrm.value);
+   /* if mod=0, register R/EBP is not used in the address
+* computation. Instead, a 32-bit displacement is expected;
+* the instruction decoder takes care of reading such
+* displacement. This is true for both R/EBP and R13, as the
+* REX.B bit is not decoded.
+*/
+   if (regno == 5 && X86_MODRM_MOD(insn->modrm.value) == 0)
+   return -EDOM;
if (X86_REX_B(insn->rex_prefix.value))
regno += 8;
break;
@@ -599,9 +607,22 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-   if (addr_offset < 0)
+   /* -EDOM means that we must ignore the address_offset.
+* The only case in which we see this value is when
+* R/M points to R/EBP. In such a case, in 64-bit mode
+* the effective address is relative to tho RIP.
+*/
+   if (addr_offset == -EDOM) {
+   eff_addr = 0;
+#ifdef CONFIG_X86_64
+   if (user_64bit_mode(regs))
+   eff_addr = (long)regs->ip;
+#endif
+   } else if (addr_offset < 0) {
goto out_err;
-   eff_addr = regs_get_register(regs, addr_offset);
+   } else {
+   eff_addr = regs_get_register(regs, addr_offset);
+   }
}
eff_addr += insn->displacement.value;
}
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 17/21] x86: Add emulation code for UMIP instructions

2017-03-07 Thread Ricardo Neri
The feature User-Mode Instruction Prevention present in recent Intel
processor prevents a group of instructions from being executed with
CPL > 0. Otherwise, a general protection fault is issued.

Rather than relaying this fault to the user space (in the form of a SIGSEGV
signal), the instructions protected by UMIP can be emulated to provide
dummy results. This allows to conserve the current kernel behavior and not
reveal the system resources that UMIP intends to protect (the global
descriptor and interrupt descriptor tables, the segment selectors of the
local descriptor table and the task state and the machine status word).

This emulation is needed because certain applications (e.g., WineHQ) rely
on this subset of instructions to function.

The instructions protected by UMIP can be split in two groups. Those who
return a kernel memory address (sgdt and sidt) and those who return a
value (sldt, str and smsw).

For the instructions that return a kernel memory address, applications
such as WineHQ rely on the result being located in the kernel memory space.
The result is emulated as a hard-coded value that, lies close to the top
of the kernel memory. The limit for the GDT and the IDT are set to zero.

The instructions sldt and str return a segment selector relative to the
base address of the global descriptor table. Since the actual address of
such table is not revealed, it makes sense to emulate the result as zero.

The instruction smsw is emulated to return the value that the register CR0
has at boot time as set in the head_32.

Care is taken to appropriately emulate the results when segmentation is
used. This is, rather than relying on USER_DS and USER_CS, the function
insn_get_addr_ref inspects the segment descriptor pointed by the registers
in pt_regs. This ensures that we correctly obtain the segment base address
and the address and operand sizes even if the user space application uses
local descriptor table.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Cc: Tony Luck <tony.l...@intel.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Liang Z. Li <liang.z...@intel.com>
Cc: Alexandre Julliard <julli...@winehq.org>
Cc: Stas Sergeev <s...@list.ru>
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/umip.h |  15 +++
 arch/x86/kernel/Makefile|   1 +
 arch/x86/kernel/umip.c  | 257 
 3 files changed, 273 insertions(+)
 create mode 100644 arch/x86/include/asm/umip.h
 create mode 100644 arch/x86/kernel/umip.c

diff --git a/arch/x86/include/asm/umip.h b/arch/x86/include/asm/umip.h
new file mode 100644
index 000..077b236
--- /dev/null
+++ b/arch/x86/include/asm/umip.h
@@ -0,0 +1,15 @@
+#ifndef _ASM_X86_UMIP_H
+#define _ASM_X86_UMIP_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_INTEL_UMIP
+bool fixup_umip_exception(struct pt_regs *regs);
+#else
+static inline bool fixup_umip_exception(struct pt_regs *regs)
+{
+   return false;
+}
+#endif  /* CONFIG_X86_INTEL_UMIP */
+#endif  /* _ASM_X86_UMIP_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 84c0059..0ded7b1 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -122,6 +122,7 @@ obj-$(CONFIG_EFI)   += sysfb_efi.o
 obj-$(CONFIG_PERF_EVENTS)  += perf_regs.o
 obj-$(CONFIG_TRACING)  += tracepoint.o
 obj-$(CONFIG_SCHED_MC_PRIO)+= itmt.o
+obj-$(CONFIG_X86_INTEL_UMIP)   += umip.o
 
 ifdef CONFIG_FRAME_POINTER
 obj-y  += unwind_frame.o
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
new file mode 100644
index 000..e64d8e5
--- /dev/null
+++ b/arch/x86/kernel/umip.c
@@ -0,0 +1,257 @@
+/*
+ * umip.c Emulation for instruction protected by the Intel User-Mode
+ * Instruction Prevention. The instructions are:
+ *sgdt
+ *sldt
+ *sidt
+ *str
+ *smsw
+ *
+ * Copyright (c) 2017, Intel Corporation.
+ * Ricardo Neri <ricardo.n...@linux.intel.com>
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * == Base addresses of G

[v6 PATCH 08/21] x86/insn-eval: Add utility function to get segment descriptor base address

2017-03-07 Thread Ricardo Neri
With segmentation, the base address of the segment descriptor is needed
to compute a linear address. The segment descriptor used in the address
computation depends on either any segment override prefixes in the in the
instruction or the default segment determined by the registers involved
in the address computation. Thus, both the instruction as well as the
register (specified as the offset from the base of pt_regs) are given as
inputs, along with a boolean variable to select between override and
default.

The segment selector is determined by get_seg_selector with the inputs
described above. Once the selector is known the base address is
determined. In protected mode, the selector is used to obtain the segment
descriptor and then its base address. If in 64-bit user mode, the segment =
base address is zero except when FS or GS are used. In virtual-8086 mode,
the base address is computed as the value of the segment selector shifted 4
positions to the left.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |  2 ++
 arch/x86/lib/insn-eval.c | 66 
 2 files changed, 68 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 754211b..b201742 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -15,5 +15,7 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs);
 int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
 int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
 int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
+unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
+   int regoff, bool use_default_seg);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 8608adf..383ca83 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -355,6 +355,72 @@ static int get_desc(unsigned short seg, struct desc_struct 
**desc)
 }
 
 /**
+ * insn_get_seg_base() - Obtain base address contained in descriptor
+ * @regs:  Set of registers containing the segment selector
+ * @insn:  Instruction structure with selector override prefixes
+ * @regoff:Operand offset, in pt_regs, of which the selector is needed
+ * @use_default_seg: Use the default segment instead of prefix overrides
+ *
+ * Obtain the base address of the segment descriptor as indicated by either
+ * any segment override prefixes contained in insn or the default segment
+ * applicable to the register indicated by regoff. regoff is specified as the
+ * offset in bytes from the base of pt_regs.
+ *
+ * Return: In protected mode, base address of the segment. It may be zero in
+ * certain cases for 64-bit builds and/or 64-bit applications. In virtual-8086
+ * mode, the segment selector shifed 4 positions to the right. -1L in case of
+ * error.
+ */
+unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
+   int regoff, bool use_default_seg)
+{
+   struct desc_struct *desc;
+   unsigned short seg;
+   enum segment seg_type;
+   int ret;
+
+   seg_type = resolve_seg_selector(insn, regoff, use_default_seg);
+
+   seg = get_segment_selector(regs, seg_type);
+   if (seg < 0)
+   return -1L;
+
+   if (v8086_mode(regs))
+   /*
+* Base is simply the segment selector shifted 4
+* positions to the right.
+*/
+   return (unsigned long)(seg << 4);
+
+#ifdef CONFIG_X86_64
+   if (user_64bit_mode(regs)) {
+   /*
+* Only FS or GS will have a base address, the rest of
+* the segments' bases are forced to 0.
+*/
+   unsigned long base;
+
+   if (seg_type == SEG_FS)
+   rdmsrl(MSR_FS_BASE, base);
+   else if (seg_type == SEG_GS)
+   /*
+* swapgs was called at the kernel entry point. Thus,
+* MSR_KERNEL_GS_BASE will have the user

[v6 PATCH 05/21] x86/insn-eval: Add utility functions to get register offsets

2017-03-07 Thread Ricardo Neri
The function insn_get_reg_offset takes as argument an enumeration that
indicates the type of offset that is returned: the R/M part of the ModRM
byte, the index of the SIB byte or the base of the SIB byte. Callers of
this function would need the definition of such enumeration. This is not
needed. Instead, helper functions can be defined for this purpose can be
added. These functions are useful in cases when, for instance, the caller
needs to decide whether the operand is a register or a memory location by
looking at the mod part of the ModRM byte.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |  3 +++
 arch/x86/lib/insn-eval.c | 51 
 2 files changed, 54 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 5cab1b1..754211b 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -12,5 +12,8 @@
 #include 
 
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
+int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
+int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
+int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 23cf010..78df1c9 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -98,6 +98,57 @@ static int get_reg_offset(struct insn *insn, struct pt_regs 
*regs,
return regoff[regno];
 }
 
+/**
+ * insn_get_reg_offset_modrm_rm - Obtain register in r/m part of ModRM byte
+ * @insn:  Instruction structure containing the ModRM byte
+ * @regs:  Set of registers indicated by the ModRM byte
+ *
+ * Obtain the register indicated by the r/m part of the ModRM byte. The
+ * register is obtained as an offset from the base of pt_regs. In specific
+ * cases, the returned value can be -EDOM to indicate that the particular value
+ * of ModRM does not refer to a register.
+ *
+ * Return: Register indicated by r/m, as an offset within struct pt_regs
+ */
+int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs)
+{
+   return get_reg_offset(insn, regs, REG_TYPE_RM);
+}
+
+/**
+ * insn_get_reg_offset_sib_base - Obtain register in base part of SiB byte
+ * @insn:  Instruction structure containing the SiB byte
+ * @regs:  Set of registers indicated by the SiB byte
+ *
+ * Obtain the register indicated by the base part of the SiB byte. The
+ * register is obtained as an offset from the base of pt_regs. In specific
+ * cases, the returned value can be -EDOM to indicate that the particular value
+ * of SiB does not refer to a register.
+ *
+ * Return: Register indicated by SiB's base, as an offset within struct pt_regs
+ */
+int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs)
+{
+   return get_reg_offset(insn, regs, REG_TYPE_BASE);
+}
+
+/**
+ * insn_get_reg_offset_sib_index - Obtain register in index part of SiB byte
+ * @insn:  Instruction structure containing the SiB byte
+ * @regs:  Set of registers indicated by the SiB byte
+ *
+ * Obtain the register indicated by the index part of the SiB byte. The
+ * register is obtained as an offset from the index of pt_regs. In specific
+ * cases, the returned value can be -EDOM to indicate that the particular value
+ * of SiB does not refer to a register.
+ *
+ * Return: Register indicated by SiB's base, as an offset within struct pt_regs
+ */
+int insn_get_reg_offset_sib_index(struct insn *insn, struct pt_regs *regs)
+{
+   return get_reg_offset(insn, regs, REG_TYPE_INDEX);
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 18/21] x86/umip: Force a page fault when unable to copy emulated result to user

2017-03-07 Thread Ricardo Neri
fixup_umip_exception will be called from do_general_protection. If the
former returns false, the latter will issue a SIGSEGV with SEND_SIG_PRIV.
However, when emulation is successful but the emulated result cannot be
copied to user space memory, it is more accurate to issue a SIGSEGV with
SEGV_MAPERR with the offending address. A new function is inspired in
force_sig_info_fault is introduced to model the page fault.

Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/kernel/umip.c | 45 +++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index e64d8e5..bd06e26 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -163,6 +163,41 @@ static int __emulate_umip_insn(struct insn *insn, enum 
umip_insn umip_inst,
 }
 
 /**
+ * __force_sig_info_umip_fault - Force a SIGSEGV with SEGV_MAPERR
+ * @address:   Address that caused the signal
+ * @regs:  Register set containing the instruction pointer
+ *
+ * Force a SIGSEGV signal with SEGV_MAPERR as the error code. This function is
+ * intended to be used to provide a segmentation fault when the result of the
+ * UMIP emulation could not be copied to the user space memory.
+ *
+ * Return: none
+ */
+static void __force_sig_info_umip_fault(void __user *address,
+   struct pt_regs *regs)
+{
+   siginfo_t info;
+   struct task_struct *tsk = current;
+
+   if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)) {
+   printk_ratelimited("%s[%d] umip emulation segfault ip:%lx 
sp:%lx error:%x in %lx\n",
+  tsk->comm, task_pid_nr(tsk), regs->ip,
+  regs->sp, X86_PF_USER | X86_PF_WRITE,
+  regs->ip);
+   }
+
+   tsk->thread.cr2 = (unsigned long)address;
+   tsk->thread.error_code  = X86_PF_USER | X86_PF_WRITE;
+   tsk->thread.trap_nr = X86_TRAP_PF;
+
+   info.si_signo   = SIGSEGV;
+   info.si_errno   = 0;
+   info.si_code= SEGV_MAPERR;
+   info.si_addr= address;
+   force_sig_info(SIGSEGV, , tsk);
+}
+
+/**
  * fixup_umip_exception - Fixup #GP faults caused by UMIP
  * @regs:  Registers as saved when entering the #GP trap
  *
@@ -247,8 +282,14 @@ bool fixup_umip_exception(struct pt_regs *regs)
} else {
uaddr = insn_get_addr_ref(, regs);
nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
-   if (nr_copied  > 0)
-   return false;
+   if (nr_copied  > 0) {
+   /*
+* If copy fails, send a signal and tell caller that
+* fault was fixed up
+*/
+   __force_sig_info_umip_fault(uaddr, regs);
+   return true;
+   }
}
 
/* increase IP to let the program keep going */
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 15/21] x86/mm: Relocate page fault error codes to traps.h

2017-03-07 Thread Ricardo Neri
Up to this point, only fault.c used the definitions of the page fault error
codes. Thus, it made sense to keep them within such file. Other portions of
code might be interested in those definitions too. For instance, the User-
Mode Instruction Prevention emulation code will use such definitions to
emulate a page fault when it is unable to successfully copy the results
of the emulated instructions to user space.

While relocating the error code enumeration, the prefix X86_ is used to
make it consistent with the rest of the definitions in traps.h. Of course,
code using the enumeration had to be updated as well. No functional changes
were performed.

Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: "Kirill A. Shutemov" <kirill.shute...@linux.intel.com>
Cc: Josh Poimboeuf <jpoim...@redhat.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/traps.h | 18 +
 arch/x86/mm/fault.c  | 88 +---
 2 files changed, 52 insertions(+), 54 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 01fd0a7..4a2e585 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -148,4 +148,22 @@ enum {
X86_TRAP_IRET = 32, /* 32, IRET Exception */
 };
 
+/*
+ * Page fault error code bits:
+ *
+ *   bit 0 ==   0: no page found   1: protection fault
+ *   bit 1 ==   0: read access 1: write access
+ *   bit 2 ==   0: kernel-mode access  1: user-mode access
+ *   bit 3 ==  1: use of reserved bit detected
+ *   bit 4 ==  1: fault was an instruction fetch
+ *   bit 5 ==  1: protection keys block access
+ */
+enum x86_pf_error_code {
+   X86_PF_PROT =   1 << 0,
+   X86_PF_WRITE=   1 << 1,
+   X86_PF_USER =   1 << 2,
+   X86_PF_RSVD =   1 << 3,
+   X86_PF_INSTR=   1 << 4,
+   X86_PF_PK   =   1 << 5,
+};
 #endif /* _ASM_X86_TRAPS_H */
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 428e3176..e859a9c 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -29,26 +29,6 @@
 #include 
 
 /*
- * Page fault error code bits:
- *
- *   bit 0 ==   0: no page found   1: protection fault
- *   bit 1 ==   0: read access 1: write access
- *   bit 2 ==   0: kernel-mode access  1: user-mode access
- *   bit 3 ==  1: use of reserved bit detected
- *   bit 4 ==  1: fault was an instruction fetch
- *   bit 5 ==  1: protection keys block access
- */
-enum x86_pf_error_code {
-
-   PF_PROT =   1 << 0,
-   PF_WRITE=   1 << 1,
-   PF_USER =   1 << 2,
-   PF_RSVD =   1 << 3,
-   PF_INSTR=   1 << 4,
-   PF_PK   =   1 << 5,
-};
-
-/*
  * Returns 0 if mmiotrace is disabled, or if the fault is not
  * handled by mmiotrace:
  */
@@ -149,7 +129,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, 
unsigned long addr)
 * If it was a exec (instruction fetch) fault on NX page, then
 * do not ignore the fault:
 */
-   if (error_code & PF_INSTR)
+   if (error_code & X86_PF_INSTR)
return 0;
 
instr = (void *)convert_ip_to_linear(current, regs);
@@ -179,7 +159,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, 
unsigned long addr)
  * siginfo so userspace can discover which protection key was set
  * on the PTE.
  *
- * If we get here, we know that the hardware signaled a PF_PK
+ * If we get here, we know that the hardware signaled a X86_PF_PK
  * fault and that there was a VMA once we got in the fault
  * handler.  It does *not* guarantee that the VMA we find here
  * was the one that we faulted on.
@@ -205,7 +185,7 @@ static void fill_sig_info_pkey(int si_code, siginfo_t *info,
/*
 * force_sig_info_fault() is called from a number of
 * contexts, some of which have a VMA and some of which
-* do not.  The PF_PK handing happens after we have a
+* do not.  The X86_PF_PK handing happens after we have a
 * valid VMA, so we should never reach this without a
 * valid VMA.
 */
@@ -655,7 +635,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long 
error_code,
if (!oops_may_print())
return;
 
-   if (error_code & PF_

[v6 PATCH 03/21] x86/mpx: Do not use R/EBP as base in the SIB byte with Mod = 0

2017-03-07 Thread Ricardo Neri
Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when a SIB byte is used and the
base of the SIB byte points to R/EBP (i.e., base = 5) and the mod part
of the ModRM byte is zero, the value of such register will not be used
as part of the address computation. To signal this, a -EDOM error is
returned to indicate callers that they should ignore the value.

Also, for this particular case, a displacement of 32-bits should follow
the SIB byte if the mod part of ModRM is equal to zero. The instruction
decoder ensures that this is the case.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Nathan Howard <liverl...@gmail.com>
Cc: Adan Hawthorn <adanhawth...@gmail.com>
Cc: Joe Perches <j...@perches.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/mm/mpx.c | 29 ++---
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index d9e92d6..ef7eb67 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -121,6 +121,17 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
 
case REG_TYPE_BASE:
regno = X86_SIB_BASE(insn->sib.value);
+   /*
+* If mod is 0 and register R/EBP (regno=5) is indicated in the
+* base part of the SIB byte, the value of such register should
+* not be used in the address computation. Also, a 32-bit
+* displacement is expected in this case; the instruction
+* decoder takes care of it. This is true for both R13 and
+* R/EBP as REX.B will not be decoded.
+*/
+   if (regno == 5 && X86_MODRM_MOD(insn->modrm.value) == 0)
+   return -EDOM;
+
if (X86_REX_B(insn->rex_prefix.value))
regno += 8;
break;
@@ -161,16 +172,21 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
eff_addr = regs_get_register(regs, addr_offset);
} else {
if (insn->sib.nbytes) {
+   /*
+* Negative values in the base and index offset means
+* an error when decoding the SIB byte. Except -EDOM,
+* which means that the registers should not be used
+* in the address computation.
+*/
base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-   if (base_offset < 0)
+   if (unlikely(base_offset == -EDOM))
+   base = 0;
+   else if (unlikely(base_offset < 0))
goto out_err;
+   else
+   base = regs_get_register(regs, base_offset);
 
indx_offset = get_reg_offset(insn, regs, 
REG_TYPE_INDEX);
-   /*
-* A negative offset generally means a error, except
-* -EDOM, which means that the contents of the register
-* should not be used as index.
-*/
if (unlikely(indx_offset == -EDOM))
indx = 0;
else if (unlikely(indx_offset < 0))
@@ -178,7 +194,6 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
else
indx = regs_get_register(regs, indx_offset);
 
-   base = regs_get_register(regs, base_offset);
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 21/21] selftests/x86: Add tests for User-Mode Instruction Prevention

2017-03-07 Thread Ricardo Neri
Certain user space programs that run on virtual-8086 mode may utilize
instructions protected by the User-Mode Instruction Prevention (UMIP)
security feature present in new Intel processors: SGDT, SIDT and SMSW. In
such a case, a general protection fault is issued if UMIP is enabled. When
such a fault happens, the kernel catches it and emulates the results of
these instructions with dummy values. The purpose of this new
test is to verify whether the impacted instructions can be executed without
causing such #GP. If no #GP exceptions occur, we expect to exit virtual-
8086 mode from INT 0x80.

The instructions protected by UMIP are executed in representative use
cases:
 a) the memory address of the result is given in the form of a displacement
from the base of the data segment
 b) the memory address of the result is given in a general purpose register
 c) the result is stored directly in a general purpose register.

Unfortunately, it is not possible to check the results against a set of
expected values because no emulation will occur in systems that do not have
the UMIP feature. Instead, results are printed for verification.

Cc: Andy Lutomirski <l...@kernel.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Chen Yucong <sla...@gmail.com>
Cc: Chris Metcalf <cmetc...@mellanox.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Huang Rui <ray.hu...@amd.com>
Cc: Jiri Slaby <jsl...@suse.cz>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Paul Gortmaker <paul.gortma...@windriver.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: Shuah Khan <sh...@kernel.org>
Cc: Vlastimil Babka <vba...@suse.cz>
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 tools/testing/selftests/x86/entry_from_vm86.c | 39 ++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/entry_from_vm86.c 
b/tools/testing/selftests/x86/entry_from_vm86.c
index d075ea0..377b773 100644
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -95,6 +95,22 @@ asm (
"int3\n\t"
"vmcode_int80:\n\t"
"int $0x80\n\t"
+   "umip:\n\t"
+   /* addressing via displacements */
+   "smsw (2052)\n\t"
+   "sidt (2054)\n\t"
+   "sgdt (2060)\n\t"
+   /* addressing via registers */
+   "mov $2066, %bx\n\t"
+   "smsw (%bx)\n\t"
+   "mov $2068, %bx\n\t"
+   "sidt (%bx)\n\t"
+   "mov $2074, %bx\n\t"
+   "sgdt (%bx)\n\t"
+   /* register operands, only for smsw */
+   "smsw %ax\n\t"
+   "mov %ax, (2080)\n\t"
+   "int $0x80\n\t"
".size vmcode, . - vmcode\n\t"
"end_vmcode:\n\t"
".code32\n\t"
@@ -103,7 +119,7 @@ asm (
 
 extern unsigned char vmcode[], end_vmcode[];
 extern unsigned char vmcode_bound[], vmcode_sysenter[], vmcode_syscall[],
-   vmcode_sti[], vmcode_int3[], vmcode_int80[];
+   vmcode_sti[], vmcode_int3[], vmcode_int80[], umip[];
 
 /* Returns false if the test was skipped. */
 static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
@@ -218,6 +234,27 @@ int main(void)
v86.regs.eax = (unsigned int)-1;
do_test(, vmcode_int80 - vmcode, VM86_INTx, 0x80, "int80");
 
+   /* UMIP -- should exit with INTx 0x80 unless UMIP was not disabled */
+   do_test(, umip - vmcode, VM86_INTx, 0x80, "UMIP tests");
+   printf("[INFO]\tResults of UMIP-protected instructions via 
displacements:\n");
+   printf("[INFO]\tSMSW:[0x%04x]\n", *(unsigned short *)(addr + 2052));
+   printf("[INFO]\tSIDT: limit[0x%04x]base[0x%08lx]\n",
+  *(unsigned short *)(addr + 2054),
+  *(unsigned long  *)(addr + 2056));
+   printf("[INFO]\tSGDT: limit[0x%04x]base[0x%08lx]\n",
+  *(unsigned short *)(addr + 2060),
+  *(unsigned long  *)(addr + 2062));
+   printf("[INFO]\tResults of UMIP-protected instructions via addressing 
in registers:\n");
+   printf("[INFO]\tSMSW:[0x%04x]\n", *(unsigned short *)(addr + 2066));
+   printf("[INFO]\tSIDT: limit[0x%04x]base[0x%08lx]\n",
+  *(unsigned short *)(addr + 2068),
+  *(unsigned long  *)(addr + 2070));
+   printf("[INFO]\tSGDT: limit[0x%04x]base[0x%08lx]\n",
+  *(unsigned short *)(addr + 2074),
+  *(unsigned long  *)(addr + 2076));
+   printf("[INFO]\tResults of SMSW via register operands:\n");
+   printf("[INFO]\tSMSW:[0x%04x]\n", *(unsigned short *)(addr + 2080));
+
/* Execute a null pointer */
v86.regs.cs = 0;
v86.regs.ss = 0;
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 11/21] insn/eval: Incorporate segment base in address computation

2017-03-07 Thread Ricardo Neri
insn_get_addr_ref returns the effective address as defined by the
section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
Developer's Manual. In order to compute the linear address, we must add
to the effective address the segment base address as set in the segment
descriptor. Furthermore, the segment descriptor to use depends on the
register that is used as the base of the effective address. The effective
base address varies depending on whether the operand is a register or a
memory address and on whether a SiB byte is used.

In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
segment is used or if segmentation is not used. However, the base address
is not necessarily zero if a user programs defines its own segments. This
is possible by using a local descriptor table.

Since the effective address is a signed quantity, the unsigned segment
base address saved in a separate variable and added to the final effective
address.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index ea10b03..edb360f 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -566,7 +566,7 @@ int insn_get_reg_offset_sib_index(struct insn *insn, struct 
pt_regs *regs)
  */
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-   unsigned long linear_addr;
+   unsigned long linear_addr, seg_base_addr;
long eff_addr, base, indx;
int addr_offset, base_offset, indx_offset;
insn_byte_t sib;
@@ -580,6 +580,8 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
if (addr_offset < 0)
goto out_err;
eff_addr = regs_get_register(regs, addr_offset);
+   seg_base_addr = insn_get_seg_base(regs, insn, addr_offset,
+ false);
} else {
if (insn->sib.nbytes) {
/*
@@ -605,6 +607,8 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
indx = regs_get_register(regs, indx_offset);
 
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   seg_base_addr = insn_get_seg_base(regs, insn,
+ base_offset, false);
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
/* -EDOM means that we must ignore the address_offset.
@@ -623,10 +627,12 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
} else {
eff_addr = regs_get_register(regs, addr_offset);
}
+   seg_base_addr = insn_get_seg_base(regs, insn,
+ addr_offset, false);
}
eff_addr += insn->displacement.value;
}
-   linear_addr = (unsigned long)eff_addr;
+   linear_addr = (unsigned long)eff_addr + seg_base_addr;
 
return (void __user *)linear_addr;
 out_err:
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 09/21] x86/insn-eval: Add functions to get default operand and address sizes

2017-03-07 Thread Ricardo Neri
These functions read the default values of the address and operand sizes
as specified in the segment descriptor. This information is determined
from the D and L bits. Hence, it can be used for both IA-32e 64-bit and
32-bit legacy modes. For virtual-8086 mode, the default address and
operand sizes are always 2 bytes.

The D bit is only meaningful for code segments. Thus, these functions
always use the code segment selector contained in regs.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/include/asm/insn-eval.h |  2 +
 arch/x86/lib/insn-eval.c | 80 
 2 files changed, 82 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index b201742..a0d81fc 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -15,6 +15,8 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs);
 int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
 int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
 int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
+unsigned char insn_get_seg_default_address_bytes(struct pt_regs *regs);
+unsigned char insn_get_seg_default_operand_bytes(struct pt_regs *regs);
 unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
int regoff, bool use_default_seg);
 
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 383ca83..cda6c71 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -421,6 +421,86 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, 
struct insn *insn,
 }
 
 /**
+ * insn_get_seg_default_address_bytes - Obtain default address size of segment
+ * @regs:  Set of registers containing the segment selector
+ *
+ * Obtain the default address size as indicated in the segment descriptor
+ * selected in regs' code segment selector. In protected mode, the default
+ * address is determined by inspecting the L and D bits of the segment
+ * descriptor. In virtual-8086 mode, the default is always two bytes.
+ *
+ * Return: Default address size of segment
+ */
+unsigned char insn_get_seg_default_address_bytes(struct pt_regs *regs)
+{
+   struct desc_struct *desc;
+   unsigned short seg;
+   int ret;
+
+   if (v8086_mode(regs))
+   return 2;
+
+   seg = (unsigned short)regs->cs;
+
+   ret = get_desc(seg, );
+   if (ret)
+   return 0;
+
+   switch ((desc->l << 1) | desc->d) {
+   case 0: /* Legacy mode. 16-bit addresses. CS.L=0, CS.D=0 */
+   return 2;
+   case 1: /* Legacy mode. 32-bit addresses. CS.L=0, CS.D=1 */
+   return 4;
+   case 2: /* IA-32e 64-bit mode. 64-bit addresses. CS.L=1, CS.D=0 */
+   return 8;
+   case 3: /* Invalid setting. CS.L=1, CS.D=1 */
+   /* fall through */
+   default:
+   return 0;
+   }
+}
+
+/**
+ * insn_get_seg_default_operand_bytes - Obtain default operand size of segment
+ * @regs:  Set of registers containing the segment selector
+ *
+ * Obtain the default operand size as indicated in the segment descriptor
+ * selected in regs' code segment selector. In protected mode, the default
+ * operand size is determined by inspecting the L and D bits of the segment
+ * descriptor. In virtual-8086 mode, the default is always two bytes.
+ *
+ * Return: Default operand size of segment
+ */
+unsigned char insn_get_seg_default_operand_bytes(struct pt_regs *regs)
+{
+   struct desc_struct *desc;
+   unsigned short seg;
+   int ret;
+
+   if (v8086_mode(regs))
+   return 2;
+
+   seg = (unsigned short)regs->cs;
+
+   ret = get_desc(seg, );
+   if (ret)
+   return 0;
+
+   switch ((desc->l << 1) | desc->d) {
+   case 0: /* Legacy mode. 16-bit or 8-bit operands CS.L=0, CS.D=0 */
+   return 2;
+   case 1: /* Legacy mode. 32- or 8 bit operands CS.L=0, CS.D=1 */
+   /* fall through */
+   case 2: /* IA-32e 64-bit mode. 32- or 8-bit opnds. CS.L=1, CS.D=0 */
+   return 4;
+   ca

[v6 PATCH 07/21] x86/insn-eval: Add utility function to get segment descriptor

2017-03-07 Thread Ricardo Neri
The segment descriptor contains information that is relevant to how linear
address need to be computed. It contains the default size of addresses as
well as the base address of the segment. Thus, given a segment selector,
we ought look at segment descriptor to correctly calculate the linear
address.

In protected mode, the segment selector might indicate a segment
descriptor from either the global descriptor table or a local descriptor
table. Both cases are considered in this function.

This function is the initial implementation for subsequent functions that
will obtain the aforementioned attributes of the segment descriptor.

Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Adam Buchbinder <adam.buchbin...@gmail.com>
Cc: Colin Ian King <colin.k...@canonical.com>
Cc: Lorenzo Stoakes <lstoa...@gmail.com>
Cc: Qiaowei Ren <qiaowei@intel.com>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Masami Hiramatsu <mhira...@kernel.org>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Thomas Garnier <thgar...@google.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: Ravi V. Shankar <ravi.v.shan...@intel.com>
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
---
 arch/x86/lib/insn-eval.c | 61 
 1 file changed, 61 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 8d45df8..8608adf 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -5,9 +5,13 @@
  */
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 enum reg_type {
@@ -294,6 +298,63 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
 }
 
 /**
+ * get_desc() - Obtain address of segment descriptor
+ * @seg:   Segment selector
+ * @desc:  Pointer to the selected segment descriptor
+ *
+ * Given a segment selector, obtain a memory pointer to the segment
+ * descriptor. Both global and local descriptor tables are supported.
+ * desc will contain the address of the descriptor.
+ *
+ * Return: 0 if success, -EINVAL if failure
+ */
+static int get_desc(unsigned short seg, struct desc_struct **desc)
+{
+   struct desc_ptr gdt_desc = {0, 0};
+   unsigned long desc_base;
+
+   if (!desc)
+   return -EINVAL;
+
+   desc_base = seg & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
+
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
+   if ((seg & SEGMENT_TI_MASK) == SEGMENT_LDT) {
+   seg >>= 3;
+
+   mutex_lock(>active_mm->context.lock);
+   if (unlikely(!current->active_mm->context.ldt ||
+seg >= current->active_mm->context.ldt->size)) {
+   *desc = NULL;
+   mutex_unlock(>active_mm->context.lock);
+   return -EINVAL;
+   }
+
+   *desc = >active_mm->context.ldt->entries[seg];
+   mutex_unlock(>active_mm->context.lock);
+   return 0;
+   }
+#endif
+   native_store_gdt(_desc);
+
+   /*
+* Bits [15:3] of the segment selector contain the index. Such
+* index needs to be multiplied by 8. However, as the index
+* least significant bit is already in bit 3, we don't have
+* to perform the multiplication.
+*/
+   desc_base = seg & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
+
+   if (desc_base > gdt_desc.size) {
+   *desc = NULL;
+   return -EINVAL;
+   }
+
+   *desc = (struct desc_struct *)(gdt_desc.address + desc_base);
+   return 0;
+}
+
+/**
  * insn_get_reg_offset_modrm_rm - Obtain register in r/m part of ModRM byte
  * @insn:  Instruction structure containing the ModRM byte
  * @regs:  Set of registers indicated by the ModRM byte
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v5 17/20] x86/umip: Force a page fault when unable to copy emulated result to user

2017-03-06 Thread Ricardo Neri
On Sun, 2017-03-05 at 08:18 -0800, Andy Lutomirski wrote:
> > + */
> > +static void __force_sig_info_umip_fault(void __user *address,
> > +   struct pt_regs *regs)
> > +{
> > +   siginfo_t info;
> > +   struct task_struct *tsk = current;
> > +
> > +   if (show_unhandled_signals && unhandled_signal(tsk,
> SIGSEGV)) {
> > +   printk_ratelimited("%s[%d] umip emulation segfault
> ip:%lx sp:%lx error:%lx in %lx\n",
> > +  tsk->comm, task_pid_nr(tsk),
> regs->ip,
> > +  regs->sp, UMIP_PF_USER |
> UMIP_PF_WRITE,
> > +  regs->ip);
> > +   }
> > +
> > +   tsk->thread.cr2 = (unsigned long)address;
> > +   tsk->thread.error_code  = UMIP_PF_USER | UMIP_PF_WRITE;
> 
> Please just move enum x86_pf_error_code into a header and rename the
> fields X86_PF_USER, etc rather than duplicating it.

Thanks again for your feedback! I will do this.

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >