Re: [PATCH v7 18/26] x86/insn-eval: Add support to resolve 16-bit addressing encodings

2017-06-15 Thread Ricardo Neri
On Wed, 2017-06-07 at 18:28 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:16AM -0700, Ricardo Neri wrote:
> > Tasks running in virtual-8086 mode or in protected mode with code
> > segment descriptors that specify 16-bit default address sizes via the
> > D bit will use 16-bit addressing form encodings as described in the Intel
> > 64 and IA-32 Architecture Software Developer's Manual Volume 2A Section
> > 2.1.5. 16-bit addressing encodings differ in several ways from the
> > 32-bit/64-bit addressing form encodings: ModRM.rm points to different
> > registers and, in some cases, effective addresses are indicated by the
> > addition of the value of two registers. Also, there is no support for SIB
> > bytes. Thus, a separate function is needed to parse this form of
> > addressing.
> > 
> > A couple of functions are introduced. get_reg_offset_16() obtains the
> > offset from the base of pt_regs of the registers indicated by the ModRM
> > byte of the address encoding. get_addr_ref_16() computes the linear
> > address indicated by the instructions using the value of the registers
> > given by ModRM as well as the base address of the segment.
> > 
> > Cc: Dave Hansen 
> > Cc: Adam Buchbinder 
> > Cc: Colin Ian King 
> > Cc: Lorenzo Stoakes 
> > Cc: Qiaowei Ren 
> > Cc: Arnaldo Carvalho de Melo 
> > Cc: Masami Hiramatsu 
> > Cc: Adrian Hunter 
> > Cc: Kees Cook 
> > Cc: Thomas Garnier 
> > Cc: Peter Zijlstra 
> > Cc: Borislav Petkov 
> > Cc: Dmitry Vyukov 
> > Cc: Ravi V. Shankar 
> > Cc: x...@kernel.org
> > Signed-off-by: Ricardo Neri 
> > ---
> >  arch/x86/lib/insn-eval.c | 155 
> > +++
> >  1 file changed, 155 insertions(+)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 9822061..928a662 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -431,6 +431,73 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> >  }
> >  
> >  /**
> > + * get_reg_offset_16 - Obtain offset of register indicated by instruction
> 
> Please end function names with parentheses.

I will correct.
> 
> > + * @insn:  Instruction structure containing ModRM and SiB bytes
> 
> s/SiB/SIB/g

I will correct.
> 
> > + * @regs:  Structure with register values as seen when entering kernel mode
> > + * @offs1: Offset of the first operand register
> > + * @offs2: Offset of the second opeand register, if applicable.
> > + *
> > + * Obtain the offset, in pt_regs, of the registers indicated by the ModRM 
> > byte
> > + * within insn. This function is to be used with 16-bit address encodings. 
> > The
> > + * offs1 and offs2 will be written with the offset of the two registers
> > + * indicated by the instruction. In cases where any of the registers is not
> > + * referenced by the instruction, the value will be set to -EDOM.
> > + *
> > + * Return: 0 on success, -EINVAL on failure.
> > + */
> > +static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
> > +int *offs1, int *offs2)
> > +{
> > +   /* 16-bit addressing can use one or two registers */
> > +   static const int regoff1[] = {
> > +   offsetof(struct pt_regs, bx),
> > +   offsetof(struct pt_regs, bx),
> > +   offsetof(struct pt_regs, bp),
> > +   offsetof(struct pt_regs, bp),
> > +   offsetof(struct pt_regs, si),
> > +   offsetof(struct pt_regs, di),
> > +   offsetof(struct pt_regs, bp),
> > +   offsetof(struct pt_regs, bx),
> > +   };
> > +
> > +   static const int regoff2[] = {
> > +   offsetof(struct pt_regs, si),
> > +   offsetof(struct pt_regs, di),
> > +   offsetof(struct pt_regs, si),
> > +   offsetof(struct pt_regs, di),
> > +   -EDOM,
> > +   -EDOM,
> > +   -EDOM,
> > +   -EDOM,
> > +   };
> 
> You mean "Table 2-1. 16-Bit Addressing Forms with the ModR/M Byte" in
> the SDM, right?

Yes.
> 
> Please add a comment pointing to it here because it is not trivial to
> map that code to the documentation.

Sure, I will add a comment pointing to this table.

> 
> > +
> > +   if (!offs1 || !offs2)
> > +   return -EINVAL;
> > +
> > +   /* operand is a register, use the generic function */
> > +   if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> > +   *offs1 = insn_get_modrm_rm_off(insn, regs);
> > +   *offs2 = -EDOM;
> > +   return 0;
> > +   }
> > +
> > +   *offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
> > +   *offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)];
> > +
> > +   /*
> > +* If no displacement is 

Re: [PATCH v7 18/26] x86/insn-eval: Add support to resolve 16-bit addressing encodings

2017-06-07 Thread Borislav Petkov
On Fri, May 05, 2017 at 11:17:16AM -0700, Ricardo Neri wrote:
> Tasks running in virtual-8086 mode or in protected mode with code
> segment descriptors that specify 16-bit default address sizes via the
> D bit will use 16-bit addressing form encodings as described in the Intel
> 64 and IA-32 Architecture Software Developer's Manual Volume 2A Section
> 2.1.5. 16-bit addressing encodings differ in several ways from the
> 32-bit/64-bit addressing form encodings: ModRM.rm points to different
> registers and, in some cases, effective addresses are indicated by the
> addition of the value of two registers. Also, there is no support for SIB
> bytes. Thus, a separate function is needed to parse this form of
> addressing.
> 
> A couple of functions are introduced. get_reg_offset_16() obtains the
> offset from the base of pt_regs of the registers indicated by the ModRM
> byte of the address encoding. get_addr_ref_16() computes the linear
> address indicated by the instructions using the value of the registers
> given by ModRM as well as the base address of the segment.
> 
> Cc: Dave Hansen 
> Cc: Adam Buchbinder 
> Cc: Colin Ian King 
> Cc: Lorenzo Stoakes 
> Cc: Qiaowei Ren 
> Cc: Arnaldo Carvalho de Melo 
> Cc: Masami Hiramatsu 
> Cc: Adrian Hunter 
> Cc: Kees Cook 
> Cc: Thomas Garnier 
> Cc: Peter Zijlstra 
> Cc: Borislav Petkov 
> Cc: Dmitry Vyukov 
> Cc: Ravi V. Shankar 
> Cc: x...@kernel.org
> Signed-off-by: Ricardo Neri 
> ---
>  arch/x86/lib/insn-eval.c | 155 
> +++
>  1 file changed, 155 insertions(+)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 9822061..928a662 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -431,6 +431,73 @@ static int get_reg_offset(struct insn *insn, struct 
> pt_regs *regs,
>  }
>  
>  /**
> + * get_reg_offset_16 - Obtain offset of register indicated by instruction

Please end function names with parentheses.

> + * @insn:Instruction structure containing ModRM and SiB bytes

s/SiB/SIB/g

> + * @regs:Structure with register values as seen when entering kernel mode
> + * @offs1:   Offset of the first operand register
> + * @offs2:   Offset of the second opeand register, if applicable.
> + *
> + * Obtain the offset, in pt_regs, of the registers indicated by the ModRM 
> byte
> + * within insn. This function is to be used with 16-bit address encodings. 
> The
> + * offs1 and offs2 will be written with the offset of the two registers
> + * indicated by the instruction. In cases where any of the registers is not
> + * referenced by the instruction, the value will be set to -EDOM.
> + *
> + * Return: 0 on success, -EINVAL on failure.
> + */
> +static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
> +  int *offs1, int *offs2)
> +{
> + /* 16-bit addressing can use one or two registers */
> + static const int regoff1[] = {
> + offsetof(struct pt_regs, bx),
> + offsetof(struct pt_regs, bx),
> + offsetof(struct pt_regs, bp),
> + offsetof(struct pt_regs, bp),
> + offsetof(struct pt_regs, si),
> + offsetof(struct pt_regs, di),
> + offsetof(struct pt_regs, bp),
> + offsetof(struct pt_regs, bx),
> + };
> +
> + static const int regoff2[] = {
> + offsetof(struct pt_regs, si),
> + offsetof(struct pt_regs, di),
> + offsetof(struct pt_regs, si),
> + offsetof(struct pt_regs, di),
> + -EDOM,
> + -EDOM,
> + -EDOM,
> + -EDOM,
> + };

You mean "Table 2-1. 16-Bit Addressing Forms with the ModR/M Byte" in
the SDM, right?

Please add a comment pointing to it here because it is not trivial to
map that code to the documentation.

> +
> + if (!offs1 || !offs2)
> + return -EINVAL;
> +
> + /* operand is a register, use the generic function */
> + if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> + *offs1 = insn_get_modrm_rm_off(insn, regs);
> + *offs2 = -EDOM;
> + return 0;
> + }
> +
> + *offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
> + *offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)];
> +
> + /*
> +  * If no displacement is indicated in the mod part of the ModRM byte,

s/"no "//

> +  * (mod part is 0) and the r/m part of the same byte is 6, no register
> +  * is used caculate the operand address. An r/m part of 6 means that
> +  * the second register offset is already invalid.
> +  */
> + if 

[PATCH v7 18/26] x86/insn-eval: Add support to resolve 16-bit addressing encodings

2017-05-05 Thread Ricardo Neri
Tasks running in virtual-8086 mode or in protected mode with code
segment descriptors that specify 16-bit default address sizes via the
D bit will use 16-bit addressing form encodings as described in the Intel
64 and IA-32 Architecture Software Developer's Manual Volume 2A Section
2.1.5. 16-bit addressing encodings differ in several ways from the
32-bit/64-bit addressing form encodings: ModRM.rm points to different
registers and, in some cases, effective addresses are indicated by the
addition of the value of two registers. Also, there is no support for SIB
bytes. Thus, a separate function is needed to parse this form of
addressing.

A couple of functions are introduced. get_reg_offset_16() obtains the
offset from the base of pt_regs of the registers indicated by the ModRM
byte of the address encoding. get_addr_ref_16() computes the linear
address indicated by the instructions using the value of the registers
given by ModRM as well as the base address of the segment.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Arnaldo Carvalho de Melo 
Cc: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Kees Cook 
Cc: Thomas Garnier 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Dmitry Vyukov 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/lib/insn-eval.c | 155 +++
 1 file changed, 155 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 9822061..928a662 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -431,6 +431,73 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
 }
 
 /**
+ * get_reg_offset_16 - Obtain offset of register indicated by instruction
+ * @insn:  Instruction structure containing ModRM and SiB bytes
+ * @regs:  Structure with register values as seen when entering kernel mode
+ * @offs1: Offset of the first operand register
+ * @offs2: Offset of the second opeand register, if applicable.
+ *
+ * Obtain the offset, in pt_regs, of the registers indicated by the ModRM byte
+ * within insn. This function is to be used with 16-bit address encodings. The
+ * offs1 and offs2 will be written with the offset of the two registers
+ * indicated by the instruction. In cases where any of the registers is not
+ * referenced by the instruction, the value will be set to -EDOM.
+ *
+ * Return: 0 on success, -EINVAL on failure.
+ */
+static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
+int *offs1, int *offs2)
+{
+   /* 16-bit addressing can use one or two registers */
+   static const int regoff1[] = {
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, bx),
+   };
+
+   static const int regoff2[] = {
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   -EDOM,
+   -EDOM,
+   -EDOM,
+   -EDOM,
+   };
+
+   if (!offs1 || !offs2)
+   return -EINVAL;
+
+   /* operand is a register, use the generic function */
+   if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+   *offs1 = insn_get_modrm_rm_off(insn, regs);
+   *offs2 = -EDOM;
+   return 0;
+   }
+
+   *offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
+   *offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)];
+
+   /*
+* If no displacement is indicated in the mod part of the ModRM byte,
+* (mod part is 0) and the r/m part of the same byte is 6, no register
+* is used caculate the operand address. An r/m part of 6 means that
+* the second register offset is already invalid.
+*/
+   if ((X86_MODRM_MOD(insn->modrm.value) == 0) &&
+   (X86_MODRM_RM(insn->modrm.value) == 6))
+   *offs1 = -EDOM;
+
+   return 0;
+}
+
+/**
  * get_desc() - Obtain address of segment descriptor
  * @sel:   Segment selector
  *
@@ -689,6 +756,94 @@ int insn_get_modrm_rm_off(struct insn *insn, struct 
pt_regs *regs)
 }
 
 /**
+ * get_addr_ref_16() - Obtain the 16-bit address referred by instruction
+ * @insn:  Instruction structure containing ModRM