Re: [PATCH -tip 3/6 V4.1] x86: instruction decorder API

2009-04-23 Thread Jim Keniston
On Thu, 2009-04-23 at 13:29 -0400, Masami Hiramatsu wrote:
...
 
 Hmm, maybe, parser can handle (extra_info) as a solid keyword.
 so let's define actual format.
 
 opcode maps
 Table: table-name
 Referrer: escamed-name
 opcode: mnemonic|Grp [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 
 2nd-mnemonic ...]
 opcode: ESC # escaped-name
 
 group maps
 reg: mnemonic ...

For some instruction groups -- e.g., Groups 12, 13, 14 -- the
instruction prefix (66, f2, f3) and the reg field both affect the
instruction type.  And for some x87 instructions, the value of the modrm
byte's rm field also affects the instruction type.  (For others, rm just
selects among the st(0)..st(7) registers, as one might expect.)

Of course, that's all about floating-point instructions, which are of
more interest to uprobes than kprobes. 

Jim

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -tip 3/6 V4.1] x86: instruction decorder API

2009-04-22 Thread Jim Keniston
On Tue, 2009-04-21 at 20:17 -0400, Masami Hiramatsu wrote:
...
 Hi Peter and Jim,
 
 Now what I'm doing is making opcode tables like this.
 
 Table: 1-byte opcode
 Alias: none
 00: ADD Eb,Gb
 01: ADD Ev,Gv
 02: ADD Gb,Eb
 03: ADD Gv,Ev
 04: ADD AL,Ib
 05: ADD rAX,Iz
 06: PUSH ES (i64)
 07: POP ES (i64)
 08: OR Eb,Gb
 09: OR Ev,Gv
 0a: OR Gb,Eb
 0b: OR Gv,Ev
 0c: OR AL,Ib
 0d: OR rAX,Iz
 0e: PUSH CS
 0f: 2-byte escape
 ...

We want to keep this info easy to parse. (Who knows how it might be
used, and by whom?)  Your format seems to be
opcode: mnemonic [comma,separated,operands] [(extra_info)]
which is fine if you stick to it... but your entry for 0f doesn't match
that.

Also, something like
+ extra_info
would be easier to parse (using, say, awk) than
(extra_info)

 
 and a parser script which parses them into,
 
 const insn_attr_t primary_table[INAT_TABLE_SIZE] = {
   [0x04] = INAT_IMM(IMM_SIZE_BYTE)
   [0x05] = INAT_IMM(IMM_SIZE_VWORD32)
   [0x0c] = INAT_IMM(IMM_SIZE_BYTE)
   [0x0d] = INAT_IMM(IMM_SIZE_VWORD32)
   [0x0f] = INAT_ESC(IMM_ESC_2BYTE)
 ...
 
 (note, instructions which has no attributes for decoder, are just ignored)
 
 
 By the way, I'm worried about legal things of Intel's instruction
 encoding expressions. Would you think there is any problem if we
 have those tables in the kernel tree?

Good question.  Sorry, I'm not a lawyer.  Intel and AMD and sandpile.org
all seem to be using the same notation, so the notation's inventor must
not be feeling too proprietary.  Interestingly, sandpile.org asserts a
copyright.

 
 Thanks,
 

Jim

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -tip 3/6 V4.1] x86: instruction decorder API

2009-04-16 Thread Jim Keniston
On Thu, 2009-04-16 at 19:31 -0400, Masami Hiramatsu wrote:
...
 
 Hmm, I have an idea about instruction table. Usually, instruction tables
 are encoded with code defined by each decoder/emulator. This method
 will show their internal code directly, and is hard to maintain when
 the opcode map is updated. Instead of that, I'd like to suggest using
 the expressions in the opcode maps in a vender's genuine document (in
 this case, Intel/AMD's manual) or www.sandpile.org for instruction
 tables.
 
 e.g.
 
 const insn_attr_t onebyte_attr_table[ATTR_TABLE_SIZE] = {
 /* 0x00-0x0f */
 AT2(Eb,Gb), AT2(Ev,Gv),  AT2(Gb,Eb),  AT2(Gv,Ev),
 AT2(AL,Ib), AT2(rAX,Iz), AT2(ES,i64), AT2(ES,i64),
 AT2(Eb,Gb), AT2(Ev,Gv),  AT2(Gb,Eb),  AT2(Gv,Ev),
 AT2(AL,Ib), AT2(rAX,Iz), AT2(CS,i64), AT(ESC),
 ...
 
 Here, AT and AT2 macros are defined as follows:
 
 #define AT(a) (INAT_OMEXP_##a)
 #define AT2(a1, a2) (INAT_OMEXP_##a1 | INAT_OMEXP_##a2)
 
...

It looks like AT2(Ev,Gv) would yield the same bits as AT2(Gv,Ev).  It'd
be nice not to lose the operand-order information.  And we'd have to
make clear whether which notation we're using -- src,dest as in the gnu
assembler, or dest,src as in the AMD (and Intel?) manuals.

Jim

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -tip 3/6 V4.1] x86: instruction decorder API

2009-04-06 Thread Jim Keniston
On Fri, 2009-04-03 at 20:37 -0400, Masami Hiramatsu wrote:
 Hi Peter,
 
 H. Peter Anvin wrote:
  Masami Hiramatsu wrote:
  Add x86 instruction decoder to arch-specific libraries. This decoder
  can decode all x86 instructions into prefix, opcode, modrm, sib,
  displacement and immediates. This can also show the length of
  instructions.
 
...
  
  Hi Masami,
  
  On the surface the overall structure looks fine, but I have a couple of 
  concerns:
  
  1. is this meant to be able to decode userspace code or just kernel 
  code?  If it is supposed to be able to decode userspace code, is there a 
  reason you're not dealing with 16-bit or V86 mode code at all?  If not, 
  why are you including the 32-bit decoder in a 64-bit kernel (as well as 
  instructions which we're pretty much guaranteed to never use in the 
  kernel, such as ENTER.)
 
 Actually, this aims to decode both of user space and kernel code.
 At this point, it just needs to cover kernel code, because kprobes
 just want to decode kernel binary.
 However, this is just a starting point, uprobe developers want to
 use it to decode user-space code. In that case, it needs to be
 enhanced.

For user-space probing, we've been concentrating on native-built
executables.  Am I correct in thinking that we'll see 16-bit or V86 mode
only on legacy apps built elsewhere?  In any case, it only makes sense
to build on the kvm folks' work in this regard.

...
 
  
  4. you have a bunch of magic opcode constants all over the place.  This 
  means that as new instructions come in -- and they're going to be coming 
  in -- this is going to be hard to update.  It would be cleaner if we 
  could have an intermediate format that preprocesses down to all the 
  relevant tables and perhaps even some of the code rather than 
  open-coding everything in C.
  
  This matters... for example you have:
  
  +   } else if (opcode == 0xea /* jmp far seg:offs */) {
  +   __get_immptr(insn);
  
  ... but nothing similar for opcode 0x9a.  This is extremely hard to spot 
  with this kind of structure.
 
 Oops, that should be a bug. Hmm, I think we'd better bit-flags tables
 for classifying opcodes.
 Jim, can your INAT idea help this situation?
 
 http://sources.redhat.com/ml/systemtap/2009-q2/msg00109.html
 

As noted, the INAT tables follow the kvm model of one fat bitmap of
attributes per opcode, rather than the kprobes/uprobes model of one or
two 256-bit tables per attribute.  (This latter approach was due to the
gradual accumulation of tables over the years.)

I like the bitmap-per-opcode approach because it's relatively easy to
see in one place everything you're saying about a particular opcode.
But with all the potential clients for this service, it's not clear that
we'll get by with a single bitmap for every opcode.  (x86 kvm uses 32
bits per opcode, I think, and the INAT tables use 10.  Seems like we
could overrun 64 bits pretty quickly.)  So I guess that means we'll have
to get a little creative as to how we expose these attribute sets to the
client.

...
 
 Thank you for good advice!
 

Ditto.
Jim Keniston

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -tip 4/6 V4.1] x86: kprobes checks safeness of insertion address.

2009-04-03 Thread Jim Keniston
On Fri, 2009-04-03 at 12:02 -0400, Masami Hiramatsu wrote:
 Ensure safeness of inserting kprobes by checking whether the specified
 address is at the first byte of a instruction. This is done by decoding
 probed function from its head to the probe point.
 
 changes from v4:
  - change a comment according to Ananth's suggestion.
 
 Signed-off-by: Masami Hiramatsu mhira...@redhat.com
 Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com
 Cc: Jim Keniston jkeni...@us.ibm.com
 Cc: Ingo Molnar mi...@elte.hu
 ---
 
  arch/x86/kernel/kprobes.c |   51 
 +
  1 files changed, 51 insertions(+), 0 deletions(-)
 
 
 diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
...
 
 +/* Recover original instruction */

/* Recover the probed instruction at addr for further analysis. */
See below.

 +static int recover_probed_instruction(kprobe_opcode_t *buf, unsigned long 
 addr)
 +{
 + struct kprobe *kp;
 + kp = get_kprobe((void *)addr);
 + if (!kp)
 + return -EINVAL;
 +
 + /* Don't use p-ainsn.insn; which will be modified by fix_riprel */

fix_riprel doesn't affect the instruction's length, which is what
concerns this patch.  But we want this function to be useful for
unforeseen uses as well, so I like the code you have.  Just consider the
suggested comment changes.

/*
 * Don't use p-ainsn.insn, which could be modified -- e.g.,
 * by fix_riprel().
 */

 + memcpy(buf, kp-addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
 + buf[0] = kp-opcode;
 + return 0;
 +}

Jim Keniston

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

2009-04-03 Thread Jim Keniston
On Fri, 2009-04-03 at 12:55 -0400, Masami Hiramatsu wrote:
 Ingo Molnar wrote:
  * Masami Hiramatsu mhira...@redhat.com wrote:
  
  Hmm, I'd like to know actually kvm aims to emulate all kinds of 
  instructions. If so, I might find some bugs in x86_emulate.c. 
  However, I don't know all bugs. To find all of them, we have to 
  port x86_emulate.c to user-space, decode binaries with it, and 
  compare its output with another decoder, as Jim had done with 
  insn.c.
 
  https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html
  
  btw., i'd suggest we put a build time check for this into the kernel 
  version as well. For example to decode the vmlinux via objdump, run 
  it through your decoder as well and compare the results. Put under a 
  CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time 
  self-test.
  
  This would ensure that the kernel we are running is fully supported 
  by the decoder - even as GCC/GAS starts using new instructions, etc. 
  
  How does this sound to you?
 
 Thanks! That is a good idea.
 Jim, would you think you can port your script into kernel tree?
...

I'd be happy to do what's needed to make it happen, and maintain it in
the face of x86 changes.  The script itself is practically nothing (~100
lines of awk and C), but what I don't know about the kernel build is a
lot, so I'd need some help from a kernel-build expert.

Jim

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html