Re: [PATCH -tip 3/6 V4.1] x86: instruction decorder API
On Thu, 2009-04-23 at 13:29 -0400, Masami Hiramatsu wrote: ... Hmm, maybe, parser can handle (extra_info) as a solid keyword. so let's define actual format. opcode maps Table: table-name Referrer: escamed-name opcode: mnemonic|Grp [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...] opcode: ESC # escaped-name group maps reg: mnemonic ... For some instruction groups -- e.g., Groups 12, 13, 14 -- the instruction prefix (66, f2, f3) and the reg field both affect the instruction type. And for some x87 instructions, the value of the modrm byte's rm field also affects the instruction type. (For others, rm just selects among the st(0)..st(7) registers, as one might expect.) Of course, that's all about floating-point instructions, which are of more interest to uprobes than kprobes. Jim -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 3/6 V4.1] x86: instruction decorder API
On Tue, 2009-04-21 at 20:17 -0400, Masami Hiramatsu wrote: ... Hi Peter and Jim, Now what I'm doing is making opcode tables like this. Table: 1-byte opcode Alias: none 00: ADD Eb,Gb 01: ADD Ev,Gv 02: ADD Gb,Eb 03: ADD Gv,Ev 04: ADD AL,Ib 05: ADD rAX,Iz 06: PUSH ES (i64) 07: POP ES (i64) 08: OR Eb,Gb 09: OR Ev,Gv 0a: OR Gb,Eb 0b: OR Gv,Ev 0c: OR AL,Ib 0d: OR rAX,Iz 0e: PUSH CS 0f: 2-byte escape ... We want to keep this info easy to parse. (Who knows how it might be used, and by whom?) Your format seems to be opcode: mnemonic [comma,separated,operands] [(extra_info)] which is fine if you stick to it... but your entry for 0f doesn't match that. Also, something like + extra_info would be easier to parse (using, say, awk) than (extra_info) and a parser script which parses them into, const insn_attr_t primary_table[INAT_TABLE_SIZE] = { [0x04] = INAT_IMM(IMM_SIZE_BYTE) [0x05] = INAT_IMM(IMM_SIZE_VWORD32) [0x0c] = INAT_IMM(IMM_SIZE_BYTE) [0x0d] = INAT_IMM(IMM_SIZE_VWORD32) [0x0f] = INAT_ESC(IMM_ESC_2BYTE) ... (note, instructions which has no attributes for decoder, are just ignored) By the way, I'm worried about legal things of Intel's instruction encoding expressions. Would you think there is any problem if we have those tables in the kernel tree? Good question. Sorry, I'm not a lawyer. Intel and AMD and sandpile.org all seem to be using the same notation, so the notation's inventor must not be feeling too proprietary. Interestingly, sandpile.org asserts a copyright. Thanks, Jim -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 3/6 V4.1] x86: instruction decorder API
On Thu, 2009-04-16 at 19:31 -0400, Masami Hiramatsu wrote: ... Hmm, I have an idea about instruction table. Usually, instruction tables are encoded with code defined by each decoder/emulator. This method will show their internal code directly, and is hard to maintain when the opcode map is updated. Instead of that, I'd like to suggest using the expressions in the opcode maps in a vender's genuine document (in this case, Intel/AMD's manual) or www.sandpile.org for instruction tables. e.g. const insn_attr_t onebyte_attr_table[ATTR_TABLE_SIZE] = { /* 0x00-0x0f */ AT2(Eb,Gb), AT2(Ev,Gv), AT2(Gb,Eb), AT2(Gv,Ev), AT2(AL,Ib), AT2(rAX,Iz), AT2(ES,i64), AT2(ES,i64), AT2(Eb,Gb), AT2(Ev,Gv), AT2(Gb,Eb), AT2(Gv,Ev), AT2(AL,Ib), AT2(rAX,Iz), AT2(CS,i64), AT(ESC), ... Here, AT and AT2 macros are defined as follows: #define AT(a) (INAT_OMEXP_##a) #define AT2(a1, a2) (INAT_OMEXP_##a1 | INAT_OMEXP_##a2) ... It looks like AT2(Ev,Gv) would yield the same bits as AT2(Gv,Ev). It'd be nice not to lose the operand-order information. And we'd have to make clear whether which notation we're using -- src,dest as in the gnu assembler, or dest,src as in the AMD (and Intel?) manuals. Jim -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 3/6 V4.1] x86: instruction decorder API
On Fri, 2009-04-03 at 20:37 -0400, Masami Hiramatsu wrote: Hi Peter, H. Peter Anvin wrote: Masami Hiramatsu wrote: Add x86 instruction decoder to arch-specific libraries. This decoder can decode all x86 instructions into prefix, opcode, modrm, sib, displacement and immediates. This can also show the length of instructions. ... Hi Masami, On the surface the overall structure looks fine, but I have a couple of concerns: 1. is this meant to be able to decode userspace code or just kernel code? If it is supposed to be able to decode userspace code, is there a reason you're not dealing with 16-bit or V86 mode code at all? If not, why are you including the 32-bit decoder in a 64-bit kernel (as well as instructions which we're pretty much guaranteed to never use in the kernel, such as ENTER.) Actually, this aims to decode both of user space and kernel code. At this point, it just needs to cover kernel code, because kprobes just want to decode kernel binary. However, this is just a starting point, uprobe developers want to use it to decode user-space code. In that case, it needs to be enhanced. For user-space probing, we've been concentrating on native-built executables. Am I correct in thinking that we'll see 16-bit or V86 mode only on legacy apps built elsewhere? In any case, it only makes sense to build on the kvm folks' work in this regard. ... 4. you have a bunch of magic opcode constants all over the place. This means that as new instructions come in -- and they're going to be coming in -- this is going to be hard to update. It would be cleaner if we could have an intermediate format that preprocesses down to all the relevant tables and perhaps even some of the code rather than open-coding everything in C. This matters... for example you have: + } else if (opcode == 0xea /* jmp far seg:offs */) { + __get_immptr(insn); ... but nothing similar for opcode 0x9a. This is extremely hard to spot with this kind of structure. Oops, that should be a bug. Hmm, I think we'd better bit-flags tables for classifying opcodes. Jim, can your INAT idea help this situation? http://sources.redhat.com/ml/systemtap/2009-q2/msg00109.html As noted, the INAT tables follow the kvm model of one fat bitmap of attributes per opcode, rather than the kprobes/uprobes model of one or two 256-bit tables per attribute. (This latter approach was due to the gradual accumulation of tables over the years.) I like the bitmap-per-opcode approach because it's relatively easy to see in one place everything you're saying about a particular opcode. But with all the potential clients for this service, it's not clear that we'll get by with a single bitmap for every opcode. (x86 kvm uses 32 bits per opcode, I think, and the INAT tables use 10. Seems like we could overrun 64 bits pretty quickly.) So I guess that means we'll have to get a little creative as to how we expose these attribute sets to the client. ... Thank you for good advice! Ditto. Jim Keniston -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 4/6 V4.1] x86: kprobes checks safeness of insertion address.
On Fri, 2009-04-03 at 12:02 -0400, Masami Hiramatsu wrote: Ensure safeness of inserting kprobes by checking whether the specified address is at the first byte of a instruction. This is done by decoding probed function from its head to the probe point. changes from v4: - change a comment according to Ananth's suggestion. Signed-off-by: Masami Hiramatsu mhira...@redhat.com Cc: Ananth N Mavinakayanahalli ana...@in.ibm.com Cc: Jim Keniston jkeni...@us.ibm.com Cc: Ingo Molnar mi...@elte.hu --- arch/x86/kernel/kprobes.c | 51 + 1 files changed, 51 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c ... +/* Recover original instruction */ /* Recover the probed instruction at addr for further analysis. */ See below. +static int recover_probed_instruction(kprobe_opcode_t *buf, unsigned long addr) +{ + struct kprobe *kp; + kp = get_kprobe((void *)addr); + if (!kp) + return -EINVAL; + + /* Don't use p-ainsn.insn; which will be modified by fix_riprel */ fix_riprel doesn't affect the instruction's length, which is what concerns this patch. But we want this function to be useful for unforeseen uses as well, so I like the code you have. Just consider the suggested comment changes. /* * Don't use p-ainsn.insn, which could be modified -- e.g., * by fix_riprel(). */ + memcpy(buf, kp-addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t)); + buf[0] = kp-opcode; + return 0; +} Jim Keniston -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
On Fri, 2009-04-03 at 12:55 -0400, Masami Hiramatsu wrote: Ingo Molnar wrote: * Masami Hiramatsu mhira...@redhat.com wrote: Hmm, I'd like to know actually kvm aims to emulate all kinds of instructions. If so, I might find some bugs in x86_emulate.c. However, I don't know all bugs. To find all of them, we have to port x86_emulate.c to user-space, decode binaries with it, and compare its output with another decoder, as Jim had done with insn.c. https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html btw., i'd suggest we put a build time check for this into the kernel version as well. For example to decode the vmlinux via objdump, run it through your decoder as well and compare the results. Put under a CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time self-test. This would ensure that the kernel we are running is fully supported by the decoder - even as GCC/GAS starts using new instructions, etc. How does this sound to you? Thanks! That is a good idea. Jim, would you think you can port your script into kernel tree? ... I'd be happy to do what's needed to make it happen, and maintain it in the face of x86 changes. The script itself is practically nothing (~100 lines of awk and C), but what I don't know about the kernel build is a lot, so I'd need some help from a kernel-build expert. Jim -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html