Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
Pekka Paalanen wrote: Not just emulation but address diversion, i.e. modifying the operation (not the text) before executing it. Mmiotrace could do something like this: 1. a blob calls ioremap 2. mmiotrace maps the MMIO area privately 3. the blob receives a dummy map from ioremap, that will generate page fault 4. the blob accesses the dummy map and raises a page fault 5. pf handler detects the dummy map 6. mmiotrace pf handler emulates the instruction and replaces the dummy address with the real MMIO address. 7. mmiotrace records the operation and the datum 8. go to step 4, or whatever This means mmiotrace would not have to fiddle with the page tables and page presence bits like it does now. As said, this would make mmiotrace SMP-proof, and also eliminate the die notifier (used for the instruction single stepping trap). IMO a big step from a hack to a tool. Getting rid of the custom instruction parser in mmiotrace would be a good step in itself. Avi Kivity noted, that the KVM emulator does almost everything. Does it allow also address diversion? Operand access is by means of a callback, so yes. In kvm's use, it's used to access guest memory, so it modified the addresses before reading or writing. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
On Fri, 03 Apr 2009 09:52:09 -0400 Masami Hiramatsu mhira...@redhat.com wrote: Vegard Nossum wrote: 2009/4/3 Ingo Molnar mi...@elte.hu: * Avi Kivity a...@redhat.com wrote: Ingo Molnar wrote: kvm has three requirements not needed by kprobes: - it wants to execute instructions, not just decode them, including generating faults where appropriate - it is performance critical - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously If an arch/x86/ decoder/emulator gives me these I'll gladly switch to it. x86_emulate.c is high on my list of most disliked code. Well, this has to be driven from the KVM side as the kprobes use will only be for decoding so if it's modified from the kprobes side the KVM-only functionality might regress. So ... we can do the library decoder for kprobes purposes, and someone versed in the KVM emulator can then combine the two. Problem is, anyone versed in the kvm emulator will want to run as far away from this work as possible. Are you suggesting that the KVM emulator should never have been merged in the first place? ;-) Anyway, we'll make sure the kprobes/library decoder is as clean as possible - so it ought to be hackable and extensible without the risk of permanent brain damage. Mmiotrace and kmemcheck has decoding smarts too, and i think the sw-breakpoint injection code of KGDB could use it as well - so there's broader utility in all this. (Sorry in advance for jumping in -- my post may be irrelevant) Thank you for clarify your needs :-) For the record, kmemcheck requirements for an instruction decoder are these: For any instruction with memory operands, we need to know which are the operands (so for movl %eax, (%ebx) we need to combine the instruction with a struct pt_regs to get the actual address dereferenced, i.e. the contents of %ebx), and their sizes (for movzbl, the source operand is 8 bits, destination operand is 32 bits). For things like movsb, we need to be able to get both %esi and %edi. New decoder can give you the value of mod/rm(insn.modrm), operand size (insn.opnd_bytes), and immediate size (insn.immediate.nbytes) To get which register is used, you can decode modrm with MODRM_*() macros. mmiotrace additionally needs to know what the actual values read/written were, for instructions that read/write to memory (again, combined with a struct pt_regs). The decoder doesn't use any locks/shared memory, so you can use it in interrupt context, with pt_regs. Maybe this doesn't really say much, since this is what a generic instruction decoder would be able to do anyway. But kmemcheck and mmiotrace both have very special-purpose decoders. I don't really know what other decoders look like, but what I would wish for is this: Some macros for iterating the operands, where each operand has a type (e.g. input (for reads), output (for writes), target (for jumps), immediate address, immediate value, etc.), a size (in bits), and a way to evaluate the operand. So eval(op, regs) for op=%eax, it will return regs-eax; for op=4(%eax), it will return regs-eax + 4; for op=4 it will return 4, etc. Hmm, it's an interesting idea. I think operand classifying can be done by evaluating opcode and mod/rm. Both kmemcheck and mmiotrace could gain SMP support with instruction emulation, though it is strictly not necessary. In that case, though, we would not want to emulate fault handling, etc. (i.e. the fault should always be generated by the CPU itself). Not just emulation but address diversion, i.e. modifying the operation (not the text) before executing it. Mmiotrace could do something like this: 1. a blob calls ioremap 2. mmiotrace maps the MMIO area privately 3. the blob receives a dummy map from ioremap, that will generate page fault 4. the blob accesses the dummy map and raises a page fault 5. pf handler detects the dummy map 6. mmiotrace pf handler emulates the instruction and replaces the dummy address with the real MMIO address. 7. mmiotrace records the operation and the datum 8. go to step 4, or whatever This means mmiotrace would not have to fiddle with the page tables and page presence bits like it does now. As said, this would make mmiotrace SMP-proof, and also eliminate the die notifier (used for the instruction single stepping trap). IMO a big step from a hack to a tool. Getting rid of the custom instruction parser in mmiotrace would be a good step in itself. Avi Kivity noted, that the KVM emulator does almost everything. Does it allow also address diversion? I haven't looked at the KVM emulator since something like 2.6.25 or so, and I probably don't have time to work with it anyway, but I am very interested to hear how things evolve. Thanks. -- Pekka Paalanen http://www.iki.fi/pq/ -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo
Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
I'm wondering about something i suggested many moons ago: to look into the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c). Hi Ingo, Me and Masami just discussed this a few emails ago in this thread:) -Andi -- a...@linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
2009/4/3 Ingo Molnar mi...@elte.hu: * Avi Kivity a...@redhat.com wrote: Ingo Molnar wrote: kvm has three requirements not needed by kprobes: - it wants to execute instructions, not just decode them, including generating faults where appropriate - it is performance critical - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously If an arch/x86/ decoder/emulator gives me these I'll gladly switch to it. x86_emulate.c is high on my list of most disliked code. Well, this has to be driven from the KVM side as the kprobes use will only be for decoding so if it's modified from the kprobes side the KVM-only functionality might regress. So ... we can do the library decoder for kprobes purposes, and someone versed in the KVM emulator can then combine the two. Problem is, anyone versed in the kvm emulator will want to run as far away from this work as possible. Are you suggesting that the KVM emulator should never have been merged in the first place? ;-) Anyway, we'll make sure the kprobes/library decoder is as clean as possible - so it ought to be hackable and extensible without the risk of permanent brain damage. Mmiotrace and kmemcheck has decoding smarts too, and i think the sw-breakpoint injection code of KGDB could use it as well - so there's broader utility in all this. (Sorry in advance for jumping in -- my post may be irrelevant) For the record, kmemcheck requirements for an instruction decoder are these: For any instruction with memory operands, we need to know which are the operands (so for movl %eax, (%ebx) we need to combine the instruction with a struct pt_regs to get the actual address dereferenced, i.e. the contents of %ebx), and their sizes (for movzbl, the source operand is 8 bits, destination operand is 32 bits). For things like movsb, we need to be able to get both %esi and %edi. mmiotrace additionally needs to know what the actual values read/written were, for instructions that read/write to memory (again, combined with a struct pt_regs). Maybe this doesn't really say much, since this is what a generic instruction decoder would be able to do anyway. But kmemcheck and mmiotrace both have very special-purpose decoders. I don't really know what other decoders look like, but what I would wish for is this: Some macros for iterating the operands, where each operand has a type (e.g. input (for reads), output (for writes), target (for jumps), immediate address, immediate value, etc.), a size (in bits), and a way to evaluate the operand. So eval(op, regs) for op=%eax, it will return regs-eax; for op=4(%eax), it will return regs-eax + 4; for op=4 it will return 4, etc. Both kmemcheck and mmiotrace could gain SMP support with instruction emulation, though it is strictly not necessary. In that case, though, we would not want to emulate fault handling, etc. (i.e. the fault should always be generated by the CPU itself). Please do put me on Cc for future discussions, though. Vegard -- The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation. -- E. W. Dijkstra, EWD1036 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
Avi Kivity wrote: Ingo Molnar wrote: ok, the structure and concept looks quite good now, really nice! I'm wondering about something i suggested many moons ago: to look into the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c). I remember there were some issues with that (one problem being that the KVM decoder is a special-purpose thing covering specific range of execution environments - not a near-full integer-ops decoder like the one we are aiming for here) - are there any other fundamental problems beyond 'it has to be done' ? Conceptually we want just a single piece of decoder logic in arch/x86/. If the KVM folks are cool with it we could factor out the KVM one into arch/x86/lib/. But ... if there are compelling reasons to leave the KVM one alone in its limited environment we can do that too. kvm has three requirements not needed by kprobes: - it wants to execute instructions, not just decode them, including generating faults where appropriate - it is performance critical - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously Hmm, I'd like to know actually kvm aims to emulate all kinds of instructions. If so, I might find some bugs in x86_emulate.c. However, I don't know all bugs. To find all of them, we have to port x86_emulate.c to user-space, decode binaries with it, and compare its output with another decoder, as Jim had done with insn.c. https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html Thank you, -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: mhira...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
* Masami Hiramatsu mhira...@redhat.com wrote: Hmm, I'd like to know actually kvm aims to emulate all kinds of instructions. If so, I might find some bugs in x86_emulate.c. However, I don't know all bugs. To find all of them, we have to port x86_emulate.c to user-space, decode binaries with it, and compare its output with another decoder, as Jim had done with insn.c. https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html btw., i'd suggest we put a build time check for this into the kernel version as well. For example to decode the vmlinux via objdump, run it through your decoder as well and compare the results. Put under a CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time self-test. This would ensure that the kernel we are running is fully supported by the decoder - even as GCC/GAS starts using new instructions, etc. How does this sound to you? Ingo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
Masami Hiramatsu wrote: Hmm, I'd like to know actually kvm aims to emulate all kinds of instructions. We're less interested in fpu/sse. The interesting instructions are those used for page table management, mmio, and real mode execution. If so, I might find some bugs in x86_emulate.c. However, I don't know all bugs. To find all of them, we have to port x86_emulate.c to user-space, decode binaries with it, and compare its output with another decoder, as Jim had done with insn.c. That would be very useful. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
Ingo Molnar wrote: * Masami Hiramatsu mhira...@redhat.com wrote: Hmm, I'd like to know actually kvm aims to emulate all kinds of instructions. If so, I might find some bugs in x86_emulate.c. However, I don't know all bugs. To find all of them, we have to port x86_emulate.c to user-space, decode binaries with it, and compare its output with another decoder, as Jim had done with insn.c. https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html btw., i'd suggest we put a build time check for this into the kernel version as well. For example to decode the vmlinux via objdump, run it through your decoder as well and compare the results. Put under a CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time self-test. This would ensure that the kernel we are running is fully supported by the decoder - even as GCC/GAS starts using new instructions, etc. How does this sound to you? Thanks! That is a good idea. Jim, would you think you can port your script into kernel tree? Thank you, -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: mhira...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
On Fri, 2009-04-03 at 12:55 -0400, Masami Hiramatsu wrote: Ingo Molnar wrote: * Masami Hiramatsu mhira...@redhat.com wrote: Hmm, I'd like to know actually kvm aims to emulate all kinds of instructions. If so, I might find some bugs in x86_emulate.c. However, I don't know all bugs. To find all of them, we have to port x86_emulate.c to user-space, decode binaries with it, and compare its output with another decoder, as Jim had done with insn.c. https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html btw., i'd suggest we put a build time check for this into the kernel version as well. For example to decode the vmlinux via objdump, run it through your decoder as well and compare the results. Put under a CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time self-test. This would ensure that the kernel we are running is fully supported by the decoder - even as GCC/GAS starts using new instructions, etc. How does this sound to you? Thanks! That is a good idea. Jim, would you think you can port your script into kernel tree? ... I'd be happy to do what's needed to make it happen, and maintain it in the face of x86 changes. The script itself is practically nothing (~100 lines of awk and C), but what I don't know about the kernel build is a lot, so I'd need some help from a kernel-build expert. Jim -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH -tip 0/6 V4] tracing: kprobe-based event tracer
Hi, Here are the patches of kprobe-based event tracer for x86, version 4. This version supports only x86(-32/-64) (If someone is interested in porting this to other architectures, he just needs to port kprobes/kretprobes and ptrace enhancement[PATCH 2/6]). I added x86 insn decoder on this version. It might be better integrated with KVM's decoder, and kprobes x86 code should be rewritten with it. This can be applied on the linux-2.6-tip tree. This patchset includes following changes: - Fix kernel_trap_sp() on x86 according to systemtap runtime. [1/6] - Add arch-dep register and stack fetching functions [2/6] - Add x86 instruction decoder [3/6] - Check insertion point safety in kprobe [4/6] - Add kprobe-tracer plugin [5/6] - Support fetching various status (register/stack/memory/etc.) [6/6] Done items: - Add kernel_trap_sp() and fetch_*() on other archs. - Support name-based register fetching (ax, bx, and so on) - Support indirect memory fetch from registers etc. - Check insertion point safety by using instruction decoder. Future items: - .init function tracing support. - Support primitive types(long, ulong, int, uint, etc) for args. kprobe-based event tracer --- This tracer is similar to the events tracer which is based on Tracepoint infrastructure. Instead of Tracepoint, this tracer is based on kprobes(kprobe and kretprobe). It probes anywhere where kprobes can probe(this means, all functions body except for __kprobes functions). Unlike the function tracer, this tracer can probe instructions inside of kernel functions. It allows you to check which instruction has been executed. Unlike the Tracepoint based events tracer, this tracer can add new probe points on the fly. Similar to the events tracer, this tracer doesn't need to be activated via current_tracer, instead of that, just set probe points via /debug/tracing/kprobe_probes. Synopsis of kprobe_probes: p SYMBOL[+offs|-offs]|MEMADDR [FETCHARGS] : set a probe r SYMBOL[+0] [FETCHARGS] : set a return probe FETCHARGS: %REG : Fetch register REG sN: Fetch Nth entry of stack (N = 0) @ADDR : Fetch memory at ADDR (ADDR should be in kernel) @SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol) aN: Fetch function argument. (N = 0)(*) rv: Fetch return value.(**) ra: Fetch return address.(**) +|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.(***) (*) aN may not correct on asmlinkaged functions and at the middle of function body. (**) only for return probe. (***) this is useful for fetching a field of data structures. E.g. echo p do_sys_open a0 a1 a2 a3 /debug/tracing/kprobe_probes This sets a kprobe on the top of do_sys_open() function with recording 1st to 4th arguments. echo r do_sys_open rv rp /debug/tracing/kprobe_probes This sets a kretprobe on the return point of do_sys_open() function with recording return value and return address. echo /debug/tracing/kprobe_probes This clears all probe points. and you can see the traced information via /debug/tracing/trace. cat /debug/tracing/trace # tracer: nop # # TASK-PIDCPU#TIMESTAMP FUNCTION # | | | | | ...-2376 [001] 262.389131: do_sys_open: @do_sys_open+0 0xff9c 0x98db83e 0x8880 0x0 ...-2376 [001] 262.391166: sys_open: -do_sys_open+0 0x5 0xc06e8ebb ...-2376 [001] 264.384876: do_sys_open: @do_sys_open+0 0xff9c 0x98db83e 0x8880 0x0 ...-2376 [001] 264.386880: sys_open: -do_sys_open+0 0x5 0xc06e8ebb ...-2084 [001] 265.380330: do_sys_open: @do_sys_open+0 0xff9c 0x804be3e 0x0 0x1b6 ...-2084 [001] 265.380399: sys_open: -do_sys_open+0 0x3 0xc06e8ebb @SYMBOL means that kernel hits a probe, and -SYMBOL means kernel returns from SYMBOL(e.g. sys_open: -do_sys_open+0 means kernel returns from do_sys_open to sys_open). Documentation/ftrace.txt | 70 arch/x86/include/asm/insn.h | 130 +++ arch/x86/include/asm/ptrace.h | 70 - arch/x86/kernel/kprobes.c | 51 +++ arch/x86/kernel/ptrace.c | 59 +++ arch/x86/lib/Makefile |1 + arch/x86/lib/insn.c | 627 kernel/trace/Kconfig |9 + kernel/trace/Makefile |1 + kernel/trace/trace_kprobe.c | 789 + 10 files changed, 1805 insertions(+), 2 deletions(-) Thank you, -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: mhira...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html