Re: [PATCH 2/2] Kprobes: Move kprobes examples to samples/
On 2/5/08, Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> wrote: > + * Build and insert the kernel module as done in the kprobe example. > + * You will see the trace data in /var/log/messages and on the console > + * whenever sys_open() returns a negative value. A passing observation"sys_open" should be replaced with "do_fork", whose return value is not checked at all. -- Regards, Abhishek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] Kprobes: Move kprobes examples to samples/
On 2/5/08, Ananth N Mavinakayanahalli [EMAIL PROTECTED] wrote: + * Build and insert the kernel module as done in the kprobe example. + * You will see the trace data in /var/log/messages and on the console + * whenever sys_open() returns a negative value. A passing observationsys_open should be replaced with do_fork, whose return value is not checked at all. -- Regards, Abhishek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ktime: Allow ktime_t comparison using ktime_compare
Add a timespec style comparison function. Allows two ktime types to be compared without having to convert to timespec/timeval. Useful for modules doing ktime based math, especially the ones using ktime_get heavily. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> --- diff --git a/include/linux/ktime.h b/include/linux/ktime.h index a6ddec1..7f9d321 100644 --- a/include/linux/ktime.h +++ b/include/linux/ktime.h @@ -95,6 +95,23 @@ static inline ktime_t ktime_set(const long secs, const unsigned long nsecs) #define ktime_add(lhs, rhs) \ ({ (ktime_t){ .tv64 = (lhs).tv64 + (rhs).tv64 }; }) +/** + * ktime_compare - Compares two ktime_t variables + * + * Return val: + * lhs < rhs: < 0 + * lhs == rhs: 0 + * lhs > rhs: > 0 + */ +static inline int ktime_compare(const ktime_t lhs, const ktime_t rhs) +{ + if (lhs.tv64 < rhs.tv64) + return -1; + if (lhs.tv64 > rhs.tv64) + return 1; + return 0; +} + /* * Add a ktime_t variable and a scalar nanosecond value. * res = kt + nsval: @@ -198,6 +215,23 @@ static inline ktime_t ktime_add(const ktime_t add1, const ktime_t add2) } /** + * ktime_compare - Compares two ktime_t variables + * + * Return val: + * lhs < rhs: < 0 + * lhs == rhs: 0 + * lhs > rhs: > 0 + */ +static inline int ktime_compare(const ktime_t lhs, const ktime_t rhs) +{ + if (lhs.tv.sec < rhs.tv.sec) + return -1; + if (lhs.tv.sec > rhs.tv.sec) + return 1; + return lhs.tv.nsec - rhs.tv.nsec; +} + +/** * ktime_add_ns - Add a scalar nanoseconds value to a ktime_t variable * @kt:addend * @nsec: the scalar nsec value to add -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ktime: Allow ktime_t comparison using ktime_compare
Add a timespec style comparison function. Allows two ktime types to be compared without having to convert to timespec/timeval. Useful for modules doing ktime based math, especially the ones using ktime_get heavily. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] --- diff --git a/include/linux/ktime.h b/include/linux/ktime.h index a6ddec1..7f9d321 100644 --- a/include/linux/ktime.h +++ b/include/linux/ktime.h @@ -95,6 +95,23 @@ static inline ktime_t ktime_set(const long secs, const unsigned long nsecs) #define ktime_add(lhs, rhs) \ ({ (ktime_t){ .tv64 = (lhs).tv64 + (rhs).tv64 }; }) +/** + * ktime_compare - Compares two ktime_t variables + * + * Return val: + * lhs rhs: 0 + * lhs == rhs: 0 + * lhs rhs: 0 + */ +static inline int ktime_compare(const ktime_t lhs, const ktime_t rhs) +{ + if (lhs.tv64 rhs.tv64) + return -1; + if (lhs.tv64 rhs.tv64) + return 1; + return 0; +} + /* * Add a ktime_t variable and a scalar nanosecond value. * res = kt + nsval: @@ -198,6 +215,23 @@ static inline ktime_t ktime_add(const ktime_t add1, const ktime_t add2) } /** + * ktime_compare - Compares two ktime_t variables + * + * Return val: + * lhs rhs: 0 + * lhs == rhs: 0 + * lhs rhs: 0 + */ +static inline int ktime_compare(const ktime_t lhs, const ktime_t rhs) +{ + if (lhs.tv.sec rhs.tv.sec) + return -1; + if (lhs.tv.sec rhs.tv.sec) + return 1; + return lhs.tv.nsec - rhs.tv.nsec; +} + +/** * ktime_add_ns - Add a scalar nanoseconds value to a ktime_t variable * @kt:addend * @nsec: the scalar nsec value to add -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints
On 1/29/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote: > In that case, why don't you just reduce the priority of kprobe_exceptions_nb? > Then, the execution path becomes very simple. Ananth mentioned that the kprobe notifier has to be the first to run. It still wouldnt allow us to notice breakpoints on places like do_int3 etc. > I also like to use a debugger for debugging kprobes. that will help us. Hmm...It would increase the code-path leading upto kprobe_handler. That's more territory to be guarded from kprobes. > Best Regards, > > -- > Masami Hiramatsu > > Software Engineer > Hitachi Computer Products (America) Inc. > Software Solutions Division > > e-mail: [EMAIL PROTECTED] -- Thanks, Abhishek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints
On 1/29/08, Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> wrote: > > May be I'm completely off the mark here, but shouldn't a small subset > > of this section simply be 'breakpoint-free' rather than 'kprobe-free'? > > Placing a breakpoint on kprobe_handler (say) can loop into a recursive > > trap without allowing the debugger's notifier chain to be invoked. > > A well heeled debugger will necessarily take care of saving contexts > (using techniques like setjmp/longjmp, etc) to help it recover from such > nested cases (See xmon for example). Ok, but the protection/warning is not just for xmon. > > I'm assuming that non-kprobe exception notifiers may (or even should) run > > after kprobe's notifier callback (kprobe_exceptions_notify). > > Yes, any such notifier is invoked after kprobe's callback as the kprobe > notifier is always registered with the highest priority. Ok. > > The WARN_ON (and not a BUG_ON) will be hit iff: > > (in_kprobes_functions(addr) && !is_jprobe_bkpt(addr)) > > But that still is unneeded dmesg clutter, IMHO. Ok, a warning in my opinion would've been prudent since I think we cannot guarantee non kprobe breakpoint users (debuggers or anything else) from the recursive trap handling case. > Ananth -- Thanks, Abhishek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints
On 1/29/08, Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> wrote: > > Non kprobe breakpoints in the kernel might lie inside the .kprobes.text > > section. Such breakpoints can easily be identified by in_kprobes_functions > > and can be caught early. These are problematic and a warning should be > > emitted to discourage them (in any rare case, if they actually occur). > > Why? As Masami indicated in an earlier reply, the annotation is to > prevent *only* kprobes. May be I'm completely off the mark here, but shouldn't a small subset of this section simply be 'breakpoint-free' rather than 'kprobe-free'? Placing a breakpoint on kprobe_handler (say) can loop into a recursive trap without allowing the debugger's notifier chain to be invoked. I'm assuming that non-kprobe exception notifiers may (or even should) run after kprobe's notifier callback (kprobe_exceptions_notify). > > For this, a check can route the trap handling of such breakpoints away from > > kprobe_handler (which ends up calling even more functions marked as > > __kprobes) from inside kprobe_exceptions_notify. > > Well.. we pass on control of a !kprobe breakpoint to the kernel. This is > exactly what permits debuggers like xmon to work fine now. This will still happen. It doesn't stop non-kprobe breakpoints from being handled, wherever they may be. > I don't see any harm in such breakpoints being handled autonomously > without any sort of kprobe influence. Here's what seems to be happening currently: int3 (non-kprobe) -> do_int3 ->kprobe_exceptions_notify -> kprobe_handler (passes the buck to the kernel) -> non-krpobe/debugger exception handler. Here's what the patch will do: int3 (non-kprobe) -> do_int3 ->kprobe_exceptions_notify -> WARN_ON/kprobe_handler -> non-kprobe/debugger exception handler. The WARN_ON (and not a BUG_ON) will be hit iff: (in_kprobes_functions(addr) && !is_jprobe_bkpt(addr)) > Ananth I hope I've understood the point you were making, or at least came close :-). -- Thanks, Abhishek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints
On 1/29/08, Ananth N Mavinakayanahalli [EMAIL PROTECTED] wrote: May be I'm completely off the mark here, but shouldn't a small subset of this section simply be 'breakpoint-free' rather than 'kprobe-free'? Placing a breakpoint on kprobe_handler (say) can loop into a recursive trap without allowing the debugger's notifier chain to be invoked. A well heeled debugger will necessarily take care of saving contexts (using techniques like setjmp/longjmp, etc) to help it recover from such nested cases (See xmon for example). Ok, but the protection/warning is not just for xmon. I'm assuming that non-kprobe exception notifiers may (or even should) run after kprobe's notifier callback (kprobe_exceptions_notify). Yes, any such notifier is invoked after kprobe's callback as the kprobe notifier is always registered with the highest priority. Ok. The WARN_ON (and not a BUG_ON) will be hit iff: (in_kprobes_functions(addr) !is_jprobe_bkpt(addr)) But that still is unneeded dmesg clutter, IMHO. Ok, a warning in my opinion would've been prudent since I think we cannot guarantee non kprobe breakpoint users (debuggers or anything else) from the recursive trap handling case. Ananth -- Thanks, Abhishek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints
On 1/29/08, Ananth N Mavinakayanahalli [EMAIL PROTECTED] wrote: Non kprobe breakpoints in the kernel might lie inside the .kprobes.text section. Such breakpoints can easily be identified by in_kprobes_functions and can be caught early. These are problematic and a warning should be emitted to discourage them (in any rare case, if they actually occur). Why? As Masami indicated in an earlier reply, the annotation is to prevent *only* kprobes. May be I'm completely off the mark here, but shouldn't a small subset of this section simply be 'breakpoint-free' rather than 'kprobe-free'? Placing a breakpoint on kprobe_handler (say) can loop into a recursive trap without allowing the debugger's notifier chain to be invoked. I'm assuming that non-kprobe exception notifiers may (or even should) run after kprobe's notifier callback (kprobe_exceptions_notify). For this, a check can route the trap handling of such breakpoints away from kprobe_handler (which ends up calling even more functions marked as __kprobes) from inside kprobe_exceptions_notify. Well.. we pass on control of a !kprobe breakpoint to the kernel. This is exactly what permits debuggers like xmon to work fine now. This will still happen. It doesn't stop non-kprobe breakpoints from being handled, wherever they may be. I don't see any harm in such breakpoints being handled autonomously without any sort of kprobe influence. Here's what seems to be happening currently: int3 (non-kprobe) - do_int3 -kprobe_exceptions_notify - kprobe_handler (passes the buck to the kernel) - non-krpobe/debugger exception handler. Here's what the patch will do: int3 (non-kprobe) - do_int3 -kprobe_exceptions_notify - WARN_ON/kprobe_handler - non-kprobe/debugger exception handler. The WARN_ON (and not a BUG_ON) will be hit iff: (in_kprobes_functions(addr) !is_jprobe_bkpt(addr)) Ananth I hope I've understood the point you were making, or at least came close :-). -- Thanks, Abhishek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints
On 1/29/08, Masami Hiramatsu [EMAIL PROTECTED] wrote: In that case, why don't you just reduce the priority of kprobe_exceptions_nb? Then, the execution path becomes very simple. Ananth mentioned that the kprobe notifier has to be the first to run. It still wouldnt allow us to notice breakpoints on places like do_int3 etc. I also like to use a debugger for debugging kprobes. that will help us. Hmm...It would increase the code-path leading upto kprobe_handler. That's more territory to be guarded from kprobes. Best Regards, -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: [EMAIL PROTECTED] -- Thanks, Abhishek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section
On 1/28/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote: > Thank you for explanation, I hope I can understand it. > Even if it causes a trap recursively, it could be checked (and ignored) by > longjump_break_handler(), and passed to the debugger correctly. Yes, all non-kprobe breakpoints are left to the kernel to be handled. The objective here is to intercept the trap handling of a certain category of such breakpoints and emit a warning. The premise being that .kprobes.text section is a logical breakpoint-free zone. > Please consider that someone expands jprobe(jprobe2) which uses > jprobe_return2() instead of jprobe_return(), how would you handle it? By a simple modification of is_jprobe_bkpt() (defined in patch #2 of this series). > Current kprobes provides an opportunity to those external probe frameworks > for checking it by themselves. Could you clarirfy this with some example. For now I'm assuming that by external probe frameworks you mean kernel modules using kprobes. If they embed breakpoints in their handlers, then they will simply not be caught by this check because thay cannot lie in the .kprobes.text section. > By the way, external kernel debugger knows how kprobe (and exception notifier) > works, so I think it can fetch his exception before kprobes does (by tweaking > notifier chain, etc). > (I hope all external kernel debuggers take care of it. :-)) I would image that from a code correctness's point of view it shouldn't matter. In any case, nothing can be done if the kprobe exception notifier is circumvented. > Masami Hiramatsu > > Software Engineer > Hitachi Computer Products (America) Inc. > Software Solutions Division > > e-mail: [EMAIL PROTECTED] -- Thanks, Abhishek Sagar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] kprobes: kretprobe user entry-handler (updated)
On 1/28/08, Andrew Morton <[EMAIL PROTECTED]> wrote: > Neither the changelog nor the newly-added documentation explain why Linux > needs this feature. What will it be used for?? There's a detailed discussion along with an example on this thread: http://lkml.org/lkml/2007/11/13/58 and a bit on: http://sourceware.org/ml/systemtap/2005-q3/msg00593.html The real advantage of this patch is in scenarios where some data (like function paramters, system time etc) needs to be shared between function entry and exit. E.g: Viewing the change in system time across a call to profile it. Here, we have a need to share data between the entry and exit events of a function call. Also, the correct association needs to be maintained between the corresponding function entry-exit pairs. This is already done using a 'return-instance' in kretprobes. This patch allows these instances to pouch some data as well. The patch also has a module example which does trivial function time-duration profiling using entry-handlers. It makes writing function profilers simpler using kretprobes. Currently, doing such a thing would require an extra kprobe to be planted at function entry-point and whose pre-handler must have all the complexity to do the function entry-exit data association. Also, using an entry-handler is optional, and is completely backward compatible. > > +1.3.2 Kretprobe entry-handler > > + > > +Kretprobes also provides an optional user-specified handler which runs > > I think "caller-specified" would be a better term here. Generally "user" > refers to Aunt Tillie sitting at the keyboard. Just followed suit from exising kprobes.txt which had quite a few references to such a 'user'. Cheerfully Acked by -> Aunt Tillie :-) -- Thanks, Abhishek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] kprobes: kretprobe user entry-handler (updated)
On 1/28/08, Andrew Morton [EMAIL PROTECTED] wrote: Neither the changelog nor the newly-added documentation explain why Linux needs this feature. What will it be used for?? There's a detailed discussion along with an example on this thread: http://lkml.org/lkml/2007/11/13/58 and a bit on: http://sourceware.org/ml/systemtap/2005-q3/msg00593.html The real advantage of this patch is in scenarios where some data (like function paramters, system time etc) needs to be shared between function entry and exit. E.g: Viewing the change in system time across a call to profile it. Here, we have a need to share data between the entry and exit events of a function call. Also, the correct association needs to be maintained between the corresponding function entry-exit pairs. This is already done using a 'return-instance' in kretprobes. This patch allows these instances to pouch some data as well. The patch also has a module example which does trivial function time-duration profiling using entry-handlers. It makes writing function profilers simpler using kretprobes. Currently, doing such a thing would require an extra kprobe to be planted at function entry-point and whose pre-handler must have all the complexity to do the function entry-exit data association. Also, using an entry-handler is optional, and is completely backward compatible. +1.3.2 Kretprobe entry-handler + +Kretprobes also provides an optional user-specified handler which runs I think caller-specified would be a better term here. Generally user refers to Aunt Tillie sitting at the keyboard. Just followed suit from exising kprobes.txt which had quite a few references to such a 'user'. Cheerfully Acked by - Aunt Tillie :-) -- Thanks, Abhishek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section
On 1/28/08, Masami Hiramatsu [EMAIL PROTECTED] wrote: Thank you for explanation, I hope I can understand it. Even if it causes a trap recursively, it could be checked (and ignored) by longjump_break_handler(), and passed to the debugger correctly. Yes, all non-kprobe breakpoints are left to the kernel to be handled. The objective here is to intercept the trap handling of a certain category of such breakpoints and emit a warning. The premise being that .kprobes.text section is a logical breakpoint-free zone. Please consider that someone expands jprobe(jprobe2) which uses jprobe_return2() instead of jprobe_return(), how would you handle it? By a simple modification of is_jprobe_bkpt() (defined in patch #2 of this series). Current kprobes provides an opportunity to those external probe frameworks for checking it by themselves. Could you clarirfy this with some example. For now I'm assuming that by external probe frameworks you mean kernel modules using kprobes. If they embed breakpoints in their handlers, then they will simply not be caught by this check because thay cannot lie in the .kprobes.text section. By the way, external kernel debugger knows how kprobe (and exception notifier) works, so I think it can fetch his exception before kprobes does (by tweaking notifier chain, etc). (I hope all external kernel debuggers take care of it. :-)) I would image that from a code correctness's point of view it shouldn't matter. In any case, nothing can be done if the kprobe exception notifier is circumvented. Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: [EMAIL PROTECTED] -- Thanks, Abhishek Sagar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section
On 1/27/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote: > Sorry, I can not understand what issue these patches can solve. > The breakpoint which is inserted by external debugger will be checked by > kprobe_handler() and be passed to other exception_notify_handlers. > In that case, we don't need to warn it. > I think current code is enough simple. kprobe_handler has a blanket check for all non-kprobe breakpoints. They're all left to the kernel to handle. This is fine. Although external debuggers are free to plant breakpoints anywhere, they should be discouraged from doing so inside .kprobes.text region. Placing them there may lead to recursive page-fault/trap handling. It's a defensive check. I hope I've been able to clarify. > -- > Masami Hiramatsu > > Software Engineer > Hitachi Computer Products (America) Inc. > Software Solutions Division > > e-mail: [EMAIL PROTECTED] Thanks, Abhishek Sagar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] x86: Macrofy resuable code
Sam Ravnborg wrote: > Small static functions are preferred over macros. > Any particular reason to use a macro here? > > Sam These macros have very limited(two) instantiations. But here's an alternative patch inlined. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index a99e764..b7c2d20 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -159,6 +159,16 @@ struct kretprobe_blackpoint kretprobe_blacklist[] = { }; const int kretprobe_blacklist_size = ARRAY_SIZE(kretprobe_blacklist); +static inline unsigned long kprobe_bkpt_addr(struct pt_regs *regs) +{ + return (instruction_pointer(regs) - sizeof(kprobe_opcode_t)); +} + +static inline int is_jprobe_bkpt(u8 *ptr) +{ + return (ptr > (u8 *)jprobe_return) && (ptr < (u8 *)jprobe_return_end); +} + /* Insert a jump instruction at address 'from', which jumps to address 'to'.*/ static void __kprobes set_jmp_op(void *from, void *to) { @@ -519,7 +529,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs) struct kprobe *p; struct kprobe_ctlblk *kcb; - addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t)); + addr = (kprobe_opcode_t *)kprobe_bkpt_addr(regs); if (*addr != BREAKPOINT_INSTRUCTION) { /* * The breakpoint instruction was removed right @@ -1032,8 +1042,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs) u8 *addr = (u8 *) (regs->ip - 1); struct jprobe *jp = container_of(p, struct jprobe, kp); - if ((addr > (u8 *) jprobe_return) && - (addr < (u8 *) jprobe_return_end)) { + if (is_jprobe_bkpt(addr)) { if (stack_addr(regs) != kcb->jprobe_saved_sp) { struct pt_regs *saved_regs = >jprobe_saved_regs; printk(KERN_ERR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section
Identify breakpoints in .kprobes.text section. These certainly aren't kprobe traps. However, we make an exception for the breakpoint hardcoded into jprobe_return. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index 45f2949..f3d13d0 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -961,6 +961,7 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *data) { struct die_args *args = data; + unsigned long addr = kprobe_bkpt_addr(args->regs); int ret = NOTIFY_DONE; if (args->regs && user_mode_vm(args->regs)) @@ -968,7 +969,14 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self, switch (val) { case DIE_INT3: - if (kprobe_handler(args->regs)) + if (in_kprobes_functions(addr) && + !is_jprobe_bkpt((u8 *)addr)) { + /* A breakpoint has made it's way to the .kprobes.text +* section (excluding jprobe_return). This could be +* due to an external debugger. */ + WARN_ON(1); + + } else if (kprobe_handler(args->regs)) ret = NOTIFY_STOP; break; case DIE_DEBUG: -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] x86: Macrofy resuable code
Encapsulate reusable code . Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index a99e764..45f2949 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -74,6 +74,13 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk); #define stack_addr(regs) ((unsigned long *)>sp) #endif +#define kprobe_bkpt_addr(regs) \ + ((unsigned long)(regs->ip - sizeof(kprobe_opcode_t))) + +#define is_jprobe_bkpt(ptr) \ + ((ptr > (u8 *)jprobe_return) && (ptr < (u8 *)jprobe_return_end)) + + #define W(row, b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, bf)\ (((b0##UL << 0x0)|(b1##UL << 0x1)|(b2##UL << 0x2)|(b3##UL << 0x3) | \ (b4##UL << 0x4)|(b5##UL << 0x5)|(b6##UL << 0x6)|(b7##UL << 0x7) | \ @@ -519,7 +526,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs) struct kprobe *p; struct kprobe_ctlblk *kcb; - addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t)); + addr = (kprobe_opcode_t *)kprobe_bkpt_addr(regs); if (*addr != BREAKPOINT_INSTRUCTION) { /* * The breakpoint instruction was removed right @@ -1032,8 +1039,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs) u8 *addr = (u8 *) (regs->ip - 1); struct jprobe *jp = container_of(p, struct jprobe, kp); - if ((addr > (u8 *) jprobe_return) && - (addr < (u8 *) jprobe_return_end)) { + if (is_jprobe_bkpt(addr)) { if (stack_addr(regs) != kcb->jprobe_saved_sp) { struct pt_regs *saved_regs = >jprobe_saved_regs; printk(KERN_ERR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints
Greetings, Non kprobe breakpoints in the kernel might lie inside the .kprobes.text section. Such breakpoints can easily be identified by in_kprobes_functions and can be caught early. These are problematic and a warning should be emitted to discourage them (in any rare case, if they actually occur). For this, a check can route the trap handling of such breakpoints away from kprobe_handler (which ends up calling even more functions marked as __kprobes) from inside kprobe_exceptions_notify. All comments/suggestions are welcome. -- Thanks & Regards, Abhishek Sagar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] x86: Move in_kprobes_functions to linux/kprobes.h
Moving in_kprobes_functions to linux/kprobes.h to share it with arch/x86/kerne/kprobes.c. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> --- diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h index 6168c0a..2762145 100644 --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -39,6 +39,7 @@ #ifdef CONFIG_KPROBES #include +#include /* kprobe_status settings */ #define KPROBE_HIT_ACTIVE 0x0001 @@ -182,6 +183,14 @@ static inline void kretprobe_assert(struct kretprobe_instance *ri, } } +static inline int in_kprobes_functions(unsigned long addr) +{ + if (addr >= (unsigned long)__kprobes_text_start && + addr < (unsigned long)__kprobes_text_end) + return -EINVAL; + return 0; +} + #ifdef CONFIG_KPROBES_SANITY_TEST extern int init_test_probes(void); #else diff --git a/kernel/kprobes.c b/kernel/kprobes.c index d0493ea..0b74dfb 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -490,14 +490,6 @@ static int __kprobes register_aggr_kprobe(struct kprobe *old_p, return ret; } -static int __kprobes in_kprobes_functions(unsigned long addr) -{ - if (addr >= (unsigned long)__kprobes_text_start && - addr < (unsigned long)__kprobes_text_end) - return -EINVAL; - return 0; -} - static int __kprobes __register_kprobe(struct kprobe *p, unsigned long called_from) { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section
Identify breakpoints in .kprobes.text section. These certainly aren't kprobe traps. However, we make an exception for the breakpoint hardcoded into jprobe_return. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index 45f2949..f3d13d0 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -961,6 +961,7 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *data) { struct die_args *args = data; + unsigned long addr = kprobe_bkpt_addr(args-regs); int ret = NOTIFY_DONE; if (args-regs user_mode_vm(args-regs)) @@ -968,7 +969,14 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self, switch (val) { case DIE_INT3: - if (kprobe_handler(args-regs)) + if (in_kprobes_functions(addr) + !is_jprobe_bkpt((u8 *)addr)) { + /* A breakpoint has made it's way to the .kprobes.text +* section (excluding jprobe_return). This could be +* due to an external debugger. */ + WARN_ON(1); + + } else if (kprobe_handler(args-regs)) ret = NOTIFY_STOP; break; case DIE_DEBUG: -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] x86: Macrofy resuable code
Encapsulate reusable code . Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index a99e764..45f2949 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -74,6 +74,13 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk); #define stack_addr(regs) ((unsigned long *)regs-sp) #endif +#define kprobe_bkpt_addr(regs) \ + ((unsigned long)(regs-ip - sizeof(kprobe_opcode_t))) + +#define is_jprobe_bkpt(ptr) \ + ((ptr (u8 *)jprobe_return) (ptr (u8 *)jprobe_return_end)) + + #define W(row, b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, bf)\ (((b0##UL 0x0)|(b1##UL 0x1)|(b2##UL 0x2)|(b3##UL 0x3) | \ (b4##UL 0x4)|(b5##UL 0x5)|(b6##UL 0x6)|(b7##UL 0x7) | \ @@ -519,7 +526,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs) struct kprobe *p; struct kprobe_ctlblk *kcb; - addr = (kprobe_opcode_t *)(regs-ip - sizeof(kprobe_opcode_t)); + addr = (kprobe_opcode_t *)kprobe_bkpt_addr(regs); if (*addr != BREAKPOINT_INSTRUCTION) { /* * The breakpoint instruction was removed right @@ -1032,8 +1039,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs) u8 *addr = (u8 *) (regs-ip - 1); struct jprobe *jp = container_of(p, struct jprobe, kp); - if ((addr (u8 *) jprobe_return) - (addr (u8 *) jprobe_return_end)) { + if (is_jprobe_bkpt(addr)) { if (stack_addr(regs) != kcb-jprobe_saved_sp) { struct pt_regs *saved_regs = kcb-jprobe_saved_regs; printk(KERN_ERR -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints
Greetings, Non kprobe breakpoints in the kernel might lie inside the .kprobes.text section. Such breakpoints can easily be identified by in_kprobes_functions and can be caught early. These are problematic and a warning should be emitted to discourage them (in any rare case, if they actually occur). For this, a check can route the trap handling of such breakpoints away from kprobe_handler (which ends up calling even more functions marked as __kprobes) from inside kprobe_exceptions_notify. All comments/suggestions are welcome. -- Thanks Regards, Abhishek Sagar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] x86: Move in_kprobes_functions to linux/kprobes.h
Moving in_kprobes_functions to linux/kprobes.h to share it with arch/x86/kerne/kprobes.c. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] --- diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h index 6168c0a..2762145 100644 --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -39,6 +39,7 @@ #ifdef CONFIG_KPROBES #include asm/kprobes.h +#include asm-generic/sections.h /* kprobe_status settings */ #define KPROBE_HIT_ACTIVE 0x0001 @@ -182,6 +183,14 @@ static inline void kretprobe_assert(struct kretprobe_instance *ri, } } +static inline int in_kprobes_functions(unsigned long addr) +{ + if (addr = (unsigned long)__kprobes_text_start + addr (unsigned long)__kprobes_text_end) + return -EINVAL; + return 0; +} + #ifdef CONFIG_KPROBES_SANITY_TEST extern int init_test_probes(void); #else diff --git a/kernel/kprobes.c b/kernel/kprobes.c index d0493ea..0b74dfb 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -490,14 +490,6 @@ static int __kprobes register_aggr_kprobe(struct kprobe *old_p, return ret; } -static int __kprobes in_kprobes_functions(unsigned long addr) -{ - if (addr = (unsigned long)__kprobes_text_start - addr (unsigned long)__kprobes_text_end) - return -EINVAL; - return 0; -} - static int __kprobes __register_kprobe(struct kprobe *p, unsigned long called_from) { -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] x86: Macrofy resuable code
Sam Ravnborg wrote: Small static functions are preferred over macros. Any particular reason to use a macro here? Sam These macros have very limited(two) instantiations. But here's an alternative patch inlined. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index a99e764..b7c2d20 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -159,6 +159,16 @@ struct kretprobe_blackpoint kretprobe_blacklist[] = { }; const int kretprobe_blacklist_size = ARRAY_SIZE(kretprobe_blacklist); +static inline unsigned long kprobe_bkpt_addr(struct pt_regs *regs) +{ + return (instruction_pointer(regs) - sizeof(kprobe_opcode_t)); +} + +static inline int is_jprobe_bkpt(u8 *ptr) +{ + return (ptr (u8 *)jprobe_return) (ptr (u8 *)jprobe_return_end); +} + /* Insert a jump instruction at address 'from', which jumps to address 'to'.*/ static void __kprobes set_jmp_op(void *from, void *to) { @@ -519,7 +529,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs) struct kprobe *p; struct kprobe_ctlblk *kcb; - addr = (kprobe_opcode_t *)(regs-ip - sizeof(kprobe_opcode_t)); + addr = (kprobe_opcode_t *)kprobe_bkpt_addr(regs); if (*addr != BREAKPOINT_INSTRUCTION) { /* * The breakpoint instruction was removed right @@ -1032,8 +1042,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs) u8 *addr = (u8 *) (regs-ip - 1); struct jprobe *jp = container_of(p, struct jprobe, kp); - if ((addr (u8 *) jprobe_return) - (addr (u8 *) jprobe_return_end)) { + if (is_jprobe_bkpt(addr)) { if (stack_addr(regs) != kcb-jprobe_saved_sp) { struct pt_regs *saved_regs = kcb-jprobe_saved_regs; printk(KERN_ERR -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section
On 1/27/08, Masami Hiramatsu [EMAIL PROTECTED] wrote: Sorry, I can not understand what issue these patches can solve. The breakpoint which is inserted by external debugger will be checked by kprobe_handler() and be passed to other exception_notify_handlers. In that case, we don't need to warn it. I think current code is enough simple. kprobe_handler has a blanket check for all non-kprobe breakpoints. They're all left to the kernel to handle. This is fine. Although external debuggers are free to plant breakpoints anywhere, they should be discouraged from doing so inside .kprobes.text region. Placing them there may lead to recursive page-fault/trap handling. It's a defensive check. I hope I've been able to clarify. -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: [EMAIL PROTECTED] Thanks, Abhishek Sagar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm] kprobes: kretprobe user entry-handler (updated)
This is a repost of a patch which was reviewed earlier at: http://lkml.org/lkml/2007/11/13/58 (thanks to Jim Keniston and Srinivasa for their review comments). This provides support to add an optional user defined callback to be run at function entry of a kretprobe'd function. It also modifies the kprobe smoke tests to include an entry-handler during the kretprobe sanity test. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> --- diff -upNr -X /home/guest/patches/dontdiff.txt linux-2.6.24-rc8-mm1/Documentation/kprobes.txt linux-2.6.24-rc8-mm1_kp/Documentation/kprobes.txt --- linux-2.6.24-rc8-mm1/Documentation/kprobes.txt 2008-01-16 09:52:48.0 +0530 +++ linux-2.6.24-rc8-mm1_kp/Documentation/kprobes.txt 2008-01-26 17:12:43.0 +0530 @@ -96,7 +96,9 @@ or in registers (e.g., for x86_64 or for The jprobe will work in either case, so long as the handler's prototype matches that of the probed function. -1.3 How Does a Return Probe Work? +1.3 Return Probes + +1.3.1 How Does a Return Probe Work? When you call register_kretprobe(), Kprobes establishes a kprobe at the entry to the function. When the probed function is called and this @@ -107,9 +109,9 @@ At boot time, Kprobes registers a kprobe When the probed function executes its return instruction, control passes to the trampoline and that probe is hit. Kprobes' trampoline -handler calls the user-specified handler associated with the kretprobe, -then sets the saved instruction pointer to the saved return address, -and that's where execution resumes upon return from the trap. +handler calls the user-specified return handler associated with the +kretprobe, then sets the saved instruction pointer to the saved return +address, and that's where execution resumes upon return from the trap. While the probed function is executing, its return address is stored in an object of type kretprobe_instance. Before calling @@ -131,6 +133,30 @@ zero when the return probe is registered time the probed function is entered but there is no kretprobe_instance object available for establishing the return probe. +1.3.2 Kretprobe entry-handler + +Kretprobes also provides an optional user-specified handler which runs +on function entry. This handler is specified by setting the entry_handler +field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the +function entry is hit, the user-defined entry_handler, if any, is invoked. +If the entry_handler returns 0 (success) then a corresponding return handler +is guaranteed to be called upon function return. If the entry_handler +returns a non-zero error then Kprobes leaves the return address as is, and +the kretprobe has no further effect for that particular function instance. + +Multiple entry and return handler invocations are matched using the unique +kretprobe_instance object associated with them. Additionally, a user +may also specify per return-instance private data to be part of each +kretprobe_instance object. This is especially useful when sharing private +data between corresponding user entry and return handlers. The size of each +private data object can be specified at kretprobe registration time by +setting the data_size field of the kretprobe struct. This data can be +accessed through the data field of each kretprobe_instance object. + +In case probed function is entered but there is no kretprobe_instance +object available, then in addition to incrementing the nmissed count, +the user entry_handler invocation is also skipped. + 2. Architectures Supported Kprobes, jprobes, and return probes are implemented on the following @@ -273,6 +299,8 @@ of interest: - ret_addr: the return address - rp: points to the corresponding kretprobe object - task: points to the corresponding task struct +- data: points to per return-instance private data; see "Kretprobe + entry-handler" for details. The regs_return_value(regs) macro provides a simple abstraction to extract the return value from the appropriate register as defined by @@ -555,23 +583,52 @@ report failed calls to sys_open(). #include #include #include +#include + +/* per-instance private data */ +struct my_data { + ktime_t entry_stamp; +}; static const char *probed_func = "sys_open"; -/* Return-probe handler: If the probed function fails, log the return value. */ -static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs) +/* Timestamp function entry. */ +static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs) +{ + struct my_data *data; + + if(!current->mm) + return 1; /* skip kernel threads */ + + data = (struct my_data *)ri->data; + data->entry_stamp = ktime_get(); + return 0; +} + +/* If the probed function failed, log the return value and duration. + * Duration may turn out to be zero consistently, depending upon the + * granularity of time accounting on the plat
[PATCH -mm] kprobes: kretprobe user entry-handler (updated)
This is a repost of a patch which was reviewed earlier at: http://lkml.org/lkml/2007/11/13/58 (thanks to Jim Keniston and Srinivasa for their review comments). This provides support to add an optional user defined callback to be run at function entry of a kretprobe'd function. It also modifies the kprobe smoke tests to include an entry-handler during the kretprobe sanity test. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] --- diff -upNr -X /home/guest/patches/dontdiff.txt linux-2.6.24-rc8-mm1/Documentation/kprobes.txt linux-2.6.24-rc8-mm1_kp/Documentation/kprobes.txt --- linux-2.6.24-rc8-mm1/Documentation/kprobes.txt 2008-01-16 09:52:48.0 +0530 +++ linux-2.6.24-rc8-mm1_kp/Documentation/kprobes.txt 2008-01-26 17:12:43.0 +0530 @@ -96,7 +96,9 @@ or in registers (e.g., for x86_64 or for The jprobe will work in either case, so long as the handler's prototype matches that of the probed function. -1.3 How Does a Return Probe Work? +1.3 Return Probes + +1.3.1 How Does a Return Probe Work? When you call register_kretprobe(), Kprobes establishes a kprobe at the entry to the function. When the probed function is called and this @@ -107,9 +109,9 @@ At boot time, Kprobes registers a kprobe When the probed function executes its return instruction, control passes to the trampoline and that probe is hit. Kprobes' trampoline -handler calls the user-specified handler associated with the kretprobe, -then sets the saved instruction pointer to the saved return address, -and that's where execution resumes upon return from the trap. +handler calls the user-specified return handler associated with the +kretprobe, then sets the saved instruction pointer to the saved return +address, and that's where execution resumes upon return from the trap. While the probed function is executing, its return address is stored in an object of type kretprobe_instance. Before calling @@ -131,6 +133,30 @@ zero when the return probe is registered time the probed function is entered but there is no kretprobe_instance object available for establishing the return probe. +1.3.2 Kretprobe entry-handler + +Kretprobes also provides an optional user-specified handler which runs +on function entry. This handler is specified by setting the entry_handler +field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the +function entry is hit, the user-defined entry_handler, if any, is invoked. +If the entry_handler returns 0 (success) then a corresponding return handler +is guaranteed to be called upon function return. If the entry_handler +returns a non-zero error then Kprobes leaves the return address as is, and +the kretprobe has no further effect for that particular function instance. + +Multiple entry and return handler invocations are matched using the unique +kretprobe_instance object associated with them. Additionally, a user +may also specify per return-instance private data to be part of each +kretprobe_instance object. This is especially useful when sharing private +data between corresponding user entry and return handlers. The size of each +private data object can be specified at kretprobe registration time by +setting the data_size field of the kretprobe struct. This data can be +accessed through the data field of each kretprobe_instance object. + +In case probed function is entered but there is no kretprobe_instance +object available, then in addition to incrementing the nmissed count, +the user entry_handler invocation is also skipped. + 2. Architectures Supported Kprobes, jprobes, and return probes are implemented on the following @@ -273,6 +299,8 @@ of interest: - ret_addr: the return address - rp: points to the corresponding kretprobe object - task: points to the corresponding task struct +- data: points to per return-instance private data; see Kretprobe + entry-handler for details. The regs_return_value(regs) macro provides a simple abstraction to extract the return value from the appropriate register as defined by @@ -555,23 +583,52 @@ report failed calls to sys_open(). #include linux/kernel.h #include linux/module.h #include linux/kprobes.h +#include linux/ktime.h + +/* per-instance private data */ +struct my_data { + ktime_t entry_stamp; +}; static const char *probed_func = sys_open; -/* Return-probe handler: If the probed function fails, log the return value. */ -static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs) +/* Timestamp function entry. */ +static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs) +{ + struct my_data *data; + + if(!current-mm) + return 1; /* skip kernel threads */ + + data = (struct my_data *)ri-data; + data-entry_stamp = ktime_get(); + return 0; +} + +/* If the probed function failed, log the return value and duration. + * Duration may turn out to be zero consistently, depending upon the + * granularity of time accounting
[PATCH] x86: Fix singlestep handling in reenter_kprobe
Highlight peculiar cases in singles-step kprobe handling. In reenter_kprobe(), a breakpoint in KPROBE_HIT_SS case can only occur when single-stepping a breakpoint on which a probe was installed. Since such probes are single-stepped inline, identifying these cases is unambiguous. All other cases leading up to KPROBE_HIT_SS are possible bugs. Identify and WARN_ON such cases. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index 93aff49..afb5c96 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -443,17 +443,6 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, *sara = (unsigned long) _trampoline; } -static void __kprobes recursive_singlestep(struct kprobe *p, - struct pt_regs *regs, - struct kprobe_ctlblk *kcb) -{ - save_previous_kprobe(kcb); - set_current_kprobe(p, regs, kcb); - kprobes_inc_nmissed_count(p); - prepare_singlestep(p, regs); - kcb->kprobe_status = KPROBE_REENTER; -} - static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb) { @@ -492,20 +481,29 @@ static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, break; #endif case KPROBE_HIT_ACTIVE: - recursive_singlestep(p, regs, kcb); + save_previous_kprobe(kcb); + set_current_kprobe(p, regs, kcb); + kprobes_inc_nmissed_count(p); + prepare_singlestep(p, regs); + kcb->kprobe_status = KPROBE_REENTER; break; case KPROBE_HIT_SS: - if (*p->ainsn.insn == BREAKPOINT_INSTRUCTION) { + if (p == kprobe_running()) { regs->flags &= ~TF_MASK; regs->flags |= kcb->kprobe_saved_flags; return 0; } else { - recursive_singlestep(p, regs, kcb); + /* A probe has been hit in the codepath leading up +* to, or just after, single-stepping of a probed +* instruction. This entire codepath should strictly +* reside in .kprobes.text section. Raise a warning +* to highlight this peculiar case. +*/ } - break; default: /* impossible cases */ WARN_ON(1); + return 0; } return 1; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: Fix singlestep handling in reenter_kprobe
Highlight peculiar cases in singles-step kprobe handling. In reenter_kprobe(), a breakpoint in KPROBE_HIT_SS case can only occur when single-stepping a breakpoint on which a probe was installed. Since such probes are single-stepped inline, identifying these cases is unambiguous. All other cases leading up to KPROBE_HIT_SS are possible bugs. Identify and WARN_ON such cases. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index 93aff49..afb5c96 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -443,17 +443,6 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, *sara = (unsigned long) kretprobe_trampoline; } -static void __kprobes recursive_singlestep(struct kprobe *p, - struct pt_regs *regs, - struct kprobe_ctlblk *kcb) -{ - save_previous_kprobe(kcb); - set_current_kprobe(p, regs, kcb); - kprobes_inc_nmissed_count(p); - prepare_singlestep(p, regs); - kcb-kprobe_status = KPROBE_REENTER; -} - static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb) { @@ -492,20 +481,29 @@ static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, break; #endif case KPROBE_HIT_ACTIVE: - recursive_singlestep(p, regs, kcb); + save_previous_kprobe(kcb); + set_current_kprobe(p, regs, kcb); + kprobes_inc_nmissed_count(p); + prepare_singlestep(p, regs); + kcb-kprobe_status = KPROBE_REENTER; break; case KPROBE_HIT_SS: - if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) { + if (p == kprobe_running()) { regs-flags = ~TF_MASK; regs-flags |= kcb-kprobe_saved_flags; return 0; } else { - recursive_singlestep(p, regs, kcb); + /* A probe has been hit in the codepath leading up +* to, or just after, single-stepping of a probed +* instruction. This entire codepath should strictly +* reside in .kprobes.text section. Raise a warning +* to highlight this peculiar case. +*/ } - break; default: /* impossible cases */ WARN_ON(1); + return 0; } return 1; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: kprobes change kprobe_handler flow
On 1/4/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote: > > + case KPROBE_HIT_SS: > > + if (*p->ainsn.insn == BREAKPOINT_INSTRUCTION) { > > + regs->flags &= ~TF_MASK; > > + regs->flags |= kcb->kprobe_saved_flags; > > + return 0; > > + } else { > > + recursive_singlestep(p, regs, kcb); > > + } > > + break; > > + default: > > + /* impossible cases */ > > + WARN_ON(1); > > WARN_ON() does not call panic(). Since the kernel doesn't stop, > we need to prepare singlestep after that. We shouldn't panic in 'default'. First, we'll never hit this case, and if we do then we can be sure that the fault is not due to a probe. So instead of singlestepping we'll let the kernel handle it. If it cant, it'll panic/die() for us. I'll try to cleanup this case and incorporate these suggestions in a separate patch, as u suggested. > Masami Hiramatsu > > Software Engineer > Hitachi Computer Products (America) Inc. > Software Solutions Division > > e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED] -- Thanks, Abhishek Sagar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: kprobes change kprobe_handler flow
On 1/4/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote: > I could understand what the original code did at last. > If a kprobe is inserted on a breakpoint which other debugger inserts, > it single step inline instead of out-of-line.(this is done in > prepare_singlestep) > In this case, (p && kprobe_running() && kcb->kprobe_status==KPROBE_HIT_SS) > is true and we need pass the control to the debugger. > And if (*p->ainsn.insn != BREAKPOINT_INSTRUCTION) (or (p != > kprobe_running())) in > that case, there may be some bugs. Yes, we can only fault while singlestepping for a unique case, which is when we're singlestepping (in-line) a breakpoint because a probe was installed on it. All other scenarios are a BUG . That's also assuming that no exception will preempt singlestepping, whose codepath has a probe on it. > Now I think your original suggestion is correct. > Please fix it in another patch. Ok. > -- > Masami Hiramatsu > > Software Engineer > Hitachi Computer Products (America) Inc. > Software Solutions Division > > e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED] Thanks, Abhishek Sagar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: kprobes change kprobe_handler flow
Masami Hiramatsu wrote: >>> As a result, this function just setups re-entrance. >> As you've also pointed out in your previous reply, this case is >> peculiar and therefore I believe it should be marked as a BUG(). I've >> left the original case, if (kcb->kprobe_status==KPROBE_HIT_SS) && >> (*p->ainsn.insn == BREAKPOINT_INSTRUCTION), untouched and is handled >> as it was before. However, if (kcb->kprobe_status==KPROBE_HIT_SS) && >> !(*p->ainsn.insn == BREAKPOINT_INSTRUCTION), then instead of >> incrementing nmissed count like before, it should cry out a BUG. This >> is not an ordinary recursive probe handling case which should update >> the nmissed count. > > Hmm, I can not agree, because it is possible to insert a kprobe > into kprobe's instruction buffer. If it should be a bug, we must > check it when registering the kprobe. > (And also, in *p->ainsn.insn == BREAKPOINT_INSTRUCTION case, I doubt > that the kernel can handle this "orphaned" breakpoint, because the > breakpoint address has been changed.) > > Anyway, if you would like to change the logic, please separate it from > cleanup patch. Okay. For now, I've restored the original behavior. >>>> -out: >>>> + if (!ret) >>>> + preempt_enable_no_resched(); >>> I think this is a bit ugly... >>> Why would you avoid using mutiple "return"s in this function? >>> I think you can remove "ret" from this function. >> Hmm...there the are no deeply nested if-elses nor overlapping paths >> which need to be avoided. All the nested checks unroll cleanly too. > > Oh, I just mentioned about this "if (!ret)" condition. > Could you try to remove this "ret" variable? > I think some "ret=1" could be replaced by "return 1" easily. Done. You should find the desired changed in this patch. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> Signed-off-by: Quentin Barnes <[EMAIL PROTECTED]> --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index a72e02b..06f8d08 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -441,6 +441,34 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, /* Replace the return addr with trampoline addr */ *sara = (unsigned long) _trampoline; } + +static void __kprobes recursive_singlestep(struct kprobe *p, + struct pt_regs *regs, + struct kprobe_ctlblk *kcb) +{ + save_previous_kprobe(kcb); + set_current_kprobe(p, regs, kcb); + kprobes_inc_nmissed_count(p); + prepare_singlestep(p, regs); + kcb->kprobe_status = KPROBE_REENTER; +} + +static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs, + struct kprobe_ctlblk *kcb) +{ +#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM) + if (p->ainsn.boostable == 1 && !p->post_handler) { + /* Boost up -- we can execute copied instructions directly */ + reset_current_kprobe(); + regs->ip = (unsigned long)p->ainsn.insn; + preempt_enable_no_resched(); + return; + } +#endif + prepare_singlestep(p, regs); + kcb->kprobe_status = KPROBE_HIT_SS; +} + /* * We have reentered the kprobe_handler(), since another probe was hit while * within the handler. We save the original kprobes variables and just single @@ -449,13 +477,9 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb) { - if (kcb->kprobe_status == KPROBE_HIT_SS && - *p->ainsn.insn == BREAKPOINT_INSTRUCTION) { - regs->flags &= ~X86_EFLAGS_TF; - regs->flags |= kcb->kprobe_saved_flags; - return 0; + switch (kcb->kprobe_status) { + case KPROBE_HIT_SSDONE: #ifdef CONFIG_X86_64 - } else if (kcb->kprobe_status == KPROBE_HIT_SSDONE) { /* TODO: Provide re-entrancy from post_kprobes_handler() and * avoid exception stack corruption while single-stepping on * the instruction of the new probe. @@ -463,14 +487,26 @@ static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, arch_disarm_kprobe(p); regs->ip = (unsigned long)p->addr; reset_current_kprobe(); - return 1; + preempt_enable_no_resched(); + break; #endif + case KPROBE_HIT_ACTIVE: + recursive_singles
Re: [PATCH] x86: kprobes change kprobe_handler flow
Masami Hiramatsu wrote: As a result, this function just setups re-entrance. As you've also pointed out in your previous reply, this case is peculiar and therefore I believe it should be marked as a BUG(). I've left the original case, if (kcb-kprobe_status==KPROBE_HIT_SS) (*p-ainsn.insn == BREAKPOINT_INSTRUCTION), untouched and is handled as it was before. However, if (kcb-kprobe_status==KPROBE_HIT_SS) !(*p-ainsn.insn == BREAKPOINT_INSTRUCTION), then instead of incrementing nmissed count like before, it should cry out a BUG. This is not an ordinary recursive probe handling case which should update the nmissed count. Hmm, I can not agree, because it is possible to insert a kprobe into kprobe's instruction buffer. If it should be a bug, we must check it when registering the kprobe. (And also, in *p-ainsn.insn == BREAKPOINT_INSTRUCTION case, I doubt that the kernel can handle this orphaned breakpoint, because the breakpoint address has been changed.) Anyway, if you would like to change the logic, please separate it from cleanup patch. Okay. For now, I've restored the original behavior. -out: + if (!ret) + preempt_enable_no_resched(); I think this is a bit ugly... Why would you avoid using mutiple returns in this function? I think you can remove ret from this function. Hmm...there the are no deeply nested if-elses nor overlapping paths which need to be avoided. All the nested checks unroll cleanly too. Oh, I just mentioned about this if (!ret) condition. Could you try to remove this ret variable? I think some ret=1 could be replaced by return 1 easily. Done. You should find the desired changed in this patch. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] Signed-off-by: Quentin Barnes [EMAIL PROTECTED] --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index a72e02b..06f8d08 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -441,6 +441,34 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, /* Replace the return addr with trampoline addr */ *sara = (unsigned long) kretprobe_trampoline; } + +static void __kprobes recursive_singlestep(struct kprobe *p, + struct pt_regs *regs, + struct kprobe_ctlblk *kcb) +{ + save_previous_kprobe(kcb); + set_current_kprobe(p, regs, kcb); + kprobes_inc_nmissed_count(p); + prepare_singlestep(p, regs); + kcb-kprobe_status = KPROBE_REENTER; +} + +static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs, + struct kprobe_ctlblk *kcb) +{ +#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM) + if (p-ainsn.boostable == 1 !p-post_handler) { + /* Boost up -- we can execute copied instructions directly */ + reset_current_kprobe(); + regs-ip = (unsigned long)p-ainsn.insn; + preempt_enable_no_resched(); + return; + } +#endif + prepare_singlestep(p, regs); + kcb-kprobe_status = KPROBE_HIT_SS; +} + /* * We have reentered the kprobe_handler(), since another probe was hit while * within the handler. We save the original kprobes variables and just single @@ -449,13 +477,9 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb) { - if (kcb-kprobe_status == KPROBE_HIT_SS - *p-ainsn.insn == BREAKPOINT_INSTRUCTION) { - regs-flags = ~X86_EFLAGS_TF; - regs-flags |= kcb-kprobe_saved_flags; - return 0; + switch (kcb-kprobe_status) { + case KPROBE_HIT_SSDONE: #ifdef CONFIG_X86_64 - } else if (kcb-kprobe_status == KPROBE_HIT_SSDONE) { /* TODO: Provide re-entrancy from post_kprobes_handler() and * avoid exception stack corruption while single-stepping on * the instruction of the new probe. @@ -463,14 +487,26 @@ static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, arch_disarm_kprobe(p); regs-ip = (unsigned long)p-addr; reset_current_kprobe(); - return 1; + preempt_enable_no_resched(); + break; #endif + case KPROBE_HIT_ACTIVE: + recursive_singlestep(p, regs, kcb); + break; + case KPROBE_HIT_SS: + if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) { + regs-flags = ~TF_MASK; + regs-flags |= kcb-kprobe_saved_flags; + return 0; + } else { + recursive_singlestep(p, regs, kcb); + } + break; + default: + /* impossible cases
Re: [PATCH] x86: kprobes change kprobe_handler flow
On 1/4/08, Masami Hiramatsu [EMAIL PROTECTED] wrote: I could understand what the original code did at last. If a kprobe is inserted on a breakpoint which other debugger inserts, it single step inline instead of out-of-line.(this is done in prepare_singlestep) In this case, (p kprobe_running() kcb-kprobe_status==KPROBE_HIT_SS) is true and we need pass the control to the debugger. And if (*p-ainsn.insn != BREAKPOINT_INSTRUCTION) (or (p != kprobe_running())) in that case, there may be some bugs. Yes, we can only fault while singlestepping for a unique case, which is when we're singlestepping (in-line) a breakpoint because a probe was installed on it. All other scenarios are a BUG . That's also assuming that no exception will preempt singlestepping, whose codepath has a probe on it. Now I think your original suggestion is correct. Please fix it in another patch. Ok. -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED] Thanks, Abhishek Sagar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: kprobes change kprobe_handler flow
On 1/4/08, Masami Hiramatsu [EMAIL PROTECTED] wrote: + case KPROBE_HIT_SS: + if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) { + regs-flags = ~TF_MASK; + regs-flags |= kcb-kprobe_saved_flags; + return 0; + } else { + recursive_singlestep(p, regs, kcb); + } + break; + default: + /* impossible cases */ + WARN_ON(1); WARN_ON() does not call panic(). Since the kernel doesn't stop, we need to prepare singlestep after that. We shouldn't panic in 'default'. First, we'll never hit this case, and if we do then we can be sure that the fault is not due to a probe. So instead of singlestepping we'll let the kernel handle it. If it cant, it'll panic/die() for us. I'll try to cleanup this case and incorporate these suggestions in a separate patch, as u suggested. Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED] -- Thanks, Abhishek Sagar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: kprobes change kprobe_handler flow
On 1/2/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote: > I think setup_singlestep() in your first patch is better, because it avoided > code duplication(*). Will retain it then. > > static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, > > struct kprobe_ctlblk *kcb) > > { > > - if (kcb->kprobe_status == KPROBE_HIT_SS && > > - *p->ainsn.insn == BREAKPOINT_INSTRUCTION) { > > - regs->flags &= ~X86_EFLAGS_TF; > > - regs->flags |= kcb->kprobe_saved_flags; > > - return 0; > > + int ret = 0; > > + > > + switch (kcb->kprobe_status) { > > + case KPROBE_HIT_SSDONE: > > #ifdef CONFIG_X86_64 > > - } else if (kcb->kprobe_status == KPROBE_HIT_SSDONE) { > > - /* TODO: Provide re-entrancy from post_kprobes_handler() and > > - * avoid exception stack corruption while single-stepping on > > + /* TODO: Provide re-entrancy from > > + * post_kprobes_handler() and avoid exception > > + * stack corruption while single-stepping on > > Why would you modify it? Do you mean the comment? I had this code in kprobe_handler earlier and it consistently exceeded 80 columns in my case. Will fix it anyways. > > + ret = 1; > > + break; > > #endif > > + case KPROBE_HIT_ACTIVE: > > + /* a probe has been hit inside a > > + * user handler */ > > + save_previous_kprobe(kcb); > > + set_current_kprobe(p, regs, kcb); > > + kprobes_inc_nmissed_count(p); > > + prepare_singlestep(p, regs); > > + kcb->kprobe_status = KPROBE_REENTER; > > + ret = 1; > > + break; > > + case KPROBE_HIT_SS: > > + if (*p->ainsn.insn == BREAKPOINT_INSTRUCTION) { > > + regs->flags &= ~TF_MASK; > > + regs->flags |= kcb->kprobe_saved_flags; > > + } else { > > + /* BUG? */ > > + } > > + break; > > If my thought is correct, we don't need to use swich-case here, > Because there are only 2 cases, KPROBE_HIT_SSDONE (x86-64 only) > or others. > As a result, this function just setups re-entrance. As you've also pointed out in your previous reply, this case is peculiar and therefore I believe it should be marked as a BUG(). I've left the original case, if (kcb->kprobe_status==KPROBE_HIT_SS) && (*p->ainsn.insn == BREAKPOINT_INSTRUCTION), untouched and is handled as it was before. However, if (kcb->kprobe_status==KPROBE_HIT_SS) && !(*p->ainsn.insn == BREAKPOINT_INSTRUCTION), then instead of incrementing nmissed count like before, it should cry out a BUG. This is not an ordinary recursive probe handling case which should update the nmissed count. > > + default: > > + /* impossible cases */ > > + BUG(); > > I think no need to check that here. It may not get hit at runtime but is quite informative. > > static int __kprobes kprobe_handler(struct pt_regs *regs) > > { > > - struct kprobe *p; > > int ret = 0; > > kprobe_opcode_t *addr; > > + struct kprobe *p, *cur; > > struct kprobe_ctlblk *kcb; > > > > addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t)); > > + if (*addr != BREAKPOINT_INSTRUCTION) { > > + /* > > + * The breakpoint instruction was removed right > > + * after we hit it. Another cpu has removed > > + * either a probepoint or a debugger breakpoint > > + * at this address. In either case, no further > > + * handling of this interrupt is appropriate. > > + * Back up over the (now missing) int3 and run > > + * the original instruction. > > + */ > > + regs->ip = (unsigned long)addr; > > + return 1; > > + } > > I think this block changing would better be separated from this patch, > because it changes code flow logically. Agreed. I'll address this in another thread, but for now it is safest to check it for all cases right at the entry of kprobe_handler(). > > > > - /* > > - * We don't want to be preempted for the entire > > - * duration of kprobe processing > > - */ > > preempt_disable(); > > kcb = get_kprobe_ctlblk(); > > - > > + cur = kprobe_running(); > > I think you don't need "cur", because kprobe_running() is called > just once on each path. Will get rid of that. > > - regs->ip = (unsigned long)addr; > > + if (!p->pre_handler || !p->pre_handler(p, regs)) { > > + if (setup_boost(p, regs)) { > > + prepare_singlestep(p, regs); > > + kcb->kprobe_status = KPROBE_HIT_SS; > > (*) duplication 1 > > > + } > > + }
Re: [PATCH] x86: kprobes change kprobe_handler flow
On 1/2/08, Masami Hiramatsu [EMAIL PROTECTED] wrote: I think setup_singlestep() in your first patch is better, because it avoided code duplication(*). Will retain it then. static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb) { - if (kcb-kprobe_status == KPROBE_HIT_SS - *p-ainsn.insn == BREAKPOINT_INSTRUCTION) { - regs-flags = ~X86_EFLAGS_TF; - regs-flags |= kcb-kprobe_saved_flags; - return 0; + int ret = 0; + + switch (kcb-kprobe_status) { + case KPROBE_HIT_SSDONE: #ifdef CONFIG_X86_64 - } else if (kcb-kprobe_status == KPROBE_HIT_SSDONE) { - /* TODO: Provide re-entrancy from post_kprobes_handler() and - * avoid exception stack corruption while single-stepping on + /* TODO: Provide re-entrancy from + * post_kprobes_handler() and avoid exception + * stack corruption while single-stepping on Why would you modify it? Do you mean the comment? I had this code in kprobe_handler earlier and it consistently exceeded 80 columns in my case. Will fix it anyways. + ret = 1; + break; #endif + case KPROBE_HIT_ACTIVE: + /* a probe has been hit inside a + * user handler */ + save_previous_kprobe(kcb); + set_current_kprobe(p, regs, kcb); + kprobes_inc_nmissed_count(p); + prepare_singlestep(p, regs); + kcb-kprobe_status = KPROBE_REENTER; + ret = 1; + break; + case KPROBE_HIT_SS: + if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) { + regs-flags = ~TF_MASK; + regs-flags |= kcb-kprobe_saved_flags; + } else { + /* BUG? */ + } + break; If my thought is correct, we don't need to use swich-case here, Because there are only 2 cases, KPROBE_HIT_SSDONE (x86-64 only) or others. As a result, this function just setups re-entrance. As you've also pointed out in your previous reply, this case is peculiar and therefore I believe it should be marked as a BUG(). I've left the original case, if (kcb-kprobe_status==KPROBE_HIT_SS) (*p-ainsn.insn == BREAKPOINT_INSTRUCTION), untouched and is handled as it was before. However, if (kcb-kprobe_status==KPROBE_HIT_SS) !(*p-ainsn.insn == BREAKPOINT_INSTRUCTION), then instead of incrementing nmissed count like before, it should cry out a BUG. This is not an ordinary recursive probe handling case which should update the nmissed count. + default: + /* impossible cases */ + BUG(); I think no need to check that here. It may not get hit at runtime but is quite informative. static int __kprobes kprobe_handler(struct pt_regs *regs) { - struct kprobe *p; int ret = 0; kprobe_opcode_t *addr; + struct kprobe *p, *cur; struct kprobe_ctlblk *kcb; addr = (kprobe_opcode_t *)(regs-ip - sizeof(kprobe_opcode_t)); + if (*addr != BREAKPOINT_INSTRUCTION) { + /* + * The breakpoint instruction was removed right + * after we hit it. Another cpu has removed + * either a probepoint or a debugger breakpoint + * at this address. In either case, no further + * handling of this interrupt is appropriate. + * Back up over the (now missing) int3 and run + * the original instruction. + */ + regs-ip = (unsigned long)addr; + return 1; + } I think this block changing would better be separated from this patch, because it changes code flow logically. Agreed. I'll address this in another thread, but for now it is safest to check it for all cases right at the entry of kprobe_handler(). - /* - * We don't want to be preempted for the entire - * duration of kprobe processing - */ preempt_disable(); kcb = get_kprobe_ctlblk(); - + cur = kprobe_running(); I think you don't need cur, because kprobe_running() is called just once on each path. Will get rid of that. - regs-ip = (unsigned long)addr; + if (!p-pre_handler || !p-pre_handler(p, regs)) { + if (setup_boost(p, regs)) { + prepare_singlestep(p, regs); + kcb-kprobe_status = KPROBE_HIT_SS; (*) duplication 1 + } + } ret = 1; - goto preempt_out; } - if (kprobe_running()) { - p = __get_cpu_var(current_kprobe); -
Re: [PATCH] x86: kprobes change kprobe_handler flow
On 1/2/08, Harvey Harrison <[EMAIL PROTECTED]> wrote: > > +#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM) > > +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs > > *regs) > > +{ > > + if (p->ainsn.boostable == 1 && !p->post_handler) { > > + /* Boost up -- we can execute copied instructions directly */ > > + reset_current_kprobe(); > > + regs->ip = (unsigned long)p->ainsn.insn; > > + preempt_enable_no_resched(); > > + return 0; > > + } > > + return 1; > > +} > > +#else > > +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs > > *regs) > > +{ > > + return 1; > > +} > > +#endif > > + > In the kernel __always_inline == inline, also I think it's nicer to only > have one function declaration, and then ifdef the body of the function. > > Something like: > > static inline int setup_boost(struct kprobe *p, struct pt_regs *regs) > { > #if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM) > if (p->ainsn.boostable == 1 && !p->post_handler) { > /* Boost up -- we can execute copied instructions directly */ > reset_current_kprobe(); > regs->ip = (unsigned long)p->ainsn.insn; > preempt_enable_no_resched(); > return 0; > } > #endif > return 1; > } Ok...will include this after I pick up some more comments. > > static int __kprobes kprobe_handler(struct pt_regs *regs) > > { > > - struct kprobe *p; > > int ret = 0; > > kprobe_opcode_t *addr; > > + struct kprobe *p, *cur; > > struct kprobe_ctlblk *kcb; > > > > addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t)); > > + if (*addr != BREAKPOINT_INSTRUCTION) { > > + /* > > + * The breakpoint instruction was removed right > > + * after we hit it. Another cpu has removed > > + * either a probepoint or a debugger breakpoint > > + * at this address. In either case, no further > > + * handling of this interrupt is appropriate. > > + * Back up over the (now missing) int3 and run > > + * the original instruction. > > + */ > > + regs->ip = (unsigned long)addr; > > + return 1; > > + } > > This return is fine I guess, but after the preempt_disable() I like > the goto approach as it will be easier to see what paths enable > preemption again and which don'tbonus points if we can move this > to the caller or make sure we reenable in all cases before returning > and pull in the code in the caller that does this for us. > > But I guess your approach of using ret to test whether we need to > reenable preemption or not would work as a signal to the caller that > they need to reenable preemption. Hmm...since enabling preemption is tied to 'ret', anyone reading kprobe_handler will have to follow around all calls which modify it. There are some checks in the current kprobe_handler definition made just to do what you're saying, i.e, to push all preemption enable/disables in krpobe_handler. LIke this one (from the current x86 kprobe_handler): ret = reenter_kprobe(p, regs, kcb); if (kcb->kprobe_status == KPROBE_REENTER) { ret = 1; goto out; } goto preempt_out; - This is just confusing because we're not actually making any exceptions here for the KPROBE_REENTER case (which has been partially handled in reenter_kprobe), rather just tricking our way out of preemption enabling for a cpl of cases in reenter_kprobe. > Cheers, > > Harvey Thanks, Abhishek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: kprobes change kprobe_handler flow
t; > + /* a probe has been hit inside a > > + * user handler */ > > + save_previous_kprobe(kcb); > > + set_current_kprobe(p, regs, kcb); > > + kprobes_inc_nmissed_count(p); > > + prepare_singlestep(p, regs); > > + kcb->kprobe_status = KPROBE_REENTER; > > ret = 1; > > - goto no_kprobe; > > - } > > - p = __get_cpu_var(current_kprobe); > > - if (p->break_handler && p->break_handler(p, regs)) { > > - goto ss_probe; > > + break; > > + case KPROBE_HIT_SS: > > + if (*p->ainsn.insn == BREAKPOINT_INSTRUCTION) > > { > > + regs->eflags &= ~TF_MASK; > > + regs->eflags |= > > + kcb->kprobe_saved_eflags; > > + } else { > > + /* BUG? */ > > + } > > + break; > > + default: > > + /* impossible cases */ > > + BUG(); > > } > > - } Replaced deeply nested if-elses with a switch. Please let me know if there are any changes on which you would like further clarification. > -- > Masami Hiramatsu > > Software Engineer > Hitachi Computer Products (America) Inc. > Software Solutions Division > > e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED] -- Thanks, Abhishek Sagar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: kprobes change kprobe_handler flow
Ingo Molnar wrote: > hm, this patch does not apply to x86.git#mm, due to the fixes, > unifications and cleanups done there. Could you send a patch against -mm > or against x86.git? (see the tree-fetching instructions below) Thanks, > > Ingo > > --{ x86.git instructions }--> > > git-clone > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git > linux-2.6.git > cd linux-2.6.git > git-branch x86 > git-checkout x86 > git-pull git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git > mm > > (do subsequent pulls via "git-pull --force", as we frequently rebase the > git tree. NOTE: this might override your own local changes, so do this > only if you dont mind about losing thse changes in that tree.) > Thanks for pointing me to the right tree. I have made some code re-arrangements in this one. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> Signed-off-by: Quentin Barnes <[EMAIL PROTECTED]> --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index a72e02b..45adc8e 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -441,6 +441,26 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, /* Replace the return addr with trampoline addr */ *sara = (unsigned long) _trampoline; } + +#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM) +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs *regs) +{ + if (p->ainsn.boostable == 1 && !p->post_handler) { + /* Boost up -- we can execute copied instructions directly */ + reset_current_kprobe(); + regs->ip = (unsigned long)p->ainsn.insn; + preempt_enable_no_resched(); + return 0; + } + return 1; +} +#else +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs *regs) +{ + return 1; +} +#endif + /* * We have reentered the kprobe_handler(), since another probe was hit while * within the handler. We save the original kprobes variables and just single @@ -449,29 +469,47 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb) { - if (kcb->kprobe_status == KPROBE_HIT_SS && - *p->ainsn.insn == BREAKPOINT_INSTRUCTION) { - regs->flags &= ~X86_EFLAGS_TF; - regs->flags |= kcb->kprobe_saved_flags; - return 0; + int ret = 0; + + switch (kcb->kprobe_status) { + case KPROBE_HIT_SSDONE: #ifdef CONFIG_X86_64 - } else if (kcb->kprobe_status == KPROBE_HIT_SSDONE) { - /* TODO: Provide re-entrancy from post_kprobes_handler() and -* avoid exception stack corruption while single-stepping on + /* TODO: Provide re-entrancy from +* post_kprobes_handler() and avoid exception +* stack corruption while single-stepping on * the instruction of the new probe. */ arch_disarm_kprobe(p); regs->ip = (unsigned long)p->addr; reset_current_kprobe(); - return 1; + preempt_enable_no_resched(); + ret = 1; + break; #endif + case KPROBE_HIT_ACTIVE: + /* a probe has been hit inside a +* user handler */ + save_previous_kprobe(kcb); + set_current_kprobe(p, regs, kcb); + kprobes_inc_nmissed_count(p); + prepare_singlestep(p, regs); + kcb->kprobe_status = KPROBE_REENTER; + ret = 1; + break; + case KPROBE_HIT_SS: + if (*p->ainsn.insn == BREAKPOINT_INSTRUCTION) { + regs->flags &= ~TF_MASK; + regs->flags |= kcb->kprobe_saved_flags; + } else { + /* BUG? */ + } + break; + default: + /* impossible cases */ + BUG(); } - save_previous_kprobe(kcb); - set_current_kprobe(p, regs, kcb); - kprobes_inc_nmissed_count(p); - prepare_singlestep(p, regs); - kcb->kprobe_status = KPROBE_REENTER; - return 1; + + return ret; } /* @@ -480,82 +518,67 @@ static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, */ static int __kprobes kprobe_handler(struct pt_regs *regs) { - struct kprobe *p; int ret = 0; kprobe_opcode_t *addr; + struct kprobe *p, *cur; struct kprobe_ctlblk *kcb; addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t)); +
Re: [PATCH] x86: kprobes change kprobe_handler flow
Ingo Molnar wrote: hm, this patch does not apply to x86.git#mm, due to the fixes, unifications and cleanups done there. Could you send a patch against -mm or against x86.git? (see the tree-fetching instructions below) Thanks, Ingo --{ x86.git instructions }-- git-clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-2.6.git cd linux-2.6.git git-branch x86 git-checkout x86 git-pull git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git mm (do subsequent pulls via git-pull --force, as we frequently rebase the git tree. NOTE: this might override your own local changes, so do this only if you dont mind about losing thse changes in that tree.) Thanks for pointing me to the right tree. I have made some code re-arrangements in this one. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] Signed-off-by: Quentin Barnes [EMAIL PROTECTED] --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index a72e02b..45adc8e 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -441,6 +441,26 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, /* Replace the return addr with trampoline addr */ *sara = (unsigned long) kretprobe_trampoline; } + +#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM) +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs *regs) +{ + if (p-ainsn.boostable == 1 !p-post_handler) { + /* Boost up -- we can execute copied instructions directly */ + reset_current_kprobe(); + regs-ip = (unsigned long)p-ainsn.insn; + preempt_enable_no_resched(); + return 0; + } + return 1; +} +#else +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs *regs) +{ + return 1; +} +#endif + /* * We have reentered the kprobe_handler(), since another probe was hit while * within the handler. We save the original kprobes variables and just single @@ -449,29 +469,47 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb) { - if (kcb-kprobe_status == KPROBE_HIT_SS - *p-ainsn.insn == BREAKPOINT_INSTRUCTION) { - regs-flags = ~X86_EFLAGS_TF; - regs-flags |= kcb-kprobe_saved_flags; - return 0; + int ret = 0; + + switch (kcb-kprobe_status) { + case KPROBE_HIT_SSDONE: #ifdef CONFIG_X86_64 - } else if (kcb-kprobe_status == KPROBE_HIT_SSDONE) { - /* TODO: Provide re-entrancy from post_kprobes_handler() and -* avoid exception stack corruption while single-stepping on + /* TODO: Provide re-entrancy from +* post_kprobes_handler() and avoid exception +* stack corruption while single-stepping on * the instruction of the new probe. */ arch_disarm_kprobe(p); regs-ip = (unsigned long)p-addr; reset_current_kprobe(); - return 1; + preempt_enable_no_resched(); + ret = 1; + break; #endif + case KPROBE_HIT_ACTIVE: + /* a probe has been hit inside a +* user handler */ + save_previous_kprobe(kcb); + set_current_kprobe(p, regs, kcb); + kprobes_inc_nmissed_count(p); + prepare_singlestep(p, regs); + kcb-kprobe_status = KPROBE_REENTER; + ret = 1; + break; + case KPROBE_HIT_SS: + if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) { + regs-flags = ~TF_MASK; + regs-flags |= kcb-kprobe_saved_flags; + } else { + /* BUG? */ + } + break; + default: + /* impossible cases */ + BUG(); } - save_previous_kprobe(kcb); - set_current_kprobe(p, regs, kcb); - kprobes_inc_nmissed_count(p); - prepare_singlestep(p, regs); - kcb-kprobe_status = KPROBE_REENTER; - return 1; + + return ret; } /* @@ -480,82 +518,67 @@ static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs, */ static int __kprobes kprobe_handler(struct pt_regs *regs) { - struct kprobe *p; int ret = 0; kprobe_opcode_t *addr; + struct kprobe *p, *cur; struct kprobe_ctlblk *kcb; addr = (kprobe_opcode_t *)(regs-ip - sizeof(kprobe_opcode_t)); + if (*addr != BREAKPOINT_INSTRUCTION) { + /* +* The breakpoint instruction was removed right +* after we hit it. Another cpu has removed +* either
Re: [PATCH] x86: kprobes change kprobe_handler flow
; + break; + case KPROBE_HIT_SS: + if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) { + regs-eflags = ~TF_MASK; + regs-eflags |= + kcb-kprobe_saved_eflags; + } else { + /* BUG? */ + } + break; + default: + /* impossible cases */ + BUG(); } - } Replaced deeply nested if-elses with a switch. Please let me know if there are any changes on which you would like further clarification. -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED] -- Thanks, Abhishek Sagar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: kprobes change kprobe_handler flow
On 1/2/08, Harvey Harrison [EMAIL PROTECTED] wrote: +#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM) +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs *regs) +{ + if (p-ainsn.boostable == 1 !p-post_handler) { + /* Boost up -- we can execute copied instructions directly */ + reset_current_kprobe(); + regs-ip = (unsigned long)p-ainsn.insn; + preempt_enable_no_resched(); + return 0; + } + return 1; +} +#else +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs *regs) +{ + return 1; +} +#endif + In the kernel __always_inline == inline, also I think it's nicer to only have one function declaration, and then ifdef the body of the function. Something like: static inline int setup_boost(struct kprobe *p, struct pt_regs *regs) { #if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM) if (p-ainsn.boostable == 1 !p-post_handler) { /* Boost up -- we can execute copied instructions directly */ reset_current_kprobe(); regs-ip = (unsigned long)p-ainsn.insn; preempt_enable_no_resched(); return 0; } #endif return 1; } Ok...will include this after I pick up some more comments. static int __kprobes kprobe_handler(struct pt_regs *regs) { - struct kprobe *p; int ret = 0; kprobe_opcode_t *addr; + struct kprobe *p, *cur; struct kprobe_ctlblk *kcb; addr = (kprobe_opcode_t *)(regs-ip - sizeof(kprobe_opcode_t)); + if (*addr != BREAKPOINT_INSTRUCTION) { + /* + * The breakpoint instruction was removed right + * after we hit it. Another cpu has removed + * either a probepoint or a debugger breakpoint + * at this address. In either case, no further + * handling of this interrupt is appropriate. + * Back up over the (now missing) int3 and run + * the original instruction. + */ + regs-ip = (unsigned long)addr; + return 1; + } This return is fine I guess, but after the preempt_disable() I like the goto approach as it will be easier to see what paths enable preemption again and which don'tbonus points if we can move this to the caller or make sure we reenable in all cases before returning and pull in the code in the caller that does this for us. But I guess your approach of using ret to test whether we need to reenable preemption or not would work as a signal to the caller that they need to reenable preemption. Hmm...since enabling preemption is tied to 'ret', anyone reading kprobe_handler will have to follow around all calls which modify it. There are some checks in the current kprobe_handler definition made just to do what you're saying, i.e, to push all preemption enable/disables in krpobe_handler. LIke this one (from the current x86 kprobe_handler): ret = reenter_kprobe(p, regs, kcb); if (kcb-kprobe_status == KPROBE_REENTER) { ret = 1; goto out; } goto preempt_out; - This is just confusing because we're not actually making any exceptions here for the KPROBE_REENTER case (which has been partially handled in reenter_kprobe), rather just tricking our way out of preemption enabling for a cpl of cases in reenter_kprobe. Cheers, Harvey Thanks, Abhishek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: kprobes change kprobe_handler flow
Harvey Harrison wrote: > Make the control flow of kprobe_handler more obvious. > > Collapse the separate if blocks/gotos with if/else blocks > this unifies the duplication of the check for a breakpoint > instruction race with another cpu. This is a patch derived from kprobe_handler of the ARM kprobes port. This further simplifies the current x86 kprobe_handler. The resulting definition is smaller, more readable, has no goto's and contains only a single call to get_kprobe. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> Signed-off-by: Quentin Barnes <[EMAIL PROTECTED]> --- diff --git a/arch/x86/kernel/kprobes_32.c b/arch/x86/kernel/kprobes_32.c index 3a020f7..5a626fd 100644 --- a/arch/x86/kernel/kprobes_32.c +++ b/arch/x86/kernel/kprobes_32.c @@ -240,108 +240,110 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, *sara = (unsigned long) _trampoline; } +static __always_inline void setup_singlestep(struct kprobe *p, +struct pt_regs *regs, +struct kprobe_ctlblk *kcb) +{ +#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM) + if (p->ainsn.boostable == 1 && !p->post_handler) { + /* Boost up -- we can execute copied instructions directly */ + reset_current_kprobe(); + regs->eip = (unsigned long)p->ainsn.insn; + preempt_enable_no_resched(); + } else { + prepare_singlestep(p, regs); + kcb->kprobe_status = KPROBE_HIT_SS; + } +#else + prepare_singlestep(p, regs); + kcb->kprobe_status = KPROBE_HIT_SS; +#endif +} + /* * Interrupts are disabled on entry as trap3 is an interrupt gate and they * remain disabled thorough out this function. */ static int __kprobes kprobe_handler(struct pt_regs *regs) { - struct kprobe *p; int ret = 0; kprobe_opcode_t *addr; + struct kprobe *p, *cur; struct kprobe_ctlblk *kcb; addr = (kprobe_opcode_t *)(regs->eip - sizeof(kprobe_opcode_t)); + if (*addr != BREAKPOINT_INSTRUCTION) { + /* +* The breakpoint instruction was removed right +* after we hit it. Another cpu has removed +* either a probepoint or a debugger breakpoint +* at this address. In either case, no further +* handling of this interrupt is appropriate. +* Back up over the (now missing) int3 and run +* the original instruction. +*/ + regs->eip -= sizeof(kprobe_opcode_t); + return 1; + } - /* -* We don't want to be preempted for the entire -* duration of kprobe processing -*/ preempt_disable(); kcb = get_kprobe_ctlblk(); + cur = kprobe_running(); + p = get_kprobe(addr); - /* Check we're not actually recursing */ - if (kprobe_running()) { - p = get_kprobe(addr); - if (p) { - if (kcb->kprobe_status == KPROBE_HIT_SS && - *p->ainsn.insn == BREAKPOINT_INSTRUCTION) { - regs->eflags &= ~TF_MASK; - regs->eflags |= kcb->kprobe_saved_eflags; - goto no_kprobe; - } - /* We have reentered the kprobe_handler(), since -* another probe was hit while within the handler. -* We here save the original kprobes variables and -* just single step on the instruction of the new probe -* without calling any user handlers. -*/ - save_previous_kprobe(kcb); - set_current_kprobe(p, regs, kcb); - kprobes_inc_nmissed_count(p); - prepare_singlestep(p, regs); - kcb->kprobe_status = KPROBE_REENTER; - return 1; - } else { - if (*addr != BREAKPOINT_INSTRUCTION) { - /* The breakpoint instruction was removed by -* another cpu right after we hit, no further -* handling of this interrupt is appropriate -*/ - regs->eip -= sizeof(kprobe_opcode_t); + if (p) { + if (cur) { + switch (kcb->kprobe_status) { + case KPROBE_HIT_ACTIVE: + case KPROBE_HIT_SSDONE: + /* a probe has been hit inside a +* user handler */ +
Re: [PATCH] x86: kprobes change kprobe_handler flow
Harvey Harrison wrote: Make the control flow of kprobe_handler more obvious. Collapse the separate if blocks/gotos with if/else blocks this unifies the duplication of the check for a breakpoint instruction race with another cpu. This is a patch derived from kprobe_handler of the ARM kprobes port. This further simplifies the current x86 kprobe_handler. The resulting definition is smaller, more readable, has no goto's and contains only a single call to get_kprobe. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] Signed-off-by: Quentin Barnes [EMAIL PROTECTED] --- diff --git a/arch/x86/kernel/kprobes_32.c b/arch/x86/kernel/kprobes_32.c index 3a020f7..5a626fd 100644 --- a/arch/x86/kernel/kprobes_32.c +++ b/arch/x86/kernel/kprobes_32.c @@ -240,108 +240,110 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, *sara = (unsigned long) kretprobe_trampoline; } +static __always_inline void setup_singlestep(struct kprobe *p, +struct pt_regs *regs, +struct kprobe_ctlblk *kcb) +{ +#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM) + if (p-ainsn.boostable == 1 !p-post_handler) { + /* Boost up -- we can execute copied instructions directly */ + reset_current_kprobe(); + regs-eip = (unsigned long)p-ainsn.insn; + preempt_enable_no_resched(); + } else { + prepare_singlestep(p, regs); + kcb-kprobe_status = KPROBE_HIT_SS; + } +#else + prepare_singlestep(p, regs); + kcb-kprobe_status = KPROBE_HIT_SS; +#endif +} + /* * Interrupts are disabled on entry as trap3 is an interrupt gate and they * remain disabled thorough out this function. */ static int __kprobes kprobe_handler(struct pt_regs *regs) { - struct kprobe *p; int ret = 0; kprobe_opcode_t *addr; + struct kprobe *p, *cur; struct kprobe_ctlblk *kcb; addr = (kprobe_opcode_t *)(regs-eip - sizeof(kprobe_opcode_t)); + if (*addr != BREAKPOINT_INSTRUCTION) { + /* +* The breakpoint instruction was removed right +* after we hit it. Another cpu has removed +* either a probepoint or a debugger breakpoint +* at this address. In either case, no further +* handling of this interrupt is appropriate. +* Back up over the (now missing) int3 and run +* the original instruction. +*/ + regs-eip -= sizeof(kprobe_opcode_t); + return 1; + } - /* -* We don't want to be preempted for the entire -* duration of kprobe processing -*/ preempt_disable(); kcb = get_kprobe_ctlblk(); + cur = kprobe_running(); + p = get_kprobe(addr); - /* Check we're not actually recursing */ - if (kprobe_running()) { - p = get_kprobe(addr); - if (p) { - if (kcb-kprobe_status == KPROBE_HIT_SS - *p-ainsn.insn == BREAKPOINT_INSTRUCTION) { - regs-eflags = ~TF_MASK; - regs-eflags |= kcb-kprobe_saved_eflags; - goto no_kprobe; - } - /* We have reentered the kprobe_handler(), since -* another probe was hit while within the handler. -* We here save the original kprobes variables and -* just single step on the instruction of the new probe -* without calling any user handlers. -*/ - save_previous_kprobe(kcb); - set_current_kprobe(p, regs, kcb); - kprobes_inc_nmissed_count(p); - prepare_singlestep(p, regs); - kcb-kprobe_status = KPROBE_REENTER; - return 1; - } else { - if (*addr != BREAKPOINT_INSTRUCTION) { - /* The breakpoint instruction was removed by -* another cpu right after we hit, no further -* handling of this interrupt is appropriate -*/ - regs-eip -= sizeof(kprobe_opcode_t); + if (p) { + if (cur) { + switch (kcb-kprobe_status) { + case KPROBE_HIT_ACTIVE: + case KPROBE_HIT_SSDONE: + /* a probe has been hit inside a +* user handler */ + save_previous_kprobe(kcb); + set_current_kprobe(p, regs, kcb); + kprobes_inc_nmissed_count(p
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On 11/21/07, Jim Keniston <[EMAIL PROTECTED]> wrote: > I have one minor suggestion on the code -- see below -- but I'm willing > to ack that with or without the suggested change. Please also see > suggestions on kprobes.txt and the demo program. Thanks...I've made the necessary changes. More comments below. > It would be more consistent with the existing text in kprobes.txt to add > a subsection labeled "User's entry handler (rp->entry_handler):" and > document the entry_handler there. I've moved almost all of the entry_handler discussion in a separate section. > > static struct kretprobe my_kretprobe = { > > - .handler = ret_handler, > > - /* Probe up to 20 instances concurrently. */ > > - .maxactive = 20 > > + .handler = return_handler, > > + .entry_handler = entry_handler, > > + .data_size = sizeof(struct my_data), > > + .maxactive = 1, /* profile one invocation at a time */ > > I don't like the idea of setting maxactive = 1 here. That's not normal > kretprobes usage, which is what we're trying to illustrate here. This > is no place for splitting hairs about profiling recursive functions. I've reverted back to ".maxactive = 20". In any case, there is no need to protect against recursive instances of sys_open. > > @@ -699,6 +699,14 @@ static int __kprobes pre_handler_kretpro > >struct kretprobe_instance, uflist); > > ri->rp = rp; > > ri->task = current; > > + > > + if (rp->entry_handler) { > > + if (rp->entry_handler(ri, regs)) { > > Could also be >if (rp->entry_handler && rp->entry_handler(ri, regs)) { Done. --- Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> diff -upNr linux-2.6.24-rc3/Documentation/kprobes.txt linux-2.6.24-rc3_kp/Documentation/kprobes.txt --- linux-2.6.24-rc3/Documentation/kprobes.txt 2007-11-17 10:46:36.0 +0530 +++ linux-2.6.24-rc3_kp/Documentation/kprobes.txt 2007-11-21 15:20:53.0 +0530 @@ -96,7 +96,9 @@ or in registers (e.g., for x86_64 or for The jprobe will work in either case, so long as the handler's prototype matches that of the probed function. -1.3 How Does a Return Probe Work? +1.3 Return Probes + +1.3.1 How Does a Return Probe Work? When you call register_kretprobe(), Kprobes establishes a kprobe at the entry to the function. When the probed function is called and this @@ -107,7 +109,7 @@ At boot time, Kprobes registers a kprobe When the probed function executes its return instruction, control passes to the trampoline and that probe is hit. Kprobes' trampoline -handler calls the user-specified handler associated with the kretprobe, +handler calls the user-specified return handler associated with the kretprobe, then sets the saved instruction pointer to the saved return address, and that's where execution resumes upon return from the trap. @@ -131,6 +133,30 @@ zero when the return probe is registered time the probed function is entered but there is no kretprobe_instance object available for establishing the return probe. +1.3.2 Kretprobe entry-handler + +Kretprobes also provides an optional user-specified handler which runs +on function entry. This handler is specified by setting the entry_handler +field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the +function entry is hit, the user-defined entry_handler, if any, is invoked. +If the entry_handler returns 0 (success) then a corresponding return handler +is guaranteed to be called upon function return. If the entry_handler +returns a non-zero error then Kprobes leaves the return address as is, and +the kretprobe has no further effect for that particular function instance. + +Multiple entry and return handler invocations are matched using the unique +kretprobe_instance object associated with them. Additionally, a user +may also specify per return-instance private data to be part of each +kretprobe_instance object. This is especially useful when sharing private +data between corresponding user entry and return handlers. The size of each +private data object can be specified at kretprobe registration time by +setting the data_size field of the kretprobe struct. This data can be +accessed through the data field of each kretprobe_instance object. + +In case probed function is entered but there is no kretprobe_instance +object available, then in addition to incrementing the nmissed count, +the user entry_handler invocation is also skipped. + 2. Architectures Supported Kprobes, jprobes, and return probes are implemented on the following @@ -273,6 +299,8 @@ of interest: - ret_addr: the return address - rp: points to the corresponding kretprobe object - task: points to the corresponding task struct +- data: points to per return-inst
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On 11/21/07, Jim Keniston [EMAIL PROTECTED] wrote: I have one minor suggestion on the code -- see below -- but I'm willing to ack that with or without the suggested change. Please also see suggestions on kprobes.txt and the demo program. Thanks...I've made the necessary changes. More comments below. It would be more consistent with the existing text in kprobes.txt to add a subsection labeled User's entry handler (rp-entry_handler): and document the entry_handler there. I've moved almost all of the entry_handler discussion in a separate section. static struct kretprobe my_kretprobe = { - .handler = ret_handler, - /* Probe up to 20 instances concurrently. */ - .maxactive = 20 + .handler = return_handler, + .entry_handler = entry_handler, + .data_size = sizeof(struct my_data), + .maxactive = 1, /* profile one invocation at a time */ I don't like the idea of setting maxactive = 1 here. That's not normal kretprobes usage, which is what we're trying to illustrate here. This is no place for splitting hairs about profiling recursive functions. I've reverted back to .maxactive = 20. In any case, there is no need to protect against recursive instances of sys_open. @@ -699,6 +699,14 @@ static int __kprobes pre_handler_kretpro struct kretprobe_instance, uflist); ri-rp = rp; ri-task = current; + + if (rp-entry_handler) { + if (rp-entry_handler(ri, regs)) { Could also be if (rp-entry_handler rp-entry_handler(ri, regs)) { Done. --- Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] diff -upNr linux-2.6.24-rc3/Documentation/kprobes.txt linux-2.6.24-rc3_kp/Documentation/kprobes.txt --- linux-2.6.24-rc3/Documentation/kprobes.txt 2007-11-17 10:46:36.0 +0530 +++ linux-2.6.24-rc3_kp/Documentation/kprobes.txt 2007-11-21 15:20:53.0 +0530 @@ -96,7 +96,9 @@ or in registers (e.g., for x86_64 or for The jprobe will work in either case, so long as the handler's prototype matches that of the probed function. -1.3 How Does a Return Probe Work? +1.3 Return Probes + +1.3.1 How Does a Return Probe Work? When you call register_kretprobe(), Kprobes establishes a kprobe at the entry to the function. When the probed function is called and this @@ -107,7 +109,7 @@ At boot time, Kprobes registers a kprobe When the probed function executes its return instruction, control passes to the trampoline and that probe is hit. Kprobes' trampoline -handler calls the user-specified handler associated with the kretprobe, +handler calls the user-specified return handler associated with the kretprobe, then sets the saved instruction pointer to the saved return address, and that's where execution resumes upon return from the trap. @@ -131,6 +133,30 @@ zero when the return probe is registered time the probed function is entered but there is no kretprobe_instance object available for establishing the return probe. +1.3.2 Kretprobe entry-handler + +Kretprobes also provides an optional user-specified handler which runs +on function entry. This handler is specified by setting the entry_handler +field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the +function entry is hit, the user-defined entry_handler, if any, is invoked. +If the entry_handler returns 0 (success) then a corresponding return handler +is guaranteed to be called upon function return. If the entry_handler +returns a non-zero error then Kprobes leaves the return address as is, and +the kretprobe has no further effect for that particular function instance. + +Multiple entry and return handler invocations are matched using the unique +kretprobe_instance object associated with them. Additionally, a user +may also specify per return-instance private data to be part of each +kretprobe_instance object. This is especially useful when sharing private +data between corresponding user entry and return handlers. The size of each +private data object can be specified at kretprobe registration time by +setting the data_size field of the kretprobe struct. This data can be +accessed through the data field of each kretprobe_instance object. + +In case probed function is entered but there is no kretprobe_instance +object available, then in addition to incrementing the nmissed count, +the user entry_handler invocation is also skipped. + 2. Architectures Supported Kprobes, jprobes, and return probes are implemented on the following @@ -273,6 +299,8 @@ of interest: - ret_addr: the return address - rp: points to the corresponding kretprobe object - task: points to the corresponding task struct +- data: points to per return-instance private data; see Kretprobe entry- + handler for details. The regs_return_value(regs) macro provides a simple abstraction to extract the return value from the appropriate register as defined by @@ -555,23 +583,46 @@ report failed calls
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 17, 2007 6:24 AM, Jim Keniston <[EMAIL PROTECTED]> wrote: > > True, some kind of data pointer/pouch is essential. > > Yes. If the pouch idea is too weird, then the data pointer is a good > compromise. > > With the above reservations, your enclosed patch looks OK. > > You should provide a patch #2 that updates Documentation/kprobes.txt. > Maybe that will yield a little more review from other folks. The inlined patch provides support for an optional user entry-handler in kretprobes. It also adds provision for private data to be held in each return instance based on Kevin Stafford's "data pouch" approach. Kretprobe usage example in Documentation/kprobes.txt has also been updated to demonstrate the usage of entry-handlers. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> --- diff -upNr linux-2.6.24-rc2/Documentation/kprobes.txt linux-2.6.24-rc2_kp/Documentation/kprobes.txt --- linux-2.6.24-rc2/Documentation/kprobes.txt 2007-11-07 03:27:46.0 +0530 +++ linux-2.6.24-rc2_kp/Documentation/kprobes.txt 2007-11-19 17:41:27.0 +0530 @@ -100,16 +100,21 @@ prototype matches that of the probed fun When you call register_kretprobe(), Kprobes establishes a kprobe at the entry to the function. When the probed function is called and this -probe is hit, Kprobes saves a copy of the return address, and replaces -the return address with the address of a "trampoline." The trampoline -is an arbitrary piece of code -- typically just a nop instruction. -At boot time, Kprobes registers a kprobe at the trampoline. - -When the probed function executes its return instruction, control -passes to the trampoline and that probe is hit. Kprobes' trampoline -handler calls the user-specified handler associated with the kretprobe, -then sets the saved instruction pointer to the saved return address, -and that's where execution resumes upon return from the trap. +probe is hit, the user defined entry_handler is invoked (optional). If +the entry_handler returns 0 (success) or is not present, then Kprobes +saves a copy of the return address, and replaces the return address +with the address of a "trampoline." If the entry_handler returns a +non-zero error, the function executes as normal, as if no probe was +installed on it. The trampoline is an arbitrary piece of code -- +typically just a nop instruction. At boot time, Kprobes registers a +kprobe at the trampoline. + +After the trampoline return address is planted, when the probed function +executes its return instruction, control passes to the trampoline and +that probe is hit. Kprobes' trampoline handler calls the user-specified +return handler associated with the kretprobe, then sets the saved +instruction pointer to the saved return address, and that's where +execution resumes upon return from the trap. While the probed function is executing, its return address is stored in an object of type kretprobe_instance. Before calling @@ -117,6 +122,9 @@ register_kretprobe(), the user sets the kretprobe struct to specify how many instances of the specified function can be probed simultaneously. register_kretprobe() pre-allocates the indicated number of kretprobe_instance objects. +Additionally, a user may also specify per-instance private data to +be part of each return instance. This is useful when using kretprobes +with a user entry_handler (see "register_kretprobe" for details). For example, if the function is non-recursive and is called with a spinlock held, maxactive = 1 should be enough. If the function is @@ -129,7 +137,8 @@ It's not a disaster if you set maxactive some probes. In the kretprobe struct, the nmissed field is set to zero when the return probe is registered, and is incremented every time the probed function is entered but there is no kretprobe_instance -object available for establishing the return probe. +object available for establishing the return probe. A miss also prevents +user entry_handler from being invoked. 2. Architectures Supported @@ -258,6 +267,16 @@ Establishes a return probe for the funct rp->kp.addr. When that function returns, Kprobes calls rp->handler. You must set rp->maxactive appropriately before you call register_kretprobe(); see "How Does a Return Probe Work?" for details. +An optional entry-handler can also be specified by initializing +rp->entry_handler. This handler is called at the beginning of the +probed function (except for instances exceeding rp->maxactive). On +success the entry_handler return 0, which guarantees invocation of +a corresponding return handler. Corresponding entry and return handler +invocations can be matched using the return instance (ri) passed to them. +Also, each ri can hold per-instance private data (ri->data), whose size +is determined by rp->data_size. If the entry_handler returns a non-zero +error, the current return instance is skipped and no ret
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 17, 2007 6:24 AM, Jim Keniston [EMAIL PROTECTED] wrote: True, some kind of data pointer/pouch is essential. Yes. If the pouch idea is too weird, then the data pointer is a good compromise. With the above reservations, your enclosed patch looks OK. You should provide a patch #2 that updates Documentation/kprobes.txt. Maybe that will yield a little more review from other folks. The inlined patch provides support for an optional user entry-handler in kretprobes. It also adds provision for private data to be held in each return instance based on Kevin Stafford's data pouch approach. Kretprobe usage example in Documentation/kprobes.txt has also been updated to demonstrate the usage of entry-handlers. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] --- diff -upNr linux-2.6.24-rc2/Documentation/kprobes.txt linux-2.6.24-rc2_kp/Documentation/kprobes.txt --- linux-2.6.24-rc2/Documentation/kprobes.txt 2007-11-07 03:27:46.0 +0530 +++ linux-2.6.24-rc2_kp/Documentation/kprobes.txt 2007-11-19 17:41:27.0 +0530 @@ -100,16 +100,21 @@ prototype matches that of the probed fun When you call register_kretprobe(), Kprobes establishes a kprobe at the entry to the function. When the probed function is called and this -probe is hit, Kprobes saves a copy of the return address, and replaces -the return address with the address of a trampoline. The trampoline -is an arbitrary piece of code -- typically just a nop instruction. -At boot time, Kprobes registers a kprobe at the trampoline. - -When the probed function executes its return instruction, control -passes to the trampoline and that probe is hit. Kprobes' trampoline -handler calls the user-specified handler associated with the kretprobe, -then sets the saved instruction pointer to the saved return address, -and that's where execution resumes upon return from the trap. +probe is hit, the user defined entry_handler is invoked (optional). If +the entry_handler returns 0 (success) or is not present, then Kprobes +saves a copy of the return address, and replaces the return address +with the address of a trampoline. If the entry_handler returns a +non-zero error, the function executes as normal, as if no probe was +installed on it. The trampoline is an arbitrary piece of code -- +typically just a nop instruction. At boot time, Kprobes registers a +kprobe at the trampoline. + +After the trampoline return address is planted, when the probed function +executes its return instruction, control passes to the trampoline and +that probe is hit. Kprobes' trampoline handler calls the user-specified +return handler associated with the kretprobe, then sets the saved +instruction pointer to the saved return address, and that's where +execution resumes upon return from the trap. While the probed function is executing, its return address is stored in an object of type kretprobe_instance. Before calling @@ -117,6 +122,9 @@ register_kretprobe(), the user sets the kretprobe struct to specify how many instances of the specified function can be probed simultaneously. register_kretprobe() pre-allocates the indicated number of kretprobe_instance objects. +Additionally, a user may also specify per-instance private data to +be part of each return instance. This is useful when using kretprobes +with a user entry_handler (see register_kretprobe for details). For example, if the function is non-recursive and is called with a spinlock held, maxactive = 1 should be enough. If the function is @@ -129,7 +137,8 @@ It's not a disaster if you set maxactive some probes. In the kretprobe struct, the nmissed field is set to zero when the return probe is registered, and is incremented every time the probed function is entered but there is no kretprobe_instance -object available for establishing the return probe. +object available for establishing the return probe. A miss also prevents +user entry_handler from being invoked. 2. Architectures Supported @@ -258,6 +267,16 @@ Establishes a return probe for the funct rp-kp.addr. When that function returns, Kprobes calls rp-handler. You must set rp-maxactive appropriately before you call register_kretprobe(); see How Does a Return Probe Work? for details. +An optional entry-handler can also be specified by initializing +rp-entry_handler. This handler is called at the beginning of the +probed function (except for instances exceeding rp-maxactive). On +success the entry_handler return 0, which guarantees invocation of +a corresponding return handler. Corresponding entry and return handler +invocations can be matched using the return instance (ri) passed to them. +Also, each ri can hold per-instance private data (ri-data), whose size +is determined by rp-data_size. If the entry_handler returns a non-zero +error, the current return instance is skipped and no return handler is +called for the current instance. register_kretprobe() returns 0 on success, or a negative errno otherwise. @@ -555,23 +574,46
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 17, 2007 6:24 AM, Jim Keniston <[EMAIL PROTECTED]> wrote: > It'd be helpful to see others (especially kprobes maintainers) chime in > on this. In particular, if doing kmalloc/kfree of GFP_ATOMIC data at > kretprobe-hit time is OK, as in Abhishek's approach, then we could also > use GFP_ATOMIC (or at least GFP_NOWAIT) allocations to make up the > difference when we run low on kretprobe_instances. It might cause a problem with return instances having a large value of entry_info_sz, being allocated in the page frame reclamation code path. > > > entry_info is, by default, a zero-length array, which adds nothing to > > > the size of a uretprobe_instance -- at least on the 3 architectures I've > > > tested on (i386, x86_64, powerpc). > > > > Strange, because from what I could gather, the data pouch patch had > > the following in the kretprobe registration routine: > > > > > > for (i = 0; i < rp->maxactive; i++) { > > - inst = kmalloc(sizeof(struct kretprobe_instance), GFP_KERNEL); > > + inst = kmalloc((sizeof(struct kretprobe_instance) > > + + rp->entry_info_sz), GFP_KERNEL); > > > > > > rp->entry_info_sz is presumably the size of the private data structure > > of the registering module. > > ... which is zero for kretprobes that don't use the data pouch. > > > This is the bloat I was referring to. But > > this difference should've showed up in your tests...? > > What bloat? On my 32-bit system, the pouch to hold struct prof_data in > your test_module example would be 20 bytes. (For comparison, > sizeof(struct kretprobe_instance) = 28, btw.) Except for functions like > schedule(), where a lot of tasks can be sleeping at the same time, an > rp->maxactive value of 5 or 10 is typically plenty. That's 100-200 > bytes of "bloat" spent at registration time (GFP_KERNEL), at least some > of which will be saved at probe-hit time (GFP_ATOMIC). (And if somebody > says, "I always use a much higher value of rp->maxactive," then he/she's > probably not really worried about bloat.) Ok. Will make the necessary transition to registration time allocation of private data. > Yes. If the pouch idea is too weird, then the data pointer is a good > compromise. > > With the above reservations, your enclosed patch looks OK. > > You should provide a patch #2 that updates Documentation/kprobes.txt. > Maybe that will yield a little more review from other folks. Will incorporate changes to kprobes.txt as well. - Abhishek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 17, 2007 4:39 AM, Jim Keniston <[EMAIL PROTECTED]> wrote: > First of all, as promised, here's what would be different if it were > implemented using the data-pouch approach: > > --- abhishek1.c 2007-11-16 13:57:13.0 -0800 > +++ jim1.c 2007-11-16 14:20:39.0 -0800 > @@ -50,15 +50,12 @@ > if (stats) > return 1; /* recursive/nested call */ > > - stats = kmalloc(sizeof(struct prof_data), GFP_ATOMIC); > - if (!stats) > - return 1; > + stats = (struct prof_data *) ri->entry_info; > > stats->entry_stamp = sched_clock(); > stats->task = current; > INIT_LIST_HEAD(>list); > list_add(>list, _nodes); > - ri->data = stats; > return 0; > } > > @@ -66,10 +63,9 @@ > static int return_handler(struct kretprobe_instance *ri, struct pt_regs > *regs) > { > unsigned long flags; > - struct prof_data *stats = (struct prof_data *)ri->data; > + struct prof_data *stats = (struct prof_data *)ri->entry_info; > u64 elapsed; > > - BUG_ON(ri->data == NULL); > elapsed = (long long)sched_clock() - (long long)stats->entry_stamp; > > /* update stats */ > @@ -79,13 +75,13 @@ > spin_unlock_irqrestore(_lock, flags); > > list_del(>list); > - kfree(stats); > return 0; > } > > static struct kretprobe my_kretprobe = { > .handler = return_handler, > .entry_handler = entry_handler, > + .entry_info_sz = sizeof(struct prof_data) > }; > > /* called after every PRINT_DELAY seconds */ > > So the data-pouch approach saves you a little code and a kmalloc/kfree > round trip on each kretprobe hit. A kmalloc/kfree round trip is about > 80 ns on my system, or about 20% of the base cost of a kretprobe hit. I > don't know how important that is to people. > > I also note that this particular example maintains a per-task list of > prof_data objects to avoid overcounting the time spent in a recursive > function. That adds about 30% to the size of your module source (136 > lines vs. 106, by my count). I suspect that many instrumentation > modules wouldn't need such a list. However, without your ri->data > pointer (or Kevin's ri->entry_info pouch), every instrumentation module > that uses your enhancement would need such a list in order to map the ri > to the per-instance. Those are interesting numbers. Will incorporate pouching in the next patch. Even with a data pointer or pouch, the mapping of ri (or ri->data) would sometimes be necessary. It's required to catch recursive/nested invocation cases. In case of time measurment test module, these invocations needed to be weeded out and therefore such a list was required. Other scenarios might not care for it. E.g a module which measures the change in some global system state across every call. Thanks for the comments. > Jim - Abhishek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 17, 2007 4:39 AM, Jim Keniston [EMAIL PROTECTED] wrote: First of all, as promised, here's what would be different if it were implemented using the data-pouch approach: --- abhishek1.c 2007-11-16 13:57:13.0 -0800 +++ jim1.c 2007-11-16 14:20:39.0 -0800 @@ -50,15 +50,12 @@ if (stats) return 1; /* recursive/nested call */ - stats = kmalloc(sizeof(struct prof_data), GFP_ATOMIC); - if (!stats) - return 1; + stats = (struct prof_data *) ri-entry_info; stats-entry_stamp = sched_clock(); stats-task = current; INIT_LIST_HEAD(stats-list); list_add(stats-list, data_nodes); - ri-data = stats; return 0; } @@ -66,10 +63,9 @@ static int return_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { unsigned long flags; - struct prof_data *stats = (struct prof_data *)ri-data; + struct prof_data *stats = (struct prof_data *)ri-entry_info; u64 elapsed; - BUG_ON(ri-data == NULL); elapsed = (long long)sched_clock() - (long long)stats-entry_stamp; /* update stats */ @@ -79,13 +75,13 @@ spin_unlock_irqrestore(time_lock, flags); list_del(stats-list); - kfree(stats); return 0; } static struct kretprobe my_kretprobe = { .handler = return_handler, .entry_handler = entry_handler, + .entry_info_sz = sizeof(struct prof_data) }; /* called after every PRINT_DELAY seconds */ So the data-pouch approach saves you a little code and a kmalloc/kfree round trip on each kretprobe hit. A kmalloc/kfree round trip is about 80 ns on my system, or about 20% of the base cost of a kretprobe hit. I don't know how important that is to people. I also note that this particular example maintains a per-task list of prof_data objects to avoid overcounting the time spent in a recursive function. That adds about 30% to the size of your module source (136 lines vs. 106, by my count). I suspect that many instrumentation modules wouldn't need such a list. However, without your ri-data pointer (or Kevin's ri-entry_info pouch), every instrumentation module that uses your enhancement would need such a list in order to map the ri to the per-instance. Those are interesting numbers. Will incorporate pouching in the next patch. Even with a data pointer or pouch, the mapping of ri (or ri-data) would sometimes be necessary. It's required to catch recursive/nested invocation cases. In case of time measurment test module, these invocations needed to be weeded out and therefore such a list was required. Other scenarios might not care for it. E.g a module which measures the change in some global system state across every call. Thanks for the comments. Jim - Abhishek - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 17, 2007 6:24 AM, Jim Keniston [EMAIL PROTECTED] wrote: It'd be helpful to see others (especially kprobes maintainers) chime in on this. In particular, if doing kmalloc/kfree of GFP_ATOMIC data at kretprobe-hit time is OK, as in Abhishek's approach, then we could also use GFP_ATOMIC (or at least GFP_NOWAIT) allocations to make up the difference when we run low on kretprobe_instances. It might cause a problem with return instances having a large value of entry_info_sz, being allocated in the page frame reclamation code path. entry_info is, by default, a zero-length array, which adds nothing to the size of a uretprobe_instance -- at least on the 3 architectures I've tested on (i386, x86_64, powerpc). Strange, because from what I could gather, the data pouch patch had the following in the kretprobe registration routine: for (i = 0; i rp-maxactive; i++) { - inst = kmalloc(sizeof(struct kretprobe_instance), GFP_KERNEL); + inst = kmalloc((sizeof(struct kretprobe_instance) + + rp-entry_info_sz), GFP_KERNEL); rp-entry_info_sz is presumably the size of the private data structure of the registering module. ... which is zero for kretprobes that don't use the data pouch. This is the bloat I was referring to. But this difference should've showed up in your tests...? What bloat? On my 32-bit system, the pouch to hold struct prof_data in your test_module example would be 20 bytes. (For comparison, sizeof(struct kretprobe_instance) = 28, btw.) Except for functions like schedule(), where a lot of tasks can be sleeping at the same time, an rp-maxactive value of 5 or 10 is typically plenty. That's 100-200 bytes of bloat spent at registration time (GFP_KERNEL), at least some of which will be saved at probe-hit time (GFP_ATOMIC). (And if somebody says, I always use a much higher value of rp-maxactive, then he/she's probably not really worried about bloat.) Ok. Will make the necessary transition to registration time allocation of private data. Yes. If the pouch idea is too weird, then the data pointer is a good compromise. With the above reservations, your enclosed patch looks OK. You should provide a patch #2 that updates Documentation/kprobes.txt. Maybe that will yield a little more review from other folks. Will incorporate changes to kprobes.txt as well. - Abhishek - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 16, 2007 5:37 AM, Jim Keniston <[EMAIL PROTECTED]> wrote: > On Thu, 2007-11-15 at 20:30 +0530, Abhishek Sagar wrote: > > On Nov 15, 2007 4:21 AM, Jim Keniston <[EMAIL PROTECTED]> wrote: > > > 2. Simplify the task of correlating data (e.g., timestamps) between > > > function entry and function return. > > > > Would adding of data and len fields in ri help? Instead of "pouching" > > data in one go at registration time, this would let user handlers do > > the allocation > > Yes and no. Adding just a data field -- void*, or maybe unsigned long > long so it's big enought to accommodate big timestamps -- would be a big > improvement on your current proposal. That would save the user the > drudgery of mapping the ri pointer to his/her per-instance data. > There's plenty of precedent for passing "private_data" values to > callbacks. > > I don't think a len field would help much. If such info were needed, it > could be stored in the data structure pointed to by the data field. > > I still don't think "letting [i.e., requiring that] user handlers do the > allocation" is a win. I'm still interested to see how this plays out in > real examples. > > > and allow them to use different kinds of data > > structures per-instance. > > I haven't been able to think of any scenarios where this would be > useful. A "data pouch" could always contain a union, FWIW. I'm inlining a sample module which uses a data pointer in ri. I didn't go for a timestamp because it's not reliable. Some platforms might simply not have any h/w timestamp counters. For the same reason a lot of platforms (on ARM, say) have their sched_clock() mapped on jiffies. This may prevent timestamps from being distinct across function entry and exit. Plus a data pointer looks pretty harmless. --- test module --- #include #include #include #include #include #define PRINT_DELAY (10 * HZ) /* This module calculates the total time and instances of func being called * across all cpu's. An average is calculated every 10 seconds and displayed. * Only one function instance per-task is monitored. This cuts out the * possibility of measuring time for recursive and nested function * invocations. * * Note: If compiling as a standalone module, make sure sched_clock() is * exported in the kernel. */ /* per-task data */ struct prof_data { struct task_struct *task; struct list_head list; unsigned long long entry_stamp; }; static const char *func = "sys_open"; static spinlock_t time_lock; static ktime_t total_time; static unsigned long hits; static LIST_HEAD(data_nodes); /* list of per-task data */ static struct delayed_work work_print; static struct prof_data *get_per_task_data(struct task_struct *tsk) { struct prof_data *p; /* lookup prof_data corresponding to tsk */ list_for_each_entry(p, _nodes, list) { if (p->task == tsk) return p; } return NULL; } /* called with kretprobe_lock held */ static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { struct prof_data *stats; stats = get_per_task_data(current); if (stats) return 1; /* recursive/nested call */ stats = kmalloc(sizeof(struct prof_data), GFP_ATOMIC); if (!stats) return 1; stats->entry_stamp = sched_clock(); stats->task = current; INIT_LIST_HEAD(>list); list_add(>list, _nodes); ri->data = stats; return 0; } /* called with kretprobe_lock held */ static int return_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { unsigned long flags; struct prof_data *stats = (struct prof_data *)ri->data; u64 elapsed; BUG_ON(ri->data == NULL); elapsed = (long long)sched_clock() - (long long)stats->entry_stamp; /* update stats */ spin_lock_irqsave(_lock, flags); ++hits; total_time = ktime_add_ns(total_time, elapsed); spin_unlock_irqrestore(_lock, flags); list_del(>list); kfree(stats); return 0; } static struct kretprobe my_kretprobe = { .handler = return_handler, .entry_handler = entry_handler, }; /* called after every PRINT_DELAY seconds */ static void print_time(struct work_struct *work) { unsigned long flags; s64 time_ns; struct timespec ts; BUG_ON(work != _print.work); spin_lock_irqsave(_lock, flags); time_ns = ktime_to_ns(total_time); do_div(time_ns, hits); spin_unlock_irqrestore(_lock, flags); ts = ns_to_timespec(time_ns); printk(KERN_DEBUG "Avg. running time of %s = %ld sec, %ld nsec\n", func, t
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
equired because if the entry_handler() returns error (a > > voluntary miss), then there has to be a way to roll-back all the > > changes that arch_prepare_kretprobe() and entry_handler() made to > > *regs. Such an instance is considered "aborted". > > No. Handlers shouldn't be writing to the pt_regs struct unless they > want to change the operation of the probed code. If an entry handler > scribbles on *regs and then decides to "abort", it's up to that handler > to restore the contents of *regs. It's that way with all kprobe and > kretprobe handlers, and the proposed entry handler is no different, as > far as I can see. I had the copy mainly to undo the effect of arch_prepare_kretprobe() on regs if entry_handler decided to abort. But I think it didn't serve that purpose as well since arch_prepare_kretprobe() ends up modifying the stack on x86. So I've gotten rid of the copy now. The entry_handler now gets called before arch_prepare_kretprobe() and therefore shouldn't expect to use ri->ret_addr. In effect the entry_handler is an early decision maker on whether the return instance setup should even take place for current function hit or not. > > Wouldn't the return addresses be different for recursive/nested calls? > > Not for recursive calls. Think about it. Or write a little program > that calls a recursive function like factorial(), and have factorial() > report the value of __builtin_return_address(0) each time it's called. Oh yea got it...I was stuck with the nested call case and forgot about direct recursion :) > > > > The fact that ri is passed to both handlers should allow any user > > > > handler to identify each of these cases and take appropriate > > > > synchronization action pertaining to its private data, if needed. > > > > > > I don't think Abhishek has made his case here. See below. > > > > > > > > > > > > (Hence I feel sol a) would be nice). > > > > > > > > With an entry-handler, any module aiming to profile running time of a > > > > function (say) can simply do the following without being "return > > > > instance" conscious. Note however that I'm not trying to address just > > > > this scenario but trying to provide a general way to use > > > > entry-handlers in kretprobes: > > > > > > > > static unsigned long flag = 0; /* use bit 0 as a global flag */ > > > > unsigned long long entry, exit; > > > > > > > > int my_entry_handler(struct kretprobe_instance *ri, struct pt_regs > > > > *regs) > > > > { > > > > if (!test_and_set_bit(0, )) > > > > /* this instance claims the first entry to kretprobe'd function */ > > > > entry = sched_clock(); > > > > /* do other stuff */ > > > > return 0; /* right on! */ > > > > } > > > > return 1; /* error: no return instance to be allocated for this > > > > function entry */ > > > > } > > > > > > > > /* will only be called iff flag == 1 */ > > > > int my_return_handler(struct kretprobe_instance *ri, struct pt_regs > > > > *regs) > > > > { > > > > BUG_ON(!flag); > > > > exit = sched_clock(); > > > > set_bit(0, ); > > > > } > > > > > > > > I think something like this should do the trick for you. > > > > > > Since flag is static, it seems to me that if there were instances of the > > > probed function active concurrently in multiple tasks, only the > > > first-called instance would be profiled. > > > > Oh that's right, or we could use a per-cpu flag which would restrict > > us to only one profiling instance per processor. > > If the probed function can sleep, then it could return on a different > CPU; a per-cpu flag wouldn't work in such cases. Hmmm...since disabling preemption from entry_handler won't work, a more practical approach would be to associate all return instances to tasks. I'll try to come up with an example to exploit the per-task return instance associativity. > > > > > Jim Keniston > > > > -- > > Thanks & Regards > > Abhishek Sagar > > > > --- > > Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> > > > > diff -upNr linux-2.6.24-rc2/include/linux/kprobes.h > > linux-2.6.24-rc2_kp/include/linux/kprobes.h > > --- linux-2.6.24-rc2/include/linux/kprobes.h 2007-11-07 03:27:46.0 > > +0530 > > +++ linux-2.6.24-rc2_kp/include/linux/kprobes.h 2007-1
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
kprobe and kretprobe handlers, and the proposed entry handler is no different, as far as I can see. I had the copy mainly to undo the effect of arch_prepare_kretprobe() on regs if entry_handler decided to abort. But I think it didn't serve that purpose as well since arch_prepare_kretprobe() ends up modifying the stack on x86. So I've gotten rid of the copy now. The entry_handler now gets called before arch_prepare_kretprobe() and therefore shouldn't expect to use ri-ret_addr. In effect the entry_handler is an early decision maker on whether the return instance setup should even take place for current function hit or not. Wouldn't the return addresses be different for recursive/nested calls? Not for recursive calls. Think about it. Or write a little program that calls a recursive function like factorial(), and have factorial() report the value of __builtin_return_address(0) each time it's called. Oh yea got it...I was stuck with the nested call case and forgot about direct recursion :) The fact that ri is passed to both handlers should allow any user handler to identify each of these cases and take appropriate synchronization action pertaining to its private data, if needed. I don't think Abhishek has made his case here. See below. (Hence I feel sol a) would be nice). With an entry-handler, any module aiming to profile running time of a function (say) can simply do the following without being return instance conscious. Note however that I'm not trying to address just this scenario but trying to provide a general way to use entry-handlers in kretprobes: static unsigned long flag = 0; /* use bit 0 as a global flag */ unsigned long long entry, exit; int my_entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { if (!test_and_set_bit(0, flag)) /* this instance claims the first entry to kretprobe'd function */ entry = sched_clock(); /* do other stuff */ return 0; /* right on! */ } return 1; /* error: no return instance to be allocated for this function entry */ } /* will only be called iff flag == 1 */ int my_return_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { BUG_ON(!flag); exit = sched_clock(); set_bit(0, flag); } I think something like this should do the trick for you. Since flag is static, it seems to me that if there were instances of the probed function active concurrently in multiple tasks, only the first-called instance would be profiled. Oh that's right, or we could use a per-cpu flag which would restrict us to only one profiling instance per processor. If the probed function can sleep, then it could return on a different CPU; a per-cpu flag wouldn't work in such cases. Hmmm...since disabling preemption from entry_handler won't work, a more practical approach would be to associate all return instances to tasks. I'll try to come up with an example to exploit the per-task return instance associativity. Jim Keniston -- Thanks Regards Abhishek Sagar --- Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] diff -upNr linux-2.6.24-rc2/include/linux/kprobes.h linux-2.6.24-rc2_kp/include/linux/kprobes.h --- linux-2.6.24-rc2/include/linux/kprobes.h 2007-11-07 03:27:46.0 +0530 +++ linux-2.6.24-rc2_kp/include/linux/kprobes.h 2007-11-15 15:49:39.0 +0530 @@ -152,6 +152,7 @@ static inline int arch_trampoline_kprobe struct kretprobe { struct kprobe kp; kretprobe_handler_t handler; + kretprobe_handler_t entry_handler; int maxactive; int nmissed; struct hlist_head free_instances; diff -upNr linux-2.6.24-rc2/kernel/kprobes.c linux-2.6.24-rc2_kp/kernel/kprobes.c --- linux-2.6.24-rc2/kernel/kprobes.c 2007-11-07 03:27:46.0 +0530 +++ linux-2.6.24-rc2_kp/kernel/kprobes.c 2007-11-15 16:00:57.0 +0530 @@ -694,12 +694,24 @@ static int __kprobes pre_handler_kretpro spin_lock_irqsave(kretprobe_lock, flags); if (!hlist_empty(rp-free_instances)) { struct kretprobe_instance *ri; + struct pt_regs copy; NAK. Saving and restoring regs is expensive and inconsistent with existing kprobes usage. Revising the patch with the suggested change: --- Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] diff -upNr linux-2.6.24-rc2/include/linux/kprobes.h linux-2.6.24-rc2_kp/include/linux/kprobes.h --- linux-2.6.24-rc2/include/linux/kprobes.h2007-11-07 03:27:46.0 +0530 +++ linux-2.6.24-rc2_kp/include/linux/kprobes.h 2007-11-16 22:50:24.0 +0530 @@ -152,6 +152,7 @@ static inline int arch_trampoline_kprobe struct kretprobe { struct kprobe kp; kretprobe_handler_t handler; + kretprobe_handler_t entry_handler; int maxactive; int
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 16, 2007 5:37 AM, Jim Keniston [EMAIL PROTECTED] wrote: On Thu, 2007-11-15 at 20:30 +0530, Abhishek Sagar wrote: On Nov 15, 2007 4:21 AM, Jim Keniston [EMAIL PROTECTED] wrote: 2. Simplify the task of correlating data (e.g., timestamps) between function entry and function return. Would adding of data and len fields in ri help? Instead of pouching data in one go at registration time, this would let user handlers do the allocation Yes and no. Adding just a data field -- void*, or maybe unsigned long long so it's big enought to accommodate big timestamps -- would be a big improvement on your current proposal. That would save the user the drudgery of mapping the ri pointer to his/her per-instance data. There's plenty of precedent for passing private_data values to callbacks. I don't think a len field would help much. If such info were needed, it could be stored in the data structure pointed to by the data field. I still don't think letting [i.e., requiring that] user handlers do the allocation is a win. I'm still interested to see how this plays out in real examples. and allow them to use different kinds of data structures per-instance. I haven't been able to think of any scenarios where this would be useful. A data pouch could always contain a union, FWIW. I'm inlining a sample module which uses a data pointer in ri. I didn't go for a timestamp because it's not reliable. Some platforms might simply not have any h/w timestamp counters. For the same reason a lot of platforms (on ARM, say) have their sched_clock() mapped on jiffies. This may prevent timestamps from being distinct across function entry and exit. Plus a data pointer looks pretty harmless. --- test module --- #include linux/kernel.h #include linux/version.h #include linux/module.h #include linux/kprobes.h #include linux/ktime.h #define PRINT_DELAY (10 * HZ) /* This module calculates the total time and instances of func being called * across all cpu's. An average is calculated every 10 seconds and displayed. * Only one function instance per-task is monitored. This cuts out the * possibility of measuring time for recursive and nested function * invocations. * * Note: If compiling as a standalone module, make sure sched_clock() is * exported in the kernel. */ /* per-task data */ struct prof_data { struct task_struct *task; struct list_head list; unsigned long long entry_stamp; }; static const char *func = sys_open; static spinlock_t time_lock; static ktime_t total_time; static unsigned long hits; static LIST_HEAD(data_nodes); /* list of per-task data */ static struct delayed_work work_print; static struct prof_data *get_per_task_data(struct task_struct *tsk) { struct prof_data *p; /* lookup prof_data corresponding to tsk */ list_for_each_entry(p, data_nodes, list) { if (p-task == tsk) return p; } return NULL; } /* called with kretprobe_lock held */ static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { struct prof_data *stats; stats = get_per_task_data(current); if (stats) return 1; /* recursive/nested call */ stats = kmalloc(sizeof(struct prof_data), GFP_ATOMIC); if (!stats) return 1; stats-entry_stamp = sched_clock(); stats-task = current; INIT_LIST_HEAD(stats-list); list_add(stats-list, data_nodes); ri-data = stats; return 0; } /* called with kretprobe_lock held */ static int return_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { unsigned long flags; struct prof_data *stats = (struct prof_data *)ri-data; u64 elapsed; BUG_ON(ri-data == NULL); elapsed = (long long)sched_clock() - (long long)stats-entry_stamp; /* update stats */ spin_lock_irqsave(time_lock, flags); ++hits; total_time = ktime_add_ns(total_time, elapsed); spin_unlock_irqrestore(time_lock, flags); list_del(stats-list); kfree(stats); return 0; } static struct kretprobe my_kretprobe = { .handler = return_handler, .entry_handler = entry_handler, }; /* called after every PRINT_DELAY seconds */ static void print_time(struct work_struct *work) { unsigned long flags; s64 time_ns; struct timespec ts; BUG_ON(work != work_print.work); spin_lock_irqsave(time_lock, flags); time_ns = ktime_to_ns(total_time); do_div(time_ns, hits); spin_unlock_irqrestore(time_lock, flags); ts = ns_to_timespec(time_ns); printk(KERN_DEBUG Avg. running time of %s = %ld sec, %ld nsec\n, func, ts.tv_sec, ts.tv_nsec); schedule_delayed_work(work_print, PRINT_DELAY); } static int __init test_module_init(void) { int ret; my_kretprobe.kp.symbol_name
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 15, 2007 4:21 AM, Jim Keniston <[EMAIL PROTECTED]> wrote: > 2. Simplify the task of correlating data (e.g., timestamps) between > function entry and function return. Would adding of data and len fields in ri help? Instead of "pouching" data in one go at registration time, this would let user handlers do the allocation and allow them to use different kinds of data structures per-instance. - Abhishek Sagar - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 15, 2007 4:21 AM, Jim Keniston <[EMAIL PROTECTED]> wrote: > On Wed, 2007-11-14 at 19:00 +0530, Abhishek Sagar wrote: > > First of all, some general comments. We seem to be trying to solve two > problems here: > 1. Prevent the asymmetry in entry- vs. return-handler calls that can > develop when we temporarily run out of kretprobe_instances. E.g., if we > have m kretprobe "misses", we may report n calls but only (n-m) returns. That has already been taken care of. The entry-handler is called iff 'hlist_empty(>free_instances)' is false. Additionally, if entry_handler() returns an error then the corresponding return handler is also not called because that particular "return instance" is aborted/voluntarily-missed. Hence the following guarantee is implied: No. of return handler calls = No. of entry_handler calls which returned 0 (success). The number of failed entry_handler calls doesn't update any kind of ri->voluntary_nmissed count since the user handlers are internally aware of them (unlike ri->nmissed). > 2. Simplify the task of correlating data (e.g., timestamps) between > function entry and function return. Right. Srinivasa and you have been hinting at the use of per-instance private data for the same. I think ri should be enough. > Problem #1 wouldn't exist if we could solve problem #1a: > 1a. Ensure that we never run out of kretprobe_instances (for some > appropriate value of "never"). > > I've thought of various approaches to #1a -- e.g., allocate > kretprobe_instances from GFP_ATOMIC memory when we're out of > preallocated instances -- but I've never found time to pursue them. > This might be a good time to brainstorm solutions to that problem. > > Lacking a solution to #1a, I think Abhishek's approach provides a > reasonable solution to problem #1. If you're not convinced that problem #1 isn't appropriately handled, we should look for something like that I guess. > I don't think it takes us very far toward solving #2, though. I agree > with Srinivasa that it would be more helpful if we could store the data > collected at function-entry time right in the kretprobe_instance. Kevin > Stafford prototyped this "data pouch" idea a couple of years ago -- > http://sourceware.org/ml/systemtap/2005-q3/msg00593.html > -- but an analogous feature was implemented at a higher level in > SystemTap. Either approach -- in kprobes or SystemTap -- would benefit > from a fix to #1 (or #1a). There are three problems in the "data pouch" approach, which is a generalized case of Srinivasa's timestamp case: 1. It bloats the size of each return instance to a run-time defined value. I'm certain that quite a few memory starved ARM kprobes users would certainly be wishing they could install more probes on their system without taking away too much from the precious memory pools which can impact their system performance. This is not a deal breaker though, just an annoyance. 2. Forces user entry/return handlers to use ri->entry_info (see Kevin's patch) even if their design only wanted such private data to be used in certain instances. Therefore ideally, any per-instance data allocation should be left to user entry handlers, IMO. Even if we allow a pouch for private data in a return instance, the user handlers would still need be aware of "return instances" to actually use them globally. 3. It's redundant. 'ri' can uniquely identify any entry-return handler pair. This itself solves your problem #2. It only moves the onus of private data allocation to user handlers. > Review of Abhishek's patch: > > I see no reason to save a copy of *regs and pass that to the entry > handler. Passing the real regs pointer is good enough for other kprobe > handlers. No, a copy is required because if the entry_handler() returns error (a voluntary miss), then there has to be a way to roll-back all the changes that arch_prepare_kretprobe() and entry_handler() made to *regs. Such an instance is considered "aborted". > And if a handler on i386 uses >esp as the value of the > stack pointer (which is correct -- long story), it'll get the wrong > value if its regs arg points at the copy. That's a catch! I've made the fix (see inlined patch below). It now passes regs instead of to both the entry_handler() and arch_prepare_kretprobe(). > More comments below. [snip] > > 1. Multiple function entries from various tasks (the one you've just > > pointed out). > > 2. Multiple kretprobe registration on the same function. > > 3. Nested calls of kretprobe'd function. > > > > In cases 1 and 3, the following information can be used to match > > corresponding entry and return handlers inside a user handler (if > > needed): > > > > (ri->task, ri->ret_addr) > > where ri is st
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 15, 2007 4:21 AM, Jim Keniston [EMAIL PROTECTED] wrote: On Wed, 2007-11-14 at 19:00 +0530, Abhishek Sagar wrote: First of all, some general comments. We seem to be trying to solve two problems here: 1. Prevent the asymmetry in entry- vs. return-handler calls that can develop when we temporarily run out of kretprobe_instances. E.g., if we have m kretprobe misses, we may report n calls but only (n-m) returns. That has already been taken care of. The entry-handler is called iff 'hlist_empty(rp-free_instances)' is false. Additionally, if entry_handler() returns an error then the corresponding return handler is also not called because that particular return instance is aborted/voluntarily-missed. Hence the following guarantee is implied: No. of return handler calls = No. of entry_handler calls which returned 0 (success). The number of failed entry_handler calls doesn't update any kind of ri-voluntary_nmissed count since the user handlers are internally aware of them (unlike ri-nmissed). 2. Simplify the task of correlating data (e.g., timestamps) between function entry and function return. Right. Srinivasa and you have been hinting at the use of per-instance private data for the same. I think ri should be enough. Problem #1 wouldn't exist if we could solve problem #1a: 1a. Ensure that we never run out of kretprobe_instances (for some appropriate value of never). I've thought of various approaches to #1a -- e.g., allocate kretprobe_instances from GFP_ATOMIC memory when we're out of preallocated instances -- but I've never found time to pursue them. This might be a good time to brainstorm solutions to that problem. Lacking a solution to #1a, I think Abhishek's approach provides a reasonable solution to problem #1. If you're not convinced that problem #1 isn't appropriately handled, we should look for something like that I guess. I don't think it takes us very far toward solving #2, though. I agree with Srinivasa that it would be more helpful if we could store the data collected at function-entry time right in the kretprobe_instance. Kevin Stafford prototyped this data pouch idea a couple of years ago -- http://sourceware.org/ml/systemtap/2005-q3/msg00593.html -- but an analogous feature was implemented at a higher level in SystemTap. Either approach -- in kprobes or SystemTap -- would benefit from a fix to #1 (or #1a). There are three problems in the data pouch approach, which is a generalized case of Srinivasa's timestamp case: 1. It bloats the size of each return instance to a run-time defined value. I'm certain that quite a few memory starved ARM kprobes users would certainly be wishing they could install more probes on their system without taking away too much from the precious memory pools which can impact their system performance. This is not a deal breaker though, just an annoyance. 2. Forces user entry/return handlers to use ri-entry_info (see Kevin's patch) even if their design only wanted such private data to be used in certain instances. Therefore ideally, any per-instance data allocation should be left to user entry handlers, IMO. Even if we allow a pouch for private data in a return instance, the user handlers would still need be aware of return instances to actually use them globally. 3. It's redundant. 'ri' can uniquely identify any entry-return handler pair. This itself solves your problem #2. It only moves the onus of private data allocation to user handlers. Review of Abhishek's patch: I see no reason to save a copy of *regs and pass that to the entry handler. Passing the real regs pointer is good enough for other kprobe handlers. No, a copy is required because if the entry_handler() returns error (a voluntary miss), then there has to be a way to roll-back all the changes that arch_prepare_kretprobe() and entry_handler() made to *regs. Such an instance is considered aborted. And if a handler on i386 uses regs-esp as the value of the stack pointer (which is correct -- long story), it'll get the wrong value if its regs arg points at the copy. That's a catch! I've made the fix (see inlined patch below). It now passes regs instead of copy to both the entry_handler() and arch_prepare_kretprobe(). More comments below. [snip] 1. Multiple function entries from various tasks (the one you've just pointed out). 2. Multiple kretprobe registration on the same function. 3. Nested calls of kretprobe'd function. In cases 1 and 3, the following information can be used to match corresponding entry and return handlers inside a user handler (if needed): (ri-task, ri-ret_addr) where ri is struct kretprobe_instance * This tuple should uniquely identify a return address (right?). But if it's a recursive function, there could be multiple instances in the same task with the same return address. The stack pointer would be different, FWIW. Wouldn't the return addresses be different for recursive/nested calls? I think the only
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 15, 2007 4:21 AM, Jim Keniston [EMAIL PROTECTED] wrote: 2. Simplify the task of correlating data (e.g., timestamps) between function entry and function return. Would adding of data and len fields in ri help? Instead of pouching data in one go at registration time, this would let user handlers do the allocation and allow them to use different kinds of data structures per-instance. - Abhishek Sagar - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 14, 2007 3:53 PM, Srinivasa Ds <[EMAIL PROTECTED]> wrote: > No, eventhough return instances are chained in an order, order of execution of > return handler entirely depends on which process returns first Right...the LIFO chain analogy holds true for return instances for the same task only. As you've pointed out, kretprobe_instance is the only thing that can bind corresponding entry and return handlers together, which has been taken care of. > So entry_handler() which gets executed last doesn't guarantee > that its return handler will be executed first(because it took a lot time > to return). Only if there are return instances pending belonging to different tasks. > So only thing to match the entry_handler() with its return_handler() is > return probe instance(ri)'s address, which user has to take care explicitly Lets see how entry and return handlers can be matched up in three different scenarios:- 1. Multiple function entries from various tasks (the one you've just pointed out). 2. Multiple kretprobe registration on the same function. 3. Nested calls of kretprobe'd function. In cases 1 and 3, the following information can be used to match corresponding entry and return handlers inside a user handler (if needed): (ri->task, ri->ret_addr) where ri is struct kretprobe_instance * This tuple should uniquely identify a return address (right?). In case 2, entry and return handlers are anyways called in the right order (taken care of by trampoline_handler() due to LIFO insertion in ri->hlist). The fact that ri is passed to both handlers should allow any user handler to identify each of these cases and take appropriate synchronization action pertaining to its private data, if needed. > (Hence I feel sol a) would be nice). With an entry-handler, any module aiming to profile running time of a function (say) can simply do the following without being "return instance" conscious. Note however that I'm not trying to address just this scenario but trying to provide a general way to use entry-handlers in kretprobes: static unsigned long flag = 0; /* use bit 0 as a global flag */ unsigned long long entry, exit; int my_entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { if (!test_and_set_bit(0, )) /* this instance claims the first entry to kretprobe'd function */ entry = sched_clock(); /* do other stuff */ return 0; /* right on! */ } return 1; /* error: no return instance to be allocated for this function entry */ } /* will only be called iff flag == 1 */ int my_return_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { BUG_ON(!flag); exit = sched_clock(); set_bit(0, ); } I think something like this should do the trick for you. > Thanks > Srinivasa DS -- Thanks & Regards Abhishek Sagar - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 14, 2007 1:27 PM, Srinivasa Ds <[EMAIL PROTECTED]> wrote: > 1) How do you map the entry_handler(which gets executed when a process enters > the function) with each instance of return probe handler. > I accept that entry_handler() will execute each time process enters the > function, but to calculate time, one needs to know corresponding instance of > return probe handler(there should be a map for each return handler). Yes, a one-to-one correspondence between the entry and return handlers is important. I've tried to address this by passing the same kretprobe_instance to both the entry and return handlers. > Let me explain briefly. > Suppose in a SMP system, 4 processes enter the same function at almost > sametime(by executing entry_hanlder()) and returns from 4 different locations > by executing the return handler. Now how do I match entry_handler() with > corresponding instance of return handler for calculating time. Right, and this scenario would also occur on UPs where the kretprobe'd function has nested calls. This has been taken care of (see below). > Now What I think is, there could be 2 solutions to these problem. > > a) Collect the entry time and exit time and put it in that kretprobe_instance > structure and fetch it before freeing that instance. In case someone wants to calculate the entry and exit timestamps of a function using kretprobes, the appropriate place for it is not the entry and return handlers. Thats because the path from when a function is called or from when it returns, to the point of invocation of the entry or return handler is not O(1). Looking at trampoline_handler(), it actually traverses a list of all pending return instances to reach the correct return instance. So any kind of exit timestamp must be placed just before that and passed to the return handler via kretprobe_instance (as you just suggested). > b) Or pass ri(kretprobe_instance address to entry_handler) and match it with > return probe handler. This is what I'm trying to do with this patch. I hope I've not misread what you meant here, but as you'll notice from the patch: + if (rp->entry_handler) { + copy = *regs; + arch_prepare_kretprobe(ri, ); + if (rp->entry_handler(ri, )) < (entry-handler with ri) + goto out; /* skip current kretprobe instance */ + *regs = copy; + } else { + arch_prepare_kretprobe(ri, regs); + } the entry handler is called with the appropriate return instance. I haven't put any explicit "match" test here for ri. The reason is that the correct ri would be passed to both the entry and return handlers as trampoline_handler() explicitly matches them to the correct task. Note that all pending return instances of a function are chained in LIFO order. S the entry-handler which gets called last, should have its return handler called first (in case of multiple pending return instances). Another cool thing here is that if the entry handler returns a non-zero value, the current return instance is aborted altogether. So if the entry-handler fails, no return handler gets called. Does this address your concerns? -- Regards Abhishek Sagar - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 14, 2007 1:27 PM, Srinivasa Ds [EMAIL PROTECTED] wrote: 1) How do you map the entry_handler(which gets executed when a process enters the function) with each instance of return probe handler. I accept that entry_handler() will execute each time process enters the function, but to calculate time, one needs to know corresponding instance of return probe handler(there should be a map for each return handler). Yes, a one-to-one correspondence between the entry and return handlers is important. I've tried to address this by passing the same kretprobe_instance to both the entry and return handlers. Let me explain briefly. Suppose in a SMP system, 4 processes enter the same function at almost sametime(by executing entry_hanlder()) and returns from 4 different locations by executing the return handler. Now how do I match entry_handler() with corresponding instance of return handler for calculating time. Right, and this scenario would also occur on UPs where the kretprobe'd function has nested calls. This has been taken care of (see below). Now What I think is, there could be 2 solutions to these problem. a) Collect the entry time and exit time and put it in that kretprobe_instance structure and fetch it before freeing that instance. In case someone wants to calculate the entry and exit timestamps of a function using kretprobes, the appropriate place for it is not the entry and return handlers. Thats because the path from when a function is called or from when it returns, to the point of invocation of the entry or return handler is not O(1). Looking at trampoline_handler(), it actually traverses a list of all pending return instances to reach the correct return instance. So any kind of exit timestamp must be placed just before that and passed to the return handler via kretprobe_instance (as you just suggested). b) Or pass ri(kretprobe_instance address to entry_handler) and match it with return probe handler. This is what I'm trying to do with this patch. I hope I've not misread what you meant here, but as you'll notice from the patch: + if (rp-entry_handler) { + copy = *regs; + arch_prepare_kretprobe(ri, copy); + if (rp-entry_handler(ri, copy)) (entry-handler with ri) + goto out; /* skip current kretprobe instance */ + *regs = copy; + } else { + arch_prepare_kretprobe(ri, regs); + } the entry handler is called with the appropriate return instance. I haven't put any explicit match test here for ri. The reason is that the correct ri would be passed to both the entry and return handlers as trampoline_handler() explicitly matches them to the correct task. Note that all pending return instances of a function are chained in LIFO order. S the entry-handler which gets called last, should have its return handler called first (in case of multiple pending return instances). Another cool thing here is that if the entry handler returns a non-zero value, the current return instance is aborted altogether. So if the entry-handler fails, no return handler gets called. Does this address your concerns? -- Regards Abhishek Sagar - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 14, 2007 3:53 PM, Srinivasa Ds [EMAIL PROTECTED] wrote: No, eventhough return instances are chained in an order, order of execution of return handler entirely depends on which process returns first Right...the LIFO chain analogy holds true for return instances for the same task only. As you've pointed out, kretprobe_instance is the only thing that can bind corresponding entry and return handlers together, which has been taken care of. So entry_handler() which gets executed last doesn't guarantee that its return handler will be executed first(because it took a lot time to return). Only if there are return instances pending belonging to different tasks. So only thing to match the entry_handler() with its return_handler() is return probe instance(ri)'s address, which user has to take care explicitly Lets see how entry and return handlers can be matched up in three different scenarios:- 1. Multiple function entries from various tasks (the one you've just pointed out). 2. Multiple kretprobe registration on the same function. 3. Nested calls of kretprobe'd function. In cases 1 and 3, the following information can be used to match corresponding entry and return handlers inside a user handler (if needed): (ri-task, ri-ret_addr) where ri is struct kretprobe_instance * This tuple should uniquely identify a return address (right?). In case 2, entry and return handlers are anyways called in the right order (taken care of by trampoline_handler() due to LIFO insertion in ri-hlist). The fact that ri is passed to both handlers should allow any user handler to identify each of these cases and take appropriate synchronization action pertaining to its private data, if needed. (Hence I feel sol a) would be nice). With an entry-handler, any module aiming to profile running time of a function (say) can simply do the following without being return instance conscious. Note however that I'm not trying to address just this scenario but trying to provide a general way to use entry-handlers in kretprobes: static unsigned long flag = 0; /* use bit 0 as a global flag */ unsigned long long entry, exit; int my_entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { if (!test_and_set_bit(0, flag)) /* this instance claims the first entry to kretprobe'd function */ entry = sched_clock(); /* do other stuff */ return 0; /* right on! */ } return 1; /* error: no return instance to be allocated for this function entry */ } /* will only be called iff flag == 1 */ int my_return_handler(struct kretprobe_instance *ri, struct pt_regs *regs) { BUG_ON(!flag); exit = sched_clock(); set_bit(0, flag); } I think something like this should do the trick for you. Thanks Srinivasa DS -- Thanks Regards Abhishek Sagar - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 13, 2007 12:09 AM, Abhishek Sagar <[EMAIL PROTECTED]> wrote: > Whoops...sry for the repeated email..emailer trouble. Expecting this one to makes it to the list. Summary again: This patch introduces a provision to specify a user-defined callback to run at function entry to complement the return handler in kretprobes. Currently, whenever a kretprobe is registered, a user can specify a callback (return-handler) to be run each time the target function returns. This is also not guaranteed and is limited by the number of concurrently pending return instances of the target function in the current process's context. This patch will now allow registration of another user defined handler which is guaranteed to run each time the current return instance is allocated and the return handler is set-up. Conversely, if the entry-handler returns an error, it'll cause the current return instance to be dropped and the return handler will also not run. The purpose is to provide flexibility to do certain kinds of function level profiling using kretprobes. By being able to register function entry and return handlers, kretprobes will now be able to reduce an extra probe registration (and associated race) for scenarios where an entry handler is required to capture the function call/entry event along with the corresponding function exit event. Hope these simple changes would suffice; all suggestions/corrections are welcome. Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]> --- diff -X /home/sagara/kprobes/dontdiff -upNr linux-2.6.24-rc2/include/linux/kprobes.h linux-2.6.24-rc2_kp/include/linux/kprobes.h --- linux-2.6.24-rc2/include/linux/kprobes.h2007-11-07 03:27:46.0 +0530 +++ linux-2.6.24-rc2_kp/include/linux/kprobes.h 2007-11-13 16:04:35.0 +0530 @@ -152,6 +152,7 @@ static inline int arch_trampoline_kprobe struct kretprobe { struct kprobe kp; kretprobe_handler_t handler; + kretprobe_handler_t entry_handler; int maxactive; int nmissed; struct hlist_head free_instances; diff -X /home/sagara/kprobes/dontdiff -upNr linux-2.6.24-rc2/kernel/kprobes.c linux-2.6.24-rc2_kp/kernel/kprobes.c --- linux-2.6.24-rc2/kernel/kprobes.c 2007-11-07 03:27:46.0 +0530 +++ linux-2.6.24-rc2_kp/kernel/kprobes.c2007-11-13 16:04:17.0 +0530 @@ -694,12 +694,22 @@ static int __kprobes pre_handler_kretpro spin_lock_irqsave(_lock, flags); if (!hlist_empty(>free_instances)) { struct kretprobe_instance *ri; + struct pt_regs copy; ri = hlist_entry(rp->free_instances.first, struct kretprobe_instance, uflist); ri->rp = rp; ri->task = current; - arch_prepare_kretprobe(ri, regs); + + if (rp->entry_handler) { + copy = *regs; + arch_prepare_kretprobe(ri, ); + if (rp->entry_handler(ri, )) + goto out; /* skip current kretprobe instance */ + *regs = copy; + } else { + arch_prepare_kretprobe(ri, regs); + } /* XXX(hch): why is there no hlist_move_head? */ hlist_del(>uflist); @@ -707,6 +717,7 @@ static int __kprobes pre_handler_kretpro hlist_add_head(>hlist, kretprobe_inst_table_head(ri->task)); } else rp->nmissed++; +out: spin_unlock_irqrestore(_lock, flags); return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 13, 2007 12:09 AM, Abhishek Sagar [EMAIL PROTECTED] wrote: Whoops...sry for the repeated email..emailer trouble. Expecting this one to makes it to the list. Summary again: This patch introduces a provision to specify a user-defined callback to run at function entry to complement the return handler in kretprobes. Currently, whenever a kretprobe is registered, a user can specify a callback (return-handler) to be run each time the target function returns. This is also not guaranteed and is limited by the number of concurrently pending return instances of the target function in the current process's context. This patch will now allow registration of another user defined handler which is guaranteed to run each time the current return instance is allocated and the return handler is set-up. Conversely, if the entry-handler returns an error, it'll cause the current return instance to be dropped and the return handler will also not run. The purpose is to provide flexibility to do certain kinds of function level profiling using kretprobes. By being able to register function entry and return handlers, kretprobes will now be able to reduce an extra probe registration (and associated race) for scenarios where an entry handler is required to capture the function call/entry event along with the corresponding function exit event. Hope these simple changes would suffice; all suggestions/corrections are welcome. Signed-off-by: Abhishek Sagar [EMAIL PROTECTED] --- diff -X /home/sagara/kprobes/dontdiff -upNr linux-2.6.24-rc2/include/linux/kprobes.h linux-2.6.24-rc2_kp/include/linux/kprobes.h --- linux-2.6.24-rc2/include/linux/kprobes.h2007-11-07 03:27:46.0 +0530 +++ linux-2.6.24-rc2_kp/include/linux/kprobes.h 2007-11-13 16:04:35.0 +0530 @@ -152,6 +152,7 @@ static inline int arch_trampoline_kprobe struct kretprobe { struct kprobe kp; kretprobe_handler_t handler; + kretprobe_handler_t entry_handler; int maxactive; int nmissed; struct hlist_head free_instances; diff -X /home/sagara/kprobes/dontdiff -upNr linux-2.6.24-rc2/kernel/kprobes.c linux-2.6.24-rc2_kp/kernel/kprobes.c --- linux-2.6.24-rc2/kernel/kprobes.c 2007-11-07 03:27:46.0 +0530 +++ linux-2.6.24-rc2_kp/kernel/kprobes.c2007-11-13 16:04:17.0 +0530 @@ -694,12 +694,22 @@ static int __kprobes pre_handler_kretpro spin_lock_irqsave(kretprobe_lock, flags); if (!hlist_empty(rp-free_instances)) { struct kretprobe_instance *ri; + struct pt_regs copy; ri = hlist_entry(rp-free_instances.first, struct kretprobe_instance, uflist); ri-rp = rp; ri-task = current; - arch_prepare_kretprobe(ri, regs); + + if (rp-entry_handler) { + copy = *regs; + arch_prepare_kretprobe(ri, copy); + if (rp-entry_handler(ri, copy)) + goto out; /* skip current kretprobe instance */ + *regs = copy; + } else { + arch_prepare_kretprobe(ri, regs); + } /* XXX(hch): why is there no hlist_move_head? */ hlist_del(ri-uflist); @@ -707,6 +717,7 @@ static int __kprobes pre_handler_kretpro hlist_add_head(ri-hlist, kretprobe_inst_table_head(ri-task)); } else rp-nmissed++; +out: spin_unlock_irqrestore(kretprobe_lock, flags); return 0; } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 13, 2007 12:01 AM, Abhishek Sagar <[EMAIL PROTECTED]> wrote: > Hope these simple changes would suffice; all suggestions/corrections are > welcome. Whoops...sry for the repeated email..emailer trouble. -- Regards, Abhishek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes
On Nov 13, 2007 12:01 AM, Abhishek Sagar [EMAIL PROTECTED] wrote: Hope these simple changes would suffice; all suggestions/corrections are welcome. Whoops...sry for the repeated email..emailer trouble. -- Regards, Abhishek - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Out of memory management in embedded systems
On 9/29/07, Daniel Spång <[EMAIL PROTECTED]> wrote: > > An embedded system is NOT an ordinary system that happens to > > boot from flash. An embedded system requires intelligent design. > > We might be talking about slightly different systems. I agree that > systems that are really embedded, in the classic meaning often with > real time constraints, should be designed as you suggests. But there > are a lot of other systems that almost actually are ordinary systems > but with limited memory and often without demand paging. [snip] > For those systems I think we need a method to dynamically decrease the > working set of a process when memory is scarce, and not just accept > that we "are screwed" and let the OOM killer solve the problem. In certain cases, I guess it could be a problem in the embedded environment. Especially while running general purpose applications with carefully designed real-time tasks. An OOM in such a case is unacceptable. The whole problem looks like an extension of page frame reclamation in user space. If the user application's cache was owned by the kernel (something like vmsplice with SPLICE_F_GIFT?), and the application managed it accordingly, then they could probably be brought under the purview of kernel's memory reclamation. This would mean that applications wouldn't need to be triggered on low memory, and leave memory freeing to the kernel (simpler and uniform). Perhaps it is even possible to do this in the kernel currently somehow...? -- Abhishek Sagar - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Out of memory management in embedded systems
On 9/29/07, Daniel Spång [EMAIL PROTECTED] wrote: An embedded system is NOT an ordinary system that happens to boot from flash. An embedded system requires intelligent design. We might be talking about slightly different systems. I agree that systems that are really embedded, in the classic meaning often with real time constraints, should be designed as you suggests. But there are a lot of other systems that almost actually are ordinary systems but with limited memory and often without demand paging. [snip] For those systems I think we need a method to dynamically decrease the working set of a process when memory is scarce, and not just accept that we are screwed and let the OOM killer solve the problem. In certain cases, I guess it could be a problem in the embedded environment. Especially while running general purpose applications with carefully designed real-time tasks. An OOM in such a case is unacceptable. The whole problem looks like an extension of page frame reclamation in user space. If the user application's cache was owned by the kernel (something like vmsplice with SPLICE_F_GIFT?), and the application managed it accordingly, then they could probably be brought under the purview of kernel's memory reclamation. This would mean that applications wouldn't need to be triggered on low memory, and leave memory freeing to the kernel (simpler and uniform). Perhaps it is even possible to do this in the kernel currently somehow...? -- Abhishek Sagar - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KPROBES: Instrumenting a function's call site
On 9/26/07, Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> wrote: > PS: There was a thought of providing a facility to run a handler at > function entry even when just a kretprobe is used. Maybe we need to > relook at that; it'd have been useful in this case. That would be really useful. I was writing a tool to dump some function profiling info through /proc. This would help save an extra probe on each function's entry. - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KPROBES: Instrumenting a function's call site
On 9/26/07, Avishay Traeger <[EMAIL PROTECTED]> wrote: > So to measure the latency of foo(), I basically want kprobes to do this: > pre_handler(); > foo(); > post_handler(); > > The problem is that the latencies that I am getting are consistently low > (~10,000 cycles). When I manually instrument the functions, the latency > is about 20,000,000 cycles. Clearly something is not right here. Single-stepping is done with preemption (and on some archs like ARM, interrupts) disabled. May be that is contributing to such a skew. > Is this a known issue? Instead of using the post-handler, I can try to > add a kprobe to the following instruction with a pre-handler. I was > just curious if there was something fundamentally wrong with the > approach I took, or maybe a bug that you should be made aware of. > > Please CC me on any replies (not subscribed to LKML). > > Thanks, > Avishay > -- Regards Abhishek Sagar - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KPROBES: Instrumenting a function's call site
On 9/26/07, Avishay Traeger [EMAIL PROTECTED] wrote: So to measure the latency of foo(), I basically want kprobes to do this: pre_handler(); foo(); post_handler(); The problem is that the latencies that I am getting are consistently low (~10,000 cycles). When I manually instrument the functions, the latency is about 20,000,000 cycles. Clearly something is not right here. Single-stepping is done with preemption (and on some archs like ARM, interrupts) disabled. May be that is contributing to such a skew. Is this a known issue? Instead of using the post-handler, I can try to add a kprobe to the following instruction with a pre-handler. I was just curious if there was something fundamentally wrong with the approach I took, or maybe a bug that you should be made aware of. Please CC me on any replies (not subscribed to LKML). Thanks, Avishay -- Regards Abhishek Sagar - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KPROBES: Instrumenting a function's call site
On 9/26/07, Ananth N Mavinakayanahalli [EMAIL PROTECTED] wrote: PS: There was a thought of providing a facility to run a handler at function entry even when just a kretprobe is used. Maybe we need to relook at that; it'd have been useful in this case. That would be really useful. I was writing a tool to dump some function profiling info through /proc. This would help save an extra probe on each function's entry. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] build system: section garbage collection for vmlinux
On 9/14/07, Sam Ravnborg <[EMAIL PROTECTED]> wrote: > > With some tweaks it worked for me. > Could you post your tweaked version - against an older kernel is OK. The inlined patch should apply cleanly on top of the patch posted on the link I mentioned before. The *.S files are the ones I chose to bring them under the purview of --ffunction-sections. My observation remains that if a fine-grained function/data/exported-symbol level garbage collection can be incorporated into the build environment, then it'll be something really useful. -- Abhishek Sagar --- diff -upNr linux_orig-2.6.12/arch/arm/kernel/armksyms.c linux-2.6.12/arch/arm/kernel/armksyms.c --- linux_orig-2.6.12/arch/arm/kernel/armksyms.c2005-06-18 01:18:29.0 +0530 +++ linux-2.6.12/arch/arm/kernel/armksyms.c 2007-09-14 09:00:03.0 +0530 @@ -44,10 +44,17 @@ extern void fp_enter(void); * This has a special calling convention; it doesn't * modify any of the usual registers, except for LR. */ +#ifndef CONFIG_GCSECTIONS #define EXPORT_SYMBOL_ALIAS(sym,orig) \ const struct kernel_symbol __ksymtab_##sym\ __attribute__((section("__ksymtab"))) = \ { (unsigned long), #sym }; +#else +#define EXPORT_SYMBOL_ALIAS(sym,orig)\ + const struct kernel_symbol __ksymtab_##sym \ + __attribute__((section("___ksymtab_" #sym))) = \ +{ (unsigned long), #sym }; +#endif /* CONFIG_GCSECTIONS */ /* * floating point math emulator support. diff -upNr linux_orig-2.6.12/arch/arm/kernel/iwmmxt.S linux-2.6.12/arch/arm/kernel/iwmmxt.S --- linux_orig-2.6.12/arch/arm/kernel/iwmmxt.S 2005-06-18 01:18:29.0 +0530 +++ linux-2.6.12/arch/arm/kernel/iwmmxt.S 2007-09-14 09:21:01.0 +0530 @@ -55,7 +55,11 @@ * * called from prefetch exception handler with interrupts disabled */ - +#ifdef CONFIG_GCSECTIONS + .section ".text.iwmmxt_task_enable" +#else + .text +#endif ENTRY(iwmmxt_task_enable) mrc p15, 0, r2, c15, c1, 0 diff -upNr linux_orig-2.6.12/arch/arm/kernel/vmlinux.lds.S linux-2.6.12/arch/arm/kernel/vmlinux.lds.S --- linux_orig-2.6.12/arch/arm/kernel/vmlinux.lds.S 2005-06-18 01:18:29.0 +0530 +++ linux-2.6.12/arch/arm/kernel/vmlinux.lds.S 2007-09-14 08:58:30.0 +0530 @@ -20,50 +20,50 @@ SECTIONS .init : { /* Init code and data */ _stext = .; _sinittext = .; - *(.init.text) + KEEP(*(.init.text)) _einittext = .; __proc_info_begin = .; - *(.proc.info) + KEEP(*(.proc.info)) __proc_info_end = .; __arch_info_begin = .; - *(.arch.info) + KEEP(*(.arch.info)) __arch_info_end = .; __tagtable_begin = .; - *(.taglist) + KEEP(*(.taglist)) __tagtable_end = .; . = ALIGN(16); __setup_start = .; - *(.init.setup) + KEEP(*(.init.setup)) __setup_end = .; __early_begin = .; - *(__early_param) + KEEP(*(__early_param)) __early_end = .; __initcall_start = .; - *(.initcall1.init) - *(.initcall2.init) - *(.initcall3.init) - *(.initcall4.init) - *(.initcall5.init) - *(.initcall6.init) - *(.initcall7.init) + KEEP(*(.initcall1.init)) + KEEP(*(.initcall2.init)) + KEEP(*(.initcall3.init)) + KEEP(*(.initcall4.init)) + KEEP(*(.initcall5.init)) + KEEP(*(.initcall6.init)) + KEEP(*(.initcall7.init)) __initcall_end = .; __con_initcall_start = .; - *(.con_initcall.init) + KEEP(*(.con_initcall.init)) __con_initcall_end = .; __security_initcall_start = .; - *(.security_initcall.init) + KEEP(*(.security_initcall.init)) __security_initcall_end = .; . = ALIGN(32); __initramfs_start = .; - usr/built-in.o(.init.ramfs) + KEEP(usr/built-in.o(.init.ramfs)) __initramfs_end = .; . = ALIGN(64); __per_cpu_start = .; - *(.data.percpu) + KEEP(*(.data.percpu)) __per_cpu_end = .; #ifndef CONFIG_XIP_KERNEL __init
Re: [PATCH 0/4] build system: section garbage collection for vmlinux
On 9/14/07, Sam Ravnborg [EMAIL PROTECTED] wrote: With some tweaks it worked for me. Could you post your tweaked version - against an older kernel is OK. The inlined patch should apply cleanly on top of the patch posted on the link I mentioned before. The *.S files are the ones I chose to bring them under the purview of --ffunction-sections. My observation remains that if a fine-grained function/data/exported-symbol level garbage collection can be incorporated into the build environment, then it'll be something really useful. -- Abhishek Sagar --- diff -upNr linux_orig-2.6.12/arch/arm/kernel/armksyms.c linux-2.6.12/arch/arm/kernel/armksyms.c --- linux_orig-2.6.12/arch/arm/kernel/armksyms.c2005-06-18 01:18:29.0 +0530 +++ linux-2.6.12/arch/arm/kernel/armksyms.c 2007-09-14 09:00:03.0 +0530 @@ -44,10 +44,17 @@ extern void fp_enter(void); * This has a special calling convention; it doesn't * modify any of the usual registers, except for LR. */ +#ifndef CONFIG_GCSECTIONS #define EXPORT_SYMBOL_ALIAS(sym,orig) \ const struct kernel_symbol __ksymtab_##sym\ __attribute__((section(__ksymtab))) = \ { (unsigned long)orig, #sym }; +#else +#define EXPORT_SYMBOL_ALIAS(sym,orig)\ + const struct kernel_symbol __ksymtab_##sym \ + __attribute__((section(___ksymtab_ #sym))) = \ +{ (unsigned long)orig, #sym }; +#endif /* CONFIG_GCSECTIONS */ /* * floating point math emulator support. diff -upNr linux_orig-2.6.12/arch/arm/kernel/iwmmxt.S linux-2.6.12/arch/arm/kernel/iwmmxt.S --- linux_orig-2.6.12/arch/arm/kernel/iwmmxt.S 2005-06-18 01:18:29.0 +0530 +++ linux-2.6.12/arch/arm/kernel/iwmmxt.S 2007-09-14 09:21:01.0 +0530 @@ -55,7 +55,11 @@ * * called from prefetch exception handler with interrupts disabled */ - +#ifdef CONFIG_GCSECTIONS + .section .text.iwmmxt_task_enable +#else + .text +#endif ENTRY(iwmmxt_task_enable) mrc p15, 0, r2, c15, c1, 0 diff -upNr linux_orig-2.6.12/arch/arm/kernel/vmlinux.lds.S linux-2.6.12/arch/arm/kernel/vmlinux.lds.S --- linux_orig-2.6.12/arch/arm/kernel/vmlinux.lds.S 2005-06-18 01:18:29.0 +0530 +++ linux-2.6.12/arch/arm/kernel/vmlinux.lds.S 2007-09-14 08:58:30.0 +0530 @@ -20,50 +20,50 @@ SECTIONS .init : { /* Init code and data */ _stext = .; _sinittext = .; - *(.init.text) + KEEP(*(.init.text)) _einittext = .; __proc_info_begin = .; - *(.proc.info) + KEEP(*(.proc.info)) __proc_info_end = .; __arch_info_begin = .; - *(.arch.info) + KEEP(*(.arch.info)) __arch_info_end = .; __tagtable_begin = .; - *(.taglist) + KEEP(*(.taglist)) __tagtable_end = .; . = ALIGN(16); __setup_start = .; - *(.init.setup) + KEEP(*(.init.setup)) __setup_end = .; __early_begin = .; - *(__early_param) + KEEP(*(__early_param)) __early_end = .; __initcall_start = .; - *(.initcall1.init) - *(.initcall2.init) - *(.initcall3.init) - *(.initcall4.init) - *(.initcall5.init) - *(.initcall6.init) - *(.initcall7.init) + KEEP(*(.initcall1.init)) + KEEP(*(.initcall2.init)) + KEEP(*(.initcall3.init)) + KEEP(*(.initcall4.init)) + KEEP(*(.initcall5.init)) + KEEP(*(.initcall6.init)) + KEEP(*(.initcall7.init)) __initcall_end = .; __con_initcall_start = .; - *(.con_initcall.init) + KEEP(*(.con_initcall.init)) __con_initcall_end = .; __security_initcall_start = .; - *(.security_initcall.init) + KEEP(*(.security_initcall.init)) __security_initcall_end = .; . = ALIGN(32); __initramfs_start = .; - usr/built-in.o(.init.ramfs) + KEEP(usr/built-in.o(.init.ramfs)) __initramfs_end = .; . = ALIGN(64); __per_cpu_start = .; - *(.data.percpu) + KEEP(*(.data.percpu)) __per_cpu_end = .; #ifndef CONFIG_XIP_KERNEL __init_begin = _stext; - *(.init.data
Re: [PATCH 0/4] build system: section garbage collection for vmlinux
On 9/12/07, Denys Vlasenko <[EMAIL PROTECTED]> wrote: > Patches were run-tested on x86_64, and likely do not work on any other arch > (need to add KEEP() to arch/*/kernel/vmlinux.lds.S for each arch). This is good stuff. I had been using a ported variant of this optimization for ARM on quite an older 2.6 kernel for a while now. I derived that port from: http://lkml.org/lkml/2006/6/4/169 With some tweaks it worked for me. Could you also have a look at the mentioned link and see if that's a superset of what you're trying to achieve? -- Abhishek Sagar - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] build system: section garbage collection for vmlinux
On 9/12/07, Denys Vlasenko [EMAIL PROTECTED] wrote: Patches were run-tested on x86_64, and likely do not work on any other arch (need to add KEEP() to arch/*/kernel/vmlinux.lds.S for each arch). This is good stuff. I had been using a ported variant of this optimization for ARM on quite an older 2.6 kernel for a while now. I derived that port from: http://lkml.org/lkml/2006/6/4/169 With some tweaks it worked for me. Could you also have a look at the mentioned link and see if that's a superset of what you're trying to achieve? -- Abhishek Sagar - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: instrumentation and kprobes really still "EXPERIMENTAL"?
On 7/2/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote: consider this random Kconfig file arch/x86_64/oprofile/Kconfig: = config PROFILING bool "Profiling support (EXPERIMENTAL)" help Say Y here to enable the extended profiling support mechanisms used by profilers such as OProfile. config OPROFILE tristate "OProfile system profiling (EXPERIMENTAL)" depends on PROFILING help OProfile is a profiling system capable of profiling the whole system, include the kernel, kernel modules, libraries, and applications. If unsure, say N. == the above is a bit silly. note that both prompts advertise themselves as "EXPERIMENTAL" even though neither of them has such a dependency. they *are*, however, part of an submenu that *is* dependent on EXPERIMENTAL (although you'd never know that just by looking at that Kconfig file). in short, it's just kind of ugly hackery at the moment. Yes but pushing the EXPERIMENTAL check to individual menu members shouldn't be done in a way which leads to an empty menu for any kernel configuration. I suspect that this is why the EXPERIMENTAL check is on the instrumentation menu in the first place. my original point was simply that, based on its acceptance, it would seem kprobes has progressed beyond the EXPERIMENTAL phase, that's all. It's worth a look on some archs. -- Abhishek Sagar - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: instrumentation and kprobes really still "EXPERIMENTAL"?
On 7/2/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote: On Mon, 2 Jul 2007, Abhishek Sagar wrote: > On 7/1/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote: > > isn't kprobes mature enough to not be considered experimental anymore? > > That would vary from arch to arch. fair enough. however, at the very least, i'm thinking that the entire "Instrumentation support" submenu can be made non-EXPERIMENTAL, while individual entries therein can be EXPERIMENTAL on a choice-by-choice basis or something like that. There's not a lot in that menu. It holds oprofile on every arch I grepped on. Oprofile might come across as being marked non-experimental but that may very well be due to the assumption that the menu that holds that option will enforce the EXPERIMENTAL check. How many archs do you see where oprofile and kprobe defer on their dependency on EXPERIMENTAL? Removing the EXPERIMENTAL check from the instrumentation menu will also require you to fix/double-check dependency on EXPERIMENTAL of oprofile/kprobes, provided you know that for sure. Which would bring you back to the same question you began with, namely, whether option xyz is EXPERIMENTAL or not. -- Abhishek Sagar - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: instrumentation and kprobes really still "EXPERIMENTAL"?
On 7/1/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote: isn't kprobes mature enough to not be considered experimental anymore? That would vary from arch to arch. in addition, while most of the KPROBES config options depend on KALLSYMS && EXPERIMENTAL && MODULES the s390 architecture depends only on EXPERIMENTAL && MODULES and its instrumentation support is *not* listed as experimental. also, the avr32 entry is in the file Kconfig.debug, and depends only on DEBUG_KERNEL. just an observation. This probably stems from the episodic growth kprobes across different archs. It can be cleaned up I suppose. -- Abhishek Sagar P.S: Adding systemtap in CC. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: instrumentation and kprobes really still EXPERIMENTAL?
On 7/1/07, Robert P. J. Day [EMAIL PROTECTED] wrote: isn't kprobes mature enough to not be considered experimental anymore? That would vary from arch to arch. in addition, while most of the KPROBES config options depend on KALLSYMS EXPERIMENTAL MODULES the s390 architecture depends only on EXPERIMENTAL MODULES and its instrumentation support is *not* listed as experimental. also, the avr32 entry is in the file Kconfig.debug, and depends only on DEBUG_KERNEL. just an observation. This probably stems from the episodic growth kprobes across different archs. It can be cleaned up I suppose. -- Abhishek Sagar P.S: Adding systemtap in CC. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: instrumentation and kprobes really still EXPERIMENTAL?
On 7/2/07, Robert P. J. Day [EMAIL PROTECTED] wrote: On Mon, 2 Jul 2007, Abhishek Sagar wrote: On 7/1/07, Robert P. J. Day [EMAIL PROTECTED] wrote: isn't kprobes mature enough to not be considered experimental anymore? That would vary from arch to arch. fair enough. however, at the very least, i'm thinking that the entire Instrumentation support submenu can be made non-EXPERIMENTAL, while individual entries therein can be EXPERIMENTAL on a choice-by-choice basis or something like that. There's not a lot in that menu. It holds oprofile on every arch I grepped on. Oprofile might come across as being marked non-experimental but that may very well be due to the assumption that the menu that holds that option will enforce the EXPERIMENTAL check. How many archs do you see where oprofile and kprobe defer on their dependency on EXPERIMENTAL? Removing the EXPERIMENTAL check from the instrumentation menu will also require you to fix/double-check dependency on EXPERIMENTAL of oprofile/kprobes, provided you know that for sure. Which would bring you back to the same question you began with, namely, whether option xyz is EXPERIMENTAL or not. -- Abhishek Sagar - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: instrumentation and kprobes really still EXPERIMENTAL?
On 7/2/07, Robert P. J. Day [EMAIL PROTECTED] wrote: consider this random Kconfig file arch/x86_64/oprofile/Kconfig: = config PROFILING bool Profiling support (EXPERIMENTAL) help Say Y here to enable the extended profiling support mechanisms used by profilers such as OProfile. config OPROFILE tristate OProfile system profiling (EXPERIMENTAL) depends on PROFILING help OProfile is a profiling system capable of profiling the whole system, include the kernel, kernel modules, libraries, and applications. If unsure, say N. == the above is a bit silly. note that both prompts advertise themselves as EXPERIMENTAL even though neither of them has such a dependency. they *are*, however, part of an submenu that *is* dependent on EXPERIMENTAL (although you'd never know that just by looking at that Kconfig file). in short, it's just kind of ugly hackery at the moment. Yes but pushing the EXPERIMENTAL check to individual menu members shouldn't be done in a way which leads to an empty menu for any kernel configuration. I suspect that this is why the EXPERIMENTAL check is on the instrumentation menu in the first place. my original point was simply that, based on its acceptance, it would seem kprobes has progressed beyond the EXPERIMENTAL phase, that's all. It's worth a look on some archs. -- Abhishek Sagar - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] Make jprobes a little safer for users
On 6/26/07, Michael Ellerman <[EMAIL PROTECTED]> wrote: It did occur to me that someone might be doing something crazy like branching to code that's not in the kernel/module text - but I was hoping that wouldn't be the case. I'm not sure what ITCM is? The reference to tightly coupled memory (ITCM) was just to have you consider the possibility of the jprobe handler being outside kernel text region. Totally paranoid really. > > int __kprobes register_jprobe(struct jprobe *jp) > > { > > + unsigned long addr = arch_deref_entry_point(jp->entry); > > + > > + if (!kernel_text_address(addr)) > > + return -EINVAL; > > Seems like you're checking for the jprobe handler to be within > kernel/module range. Why not narrow this down to just module range > (!module_text_address(addr), say)? Core kernel functions would not be > ending with a 'jprobe_return()' anyway. There's jprobe code in net/ipv4/tcp_probe.c and net/dccp/probe.c that can be builtin or modular, so I think kernel_text_address() is right. Ok..thanks for that clarification. -- Abhishek Sagar - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] Make jprobes a little safer for users
On 6/26/07, Michael Ellerman <[EMAIL PROTECTED]> wrote: We can then use that in register_jprobe() to check that the entry point we're passed is actually in the kernel text, rather than just some random value. A similar cleanup is possible even for return probes then. I wonder if there are any kprobe related scenarios where the executable code may be located outside the core kernel text region (e.g, ITCM?). In that case would it also be wrong to assume that the jprobe handler may be situated outside the kernel core text / module region? Would it then make sense to move this check from register_jprobe() to the arch dependent code? int __kprobes register_jprobe(struct jprobe *jp) { + unsigned long addr = arch_deref_entry_point(jp->entry); + + if (!kernel_text_address(addr)) + return -EINVAL; Seems like you're checking for the jprobe handler to be within kernel/module range. Why not narrow this down to just module range (!module_text_address(addr), say)? Core kernel functions would not be ending with a 'jprobe_return()' anyway. -- Abhishek Sagar - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] Make jprobes a little safer for users
On 6/26/07, Michael Ellerman [EMAIL PROTECTED] wrote: We can then use that in register_jprobe() to check that the entry point we're passed is actually in the kernel text, rather than just some random value. A similar cleanup is possible even for return probes then. I wonder if there are any kprobe related scenarios where the executable code may be located outside the core kernel text region (e.g, ITCM?). In that case would it also be wrong to assume that the jprobe handler may be situated outside the kernel core text / module region? Would it then make sense to move this check from register_jprobe() to the arch dependent code? int __kprobes register_jprobe(struct jprobe *jp) { + unsigned long addr = arch_deref_entry_point(jp-entry); + + if (!kernel_text_address(addr)) + return -EINVAL; Seems like you're checking for the jprobe handler to be within kernel/module range. Why not narrow this down to just module range (!module_text_address(addr), say)? Core kernel functions would not be ending with a 'jprobe_return()' anyway. -- Abhishek Sagar - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] Make jprobes a little safer for users
On 6/26/07, Michael Ellerman [EMAIL PROTECTED] wrote: It did occur to me that someone might be doing something crazy like branching to code that's not in the kernel/module text - but I was hoping that wouldn't be the case. I'm not sure what ITCM is? The reference to tightly coupled memory (ITCM) was just to have you consider the possibility of the jprobe handler being outside kernel text region. Totally paranoid really. int __kprobes register_jprobe(struct jprobe *jp) { + unsigned long addr = arch_deref_entry_point(jp-entry); + + if (!kernel_text_address(addr)) + return -EINVAL; Seems like you're checking for the jprobe handler to be within kernel/module range. Why not narrow this down to just module range (!module_text_address(addr), say)? Core kernel functions would not be ending with a 'jprobe_return()' anyway. There's jprobe code in net/ipv4/tcp_probe.c and net/dccp/probe.c that can be builtin or modular, so I think kernel_text_address() is right. Ok..thanks for that clarification. -- Abhishek Sagar - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/