Re: [PATCH 2/2] Kprobes: Move kprobes examples to samples/

2008-02-04 Thread Abhishek Sagar
On 2/5/08, Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> wrote:

> + * Build and insert the kernel module as done in the kprobe example.
> + * You will see the trace data in /var/log/messages and on the console
> + * whenever sys_open() returns a negative value.

A passing observation"sys_open" should be replaced with "do_fork",
whose return value is not checked at all.

--
Regards,
Abhishek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] Kprobes: Move kprobes examples to samples/

2008-02-04 Thread Abhishek Sagar
On 2/5/08, Ananth N Mavinakayanahalli [EMAIL PROTECTED] wrote:

 + * Build and insert the kernel module as done in the kprobe example.
 + * You will see the trace data in /var/log/messages and on the console
 + * whenever sys_open() returns a negative value.

A passing observationsys_open should be replaced with do_fork,
whose return value is not checked at all.

--
Regards,
Abhishek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ktime: Allow ktime_t comparison using ktime_compare

2008-02-02 Thread Abhishek Sagar
Add a timespec style comparison function. Allows two ktime types to be compared
without having to convert to timespec/timeval. Useful for modules doing ktime
based math, especially the ones using ktime_get heavily.

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
---

diff --git a/include/linux/ktime.h b/include/linux/ktime.h
index a6ddec1..7f9d321 100644
--- a/include/linux/ktime.h
+++ b/include/linux/ktime.h
@@ -95,6 +95,23 @@ static inline ktime_t ktime_set(const long secs, const 
unsigned long nsecs)
 #define ktime_add(lhs, rhs) \
({ (ktime_t){ .tv64 = (lhs).tv64 + (rhs).tv64 }; })
 
+/**
+ * ktime_compare - Compares two ktime_t variables
+ *
+ * Return val:
+ * lhs < rhs:  < 0
+ * lhs == rhs: 0
+ * lhs > rhs:  > 0
+ */
+static inline int ktime_compare(const ktime_t lhs, const ktime_t rhs)
+{
+   if (lhs.tv64 < rhs.tv64)
+   return -1;
+   if (lhs.tv64 > rhs.tv64)
+   return 1;
+   return 0;
+}
+
 /*
  * Add a ktime_t variable and a scalar nanosecond value.
  * res = kt + nsval:
@@ -198,6 +215,23 @@ static inline ktime_t ktime_add(const ktime_t add1, const 
ktime_t add2)
 }
 
 /**
+ * ktime_compare - Compares two ktime_t variables
+ *
+ * Return val:
+ * lhs < rhs:  < 0
+ * lhs == rhs: 0
+ * lhs > rhs:  > 0
+ */
+static inline int ktime_compare(const ktime_t lhs, const ktime_t rhs)
+{
+   if (lhs.tv.sec < rhs.tv.sec)
+   return -1;
+   if (lhs.tv.sec > rhs.tv.sec)
+   return 1;
+   return lhs.tv.nsec - rhs.tv.nsec;
+}
+
+/**
  * ktime_add_ns - Add a scalar nanoseconds value to a ktime_t variable
  * @kt:addend
  * @nsec:  the scalar nsec value to add
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ktime: Allow ktime_t comparison using ktime_compare

2008-02-02 Thread Abhishek Sagar
Add a timespec style comparison function. Allows two ktime types to be compared
without having to convert to timespec/timeval. Useful for modules doing ktime
based math, especially the ones using ktime_get heavily.

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
---

diff --git a/include/linux/ktime.h b/include/linux/ktime.h
index a6ddec1..7f9d321 100644
--- a/include/linux/ktime.h
+++ b/include/linux/ktime.h
@@ -95,6 +95,23 @@ static inline ktime_t ktime_set(const long secs, const 
unsigned long nsecs)
 #define ktime_add(lhs, rhs) \
({ (ktime_t){ .tv64 = (lhs).tv64 + (rhs).tv64 }; })
 
+/**
+ * ktime_compare - Compares two ktime_t variables
+ *
+ * Return val:
+ * lhs  rhs:   0
+ * lhs == rhs: 0
+ * lhs  rhs:   0
+ */
+static inline int ktime_compare(const ktime_t lhs, const ktime_t rhs)
+{
+   if (lhs.tv64  rhs.tv64)
+   return -1;
+   if (lhs.tv64  rhs.tv64)
+   return 1;
+   return 0;
+}
+
 /*
  * Add a ktime_t variable and a scalar nanosecond value.
  * res = kt + nsval:
@@ -198,6 +215,23 @@ static inline ktime_t ktime_add(const ktime_t add1, const 
ktime_t add2)
 }
 
 /**
+ * ktime_compare - Compares two ktime_t variables
+ *
+ * Return val:
+ * lhs  rhs:   0
+ * lhs == rhs: 0
+ * lhs  rhs:   0
+ */
+static inline int ktime_compare(const ktime_t lhs, const ktime_t rhs)
+{
+   if (lhs.tv.sec  rhs.tv.sec)
+   return -1;
+   if (lhs.tv.sec  rhs.tv.sec)
+   return 1;
+   return lhs.tv.nsec - rhs.tv.nsec;
+}
+
+/**
  * ktime_add_ns - Add a scalar nanoseconds value to a ktime_t variable
  * @kt:addend
  * @nsec:  the scalar nsec value to add
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints

2008-01-29 Thread Abhishek Sagar
On 1/29/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote:
> In that case, why don't you just reduce the priority of kprobe_exceptions_nb?
> Then, the execution path becomes very simple.

Ananth mentioned that the kprobe notifier has to be the first to run.
It still wouldnt allow us to notice breakpoints on places like do_int3
etc.

> I also like to use a debugger for debugging kprobes. that will help us.

Hmm...It would increase the code-path leading upto kprobe_handler.
That's more territory to be guarded from kprobes.

> Best Regards,
>
> --
> Masami Hiramatsu
>
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
>
> e-mail: [EMAIL PROTECTED]
--
Thanks,
Abhishek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints

2008-01-29 Thread Abhishek Sagar
On 1/29/08, Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> wrote:
> > May be I'm completely off the mark here, but shouldn't a small subset
> > of this section simply be 'breakpoint-free' rather than 'kprobe-free'?
> > Placing a breakpoint on kprobe_handler (say) can loop into a recursive
> > trap without allowing the debugger's notifier chain to be invoked.
>
> A well heeled debugger will necessarily take care of saving contexts
> (using techniques like setjmp/longjmp, etc) to help it recover from such
> nested cases (See xmon for example).

Ok, but the protection/warning is not just for xmon.

> > I'm assuming that non-kprobe exception notifiers may (or even should) run
> > after kprobe's notifier callback (kprobe_exceptions_notify).
>
> Yes, any such notifier is invoked after kprobe's callback as the kprobe
> notifier is always registered with the highest priority.

Ok.

> > The WARN_ON (and not a BUG_ON) will be hit iff:
> > (in_kprobes_functions(addr) && !is_jprobe_bkpt(addr))
>
> But that still is unneeded dmesg clutter, IMHO.

Ok, a warning in my opinion would've been prudent since I think we
cannot guarantee non kprobe breakpoint users (debuggers or anything
else) from the recursive trap handling case.

> Ananth

--
Thanks,
Abhishek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints

2008-01-29 Thread Abhishek Sagar
On 1/29/08, Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> wrote:
> > Non kprobe breakpoints in the kernel might lie inside the .kprobes.text 
> > section. Such breakpoints can easily be identified by in_kprobes_functions 
> > and can be caught early. These are problematic and a warning should be 
> > emitted to discourage them (in any rare case, if they actually occur).
>
> Why? As Masami indicated in an earlier reply, the annotation is to
> prevent *only* kprobes.

May be I'm completely off the mark here, but shouldn't a small subset
of this section simply be 'breakpoint-free' rather than 'kprobe-free'?
Placing a breakpoint on kprobe_handler (say) can loop into a recursive
trap without allowing the debugger's notifier chain to be invoked. I'm
assuming that non-kprobe exception notifiers may (or even should) run
after kprobe's notifier callback (kprobe_exceptions_notify).

> > For this, a check can route the trap handling of such breakpoints away from 
> > kprobe_handler (which ends up calling even more functions marked as 
> > __kprobes) from inside kprobe_exceptions_notify.
>
> Well.. we pass on control of a !kprobe breakpoint to the kernel. This is
> exactly what permits debuggers like xmon to work fine now.

This will still happen. It doesn't stop non-kprobe breakpoints from
being handled, wherever they may be.

> I don't see any harm in such breakpoints being handled autonomously
> without any sort of kprobe influence.

Here's what seems to be happening currently:

int3 (non-kprobe) -> do_int3 ->kprobe_exceptions_notify ->
kprobe_handler (passes the buck to the kernel) -> non-krpobe/debugger
exception handler.

Here's what the patch will do:

int3 (non-kprobe) -> do_int3 ->kprobe_exceptions_notify ->
WARN_ON/kprobe_handler -> non-kprobe/debugger exception handler.

The WARN_ON (and not a BUG_ON) will be hit iff:
(in_kprobes_functions(addr) && !is_jprobe_bkpt(addr))

> Ananth

I hope I've understood the point you were making, or at least came close :-).

--
Thanks,
Abhishek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints

2008-01-29 Thread Abhishek Sagar
On 1/29/08, Ananth N Mavinakayanahalli [EMAIL PROTECTED] wrote:
  May be I'm completely off the mark here, but shouldn't a small subset
  of this section simply be 'breakpoint-free' rather than 'kprobe-free'?
  Placing a breakpoint on kprobe_handler (say) can loop into a recursive
  trap without allowing the debugger's notifier chain to be invoked.

 A well heeled debugger will necessarily take care of saving contexts
 (using techniques like setjmp/longjmp, etc) to help it recover from such
 nested cases (See xmon for example).

Ok, but the protection/warning is not just for xmon.

  I'm assuming that non-kprobe exception notifiers may (or even should) run
  after kprobe's notifier callback (kprobe_exceptions_notify).

 Yes, any such notifier is invoked after kprobe's callback as the kprobe
 notifier is always registered with the highest priority.

Ok.

  The WARN_ON (and not a BUG_ON) will be hit iff:
  (in_kprobes_functions(addr)  !is_jprobe_bkpt(addr))

 But that still is unneeded dmesg clutter, IMHO.

Ok, a warning in my opinion would've been prudent since I think we
cannot guarantee non kprobe breakpoint users (debuggers or anything
else) from the recursive trap handling case.

 Ananth

--
Thanks,
Abhishek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints

2008-01-29 Thread Abhishek Sagar
On 1/29/08, Ananth N Mavinakayanahalli [EMAIL PROTECTED] wrote:
  Non kprobe breakpoints in the kernel might lie inside the .kprobes.text 
  section. Such breakpoints can easily be identified by in_kprobes_functions 
  and can be caught early. These are problematic and a warning should be 
  emitted to discourage them (in any rare case, if they actually occur).

 Why? As Masami indicated in an earlier reply, the annotation is to
 prevent *only* kprobes.

May be I'm completely off the mark here, but shouldn't a small subset
of this section simply be 'breakpoint-free' rather than 'kprobe-free'?
Placing a breakpoint on kprobe_handler (say) can loop into a recursive
trap without allowing the debugger's notifier chain to be invoked. I'm
assuming that non-kprobe exception notifiers may (or even should) run
after kprobe's notifier callback (kprobe_exceptions_notify).

  For this, a check can route the trap handling of such breakpoints away from 
  kprobe_handler (which ends up calling even more functions marked as 
  __kprobes) from inside kprobe_exceptions_notify.

 Well.. we pass on control of a !kprobe breakpoint to the kernel. This is
 exactly what permits debuggers like xmon to work fine now.

This will still happen. It doesn't stop non-kprobe breakpoints from
being handled, wherever they may be.

 I don't see any harm in such breakpoints being handled autonomously
 without any sort of kprobe influence.

Here's what seems to be happening currently:

int3 (non-kprobe) - do_int3 -kprobe_exceptions_notify -
kprobe_handler (passes the buck to the kernel) - non-krpobe/debugger
exception handler.

Here's what the patch will do:

int3 (non-kprobe) - do_int3 -kprobe_exceptions_notify -
WARN_ON/kprobe_handler - non-kprobe/debugger exception handler.

The WARN_ON (and not a BUG_ON) will be hit iff:
(in_kprobes_functions(addr)  !is_jprobe_bkpt(addr))

 Ananth

I hope I've understood the point you were making, or at least came close :-).

--
Thanks,
Abhishek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints

2008-01-29 Thread Abhishek Sagar
On 1/29/08, Masami Hiramatsu [EMAIL PROTECTED] wrote:
 In that case, why don't you just reduce the priority of kprobe_exceptions_nb?
 Then, the execution path becomes very simple.

Ananth mentioned that the kprobe notifier has to be the first to run.
It still wouldnt allow us to notice breakpoints on places like do_int3
etc.

 I also like to use a debugger for debugging kprobes. that will help us.

Hmm...It would increase the code-path leading upto kprobe_handler.
That's more territory to be guarded from kprobes.

 Best Regards,

 --
 Masami Hiramatsu

 Software Engineer
 Hitachi Computer Products (America) Inc.
 Software Solutions Division

 e-mail: [EMAIL PROTECTED]
--
Thanks,
Abhishek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section

2008-01-28 Thread Abhishek Sagar
On 1/28/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote:
> Thank you for explanation, I hope I can understand it.
> Even if it causes a trap recursively, it could be checked (and ignored) by
> longjump_break_handler(), and passed to the debugger correctly.

Yes, all non-kprobe breakpoints are left to the kernel to be handled.
The objective here is to intercept the trap handling of a certain
category of such breakpoints and emit a warning. The premise being
that .kprobes.text section is a logical breakpoint-free zone.

> Please consider that someone expands jprobe(jprobe2) which uses
> jprobe_return2() instead of jprobe_return(), how would you handle it?

By a simple modification of is_jprobe_bkpt() (defined in patch #2 of
this series).

> Current kprobes provides an opportunity to those external probe frameworks
> for checking it by themselves.

Could you clarirfy this with some example. For now I'm assuming that
by external probe frameworks you mean kernel modules using kprobes. If
they embed breakpoints in their handlers, then they will simply not be
caught by this check because thay cannot lie in the .kprobes.text
section.

> By the way, external kernel debugger knows how kprobe (and exception notifier)
> works, so I think it can fetch his exception before kprobes does (by tweaking
> notifier chain, etc).
> (I hope all external kernel debuggers take care of it. :-))

I would image that from a code correctness's point of view it
shouldn't matter. In any case, nothing can be done if the kprobe
exception notifier is circumvented.

> Masami Hiramatsu
>
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
>
> e-mail: [EMAIL PROTECTED]

--
Thanks,
Abhishek Sagar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] kprobes: kretprobe user entry-handler (updated)

2008-01-28 Thread Abhishek Sagar
On 1/28/08, Andrew Morton <[EMAIL PROTECTED]> wrote:
> Neither the changelog nor the newly-added documentation explain why Linux
> needs this feature.  What will it be used for??

There's a detailed discussion along with an example on this thread:
http://lkml.org/lkml/2007/11/13/58

and a bit on:
http://sourceware.org/ml/systemtap/2005-q3/msg00593.html

The real advantage of this patch is in scenarios where some data (like
function paramters, system time etc) needs to be shared between
function entry and exit. E.g: Viewing the change in system time across
a call to profile it. Here, we have a need to share data between the
entry and exit events of a function call. Also, the correct
association needs to be maintained between the corresponding function
entry-exit pairs. This is already done using a 'return-instance' in
kretprobes. This patch allows these instances to pouch some data as
well. The patch also has a module example which does trivial function
time-duration profiling using entry-handlers. It makes writing
function profilers simpler using kretprobes.

Currently, doing such a thing would require an extra kprobe to be
planted at function entry-point and whose pre-handler must have all
the complexity to do the function entry-exit data association. Also,
using an entry-handler is optional, and is completely backward
compatible.

> > +1.3.2 Kretprobe entry-handler
> > +
> > +Kretprobes also provides an optional user-specified handler which runs
>
> I think "caller-specified" would be a better term here.  Generally "user"
> refers to Aunt Tillie sitting at the keyboard.

Just followed suit from exising kprobes.txt which had quite a few
references to such a 'user'.

Cheerfully Acked by -> Aunt Tillie :-)

--
Thanks,
Abhishek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] kprobes: kretprobe user entry-handler (updated)

2008-01-28 Thread Abhishek Sagar
On 1/28/08, Andrew Morton [EMAIL PROTECTED] wrote:
 Neither the changelog nor the newly-added documentation explain why Linux
 needs this feature.  What will it be used for??

There's a detailed discussion along with an example on this thread:
http://lkml.org/lkml/2007/11/13/58

and a bit on:
http://sourceware.org/ml/systemtap/2005-q3/msg00593.html

The real advantage of this patch is in scenarios where some data (like
function paramters, system time etc) needs to be shared between
function entry and exit. E.g: Viewing the change in system time across
a call to profile it. Here, we have a need to share data between the
entry and exit events of a function call. Also, the correct
association needs to be maintained between the corresponding function
entry-exit pairs. This is already done using a 'return-instance' in
kretprobes. This patch allows these instances to pouch some data as
well. The patch also has a module example which does trivial function
time-duration profiling using entry-handlers. It makes writing
function profilers simpler using kretprobes.

Currently, doing such a thing would require an extra kprobe to be
planted at function entry-point and whose pre-handler must have all
the complexity to do the function entry-exit data association. Also,
using an entry-handler is optional, and is completely backward
compatible.

  +1.3.2 Kretprobe entry-handler
  +
  +Kretprobes also provides an optional user-specified handler which runs

 I think caller-specified would be a better term here.  Generally user
 refers to Aunt Tillie sitting at the keyboard.

Just followed suit from exising kprobes.txt which had quite a few
references to such a 'user'.

Cheerfully Acked by - Aunt Tillie :-)

--
Thanks,
Abhishek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section

2008-01-28 Thread Abhishek Sagar
On 1/28/08, Masami Hiramatsu [EMAIL PROTECTED] wrote:
 Thank you for explanation, I hope I can understand it.
 Even if it causes a trap recursively, it could be checked (and ignored) by
 longjump_break_handler(), and passed to the debugger correctly.

Yes, all non-kprobe breakpoints are left to the kernel to be handled.
The objective here is to intercept the trap handling of a certain
category of such breakpoints and emit a warning. The premise being
that .kprobes.text section is a logical breakpoint-free zone.

 Please consider that someone expands jprobe(jprobe2) which uses
 jprobe_return2() instead of jprobe_return(), how would you handle it?

By a simple modification of is_jprobe_bkpt() (defined in patch #2 of
this series).

 Current kprobes provides an opportunity to those external probe frameworks
 for checking it by themselves.

Could you clarirfy this with some example. For now I'm assuming that
by external probe frameworks you mean kernel modules using kprobes. If
they embed breakpoints in their handlers, then they will simply not be
caught by this check because thay cannot lie in the .kprobes.text
section.

 By the way, external kernel debugger knows how kprobe (and exception notifier)
 works, so I think it can fetch his exception before kprobes does (by tweaking
 notifier chain, etc).
 (I hope all external kernel debuggers take care of it. :-))

I would image that from a code correctness's point of view it
shouldn't matter. In any case, nothing can be done if the kprobe
exception notifier is circumvented.

 Masami Hiramatsu

 Software Engineer
 Hitachi Computer Products (America) Inc.
 Software Solutions Division

 e-mail: [EMAIL PROTECTED]

--
Thanks,
Abhishek Sagar
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section

2008-01-27 Thread Abhishek Sagar
On 1/27/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote:
> Sorry, I can not understand what issue these patches can solve.
> The breakpoint which is inserted by external debugger will be checked by
> kprobe_handler() and be passed to other exception_notify_handlers.
> In that case, we don't need to warn it.
> I think current code is enough simple.

kprobe_handler has a blanket check for all non-kprobe breakpoints.
They're all left to the kernel to handle. This is fine. Although
external debuggers are free to plant breakpoints anywhere, they should
be discouraged from doing so inside .kprobes.text region. Placing them
there may lead to recursive page-fault/trap handling. It's a defensive
check. I hope I've been able to clarify.

> --
> Masami Hiramatsu
>
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
>
> e-mail: [EMAIL PROTECTED]

Thanks,
Abhishek Sagar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] x86: Macrofy resuable code

2008-01-27 Thread Abhishek Sagar
Sam Ravnborg wrote:

> Small static functions are preferred over macros.
> Any particular reason to use a macro here?
> 
>   Sam

These macros have very limited(two) instantiations. But here's an alternative 
patch inlined.

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index a99e764..b7c2d20 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -159,6 +159,16 @@ struct kretprobe_blackpoint kretprobe_blacklist[] = {
 };
 const int kretprobe_blacklist_size = ARRAY_SIZE(kretprobe_blacklist);
 
+static inline unsigned long kprobe_bkpt_addr(struct pt_regs *regs)
+{
+   return (instruction_pointer(regs) - sizeof(kprobe_opcode_t));
+}
+
+static inline int is_jprobe_bkpt(u8 *ptr)
+{
+   return (ptr > (u8 *)jprobe_return) && (ptr < (u8 *)jprobe_return_end);
+}
+
 /* Insert a jump instruction at address 'from', which jumps to address 'to'.*/
 static void __kprobes set_jmp_op(void *from, void *to)
 {
@@ -519,7 +529,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
struct kprobe *p;
struct kprobe_ctlblk *kcb;
 
-   addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t));
+   addr = (kprobe_opcode_t *)kprobe_bkpt_addr(regs);
if (*addr != BREAKPOINT_INSTRUCTION) {
/*
 * The breakpoint instruction was removed right
@@ -1032,8 +1042,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, 
struct pt_regs *regs)
u8 *addr = (u8 *) (regs->ip - 1);
struct jprobe *jp = container_of(p, struct jprobe, kp);
 
-   if ((addr > (u8 *) jprobe_return) &&
-   (addr < (u8 *) jprobe_return_end)) {
+   if (is_jprobe_bkpt(addr)) {
if (stack_addr(regs) != kcb->jprobe_saved_sp) {
struct pt_regs *saved_regs = >jprobe_saved_regs;
printk(KERN_ERR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section

2008-01-27 Thread Abhishek Sagar
Identify breakpoints in .kprobes.text section. These certainly aren't kprobe 
traps. However, we make an exception for the breakpoint hardcoded into 
jprobe_return.

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index 45f2949..f3d13d0 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -961,6 +961,7 @@ int __kprobes kprobe_exceptions_notify(struct 
notifier_block *self,
   unsigned long val, void *data)
 {
struct die_args *args = data;
+   unsigned long addr = kprobe_bkpt_addr(args->regs);
int ret = NOTIFY_DONE;
 
if (args->regs && user_mode_vm(args->regs))
@@ -968,7 +969,14 @@ int __kprobes kprobe_exceptions_notify(struct 
notifier_block *self,
 
switch (val) {
case DIE_INT3:
-   if (kprobe_handler(args->regs))
+   if (in_kprobes_functions(addr) &&
+   !is_jprobe_bkpt((u8 *)addr)) {
+   /* A breakpoint has made it's way to the .kprobes.text
+* section (excluding jprobe_return). This could be
+* due to an external debugger. */
+   WARN_ON(1);
+   
+   } else if (kprobe_handler(args->regs))
ret = NOTIFY_STOP;
break;
case DIE_DEBUG:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] x86: Macrofy resuable code

2008-01-27 Thread Abhishek Sagar
Encapsulate reusable code .

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index a99e764..45f2949 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -74,6 +74,13 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
 #define stack_addr(regs) ((unsigned long *)>sp)
 #endif
 
+#define kprobe_bkpt_addr(regs) \
+   ((unsigned long)(regs->ip - sizeof(kprobe_opcode_t)))
+
+#define is_jprobe_bkpt(ptr) \
+   ((ptr > (u8 *)jprobe_return) && (ptr < (u8 *)jprobe_return_end))
+
+
 #define W(row, b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, bf)\
(((b0##UL << 0x0)|(b1##UL << 0x1)|(b2##UL << 0x2)|(b3##UL << 0x3) |   \
  (b4##UL << 0x4)|(b5##UL << 0x5)|(b6##UL << 0x6)|(b7##UL << 0x7) |   \
@@ -519,7 +526,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
struct kprobe *p;
struct kprobe_ctlblk *kcb;
 
-   addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t));
+   addr = (kprobe_opcode_t *)kprobe_bkpt_addr(regs);
if (*addr != BREAKPOINT_INSTRUCTION) {
/*
 * The breakpoint instruction was removed right
@@ -1032,8 +1039,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, 
struct pt_regs *regs)
u8 *addr = (u8 *) (regs->ip - 1);
struct jprobe *jp = container_of(p, struct jprobe, kp);
 
-   if ((addr > (u8 *) jprobe_return) &&
-   (addr < (u8 *) jprobe_return_end)) {
+   if (is_jprobe_bkpt(addr)) {
if (stack_addr(regs) != kcb->jprobe_saved_sp) {
struct pt_regs *saved_regs = >jprobe_saved_regs;
printk(KERN_ERR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints

2008-01-27 Thread Abhishek Sagar
Greetings,

Non kprobe breakpoints in the kernel might lie inside the .kprobes.text 
section. Such breakpoints can easily be identified by in_kprobes_functions and 
can be caught early. These are problematic and a warning should be emitted to 
discourage them (in any rare case, if they actually occur).

For this, a check can route the trap handling of such breakpoints away from 
kprobe_handler (which ends up calling even more functions marked as __kprobes) 
from inside kprobe_exceptions_notify.

All comments/suggestions are welcome.

--
Thanks & Regards,
Abhishek Sagar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] x86: Move in_kprobes_functions to linux/kprobes.h

2008-01-27 Thread Abhishek Sagar
Moving in_kprobes_functions to linux/kprobes.h to share it with 
arch/x86/kerne/kprobes.c.

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
---

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 6168c0a..2762145 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -39,6 +39,7 @@
 
 #ifdef CONFIG_KPROBES
 #include 
+#include 
 
 /* kprobe_status settings */
 #define KPROBE_HIT_ACTIVE  0x0001
@@ -182,6 +183,14 @@ static inline void kretprobe_assert(struct 
kretprobe_instance *ri,
}
 }
 
+static inline int in_kprobes_functions(unsigned long addr)
+{
+   if (addr >= (unsigned long)__kprobes_text_start &&
+   addr < (unsigned long)__kprobes_text_end)
+   return -EINVAL;
+   return 0;
+}
+
 #ifdef CONFIG_KPROBES_SANITY_TEST
 extern int init_test_probes(void);
 #else
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index d0493ea..0b74dfb 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -490,14 +490,6 @@ static int __kprobes register_aggr_kprobe(struct kprobe 
*old_p,
return ret;
 }
 
-static int __kprobes in_kprobes_functions(unsigned long addr)
-{
-   if (addr >= (unsigned long)__kprobes_text_start &&
-   addr < (unsigned long)__kprobes_text_end)
-   return -EINVAL;
-   return 0;
-}
-
 static int __kprobes __register_kprobe(struct kprobe *p,
unsigned long called_from)
 {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section

2008-01-27 Thread Abhishek Sagar
Identify breakpoints in .kprobes.text section. These certainly aren't kprobe 
traps. However, we make an exception for the breakpoint hardcoded into 
jprobe_return.

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index 45f2949..f3d13d0 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -961,6 +961,7 @@ int __kprobes kprobe_exceptions_notify(struct 
notifier_block *self,
   unsigned long val, void *data)
 {
struct die_args *args = data;
+   unsigned long addr = kprobe_bkpt_addr(args-regs);
int ret = NOTIFY_DONE;
 
if (args-regs  user_mode_vm(args-regs))
@@ -968,7 +969,14 @@ int __kprobes kprobe_exceptions_notify(struct 
notifier_block *self,
 
switch (val) {
case DIE_INT3:
-   if (kprobe_handler(args-regs))
+   if (in_kprobes_functions(addr) 
+   !is_jprobe_bkpt((u8 *)addr)) {
+   /* A breakpoint has made it's way to the .kprobes.text
+* section (excluding jprobe_return). This could be
+* due to an external debugger. */
+   WARN_ON(1);
+   
+   } else if (kprobe_handler(args-regs))
ret = NOTIFY_STOP;
break;
case DIE_DEBUG:
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] x86: Macrofy resuable code

2008-01-27 Thread Abhishek Sagar
Encapsulate reusable code .

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index a99e764..45f2949 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -74,6 +74,13 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
 #define stack_addr(regs) ((unsigned long *)regs-sp)
 #endif
 
+#define kprobe_bkpt_addr(regs) \
+   ((unsigned long)(regs-ip - sizeof(kprobe_opcode_t)))
+
+#define is_jprobe_bkpt(ptr) \
+   ((ptr  (u8 *)jprobe_return)  (ptr  (u8 *)jprobe_return_end))
+
+
 #define W(row, b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, bf)\
(((b0##UL  0x0)|(b1##UL  0x1)|(b2##UL  0x2)|(b3##UL  0x3) |   \
  (b4##UL  0x4)|(b5##UL  0x5)|(b6##UL  0x6)|(b7##UL  0x7) |   \
@@ -519,7 +526,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
struct kprobe *p;
struct kprobe_ctlblk *kcb;
 
-   addr = (kprobe_opcode_t *)(regs-ip - sizeof(kprobe_opcode_t));
+   addr = (kprobe_opcode_t *)kprobe_bkpt_addr(regs);
if (*addr != BREAKPOINT_INSTRUCTION) {
/*
 * The breakpoint instruction was removed right
@@ -1032,8 +1039,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, 
struct pt_regs *regs)
u8 *addr = (u8 *) (regs-ip - 1);
struct jprobe *jp = container_of(p, struct jprobe, kp);
 
-   if ((addr  (u8 *) jprobe_return) 
-   (addr  (u8 *) jprobe_return_end)) {
+   if (is_jprobe_bkpt(addr)) {
if (stack_addr(regs) != kcb-jprobe_saved_sp) {
struct pt_regs *saved_regs = kcb-jprobe_saved_regs;
printk(KERN_ERR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3][RFC] x86: Catch stray non-kprobe breakpoints

2008-01-27 Thread Abhishek Sagar
Greetings,

Non kprobe breakpoints in the kernel might lie inside the .kprobes.text 
section. Such breakpoints can easily be identified by in_kprobes_functions and 
can be caught early. These are problematic and a warning should be emitted to 
discourage them (in any rare case, if they actually occur).

For this, a check can route the trap handling of such breakpoints away from 
kprobe_handler (which ends up calling even more functions marked as __kprobes) 
from inside kprobe_exceptions_notify.

All comments/suggestions are welcome.

--
Thanks  Regards,
Abhishek Sagar
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] x86: Move in_kprobes_functions to linux/kprobes.h

2008-01-27 Thread Abhishek Sagar
Moving in_kprobes_functions to linux/kprobes.h to share it with 
arch/x86/kerne/kprobes.c.

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
---

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 6168c0a..2762145 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -39,6 +39,7 @@
 
 #ifdef CONFIG_KPROBES
 #include asm/kprobes.h
+#include asm-generic/sections.h
 
 /* kprobe_status settings */
 #define KPROBE_HIT_ACTIVE  0x0001
@@ -182,6 +183,14 @@ static inline void kretprobe_assert(struct 
kretprobe_instance *ri,
}
 }
 
+static inline int in_kprobes_functions(unsigned long addr)
+{
+   if (addr = (unsigned long)__kprobes_text_start 
+   addr  (unsigned long)__kprobes_text_end)
+   return -EINVAL;
+   return 0;
+}
+
 #ifdef CONFIG_KPROBES_SANITY_TEST
 extern int init_test_probes(void);
 #else
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index d0493ea..0b74dfb 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -490,14 +490,6 @@ static int __kprobes register_aggr_kprobe(struct kprobe 
*old_p,
return ret;
 }
 
-static int __kprobes in_kprobes_functions(unsigned long addr)
-{
-   if (addr = (unsigned long)__kprobes_text_start 
-   addr  (unsigned long)__kprobes_text_end)
-   return -EINVAL;
-   return 0;
-}
-
 static int __kprobes __register_kprobe(struct kprobe *p,
unsigned long called_from)
 {

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] x86: Macrofy resuable code

2008-01-27 Thread Abhishek Sagar
Sam Ravnborg wrote:

 Small static functions are preferred over macros.
 Any particular reason to use a macro here?
 
   Sam

These macros have very limited(two) instantiations. But here's an alternative 
patch inlined.

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index a99e764..b7c2d20 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -159,6 +159,16 @@ struct kretprobe_blackpoint kretprobe_blacklist[] = {
 };
 const int kretprobe_blacklist_size = ARRAY_SIZE(kretprobe_blacklist);
 
+static inline unsigned long kprobe_bkpt_addr(struct pt_regs *regs)
+{
+   return (instruction_pointer(regs) - sizeof(kprobe_opcode_t));
+}
+
+static inline int is_jprobe_bkpt(u8 *ptr)
+{
+   return (ptr  (u8 *)jprobe_return)  (ptr  (u8 *)jprobe_return_end);
+}
+
 /* Insert a jump instruction at address 'from', which jumps to address 'to'.*/
 static void __kprobes set_jmp_op(void *from, void *to)
 {
@@ -519,7 +529,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
struct kprobe *p;
struct kprobe_ctlblk *kcb;
 
-   addr = (kprobe_opcode_t *)(regs-ip - sizeof(kprobe_opcode_t));
+   addr = (kprobe_opcode_t *)kprobe_bkpt_addr(regs);
if (*addr != BREAKPOINT_INSTRUCTION) {
/*
 * The breakpoint instruction was removed right
@@ -1032,8 +1042,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, 
struct pt_regs *regs)
u8 *addr = (u8 *) (regs-ip - 1);
struct jprobe *jp = container_of(p, struct jprobe, kp);
 
-   if ((addr  (u8 *) jprobe_return) 
-   (addr  (u8 *) jprobe_return_end)) {
+   if (is_jprobe_bkpt(addr)) {
if (stack_addr(regs) != kcb-jprobe_saved_sp) {
struct pt_regs *saved_regs = kcb-jprobe_saved_regs;
printk(KERN_ERR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] x86: WARN_ON breakpoints from .kprobes.text section

2008-01-27 Thread Abhishek Sagar
On 1/27/08, Masami Hiramatsu [EMAIL PROTECTED] wrote:
 Sorry, I can not understand what issue these patches can solve.
 The breakpoint which is inserted by external debugger will be checked by
 kprobe_handler() and be passed to other exception_notify_handlers.
 In that case, we don't need to warn it.
 I think current code is enough simple.

kprobe_handler has a blanket check for all non-kprobe breakpoints.
They're all left to the kernel to handle. This is fine. Although
external debuggers are free to plant breakpoints anywhere, they should
be discouraged from doing so inside .kprobes.text region. Placing them
there may lead to recursive page-fault/trap handling. It's a defensive
check. I hope I've been able to clarify.

 --
 Masami Hiramatsu

 Software Engineer
 Hitachi Computer Products (America) Inc.
 Software Solutions Division

 e-mail: [EMAIL PROTECTED]

Thanks,
Abhishek Sagar
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] kprobes: kretprobe user entry-handler (updated)

2008-01-26 Thread Abhishek Sagar
This is a repost of a patch which was reviewed earlier at:
http://lkml.org/lkml/2007/11/13/58 (thanks to Jim Keniston and Srinivasa for 
their review comments). This provides support to add an optional user defined 
callback to be run at function entry of a kretprobe'd function. It also 
modifies the kprobe smoke tests to include an entry-handler during the 
kretprobe sanity test.

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
---

diff -upNr -X /home/guest/patches/dontdiff.txt 
linux-2.6.24-rc8-mm1/Documentation/kprobes.txt 
linux-2.6.24-rc8-mm1_kp/Documentation/kprobes.txt
--- linux-2.6.24-rc8-mm1/Documentation/kprobes.txt  2008-01-16 
09:52:48.0 +0530
+++ linux-2.6.24-rc8-mm1_kp/Documentation/kprobes.txt   2008-01-26 
17:12:43.0 +0530
@@ -96,7 +96,9 @@ or in registers (e.g., for x86_64 or for
 The jprobe will work in either case, so long as the handler's
 prototype matches that of the probed function.
 
-1.3 How Does a Return Probe Work?
+1.3 Return Probes
+
+1.3.1 How Does a Return Probe Work?
 
 When you call register_kretprobe(), Kprobes establishes a kprobe at
 the entry to the function.  When the probed function is called and this
@@ -107,9 +109,9 @@ At boot time, Kprobes registers a kprobe
 
 When the probed function executes its return instruction, control
 passes to the trampoline and that probe is hit.  Kprobes' trampoline
-handler calls the user-specified handler associated with the kretprobe,
-then sets the saved instruction pointer to the saved return address,
-and that's where execution resumes upon return from the trap.
+handler calls the user-specified return handler associated with the
+kretprobe, then sets the saved instruction pointer to the saved return
+address, and that's where execution resumes upon return from the trap.
 
 While the probed function is executing, its return address is
 stored in an object of type kretprobe_instance.  Before calling
@@ -131,6 +133,30 @@ zero when the return probe is registered
 time the probed function is entered but there is no kretprobe_instance
 object available for establishing the return probe.
 
+1.3.2 Kretprobe entry-handler
+
+Kretprobes also provides an optional user-specified handler which runs
+on function entry. This handler is specified by setting the entry_handler
+field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the
+function entry is hit, the user-defined entry_handler, if any, is invoked.
+If the entry_handler returns 0 (success) then a corresponding return handler
+is guaranteed to be called upon function return. If the entry_handler
+returns a non-zero error then Kprobes leaves the return address as is, and
+the kretprobe has no further effect for that particular function instance.
+
+Multiple entry and return handler invocations are matched using the unique
+kretprobe_instance object associated with them. Additionally, a user
+may also specify per return-instance private data to be part of each
+kretprobe_instance object. This is especially useful when sharing private
+data between corresponding user entry and return handlers. The size of each
+private data object can be specified at kretprobe registration time by
+setting the data_size field of the kretprobe struct. This data can be
+accessed through the data field of each kretprobe_instance object.
+
+In case probed function is entered but there is no kretprobe_instance
+object available, then in addition to incrementing the nmissed count,
+the user entry_handler invocation is also skipped.
+
 2. Architectures Supported
 
 Kprobes, jprobes, and return probes are implemented on the following
@@ -273,6 +299,8 @@ of interest:
 - ret_addr: the return address
 - rp: points to the corresponding kretprobe object
 - task: points to the corresponding task struct
+- data: points to per return-instance private data; see "Kretprobe
+   entry-handler" for details.
 
 The regs_return_value(regs) macro provides a simple abstraction to
 extract the return value from the appropriate register as defined by
@@ -555,23 +583,52 @@ report failed calls to sys_open().
 #include 
 #include 
 #include 
+#include 
+
+/* per-instance private data */
+struct my_data {
+   ktime_t entry_stamp;
+};
 
 static const char *probed_func = "sys_open";
 
-/* Return-probe handler: If the probed function fails, log the return value. */
-static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
+/* Timestamp function entry. */
+static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
+{
+   struct my_data *data;
+
+   if(!current->mm)
+   return 1; /* skip kernel threads */
+
+   data = (struct my_data *)ri->data;
+   data->entry_stamp = ktime_get();
+   return 0;
+}
+
+/* If the probed function failed, log the return value and duration.
+ * Duration may turn out to be zero consistently, depending upon the
+ * granularity of time accounting on the plat

[PATCH -mm] kprobes: kretprobe user entry-handler (updated)

2008-01-26 Thread Abhishek Sagar
This is a repost of a patch which was reviewed earlier at:
http://lkml.org/lkml/2007/11/13/58 (thanks to Jim Keniston and Srinivasa for 
their review comments). This provides support to add an optional user defined 
callback to be run at function entry of a kretprobe'd function. It also 
modifies the kprobe smoke tests to include an entry-handler during the 
kretprobe sanity test.

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
---

diff -upNr -X /home/guest/patches/dontdiff.txt 
linux-2.6.24-rc8-mm1/Documentation/kprobes.txt 
linux-2.6.24-rc8-mm1_kp/Documentation/kprobes.txt
--- linux-2.6.24-rc8-mm1/Documentation/kprobes.txt  2008-01-16 
09:52:48.0 +0530
+++ linux-2.6.24-rc8-mm1_kp/Documentation/kprobes.txt   2008-01-26 
17:12:43.0 +0530
@@ -96,7 +96,9 @@ or in registers (e.g., for x86_64 or for
 The jprobe will work in either case, so long as the handler's
 prototype matches that of the probed function.
 
-1.3 How Does a Return Probe Work?
+1.3 Return Probes
+
+1.3.1 How Does a Return Probe Work?
 
 When you call register_kretprobe(), Kprobes establishes a kprobe at
 the entry to the function.  When the probed function is called and this
@@ -107,9 +109,9 @@ At boot time, Kprobes registers a kprobe
 
 When the probed function executes its return instruction, control
 passes to the trampoline and that probe is hit.  Kprobes' trampoline
-handler calls the user-specified handler associated with the kretprobe,
-then sets the saved instruction pointer to the saved return address,
-and that's where execution resumes upon return from the trap.
+handler calls the user-specified return handler associated with the
+kretprobe, then sets the saved instruction pointer to the saved return
+address, and that's where execution resumes upon return from the trap.
 
 While the probed function is executing, its return address is
 stored in an object of type kretprobe_instance.  Before calling
@@ -131,6 +133,30 @@ zero when the return probe is registered
 time the probed function is entered but there is no kretprobe_instance
 object available for establishing the return probe.
 
+1.3.2 Kretprobe entry-handler
+
+Kretprobes also provides an optional user-specified handler which runs
+on function entry. This handler is specified by setting the entry_handler
+field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the
+function entry is hit, the user-defined entry_handler, if any, is invoked.
+If the entry_handler returns 0 (success) then a corresponding return handler
+is guaranteed to be called upon function return. If the entry_handler
+returns a non-zero error then Kprobes leaves the return address as is, and
+the kretprobe has no further effect for that particular function instance.
+
+Multiple entry and return handler invocations are matched using the unique
+kretprobe_instance object associated with them. Additionally, a user
+may also specify per return-instance private data to be part of each
+kretprobe_instance object. This is especially useful when sharing private
+data between corresponding user entry and return handlers. The size of each
+private data object can be specified at kretprobe registration time by
+setting the data_size field of the kretprobe struct. This data can be
+accessed through the data field of each kretprobe_instance object.
+
+In case probed function is entered but there is no kretprobe_instance
+object available, then in addition to incrementing the nmissed count,
+the user entry_handler invocation is also skipped.
+
 2. Architectures Supported
 
 Kprobes, jprobes, and return probes are implemented on the following
@@ -273,6 +299,8 @@ of interest:
 - ret_addr: the return address
 - rp: points to the corresponding kretprobe object
 - task: points to the corresponding task struct
+- data: points to per return-instance private data; see Kretprobe
+   entry-handler for details.
 
 The regs_return_value(regs) macro provides a simple abstraction to
 extract the return value from the appropriate register as defined by
@@ -555,23 +583,52 @@ report failed calls to sys_open().
 #include linux/kernel.h
 #include linux/module.h
 #include linux/kprobes.h
+#include linux/ktime.h
+
+/* per-instance private data */
+struct my_data {
+   ktime_t entry_stamp;
+};
 
 static const char *probed_func = sys_open;
 
-/* Return-probe handler: If the probed function fails, log the return value. */
-static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
+/* Timestamp function entry. */
+static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
+{
+   struct my_data *data;
+
+   if(!current-mm)
+   return 1; /* skip kernel threads */
+
+   data = (struct my_data *)ri-data;
+   data-entry_stamp = ktime_get();
+   return 0;
+}
+
+/* If the probed function failed, log the return value and duration.
+ * Duration may turn out to be zero consistently, depending upon the
+ * granularity of time accounting

[PATCH] x86: Fix singlestep handling in reenter_kprobe

2008-01-12 Thread Abhishek Sagar

Highlight peculiar cases in singles-step kprobe handling. 

In reenter_kprobe(), a breakpoint in KPROBE_HIT_SS case can only occur when 
single-stepping a breakpoint on which a probe was installed. Since such probes 
are single-stepped inline, identifying these cases is unambiguous. All other 
cases leading up to KPROBE_HIT_SS are possible bugs. Identify and WARN_ON such 
cases.

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index 93aff49..afb5c96 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -443,17 +443,6 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
*sara = (unsigned long) _trampoline;
 }
 
-static void __kprobes recursive_singlestep(struct kprobe *p,
-  struct pt_regs *regs,
-  struct kprobe_ctlblk *kcb)
-{
-   save_previous_kprobe(kcb);
-   set_current_kprobe(p, regs, kcb);
-   kprobes_inc_nmissed_count(p);
-   prepare_singlestep(p, regs);
-   kcb->kprobe_status = KPROBE_REENTER;
-}
-
 static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs,
   struct kprobe_ctlblk *kcb)
 {
@@ -492,20 +481,29 @@ static int __kprobes reenter_kprobe(struct kprobe *p, 
struct pt_regs *regs,
break;
 #endif
case KPROBE_HIT_ACTIVE:
-   recursive_singlestep(p, regs, kcb);
+   save_previous_kprobe(kcb);
+   set_current_kprobe(p, regs, kcb);
+   kprobes_inc_nmissed_count(p);
+   prepare_singlestep(p, regs);
+   kcb->kprobe_status = KPROBE_REENTER;
break;
case KPROBE_HIT_SS:
-   if (*p->ainsn.insn == BREAKPOINT_INSTRUCTION) {
+   if (p == kprobe_running()) {
regs->flags &= ~TF_MASK;
regs->flags |= kcb->kprobe_saved_flags;
return 0;
} else {
-   recursive_singlestep(p, regs, kcb);
+   /* A probe has been hit in the codepath leading up
+* to, or just after, single-stepping of a probed
+* instruction. This entire codepath should strictly
+* reside in .kprobes.text section. Raise a warning
+* to highlight this peculiar case.
+*/
}
-   break;
default:
/* impossible cases */
WARN_ON(1);
+   return 0;
}
 
return 1;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: Fix singlestep handling in reenter_kprobe

2008-01-12 Thread Abhishek Sagar

Highlight peculiar cases in singles-step kprobe handling. 

In reenter_kprobe(), a breakpoint in KPROBE_HIT_SS case can only occur when 
single-stepping a breakpoint on which a probe was installed. Since such probes 
are single-stepped inline, identifying these cases is unambiguous. All other 
cases leading up to KPROBE_HIT_SS are possible bugs. Identify and WARN_ON such 
cases.

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index 93aff49..afb5c96 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -443,17 +443,6 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
*sara = (unsigned long) kretprobe_trampoline;
 }
 
-static void __kprobes recursive_singlestep(struct kprobe *p,
-  struct pt_regs *regs,
-  struct kprobe_ctlblk *kcb)
-{
-   save_previous_kprobe(kcb);
-   set_current_kprobe(p, regs, kcb);
-   kprobes_inc_nmissed_count(p);
-   prepare_singlestep(p, regs);
-   kcb-kprobe_status = KPROBE_REENTER;
-}
-
 static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs,
   struct kprobe_ctlblk *kcb)
 {
@@ -492,20 +481,29 @@ static int __kprobes reenter_kprobe(struct kprobe *p, 
struct pt_regs *regs,
break;
 #endif
case KPROBE_HIT_ACTIVE:
-   recursive_singlestep(p, regs, kcb);
+   save_previous_kprobe(kcb);
+   set_current_kprobe(p, regs, kcb);
+   kprobes_inc_nmissed_count(p);
+   prepare_singlestep(p, regs);
+   kcb-kprobe_status = KPROBE_REENTER;
break;
case KPROBE_HIT_SS:
-   if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) {
+   if (p == kprobe_running()) {
regs-flags = ~TF_MASK;
regs-flags |= kcb-kprobe_saved_flags;
return 0;
} else {
-   recursive_singlestep(p, regs, kcb);
+   /* A probe has been hit in the codepath leading up
+* to, or just after, single-stepping of a probed
+* instruction. This entire codepath should strictly
+* reside in .kprobes.text section. Raise a warning
+* to highlight this peculiar case.
+*/
}
-   break;
default:
/* impossible cases */
WARN_ON(1);
+   return 0;
}
 
return 1;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-03 Thread Abhishek Sagar
On 1/4/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote:
> > + case KPROBE_HIT_SS:
> > + if (*p->ainsn.insn == BREAKPOINT_INSTRUCTION) {
> > + regs->flags &= ~TF_MASK;
> > + regs->flags |= kcb->kprobe_saved_flags;
> > + return 0;
> > + } else {
> > + recursive_singlestep(p, regs, kcb);
> > + }
> > + break;
> > + default:
> > + /* impossible cases */
> > + WARN_ON(1);
>
> WARN_ON() does not call panic(). Since the kernel doesn't stop,
> we need to prepare singlestep after that.

We shouldn't panic in 'default'. First, we'll never hit this case, and
if we do then we can be sure that the fault is not due to a probe. So
instead of singlestepping we'll let the kernel handle it. If it cant,
it'll panic/die() for us. I'll try to cleanup this case and
incorporate these suggestions in a separate patch, as u suggested.

> Masami Hiramatsu
>
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
>
> e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED]

--
Thanks,
Abhishek Sagar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-03 Thread Abhishek Sagar
On 1/4/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote:
> I could understand what the original code did at last.
> If a kprobe is inserted on a breakpoint which other debugger inserts,
> it single step inline instead of out-of-line.(this is done in 
> prepare_singlestep)
> In this case, (p && kprobe_running() && kcb->kprobe_status==KPROBE_HIT_SS)
> is true and we need pass the control to the debugger.
> And if (*p->ainsn.insn != BREAKPOINT_INSTRUCTION) (or (p != 
> kprobe_running())) in
> that case, there may be some bugs.

Yes, we can only fault while singlestepping for a unique case, which
is when we're singlestepping (in-line) a breakpoint because a probe
was installed on it. All other scenarios are a BUG . That's also
assuming that no exception will preempt singlestepping, whose codepath
has a probe on it.

> Now I think your original suggestion is correct.
> Please fix it in another patch.

Ok.

> --
> Masami Hiramatsu
>
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
>
> e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED]

Thanks,
Abhishek Sagar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-03 Thread Abhishek Sagar
Masami Hiramatsu wrote:
>>> As a result, this function just setups re-entrance.
>> As you've also pointed out in your previous reply, this case is
>> peculiar and therefore I believe it should be marked as a BUG(). I've
>> left the original case, if (kcb->kprobe_status==KPROBE_HIT_SS) &&
>> (*p->ainsn.insn == BREAKPOINT_INSTRUCTION), untouched and is handled
>> as it was before. However, if (kcb->kprobe_status==KPROBE_HIT_SS) &&
>> !(*p->ainsn.insn == BREAKPOINT_INSTRUCTION), then instead of
>> incrementing nmissed count like before, it should cry out a BUG. This
>> is not an ordinary recursive probe handling case which should update
>> the nmissed count.
> 
> Hmm, I can not agree, because it is possible to insert a kprobe
> into kprobe's instruction buffer. If it should be a bug, we must
> check it when registering the kprobe.
> (And also, in *p->ainsn.insn == BREAKPOINT_INSTRUCTION case, I doubt
> that the kernel can handle this "orphaned" breakpoint, because the
> breakpoint address has been changed.)
> 
> Anyway, if you would like to change the logic, please separate it from
> cleanup patch.

Okay. For now, I've restored the original behavior.
 
>>>> -out:
>>>> + if (!ret)
>>>> + preempt_enable_no_resched();
>>> I think this is a bit ugly...
>>> Why would you avoid using mutiple "return"s in this function?
>>> I think you can remove "ret" from this function.
>> Hmm...there the are no deeply nested if-elses nor overlapping paths
>> which need to be avoided. All the nested checks unroll cleanly too.
> 
> Oh, I just mentioned about this "if (!ret)" condition.
> Could you try to remove this "ret" variable?
> I think some "ret=1" could be replaced by "return 1" easily.

Done. You should find the desired changed in this patch.

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
Signed-off-by: Quentin Barnes <[EMAIL PROTECTED]>
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index a72e02b..06f8d08 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -441,6 +441,34 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
/* Replace the return addr with trampoline addr */
*sara = (unsigned long) _trampoline;
 }
+
+static void __kprobes recursive_singlestep(struct kprobe *p,
+  struct pt_regs *regs,
+  struct kprobe_ctlblk *kcb)
+{
+   save_previous_kprobe(kcb);
+   set_current_kprobe(p, regs, kcb);
+   kprobes_inc_nmissed_count(p);
+   prepare_singlestep(p, regs);
+   kcb->kprobe_status = KPROBE_REENTER;
+}
+
+static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs,
+  struct kprobe_ctlblk *kcb)
+{
+#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM)
+   if (p->ainsn.boostable == 1 && !p->post_handler) {
+   /* Boost up -- we can execute copied instructions directly */
+   reset_current_kprobe();
+   regs->ip = (unsigned long)p->ainsn.insn;
+   preempt_enable_no_resched();
+   return;
+   }
+#endif
+   prepare_singlestep(p, regs);
+   kcb->kprobe_status = KPROBE_HIT_SS;
+}
+
 /*
  * We have reentered the kprobe_handler(), since another probe was hit while
  * within the handler. We save the original kprobes variables and just single
@@ -449,13 +477,9 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
 static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb)
 {
-   if (kcb->kprobe_status == KPROBE_HIT_SS &&
-   *p->ainsn.insn == BREAKPOINT_INSTRUCTION) {
-   regs->flags &= ~X86_EFLAGS_TF;
-   regs->flags |= kcb->kprobe_saved_flags;
-   return 0;
+   switch (kcb->kprobe_status) {
+   case KPROBE_HIT_SSDONE:
 #ifdef CONFIG_X86_64
-   } else if (kcb->kprobe_status == KPROBE_HIT_SSDONE) {
/* TODO: Provide re-entrancy from post_kprobes_handler() and
 * avoid exception stack corruption while single-stepping on
 * the instruction of the new probe.
@@ -463,14 +487,26 @@ static int __kprobes reenter_kprobe(struct kprobe *p, 
struct pt_regs *regs,
arch_disarm_kprobe(p);
regs->ip = (unsigned long)p->addr;
reset_current_kprobe();
-   return 1;
+   preempt_enable_no_resched();
+   break;
 #endif
+   case KPROBE_HIT_ACTIVE:
+   recursive_singles

Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-03 Thread Abhishek Sagar
Masami Hiramatsu wrote:
 As a result, this function just setups re-entrance.
 As you've also pointed out in your previous reply, this case is
 peculiar and therefore I believe it should be marked as a BUG(). I've
 left the original case, if (kcb-kprobe_status==KPROBE_HIT_SS) 
 (*p-ainsn.insn == BREAKPOINT_INSTRUCTION), untouched and is handled
 as it was before. However, if (kcb-kprobe_status==KPROBE_HIT_SS) 
 !(*p-ainsn.insn == BREAKPOINT_INSTRUCTION), then instead of
 incrementing nmissed count like before, it should cry out a BUG. This
 is not an ordinary recursive probe handling case which should update
 the nmissed count.
 
 Hmm, I can not agree, because it is possible to insert a kprobe
 into kprobe's instruction buffer. If it should be a bug, we must
 check it when registering the kprobe.
 (And also, in *p-ainsn.insn == BREAKPOINT_INSTRUCTION case, I doubt
 that the kernel can handle this orphaned breakpoint, because the
 breakpoint address has been changed.)
 
 Anyway, if you would like to change the logic, please separate it from
 cleanup patch.

Okay. For now, I've restored the original behavior.
 
 -out:
 + if (!ret)
 + preempt_enable_no_resched();
 I think this is a bit ugly...
 Why would you avoid using mutiple returns in this function?
 I think you can remove ret from this function.
 Hmm...there the are no deeply nested if-elses nor overlapping paths
 which need to be avoided. All the nested checks unroll cleanly too.
 
 Oh, I just mentioned about this if (!ret) condition.
 Could you try to remove this ret variable?
 I think some ret=1 could be replaced by return 1 easily.

Done. You should find the desired changed in this patch.

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
Signed-off-by: Quentin Barnes [EMAIL PROTECTED]
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index a72e02b..06f8d08 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -441,6 +441,34 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
/* Replace the return addr with trampoline addr */
*sara = (unsigned long) kretprobe_trampoline;
 }
+
+static void __kprobes recursive_singlestep(struct kprobe *p,
+  struct pt_regs *regs,
+  struct kprobe_ctlblk *kcb)
+{
+   save_previous_kprobe(kcb);
+   set_current_kprobe(p, regs, kcb);
+   kprobes_inc_nmissed_count(p);
+   prepare_singlestep(p, regs);
+   kcb-kprobe_status = KPROBE_REENTER;
+}
+
+static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs,
+  struct kprobe_ctlblk *kcb)
+{
+#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM)
+   if (p-ainsn.boostable == 1  !p-post_handler) {
+   /* Boost up -- we can execute copied instructions directly */
+   reset_current_kprobe();
+   regs-ip = (unsigned long)p-ainsn.insn;
+   preempt_enable_no_resched();
+   return;
+   }
+#endif
+   prepare_singlestep(p, regs);
+   kcb-kprobe_status = KPROBE_HIT_SS;
+}
+
 /*
  * We have reentered the kprobe_handler(), since another probe was hit while
  * within the handler. We save the original kprobes variables and just single
@@ -449,13 +477,9 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
 static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb)
 {
-   if (kcb-kprobe_status == KPROBE_HIT_SS 
-   *p-ainsn.insn == BREAKPOINT_INSTRUCTION) {
-   regs-flags = ~X86_EFLAGS_TF;
-   regs-flags |= kcb-kprobe_saved_flags;
-   return 0;
+   switch (kcb-kprobe_status) {
+   case KPROBE_HIT_SSDONE:
 #ifdef CONFIG_X86_64
-   } else if (kcb-kprobe_status == KPROBE_HIT_SSDONE) {
/* TODO: Provide re-entrancy from post_kprobes_handler() and
 * avoid exception stack corruption while single-stepping on
 * the instruction of the new probe.
@@ -463,14 +487,26 @@ static int __kprobes reenter_kprobe(struct kprobe *p, 
struct pt_regs *regs,
arch_disarm_kprobe(p);
regs-ip = (unsigned long)p-addr;
reset_current_kprobe();
-   return 1;
+   preempt_enable_no_resched();
+   break;
 #endif
+   case KPROBE_HIT_ACTIVE:
+   recursive_singlestep(p, regs, kcb);
+   break;
+   case KPROBE_HIT_SS:
+   if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) {
+   regs-flags = ~TF_MASK;
+   regs-flags |= kcb-kprobe_saved_flags;
+   return 0;
+   } else {
+   recursive_singlestep(p, regs, kcb);
+   }
+   break;
+   default:
+   /* impossible cases

Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-03 Thread Abhishek Sagar
On 1/4/08, Masami Hiramatsu [EMAIL PROTECTED] wrote:
 I could understand what the original code did at last.
 If a kprobe is inserted on a breakpoint which other debugger inserts,
 it single step inline instead of out-of-line.(this is done in 
 prepare_singlestep)
 In this case, (p  kprobe_running()  kcb-kprobe_status==KPROBE_HIT_SS)
 is true and we need pass the control to the debugger.
 And if (*p-ainsn.insn != BREAKPOINT_INSTRUCTION) (or (p != 
 kprobe_running())) in
 that case, there may be some bugs.

Yes, we can only fault while singlestepping for a unique case, which
is when we're singlestepping (in-line) a breakpoint because a probe
was installed on it. All other scenarios are a BUG . That's also
assuming that no exception will preempt singlestepping, whose codepath
has a probe on it.

 Now I think your original suggestion is correct.
 Please fix it in another patch.

Ok.

 --
 Masami Hiramatsu

 Software Engineer
 Hitachi Computer Products (America) Inc.
 Software Solutions Division

 e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED]

Thanks,
Abhishek Sagar
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-03 Thread Abhishek Sagar
On 1/4/08, Masami Hiramatsu [EMAIL PROTECTED] wrote:
  + case KPROBE_HIT_SS:
  + if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) {
  + regs-flags = ~TF_MASK;
  + regs-flags |= kcb-kprobe_saved_flags;
  + return 0;
  + } else {
  + recursive_singlestep(p, regs, kcb);
  + }
  + break;
  + default:
  + /* impossible cases */
  + WARN_ON(1);

 WARN_ON() does not call panic(). Since the kernel doesn't stop,
 we need to prepare singlestep after that.

We shouldn't panic in 'default'. First, we'll never hit this case, and
if we do then we can be sure that the fault is not due to a probe. So
instead of singlestepping we'll let the kernel handle it. If it cant,
it'll panic/die() for us. I'll try to cleanup this case and
incorporate these suggestions in a separate patch, as u suggested.

 Masami Hiramatsu

 Software Engineer
 Hitachi Computer Products (America) Inc.
 Software Solutions Division

 e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED]

--
Thanks,
Abhishek Sagar
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-02 Thread Abhishek Sagar
On 1/2/08, Masami Hiramatsu <[EMAIL PROTECTED]> wrote:
> I think setup_singlestep() in your first patch is better, because it avoided
> code duplication(*).

Will retain it then.

> >  static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs,
> >   struct kprobe_ctlblk *kcb)
> >  {
> > - if (kcb->kprobe_status == KPROBE_HIT_SS &&
> > - *p->ainsn.insn == BREAKPOINT_INSTRUCTION) {
> > - regs->flags &= ~X86_EFLAGS_TF;
> > - regs->flags |= kcb->kprobe_saved_flags;
> > - return 0;
> > + int ret = 0;
> > +
> > + switch (kcb->kprobe_status) {
> > + case KPROBE_HIT_SSDONE:
> >  #ifdef CONFIG_X86_64
> > - } else if (kcb->kprobe_status == KPROBE_HIT_SSDONE) {
> > - /* TODO: Provide re-entrancy from post_kprobes_handler() and
> > -  * avoid exception stack corruption while single-stepping on
> > + /* TODO: Provide re-entrancy from
> > +  * post_kprobes_handler() and avoid exception
> > +  * stack corruption while single-stepping on
>
> Why would you modify it?

Do you mean the comment? I had this code in kprobe_handler earlier and
it consistently exceeded 80 columns in my case. Will fix it anyways.

> > + ret = 1;
> > + break;
> >  #endif
> > + case KPROBE_HIT_ACTIVE:
> > + /* a probe has been hit inside a
> > +  * user handler */
> > + save_previous_kprobe(kcb);
> > + set_current_kprobe(p, regs, kcb);
> > + kprobes_inc_nmissed_count(p);
> > + prepare_singlestep(p, regs);
> > + kcb->kprobe_status = KPROBE_REENTER;
> > + ret = 1;
> > + break;
> > + case KPROBE_HIT_SS:
> > + if (*p->ainsn.insn == BREAKPOINT_INSTRUCTION) {
> > + regs->flags &= ~TF_MASK;
> > + regs->flags |= kcb->kprobe_saved_flags;
> > + } else {
> > + /* BUG? */
> > + }
> > + break;
>
> If my thought is correct, we don't need to use swich-case here,
> Because there are only 2 cases, KPROBE_HIT_SSDONE (x86-64 only)
> or others.
> As a result, this function just setups re-entrance.

As you've also pointed out in your previous reply, this case is
peculiar and therefore I believe it should be marked as a BUG(). I've
left the original case, if (kcb->kprobe_status==KPROBE_HIT_SS) &&
(*p->ainsn.insn == BREAKPOINT_INSTRUCTION), untouched and is handled
as it was before. However, if (kcb->kprobe_status==KPROBE_HIT_SS) &&
!(*p->ainsn.insn == BREAKPOINT_INSTRUCTION), then instead of
incrementing nmissed count like before, it should cry out a BUG. This
is not an ordinary recursive probe handling case which should update
the nmissed count.

> > + default:
> > + /* impossible cases */
> > + BUG();
>
> I think no need to check that here.

It may not get hit at runtime but is quite informative.

> >  static int __kprobes kprobe_handler(struct pt_regs *regs)
> >  {
> > - struct kprobe *p;
> >   int ret = 0;
> >   kprobe_opcode_t *addr;
> > + struct kprobe *p, *cur;
> >   struct kprobe_ctlblk *kcb;
> >
> >   addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t));
> > + if (*addr != BREAKPOINT_INSTRUCTION) {
> > + /*
> > +  * The breakpoint instruction was removed right
> > +  * after we hit it.  Another cpu has removed
> > +  * either a probepoint or a debugger breakpoint
> > +  * at this address.  In either case, no further
> > +  * handling of this interrupt is appropriate.
> > +  * Back up over the (now missing) int3 and run
> > +  * the original instruction.
> > +  */
> > + regs->ip = (unsigned long)addr;
> > + return 1;
> > + }
>
> I think this block changing would better be separated from this patch,
> because it changes code flow logically.

Agreed. I'll address this in another thread, but for now it is safest
to check it for all cases right at the entry of kprobe_handler().

> >
> > - /*
> > -  * We don't want to be preempted for the entire
> > -  * duration of kprobe processing
> > -  */
> >   preempt_disable();
> >   kcb = get_kprobe_ctlblk();
> > -
> > + cur = kprobe_running();
>
> I think you don't need "cur", because kprobe_running() is called
> just once on each path.

Will get rid of that.

> > - regs->ip = (unsigned long)addr;
> > + if (!p->pre_handler || !p->pre_handler(p, regs)) {
> > + if (setup_boost(p, regs)) {
> > + prepare_singlestep(p, regs);
> > + kcb->kprobe_status = KPROBE_HIT_SS;
>
> (*) duplication 1
>
> > + }
> > + }

Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-02 Thread Abhishek Sagar
On 1/2/08, Masami Hiramatsu [EMAIL PROTECTED] wrote:
 I think setup_singlestep() in your first patch is better, because it avoided
 code duplication(*).

Will retain it then.

   static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb)
   {
  - if (kcb-kprobe_status == KPROBE_HIT_SS 
  - *p-ainsn.insn == BREAKPOINT_INSTRUCTION) {
  - regs-flags = ~X86_EFLAGS_TF;
  - regs-flags |= kcb-kprobe_saved_flags;
  - return 0;
  + int ret = 0;
  +
  + switch (kcb-kprobe_status) {
  + case KPROBE_HIT_SSDONE:
   #ifdef CONFIG_X86_64
  - } else if (kcb-kprobe_status == KPROBE_HIT_SSDONE) {
  - /* TODO: Provide re-entrancy from post_kprobes_handler() and
  -  * avoid exception stack corruption while single-stepping on
  + /* TODO: Provide re-entrancy from
  +  * post_kprobes_handler() and avoid exception
  +  * stack corruption while single-stepping on

 Why would you modify it?

Do you mean the comment? I had this code in kprobe_handler earlier and
it consistently exceeded 80 columns in my case. Will fix it anyways.

  + ret = 1;
  + break;
   #endif
  + case KPROBE_HIT_ACTIVE:
  + /* a probe has been hit inside a
  +  * user handler */
  + save_previous_kprobe(kcb);
  + set_current_kprobe(p, regs, kcb);
  + kprobes_inc_nmissed_count(p);
  + prepare_singlestep(p, regs);
  + kcb-kprobe_status = KPROBE_REENTER;
  + ret = 1;
  + break;
  + case KPROBE_HIT_SS:
  + if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) {
  + regs-flags = ~TF_MASK;
  + regs-flags |= kcb-kprobe_saved_flags;
  + } else {
  + /* BUG? */
  + }
  + break;

 If my thought is correct, we don't need to use swich-case here,
 Because there are only 2 cases, KPROBE_HIT_SSDONE (x86-64 only)
 or others.
 As a result, this function just setups re-entrance.

As you've also pointed out in your previous reply, this case is
peculiar and therefore I believe it should be marked as a BUG(). I've
left the original case, if (kcb-kprobe_status==KPROBE_HIT_SS) 
(*p-ainsn.insn == BREAKPOINT_INSTRUCTION), untouched and is handled
as it was before. However, if (kcb-kprobe_status==KPROBE_HIT_SS) 
!(*p-ainsn.insn == BREAKPOINT_INSTRUCTION), then instead of
incrementing nmissed count like before, it should cry out a BUG. This
is not an ordinary recursive probe handling case which should update
the nmissed count.

  + default:
  + /* impossible cases */
  + BUG();

 I think no need to check that here.

It may not get hit at runtime but is quite informative.

   static int __kprobes kprobe_handler(struct pt_regs *regs)
   {
  - struct kprobe *p;
int ret = 0;
kprobe_opcode_t *addr;
  + struct kprobe *p, *cur;
struct kprobe_ctlblk *kcb;
 
addr = (kprobe_opcode_t *)(regs-ip - sizeof(kprobe_opcode_t));
  + if (*addr != BREAKPOINT_INSTRUCTION) {
  + /*
  +  * The breakpoint instruction was removed right
  +  * after we hit it.  Another cpu has removed
  +  * either a probepoint or a debugger breakpoint
  +  * at this address.  In either case, no further
  +  * handling of this interrupt is appropriate.
  +  * Back up over the (now missing) int3 and run
  +  * the original instruction.
  +  */
  + regs-ip = (unsigned long)addr;
  + return 1;
  + }

 I think this block changing would better be separated from this patch,
 because it changes code flow logically.

Agreed. I'll address this in another thread, but for now it is safest
to check it for all cases right at the entry of kprobe_handler().

 
  - /*
  -  * We don't want to be preempted for the entire
  -  * duration of kprobe processing
  -  */
preempt_disable();
kcb = get_kprobe_ctlblk();
  -
  + cur = kprobe_running();

 I think you don't need cur, because kprobe_running() is called
 just once on each path.

Will get rid of that.

  - regs-ip = (unsigned long)addr;
  + if (!p-pre_handler || !p-pre_handler(p, regs)) {
  + if (setup_boost(p, regs)) {
  + prepare_singlestep(p, regs);
  + kcb-kprobe_status = KPROBE_HIT_SS;

 (*) duplication 1

  + }
  + }
ret = 1;
  - goto preempt_out;
}
  - if (kprobe_running()) {
  - p = __get_cpu_var(current_kprobe);
  - 

Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-01 Thread Abhishek Sagar
On 1/2/08, Harvey Harrison <[EMAIL PROTECTED]> wrote:
> > +#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM)
> > +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs 
> > *regs)
> > +{
> > + if (p->ainsn.boostable == 1 && !p->post_handler) {
> > + /* Boost up -- we can execute copied instructions directly */
> > + reset_current_kprobe();
> > + regs->ip = (unsigned long)p->ainsn.insn;
> > + preempt_enable_no_resched();
> > + return 0;
> > + }
> > + return 1;
> > +}
> > +#else
> > +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs 
> > *regs)
> > +{
> > + return 1;
> > +}
> > +#endif
> > +
> In the kernel __always_inline == inline, also I think it's nicer to only
> have one function declaration, and then ifdef the body of the function.
>
> Something like:
>
> static inline int setup_boost(struct kprobe *p, struct pt_regs *regs)
> {
> #if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM)
> if (p->ainsn.boostable == 1 && !p->post_handler) {
> /* Boost up -- we can execute copied instructions directly */
> reset_current_kprobe();
> regs->ip = (unsigned long)p->ainsn.insn;
> preempt_enable_no_resched();
> return 0;
> }
> #endif
> return 1;
> }

Ok...will include this after I pick up some more comments.

> >  static int __kprobes kprobe_handler(struct pt_regs *regs)
> >  {
> > - struct kprobe *p;
> >   int ret = 0;
> >   kprobe_opcode_t *addr;
> > + struct kprobe *p, *cur;
> >   struct kprobe_ctlblk *kcb;
> >
> >   addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t));
> > + if (*addr != BREAKPOINT_INSTRUCTION) {
> > + /*
> > +  * The breakpoint instruction was removed right
> > +  * after we hit it.  Another cpu has removed
> > +  * either a probepoint or a debugger breakpoint
> > +  * at this address.  In either case, no further
> > +  * handling of this interrupt is appropriate.
> > +  * Back up over the (now missing) int3 and run
> > +  * the original instruction.
> > +  */
> > + regs->ip = (unsigned long)addr;
> > + return 1;
> > + }
>
> This return is fine I guess, but after the preempt_disable() I like
> the goto approach as it will be easier to see what paths enable
> preemption again and which don'tbonus points if we can move this
> to the caller or make sure we reenable in all cases before returning
> and pull in the code in the caller that does this for us.
>
> But I guess your approach of using ret to test whether we need to
> reenable preemption or not would work as a signal to the caller that
> they need to reenable preemption.

Hmm...since enabling preemption is tied to 'ret', anyone reading
kprobe_handler will have to follow around all calls which modify it.
There are some checks in the current kprobe_handler definition made
just to do what you're saying, i.e, to push all preemption
enable/disables in krpobe_handler. LIke this one (from the current x86
kprobe_handler):


ret = reenter_kprobe(p, regs, kcb);
if (kcb->kprobe_status == KPROBE_REENTER)
{
ret = 1;
goto out;
}
goto preempt_out;

-

This is just confusing because we're not actually making any
exceptions here for the KPROBE_REENTER case (which has been partially
handled in reenter_kprobe), rather just tricking our way out of
preemption enabling for a cpl of cases in reenter_kprobe.

> Cheers,
>
> Harvey

Thanks,
Abhishek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-01 Thread Abhishek Sagar
t; > + /* a probe has been hit inside a
> > +  * user handler */
> > + save_previous_kprobe(kcb);
> > + set_current_kprobe(p, regs, kcb);
> > + kprobes_inc_nmissed_count(p);
> > + prepare_singlestep(p, regs);
> > + kcb->kprobe_status = KPROBE_REENTER;
> >   ret = 1;
> > - goto no_kprobe;
> > - }
> > - p = __get_cpu_var(current_kprobe);
> > - if (p->break_handler && p->break_handler(p, regs)) {
> > - goto ss_probe;
> > + break;
> > + case KPROBE_HIT_SS:
> > + if (*p->ainsn.insn == BREAKPOINT_INSTRUCTION) 
> > {
> > + regs->eflags &= ~TF_MASK;
> > + regs->eflags |=
> > + kcb->kprobe_saved_eflags;
> > + } else {
> > + /* BUG? */
> > + }
> > + break;
> > + default:
> > + /* impossible cases */
> > + BUG();
> >   }
> > - }

Replaced deeply nested if-elses with a switch.

Please let me know if there are any changes on which you would like
further clarification.

> --
> Masami Hiramatsu
>
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
>
> e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED]
--

Thanks,
Abhishek Sagar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-01 Thread Abhishek Sagar
Ingo Molnar wrote:
> hm, this patch does not apply to x86.git#mm, due to the fixes, 
> unifications and cleanups done there. Could you send a patch against -mm 
> or against x86.git? (see the tree-fetching instructions below) Thanks,
> 
>   Ingo
> 
> --{ x86.git instructions }-->
> 
>  git-clone 
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 
> linux-2.6.git
>  cd linux-2.6.git
>  git-branch x86
>  git-checkout x86
>  git-pull git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git 
> mm
> 
> (do subsequent pulls via "git-pull --force", as we frequently rebase the
> git tree. NOTE: this might override your own local changes, so do this
> only if you dont mind about losing thse changes in that tree.)
> 

Thanks for pointing me to the right tree. I have made some code re-arrangements 
in this one.

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
Signed-off-by: Quentin Barnes <[EMAIL PROTECTED]>
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index a72e02b..45adc8e 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -441,6 +441,26 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
/* Replace the return addr with trampoline addr */
*sara = (unsigned long) _trampoline;
 }
+
+#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM)
+static __always_inline int setup_boost(struct kprobe *p, struct pt_regs *regs)
+{
+   if (p->ainsn.boostable == 1 && !p->post_handler) {
+   /* Boost up -- we can execute copied instructions directly */
+   reset_current_kprobe();
+   regs->ip = (unsigned long)p->ainsn.insn;
+   preempt_enable_no_resched();
+   return 0;
+   }
+   return 1;
+}
+#else
+static __always_inline int setup_boost(struct kprobe *p, struct pt_regs *regs)
+{
+   return 1;
+}
+#endif
+
 /*
  * We have reentered the kprobe_handler(), since another probe was hit while
  * within the handler. We save the original kprobes variables and just single
@@ -449,29 +469,47 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
 static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb)
 {
-   if (kcb->kprobe_status == KPROBE_HIT_SS &&
-   *p->ainsn.insn == BREAKPOINT_INSTRUCTION) {
-   regs->flags &= ~X86_EFLAGS_TF;
-   regs->flags |= kcb->kprobe_saved_flags;
-   return 0;
+   int ret = 0;
+
+   switch (kcb->kprobe_status) {
+   case KPROBE_HIT_SSDONE:
 #ifdef CONFIG_X86_64
-   } else if (kcb->kprobe_status == KPROBE_HIT_SSDONE) {
-   /* TODO: Provide re-entrancy from post_kprobes_handler() and
-* avoid exception stack corruption while single-stepping on
+   /* TODO: Provide re-entrancy from
+* post_kprobes_handler() and avoid exception
+* stack corruption while single-stepping on
 * the instruction of the new probe.
 */
arch_disarm_kprobe(p);
regs->ip = (unsigned long)p->addr;
reset_current_kprobe();
-   return 1;
+   preempt_enable_no_resched();
+   ret = 1;
+   break;
 #endif
+   case KPROBE_HIT_ACTIVE:
+   /* a probe has been hit inside a
+* user handler */
+   save_previous_kprobe(kcb);
+   set_current_kprobe(p, regs, kcb);
+   kprobes_inc_nmissed_count(p);
+   prepare_singlestep(p, regs);
+   kcb->kprobe_status = KPROBE_REENTER;
+   ret = 1;
+   break;
+   case KPROBE_HIT_SS:
+   if (*p->ainsn.insn == BREAKPOINT_INSTRUCTION) {
+   regs->flags &= ~TF_MASK;
+   regs->flags |= kcb->kprobe_saved_flags;
+   } else {
+   /* BUG? */
+   }
+   break;
+   default:
+   /* impossible cases */
+   BUG();
}
-   save_previous_kprobe(kcb);
-   set_current_kprobe(p, regs, kcb);
-   kprobes_inc_nmissed_count(p);
-   prepare_singlestep(p, regs);
-   kcb->kprobe_status = KPROBE_REENTER;
-   return 1;
+
+   return ret;
 }
 
 /*
@@ -480,82 +518,67 @@ static int __kprobes reenter_kprobe(struct kprobe *p, 
struct pt_regs *regs,
  */
 static int __kprobes kprobe_handler(struct pt_regs *regs)
 {
-   struct kprobe *p;
int ret = 0;
kprobe_opcode_t *addr;
+   struct kprobe *p, *cur;
struct kprobe_ctlblk *kcb;
 
addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t));
+ 

Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-01 Thread Abhishek Sagar
Ingo Molnar wrote:
 hm, this patch does not apply to x86.git#mm, due to the fixes, 
 unifications and cleanups done there. Could you send a patch against -mm 
 or against x86.git? (see the tree-fetching instructions below) Thanks,
 
   Ingo
 
 --{ x86.git instructions }--
 
  git-clone 
 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 
 linux-2.6.git
  cd linux-2.6.git
  git-branch x86
  git-checkout x86
  git-pull git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git 
 mm
 
 (do subsequent pulls via git-pull --force, as we frequently rebase the
 git tree. NOTE: this might override your own local changes, so do this
 only if you dont mind about losing thse changes in that tree.)
 

Thanks for pointing me to the right tree. I have made some code re-arrangements 
in this one.

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
Signed-off-by: Quentin Barnes [EMAIL PROTECTED]
---

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index a72e02b..45adc8e 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -441,6 +441,26 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
/* Replace the return addr with trampoline addr */
*sara = (unsigned long) kretprobe_trampoline;
 }
+
+#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM)
+static __always_inline int setup_boost(struct kprobe *p, struct pt_regs *regs)
+{
+   if (p-ainsn.boostable == 1  !p-post_handler) {
+   /* Boost up -- we can execute copied instructions directly */
+   reset_current_kprobe();
+   regs-ip = (unsigned long)p-ainsn.insn;
+   preempt_enable_no_resched();
+   return 0;
+   }
+   return 1;
+}
+#else
+static __always_inline int setup_boost(struct kprobe *p, struct pt_regs *regs)
+{
+   return 1;
+}
+#endif
+
 /*
  * We have reentered the kprobe_handler(), since another probe was hit while
  * within the handler. We save the original kprobes variables and just single
@@ -449,29 +469,47 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
 static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb)
 {
-   if (kcb-kprobe_status == KPROBE_HIT_SS 
-   *p-ainsn.insn == BREAKPOINT_INSTRUCTION) {
-   regs-flags = ~X86_EFLAGS_TF;
-   regs-flags |= kcb-kprobe_saved_flags;
-   return 0;
+   int ret = 0;
+
+   switch (kcb-kprobe_status) {
+   case KPROBE_HIT_SSDONE:
 #ifdef CONFIG_X86_64
-   } else if (kcb-kprobe_status == KPROBE_HIT_SSDONE) {
-   /* TODO: Provide re-entrancy from post_kprobes_handler() and
-* avoid exception stack corruption while single-stepping on
+   /* TODO: Provide re-entrancy from
+* post_kprobes_handler() and avoid exception
+* stack corruption while single-stepping on
 * the instruction of the new probe.
 */
arch_disarm_kprobe(p);
regs-ip = (unsigned long)p-addr;
reset_current_kprobe();
-   return 1;
+   preempt_enable_no_resched();
+   ret = 1;
+   break;
 #endif
+   case KPROBE_HIT_ACTIVE:
+   /* a probe has been hit inside a
+* user handler */
+   save_previous_kprobe(kcb);
+   set_current_kprobe(p, regs, kcb);
+   kprobes_inc_nmissed_count(p);
+   prepare_singlestep(p, regs);
+   kcb-kprobe_status = KPROBE_REENTER;
+   ret = 1;
+   break;
+   case KPROBE_HIT_SS:
+   if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) {
+   regs-flags = ~TF_MASK;
+   regs-flags |= kcb-kprobe_saved_flags;
+   } else {
+   /* BUG? */
+   }
+   break;
+   default:
+   /* impossible cases */
+   BUG();
}
-   save_previous_kprobe(kcb);
-   set_current_kprobe(p, regs, kcb);
-   kprobes_inc_nmissed_count(p);
-   prepare_singlestep(p, regs);
-   kcb-kprobe_status = KPROBE_REENTER;
-   return 1;
+
+   return ret;
 }
 
 /*
@@ -480,82 +518,67 @@ static int __kprobes reenter_kprobe(struct kprobe *p, 
struct pt_regs *regs,
  */
 static int __kprobes kprobe_handler(struct pt_regs *regs)
 {
-   struct kprobe *p;
int ret = 0;
kprobe_opcode_t *addr;
+   struct kprobe *p, *cur;
struct kprobe_ctlblk *kcb;
 
addr = (kprobe_opcode_t *)(regs-ip - sizeof(kprobe_opcode_t));
+   if (*addr != BREAKPOINT_INSTRUCTION) {
+   /*
+* The breakpoint instruction was removed right
+* after we hit it.  Another cpu has removed
+* either

Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-01 Thread Abhishek Sagar
;
  + break;
  + case KPROBE_HIT_SS:
  + if (*p-ainsn.insn == BREAKPOINT_INSTRUCTION) 
  {
  + regs-eflags = ~TF_MASK;
  + regs-eflags |=
  + kcb-kprobe_saved_eflags;
  + } else {
  + /* BUG? */
  + }
  + break;
  + default:
  + /* impossible cases */
  + BUG();
}
  - }

Replaced deeply nested if-elses with a switch.

Please let me know if there are any changes on which you would like
further clarification.

 --
 Masami Hiramatsu

 Software Engineer
 Hitachi Computer Products (America) Inc.
 Software Solutions Division

 e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED]
--

Thanks,
Abhishek Sagar
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: kprobes change kprobe_handler flow

2008-01-01 Thread Abhishek Sagar
On 1/2/08, Harvey Harrison [EMAIL PROTECTED] wrote:
  +#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM)
  +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs 
  *regs)
  +{
  + if (p-ainsn.boostable == 1  !p-post_handler) {
  + /* Boost up -- we can execute copied instructions directly */
  + reset_current_kprobe();
  + regs-ip = (unsigned long)p-ainsn.insn;
  + preempt_enable_no_resched();
  + return 0;
  + }
  + return 1;
  +}
  +#else
  +static __always_inline int setup_boost(struct kprobe *p, struct pt_regs 
  *regs)
  +{
  + return 1;
  +}
  +#endif
  +
 In the kernel __always_inline == inline, also I think it's nicer to only
 have one function declaration, and then ifdef the body of the function.

 Something like:

 static inline int setup_boost(struct kprobe *p, struct pt_regs *regs)
 {
 #if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM)
 if (p-ainsn.boostable == 1  !p-post_handler) {
 /* Boost up -- we can execute copied instructions directly */
 reset_current_kprobe();
 regs-ip = (unsigned long)p-ainsn.insn;
 preempt_enable_no_resched();
 return 0;
 }
 #endif
 return 1;
 }

Ok...will include this after I pick up some more comments.

   static int __kprobes kprobe_handler(struct pt_regs *regs)
   {
  - struct kprobe *p;
int ret = 0;
kprobe_opcode_t *addr;
  + struct kprobe *p, *cur;
struct kprobe_ctlblk *kcb;
 
addr = (kprobe_opcode_t *)(regs-ip - sizeof(kprobe_opcode_t));
  + if (*addr != BREAKPOINT_INSTRUCTION) {
  + /*
  +  * The breakpoint instruction was removed right
  +  * after we hit it.  Another cpu has removed
  +  * either a probepoint or a debugger breakpoint
  +  * at this address.  In either case, no further
  +  * handling of this interrupt is appropriate.
  +  * Back up over the (now missing) int3 and run
  +  * the original instruction.
  +  */
  + regs-ip = (unsigned long)addr;
  + return 1;
  + }

 This return is fine I guess, but after the preempt_disable() I like
 the goto approach as it will be easier to see what paths enable
 preemption again and which don'tbonus points if we can move this
 to the caller or make sure we reenable in all cases before returning
 and pull in the code in the caller that does this for us.

 But I guess your approach of using ret to test whether we need to
 reenable preemption or not would work as a signal to the caller that
 they need to reenable preemption.

Hmm...since enabling preemption is tied to 'ret', anyone reading
kprobe_handler will have to follow around all calls which modify it.
There are some checks in the current kprobe_handler definition made
just to do what you're saying, i.e, to push all preemption
enable/disables in krpobe_handler. LIke this one (from the current x86
kprobe_handler):


ret = reenter_kprobe(p, regs, kcb);
if (kcb-kprobe_status == KPROBE_REENTER)
{
ret = 1;
goto out;
}
goto preempt_out;

-

This is just confusing because we're not actually making any
exceptions here for the KPROBE_REENTER case (which has been partially
handled in reenter_kprobe), rather just tricking our way out of
preemption enabling for a cpl of cases in reenter_kprobe.

 Cheers,

 Harvey

Thanks,
Abhishek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: kprobes change kprobe_handler flow

2007-12-31 Thread Abhishek Sagar
Harvey Harrison wrote:
> Make the control flow of kprobe_handler more obvious.
> 
> Collapse the separate if blocks/gotos with if/else blocks
> this unifies the duplication of the check for a breakpoint
> instruction race with another cpu.

This is a patch derived from kprobe_handler of the ARM kprobes port. This 
further simplifies the current x86 kprobe_handler. The resulting definition is 
smaller, more readable, has no goto's and contains only a single call to 
get_kprobe.

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
Signed-off-by: Quentin Barnes <[EMAIL PROTECTED]>
---

diff --git a/arch/x86/kernel/kprobes_32.c b/arch/x86/kernel/kprobes_32.c
index 3a020f7..5a626fd 100644
--- a/arch/x86/kernel/kprobes_32.c
+++ b/arch/x86/kernel/kprobes_32.c
@@ -240,108 +240,110 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
*sara = (unsigned long) _trampoline;
 }
 
+static __always_inline void setup_singlestep(struct kprobe *p,
+struct pt_regs *regs,
+struct kprobe_ctlblk *kcb)
+{
+#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM)
+   if (p->ainsn.boostable == 1 && !p->post_handler) {
+   /* Boost up -- we can execute copied instructions directly */
+   reset_current_kprobe();
+   regs->eip = (unsigned long)p->ainsn.insn;
+   preempt_enable_no_resched();
+   } else {
+   prepare_singlestep(p, regs);
+   kcb->kprobe_status = KPROBE_HIT_SS;
+   }
+#else
+   prepare_singlestep(p, regs);
+   kcb->kprobe_status = KPROBE_HIT_SS;
+#endif
+}
+
 /*
  * Interrupts are disabled on entry as trap3 is an interrupt gate and they
  * remain disabled thorough out this function.
  */
 static int __kprobes kprobe_handler(struct pt_regs *regs)
 {
-   struct kprobe *p;
int ret = 0;
kprobe_opcode_t *addr;
+   struct kprobe *p, *cur;
struct kprobe_ctlblk *kcb;
 
addr = (kprobe_opcode_t *)(regs->eip - sizeof(kprobe_opcode_t));
+   if (*addr != BREAKPOINT_INSTRUCTION) {
+   /*
+* The breakpoint instruction was removed right
+* after we hit it.  Another cpu has removed
+* either a probepoint or a debugger breakpoint
+* at this address.  In either case, no further
+* handling of this interrupt is appropriate.
+* Back up over the (now missing) int3 and run
+* the original instruction.
+*/
+   regs->eip -= sizeof(kprobe_opcode_t);
+   return 1;
+   }
 
-   /*
-* We don't want to be preempted for the entire
-* duration of kprobe processing
-*/
preempt_disable();
kcb = get_kprobe_ctlblk();
+   cur = kprobe_running();
+   p = get_kprobe(addr);
 
-   /* Check we're not actually recursing */
-   if (kprobe_running()) {
-   p = get_kprobe(addr);
-   if (p) {
-   if (kcb->kprobe_status == KPROBE_HIT_SS &&
-   *p->ainsn.insn == BREAKPOINT_INSTRUCTION) {
-   regs->eflags &= ~TF_MASK;
-   regs->eflags |= kcb->kprobe_saved_eflags;
-   goto no_kprobe;
-   }
-   /* We have reentered the kprobe_handler(), since
-* another probe was hit while within the handler.
-* We here save the original kprobes variables and
-* just single step on the instruction of the new probe
-* without calling any user handlers.
-*/
-   save_previous_kprobe(kcb);
-   set_current_kprobe(p, regs, kcb);
-   kprobes_inc_nmissed_count(p);
-   prepare_singlestep(p, regs);
-   kcb->kprobe_status = KPROBE_REENTER;
-   return 1;
-   } else {
-   if (*addr != BREAKPOINT_INSTRUCTION) {
-   /* The breakpoint instruction was removed by
-* another cpu right after we hit, no further
-* handling of this interrupt is appropriate
-*/
-   regs->eip -= sizeof(kprobe_opcode_t);
+   if (p) {
+   if (cur) {
+   switch (kcb->kprobe_status) {
+   case KPROBE_HIT_ACTIVE:
+   case KPROBE_HIT_SSDONE:
+   /* a probe has been hit inside a
+* user handler */
+  

Re: [PATCH] x86: kprobes change kprobe_handler flow

2007-12-31 Thread Abhishek Sagar
Harvey Harrison wrote:
 Make the control flow of kprobe_handler more obvious.
 
 Collapse the separate if blocks/gotos with if/else blocks
 this unifies the duplication of the check for a breakpoint
 instruction race with another cpu.

This is a patch derived from kprobe_handler of the ARM kprobes port. This 
further simplifies the current x86 kprobe_handler. The resulting definition is 
smaller, more readable, has no goto's and contains only a single call to 
get_kprobe.

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
Signed-off-by: Quentin Barnes [EMAIL PROTECTED]
---

diff --git a/arch/x86/kernel/kprobes_32.c b/arch/x86/kernel/kprobes_32.c
index 3a020f7..5a626fd 100644
--- a/arch/x86/kernel/kprobes_32.c
+++ b/arch/x86/kernel/kprobes_32.c
@@ -240,108 +240,110 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
*sara = (unsigned long) kretprobe_trampoline;
 }
 
+static __always_inline void setup_singlestep(struct kprobe *p,
+struct pt_regs *regs,
+struct kprobe_ctlblk *kcb)
+{
+#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PM)
+   if (p-ainsn.boostable == 1  !p-post_handler) {
+   /* Boost up -- we can execute copied instructions directly */
+   reset_current_kprobe();
+   regs-eip = (unsigned long)p-ainsn.insn;
+   preempt_enable_no_resched();
+   } else {
+   prepare_singlestep(p, regs);
+   kcb-kprobe_status = KPROBE_HIT_SS;
+   }
+#else
+   prepare_singlestep(p, regs);
+   kcb-kprobe_status = KPROBE_HIT_SS;
+#endif
+}
+
 /*
  * Interrupts are disabled on entry as trap3 is an interrupt gate and they
  * remain disabled thorough out this function.
  */
 static int __kprobes kprobe_handler(struct pt_regs *regs)
 {
-   struct kprobe *p;
int ret = 0;
kprobe_opcode_t *addr;
+   struct kprobe *p, *cur;
struct kprobe_ctlblk *kcb;
 
addr = (kprobe_opcode_t *)(regs-eip - sizeof(kprobe_opcode_t));
+   if (*addr != BREAKPOINT_INSTRUCTION) {
+   /*
+* The breakpoint instruction was removed right
+* after we hit it.  Another cpu has removed
+* either a probepoint or a debugger breakpoint
+* at this address.  In either case, no further
+* handling of this interrupt is appropriate.
+* Back up over the (now missing) int3 and run
+* the original instruction.
+*/
+   regs-eip -= sizeof(kprobe_opcode_t);
+   return 1;
+   }
 
-   /*
-* We don't want to be preempted for the entire
-* duration of kprobe processing
-*/
preempt_disable();
kcb = get_kprobe_ctlblk();
+   cur = kprobe_running();
+   p = get_kprobe(addr);
 
-   /* Check we're not actually recursing */
-   if (kprobe_running()) {
-   p = get_kprobe(addr);
-   if (p) {
-   if (kcb-kprobe_status == KPROBE_HIT_SS 
-   *p-ainsn.insn == BREAKPOINT_INSTRUCTION) {
-   regs-eflags = ~TF_MASK;
-   regs-eflags |= kcb-kprobe_saved_eflags;
-   goto no_kprobe;
-   }
-   /* We have reentered the kprobe_handler(), since
-* another probe was hit while within the handler.
-* We here save the original kprobes variables and
-* just single step on the instruction of the new probe
-* without calling any user handlers.
-*/
-   save_previous_kprobe(kcb);
-   set_current_kprobe(p, regs, kcb);
-   kprobes_inc_nmissed_count(p);
-   prepare_singlestep(p, regs);
-   kcb-kprobe_status = KPROBE_REENTER;
-   return 1;
-   } else {
-   if (*addr != BREAKPOINT_INSTRUCTION) {
-   /* The breakpoint instruction was removed by
-* another cpu right after we hit, no further
-* handling of this interrupt is appropriate
-*/
-   regs-eip -= sizeof(kprobe_opcode_t);
+   if (p) {
+   if (cur) {
+   switch (kcb-kprobe_status) {
+   case KPROBE_HIT_ACTIVE:
+   case KPROBE_HIT_SSDONE:
+   /* a probe has been hit inside a
+* user handler */
+   save_previous_kprobe(kcb);
+   set_current_kprobe(p, regs, kcb);
+   kprobes_inc_nmissed_count(p

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-21 Thread Abhishek Sagar
On 11/21/07, Jim Keniston <[EMAIL PROTECTED]> wrote:
> I have one minor suggestion on the code -- see below -- but I'm willing
> to ack that with or without the suggested change.  Please also see
> suggestions on kprobes.txt and the demo program.

Thanks...I've made the necessary changes. More comments below.

> It would be more consistent with the existing text in kprobes.txt to add
> a subsection labeled "User's entry handler (rp->entry_handler):" and
> document the entry_handler there.

I've moved almost all of the entry_handler discussion in a separate section.

> >  static struct kretprobe my_kretprobe = {
> > - .handler = ret_handler,
> > - /* Probe up to 20 instances concurrently. */
> > - .maxactive = 20
> > + .handler = return_handler,
> > + .entry_handler = entry_handler,
> > + .data_size = sizeof(struct my_data),
> > + .maxactive = 1, /* profile one invocation at a time */
>
> I don't like the idea of setting maxactive = 1 here.  That's not normal
> kretprobes usage, which is what we're trying to illustrate here.  This
> is no place for splitting hairs about profiling recursive functions.

I've reverted back to ".maxactive = 20". In any case, there is no need
to protect against recursive instances of sys_open.

> > @@ -699,6 +699,14 @@ static int __kprobes pre_handler_kretpro
> >struct kretprobe_instance, uflist);
> >   ri->rp = rp;
> >   ri->task = current;
> > +
> > + if (rp->entry_handler) {
> > + if (rp->entry_handler(ri, regs)) {
>
> Could also be
>if (rp->entry_handler && rp->entry_handler(ri, regs)) {

Done.

---
Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>

diff -upNr linux-2.6.24-rc3/Documentation/kprobes.txt
linux-2.6.24-rc3_kp/Documentation/kprobes.txt
--- linux-2.6.24-rc3/Documentation/kprobes.txt  2007-11-17
10:46:36.0 +0530
+++ linux-2.6.24-rc3_kp/Documentation/kprobes.txt   2007-11-21
15:20:53.0 +0530
@@ -96,7 +96,9 @@ or in registers (e.g., for x86_64 or for
 The jprobe will work in either case, so long as the handler's
 prototype matches that of the probed function.

-1.3 How Does a Return Probe Work?
+1.3 Return Probes
+
+1.3.1 How Does a Return Probe Work?

 When you call register_kretprobe(), Kprobes establishes a kprobe at
 the entry to the function.  When the probed function is called and this
@@ -107,7 +109,7 @@ At boot time, Kprobes registers a kprobe

 When the probed function executes its return instruction, control
 passes to the trampoline and that probe is hit.  Kprobes' trampoline
-handler calls the user-specified handler associated with the kretprobe,
+handler calls the user-specified return handler associated with the kretprobe,
 then sets the saved instruction pointer to the saved return address,
 and that's where execution resumes upon return from the trap.

@@ -131,6 +133,30 @@ zero when the return probe is registered
 time the probed function is entered but there is no kretprobe_instance
 object available for establishing the return probe.

+1.3.2 Kretprobe entry-handler
+
+Kretprobes also provides an optional user-specified handler which runs
+on function entry. This handler is specified by setting the entry_handler
+field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the
+function entry is hit, the user-defined entry_handler, if any, is invoked.
+If the entry_handler returns 0 (success) then a corresponding return handler
+is guaranteed to be called upon function return. If the entry_handler
+returns a non-zero error then Kprobes leaves the return address as is, and
+the kretprobe has no further effect for that particular function instance.
+
+Multiple entry and return handler invocations are matched using the unique
+kretprobe_instance object associated with them. Additionally, a user
+may also specify per return-instance private data to be part of each
+kretprobe_instance object. This is especially useful when sharing private
+data between corresponding user entry and return handlers. The size of each
+private data object can be specified at kretprobe registration time by
+setting the data_size field of the kretprobe struct. This data can be
+accessed through the data field of each kretprobe_instance object.
+
+In case probed function is entered but there is no kretprobe_instance
+object available, then in addition to incrementing the nmissed count,
+the user entry_handler invocation is also skipped.
+
 2. Architectures Supported

 Kprobes, jprobes, and return probes are implemented on the following
@@ -273,6 +299,8 @@ of interest:
 - ret_addr: the return address
 - rp: points to the corresponding kretprobe object
 - task: points to the corresponding task struct
+- data: points to per return-inst

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-21 Thread Abhishek Sagar
On 11/21/07, Jim Keniston [EMAIL PROTECTED] wrote:
 I have one minor suggestion on the code -- see below -- but I'm willing
 to ack that with or without the suggested change.  Please also see
 suggestions on kprobes.txt and the demo program.

Thanks...I've made the necessary changes. More comments below.

 It would be more consistent with the existing text in kprobes.txt to add
 a subsection labeled User's entry handler (rp-entry_handler): and
 document the entry_handler there.

I've moved almost all of the entry_handler discussion in a separate section.

   static struct kretprobe my_kretprobe = {
  - .handler = ret_handler,
  - /* Probe up to 20 instances concurrently. */
  - .maxactive = 20
  + .handler = return_handler,
  + .entry_handler = entry_handler,
  + .data_size = sizeof(struct my_data),
  + .maxactive = 1, /* profile one invocation at a time */

 I don't like the idea of setting maxactive = 1 here.  That's not normal
 kretprobes usage, which is what we're trying to illustrate here.  This
 is no place for splitting hairs about profiling recursive functions.

I've reverted back to .maxactive = 20. In any case, there is no need
to protect against recursive instances of sys_open.

  @@ -699,6 +699,14 @@ static int __kprobes pre_handler_kretpro
 struct kretprobe_instance, uflist);
ri-rp = rp;
ri-task = current;
  +
  + if (rp-entry_handler) {
  + if (rp-entry_handler(ri, regs)) {

 Could also be
if (rp-entry_handler  rp-entry_handler(ri, regs)) {

Done.

---
Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]

diff -upNr linux-2.6.24-rc3/Documentation/kprobes.txt
linux-2.6.24-rc3_kp/Documentation/kprobes.txt
--- linux-2.6.24-rc3/Documentation/kprobes.txt  2007-11-17
10:46:36.0 +0530
+++ linux-2.6.24-rc3_kp/Documentation/kprobes.txt   2007-11-21
15:20:53.0 +0530
@@ -96,7 +96,9 @@ or in registers (e.g., for x86_64 or for
 The jprobe will work in either case, so long as the handler's
 prototype matches that of the probed function.

-1.3 How Does a Return Probe Work?
+1.3 Return Probes
+
+1.3.1 How Does a Return Probe Work?

 When you call register_kretprobe(), Kprobes establishes a kprobe at
 the entry to the function.  When the probed function is called and this
@@ -107,7 +109,7 @@ At boot time, Kprobes registers a kprobe

 When the probed function executes its return instruction, control
 passes to the trampoline and that probe is hit.  Kprobes' trampoline
-handler calls the user-specified handler associated with the kretprobe,
+handler calls the user-specified return handler associated with the kretprobe,
 then sets the saved instruction pointer to the saved return address,
 and that's where execution resumes upon return from the trap.

@@ -131,6 +133,30 @@ zero when the return probe is registered
 time the probed function is entered but there is no kretprobe_instance
 object available for establishing the return probe.

+1.3.2 Kretprobe entry-handler
+
+Kretprobes also provides an optional user-specified handler which runs
+on function entry. This handler is specified by setting the entry_handler
+field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the
+function entry is hit, the user-defined entry_handler, if any, is invoked.
+If the entry_handler returns 0 (success) then a corresponding return handler
+is guaranteed to be called upon function return. If the entry_handler
+returns a non-zero error then Kprobes leaves the return address as is, and
+the kretprobe has no further effect for that particular function instance.
+
+Multiple entry and return handler invocations are matched using the unique
+kretprobe_instance object associated with them. Additionally, a user
+may also specify per return-instance private data to be part of each
+kretprobe_instance object. This is especially useful when sharing private
+data between corresponding user entry and return handlers. The size of each
+private data object can be specified at kretprobe registration time by
+setting the data_size field of the kretprobe struct. This data can be
+accessed through the data field of each kretprobe_instance object.
+
+In case probed function is entered but there is no kretprobe_instance
+object available, then in addition to incrementing the nmissed count,
+the user entry_handler invocation is also skipped.
+
 2. Architectures Supported

 Kprobes, jprobes, and return probes are implemented on the following
@@ -273,6 +299,8 @@ of interest:
 - ret_addr: the return address
 - rp: points to the corresponding kretprobe object
 - task: points to the corresponding task struct
+- data: points to per return-instance private data; see Kretprobe entry-
+  handler for details.

 The regs_return_value(regs) macro provides a simple abstraction to
 extract the return value from the appropriate register as defined by
@@ -555,23 +583,46 @@ report failed calls

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-19 Thread Abhishek Sagar
On Nov 17, 2007 6:24 AM, Jim Keniston <[EMAIL PROTECTED]> wrote:
> > True, some kind of data pointer/pouch is essential.
>
> Yes.  If the pouch idea is too weird, then the data pointer is a good
> compromise.
>
> With the above reservations, your enclosed patch looks OK.
>
> You should provide a patch #2 that updates Documentation/kprobes.txt.
> Maybe that will yield a little more review from other folks.

The inlined patch provides support for an optional user entry-handler
in kretprobes. It also adds provision for private data to be held in
each return instance based on Kevin Stafford's "data pouch" approach.
Kretprobe usage example in Documentation/kprobes.txt has also been
updated to demonstrate the usage of entry-handlers.

Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
---
diff -upNr linux-2.6.24-rc2/Documentation/kprobes.txt
linux-2.6.24-rc2_kp/Documentation/kprobes.txt
--- linux-2.6.24-rc2/Documentation/kprobes.txt  2007-11-07
03:27:46.0 +0530
+++ linux-2.6.24-rc2_kp/Documentation/kprobes.txt   2007-11-19
17:41:27.0 +0530
@@ -100,16 +100,21 @@ prototype matches that of the probed fun

 When you call register_kretprobe(), Kprobes establishes a kprobe at
 the entry to the function.  When the probed function is called and this
-probe is hit, Kprobes saves a copy of the return address, and replaces
-the return address with the address of a "trampoline."  The trampoline
-is an arbitrary piece of code -- typically just a nop instruction.
-At boot time, Kprobes registers a kprobe at the trampoline.
-
-When the probed function executes its return instruction, control
-passes to the trampoline and that probe is hit.  Kprobes' trampoline
-handler calls the user-specified handler associated with the kretprobe,
-then sets the saved instruction pointer to the saved return address,
-and that's where execution resumes upon return from the trap.
+probe is hit, the user defined entry_handler is invoked (optional). If
+the entry_handler returns 0 (success) or is not present, then Kprobes
+saves a copy of the return address, and replaces the return address
+with the address of a "trampoline."  If the entry_handler returns a
+non-zero error, the function executes as normal, as if no probe was
+installed on it. The trampoline is an arbitrary piece of code --
+typically just a nop instruction. At boot time, Kprobes registers a
+kprobe at the trampoline.
+
+After the trampoline return address is planted, when the probed function
+executes its return instruction, control passes to the trampoline and
+that probe is hit.  Kprobes' trampoline handler calls the user-specified
+return handler associated with the kretprobe, then sets the saved
+instruction pointer to the saved return address, and that's where
+execution resumes upon return from the trap.

 While the probed function is executing, its return address is
 stored in an object of type kretprobe_instance.  Before calling
@@ -117,6 +122,9 @@ register_kretprobe(), the user sets the
 kretprobe struct to specify how many instances of the specified
 function can be probed simultaneously.  register_kretprobe()
 pre-allocates the indicated number of kretprobe_instance objects.
+Additionally, a user may also specify per-instance private data to
+be part of each return instance. This is useful when using kretprobes
+with a user entry_handler (see "register_kretprobe" for details).

 For example, if the function is non-recursive and is called with a
 spinlock held, maxactive = 1 should be enough.  If the function is
@@ -129,7 +137,8 @@ It's not a disaster if you set maxactive
 some probes.  In the kretprobe struct, the nmissed field is set to
 zero when the return probe is registered, and is incremented every
 time the probed function is entered but there is no kretprobe_instance
-object available for establishing the return probe.
+object available for establishing the return probe. A miss also prevents
+user entry_handler from being invoked.

 2. Architectures Supported

@@ -258,6 +267,16 @@ Establishes a return probe for the funct
 rp->kp.addr.  When that function returns, Kprobes calls rp->handler.
 You must set rp->maxactive appropriately before you call
 register_kretprobe(); see "How Does a Return Probe Work?" for details.
+An optional entry-handler can also be specified by initializing
+rp->entry_handler. This handler is called at the beginning of the
+probed function (except for instances exceeding rp->maxactive). On
+success the entry_handler return 0, which guarantees invocation of
+a corresponding return handler. Corresponding entry and return handler
+invocations can be matched using the return instance (ri) passed to them.
+Also, each ri can hold per-instance private data (ri->data), whose size
+is determined by rp->data_size. If the entry_handler returns a non-zero
+error, the current return instance is skipped and no ret

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-19 Thread Abhishek Sagar
On Nov 17, 2007 6:24 AM, Jim Keniston [EMAIL PROTECTED] wrote:
  True, some kind of data pointer/pouch is essential.

 Yes.  If the pouch idea is too weird, then the data pointer is a good
 compromise.

 With the above reservations, your enclosed patch looks OK.

 You should provide a patch #2 that updates Documentation/kprobes.txt.
 Maybe that will yield a little more review from other folks.

The inlined patch provides support for an optional user entry-handler
in kretprobes. It also adds provision for private data to be held in
each return instance based on Kevin Stafford's data pouch approach.
Kretprobe usage example in Documentation/kprobes.txt has also been
updated to demonstrate the usage of entry-handlers.

Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
---
diff -upNr linux-2.6.24-rc2/Documentation/kprobes.txt
linux-2.6.24-rc2_kp/Documentation/kprobes.txt
--- linux-2.6.24-rc2/Documentation/kprobes.txt  2007-11-07
03:27:46.0 +0530
+++ linux-2.6.24-rc2_kp/Documentation/kprobes.txt   2007-11-19
17:41:27.0 +0530
@@ -100,16 +100,21 @@ prototype matches that of the probed fun

 When you call register_kretprobe(), Kprobes establishes a kprobe at
 the entry to the function.  When the probed function is called and this
-probe is hit, Kprobes saves a copy of the return address, and replaces
-the return address with the address of a trampoline.  The trampoline
-is an arbitrary piece of code -- typically just a nop instruction.
-At boot time, Kprobes registers a kprobe at the trampoline.
-
-When the probed function executes its return instruction, control
-passes to the trampoline and that probe is hit.  Kprobes' trampoline
-handler calls the user-specified handler associated with the kretprobe,
-then sets the saved instruction pointer to the saved return address,
-and that's where execution resumes upon return from the trap.
+probe is hit, the user defined entry_handler is invoked (optional). If
+the entry_handler returns 0 (success) or is not present, then Kprobes
+saves a copy of the return address, and replaces the return address
+with the address of a trampoline.  If the entry_handler returns a
+non-zero error, the function executes as normal, as if no probe was
+installed on it. The trampoline is an arbitrary piece of code --
+typically just a nop instruction. At boot time, Kprobes registers a
+kprobe at the trampoline.
+
+After the trampoline return address is planted, when the probed function
+executes its return instruction, control passes to the trampoline and
+that probe is hit.  Kprobes' trampoline handler calls the user-specified
+return handler associated with the kretprobe, then sets the saved
+instruction pointer to the saved return address, and that's where
+execution resumes upon return from the trap.

 While the probed function is executing, its return address is
 stored in an object of type kretprobe_instance.  Before calling
@@ -117,6 +122,9 @@ register_kretprobe(), the user sets the
 kretprobe struct to specify how many instances of the specified
 function can be probed simultaneously.  register_kretprobe()
 pre-allocates the indicated number of kretprobe_instance objects.
+Additionally, a user may also specify per-instance private data to
+be part of each return instance. This is useful when using kretprobes
+with a user entry_handler (see register_kretprobe for details).

 For example, if the function is non-recursive and is called with a
 spinlock held, maxactive = 1 should be enough.  If the function is
@@ -129,7 +137,8 @@ It's not a disaster if you set maxactive
 some probes.  In the kretprobe struct, the nmissed field is set to
 zero when the return probe is registered, and is incremented every
 time the probed function is entered but there is no kretprobe_instance
-object available for establishing the return probe.
+object available for establishing the return probe. A miss also prevents
+user entry_handler from being invoked.

 2. Architectures Supported

@@ -258,6 +267,16 @@ Establishes a return probe for the funct
 rp-kp.addr.  When that function returns, Kprobes calls rp-handler.
 You must set rp-maxactive appropriately before you call
 register_kretprobe(); see How Does a Return Probe Work? for details.
+An optional entry-handler can also be specified by initializing
+rp-entry_handler. This handler is called at the beginning of the
+probed function (except for instances exceeding rp-maxactive). On
+success the entry_handler return 0, which guarantees invocation of
+a corresponding return handler. Corresponding entry and return handler
+invocations can be matched using the return instance (ri) passed to them.
+Also, each ri can hold per-instance private data (ri-data), whose size
+is determined by rp-data_size. If the entry_handler returns a non-zero
+error, the current return instance is skipped and no return handler is
+called for the current instance.

 register_kretprobe() returns 0 on success, or a negative errno
 otherwise.
@@ -555,23 +574,46

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-17 Thread Abhishek Sagar
On Nov 17, 2007 6:24 AM, Jim Keniston <[EMAIL PROTECTED]> wrote:
> It'd be helpful to see others (especially kprobes maintainers) chime in
> on this.  In particular, if doing kmalloc/kfree of GFP_ATOMIC data at
> kretprobe-hit time is OK, as in Abhishek's approach, then we could also
> use GFP_ATOMIC (or at least GFP_NOWAIT) allocations to make up the
> difference when we run low on kretprobe_instances.

It might cause a problem with return instances having a large value of
entry_info_sz, being allocated in the page frame reclamation code
path.

> > > entry_info is, by default, a zero-length array, which adds nothing to
> > > the size of a uretprobe_instance -- at least on the 3 architectures I've
> > > tested on (i386, x86_64, powerpc).
> >
> > Strange, because from what I could gather, the data pouch patch had
> > the following in the kretprobe registration routine:
> >
> >
> > for (i = 0; i < rp->maxactive; i++) {
> > - inst = kmalloc(sizeof(struct kretprobe_instance), GFP_KERNEL);
> > + inst = kmalloc((sizeof(struct kretprobe_instance)
> > + + rp->entry_info_sz), GFP_KERNEL);
> >
> >
> > rp->entry_info_sz is presumably the size of the private data structure
> > of the registering module.
>
> ... which is zero for kretprobes that don't use the data pouch.
>
> > This is the bloat I was referring to. But
> > this difference should've showed up in your tests...?
>
> What bloat?  On my 32-bit system, the pouch to hold struct prof_data in
> your test_module example would be 20 bytes.  (For comparison,
> sizeof(struct kretprobe_instance) = 28, btw.)  Except for functions like
> schedule(), where a lot of tasks can be sleeping at the same time, an
> rp->maxactive value of 5 or 10 is typically plenty.  That's 100-200
> bytes of "bloat" spent at registration time (GFP_KERNEL), at least some
> of which will be saved at probe-hit time (GFP_ATOMIC).  (And if somebody
> says, "I always use a much higher value of rp->maxactive," then he/she's
> probably not really worried about bloat.)

Ok. Will make the necessary transition to registration time allocation
of private data.

> Yes.  If the pouch idea is too weird, then the data pointer is a good
> compromise.
>
> With the above reservations, your enclosed patch looks OK.
>
> You should provide a patch #2 that updates Documentation/kprobes.txt.
> Maybe that will yield a little more review from other folks.

Will incorporate changes to kprobes.txt as well.

- Abhishek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-17 Thread Abhishek Sagar
On Nov 17, 2007 4:39 AM, Jim Keniston <[EMAIL PROTECTED]> wrote:
> First of all, as promised, here's what would be different if it were
> implemented using the data-pouch approach:
>
> --- abhishek1.c 2007-11-16 13:57:13.0 -0800
> +++ jim1.c  2007-11-16 14:20:39.0 -0800
> @@ -50,15 +50,12 @@
> if (stats)
> return 1; /* recursive/nested call */
>
> -   stats = kmalloc(sizeof(struct prof_data), GFP_ATOMIC);
> -   if (!stats)
> -   return 1;
> +   stats = (struct prof_data *) ri->entry_info;
>
> stats->entry_stamp = sched_clock();
> stats->task = current;
> INIT_LIST_HEAD(>list);
> list_add(>list, _nodes);
> -   ri->data = stats;
> return 0;
>  }
>
> @@ -66,10 +63,9 @@
>  static int return_handler(struct kretprobe_instance *ri, struct pt_regs
> *regs)
>  {
> unsigned long flags;
> -   struct prof_data *stats = (struct prof_data *)ri->data;
> +   struct prof_data *stats = (struct prof_data *)ri->entry_info;
> u64 elapsed;
>
> -   BUG_ON(ri->data == NULL);
> elapsed = (long long)sched_clock() - (long long)stats->entry_stamp;
>
> /* update stats */
> @@ -79,13 +75,13 @@
> spin_unlock_irqrestore(_lock, flags);
>
> list_del(>list);
> -   kfree(stats);
> return 0;
>  }
>
>  static struct kretprobe my_kretprobe = {
> .handler = return_handler,
> .entry_handler = entry_handler,
> +   .entry_info_sz = sizeof(struct prof_data)
>  };
>
>  /* called after every PRINT_DELAY seconds */
>
> So the data-pouch approach saves you a little code and a kmalloc/kfree
> round trip on each kretprobe hit.  A kmalloc/kfree round trip is about
> 80 ns on my system, or about 20% of the base cost of a kretprobe hit.  I
> don't know how important that is to people.
>
> I also note that this particular example maintains a per-task list of
> prof_data objects to avoid overcounting the time spent in a recursive
> function.  That adds about 30% to the size of your module source (136
> lines vs. 106, by my count).  I suspect that many instrumentation
> modules wouldn't need such a list.  However, without your ri->data
> pointer (or Kevin's ri->entry_info pouch), every instrumentation module
> that uses your enhancement would need such a list in order to map the ri
> to the per-instance.

Those are interesting numbers. Will incorporate pouching in the next
patch. Even with a data pointer or pouch, the mapping of ri (or
ri->data) would sometimes be necessary. It's required to catch
recursive/nested invocation cases. In case of time measurment test
module, these invocations needed to be weeded out and therefore such a
list was required. Other scenarios might not care for it. E.g a module
which measures the change in some global system state across every
call.

Thanks for the comments.

> Jim

- Abhishek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-17 Thread Abhishek Sagar
On Nov 17, 2007 4:39 AM, Jim Keniston [EMAIL PROTECTED] wrote:
 First of all, as promised, here's what would be different if it were
 implemented using the data-pouch approach:

 --- abhishek1.c 2007-11-16 13:57:13.0 -0800
 +++ jim1.c  2007-11-16 14:20:39.0 -0800
 @@ -50,15 +50,12 @@
 if (stats)
 return 1; /* recursive/nested call */

 -   stats = kmalloc(sizeof(struct prof_data), GFP_ATOMIC);
 -   if (!stats)
 -   return 1;
 +   stats = (struct prof_data *) ri-entry_info;

 stats-entry_stamp = sched_clock();
 stats-task = current;
 INIT_LIST_HEAD(stats-list);
 list_add(stats-list, data_nodes);
 -   ri-data = stats;
 return 0;
  }

 @@ -66,10 +63,9 @@
  static int return_handler(struct kretprobe_instance *ri, struct pt_regs
 *regs)
  {
 unsigned long flags;
 -   struct prof_data *stats = (struct prof_data *)ri-data;
 +   struct prof_data *stats = (struct prof_data *)ri-entry_info;
 u64 elapsed;

 -   BUG_ON(ri-data == NULL);
 elapsed = (long long)sched_clock() - (long long)stats-entry_stamp;

 /* update stats */
 @@ -79,13 +75,13 @@
 spin_unlock_irqrestore(time_lock, flags);

 list_del(stats-list);
 -   kfree(stats);
 return 0;
  }

  static struct kretprobe my_kretprobe = {
 .handler = return_handler,
 .entry_handler = entry_handler,
 +   .entry_info_sz = sizeof(struct prof_data)
  };

  /* called after every PRINT_DELAY seconds */

 So the data-pouch approach saves you a little code and a kmalloc/kfree
 round trip on each kretprobe hit.  A kmalloc/kfree round trip is about
 80 ns on my system, or about 20% of the base cost of a kretprobe hit.  I
 don't know how important that is to people.

 I also note that this particular example maintains a per-task list of
 prof_data objects to avoid overcounting the time spent in a recursive
 function.  That adds about 30% to the size of your module source (136
 lines vs. 106, by my count).  I suspect that many instrumentation
 modules wouldn't need such a list.  However, without your ri-data
 pointer (or Kevin's ri-entry_info pouch), every instrumentation module
 that uses your enhancement would need such a list in order to map the ri
 to the per-instance.

Those are interesting numbers. Will incorporate pouching in the next
patch. Even with a data pointer or pouch, the mapping of ri (or
ri-data) would sometimes be necessary. It's required to catch
recursive/nested invocation cases. In case of time measurment test
module, these invocations needed to be weeded out and therefore such a
list was required. Other scenarios might not care for it. E.g a module
which measures the change in some global system state across every
call.

Thanks for the comments.

 Jim

- Abhishek
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-17 Thread Abhishek Sagar
On Nov 17, 2007 6:24 AM, Jim Keniston [EMAIL PROTECTED] wrote:
 It'd be helpful to see others (especially kprobes maintainers) chime in
 on this.  In particular, if doing kmalloc/kfree of GFP_ATOMIC data at
 kretprobe-hit time is OK, as in Abhishek's approach, then we could also
 use GFP_ATOMIC (or at least GFP_NOWAIT) allocations to make up the
 difference when we run low on kretprobe_instances.

It might cause a problem with return instances having a large value of
entry_info_sz, being allocated in the page frame reclamation code
path.

   entry_info is, by default, a zero-length array, which adds nothing to
   the size of a uretprobe_instance -- at least on the 3 architectures I've
   tested on (i386, x86_64, powerpc).
 
  Strange, because from what I could gather, the data pouch patch had
  the following in the kretprobe registration routine:
 
 
  for (i = 0; i  rp-maxactive; i++) {
  - inst = kmalloc(sizeof(struct kretprobe_instance), GFP_KERNEL);
  + inst = kmalloc((sizeof(struct kretprobe_instance)
  + + rp-entry_info_sz), GFP_KERNEL);
 
 
  rp-entry_info_sz is presumably the size of the private data structure
  of the registering module.

 ... which is zero for kretprobes that don't use the data pouch.

  This is the bloat I was referring to. But
  this difference should've showed up in your tests...?

 What bloat?  On my 32-bit system, the pouch to hold struct prof_data in
 your test_module example would be 20 bytes.  (For comparison,
 sizeof(struct kretprobe_instance) = 28, btw.)  Except for functions like
 schedule(), where a lot of tasks can be sleeping at the same time, an
 rp-maxactive value of 5 or 10 is typically plenty.  That's 100-200
 bytes of bloat spent at registration time (GFP_KERNEL), at least some
 of which will be saved at probe-hit time (GFP_ATOMIC).  (And if somebody
 says, I always use a much higher value of rp-maxactive, then he/she's
 probably not really worried about bloat.)

Ok. Will make the necessary transition to registration time allocation
of private data.

 Yes.  If the pouch idea is too weird, then the data pointer is a good
 compromise.

 With the above reservations, your enclosed patch looks OK.

 You should provide a patch #2 that updates Documentation/kprobes.txt.
 Maybe that will yield a little more review from other folks.

Will incorporate changes to kprobes.txt as well.

- Abhishek
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-16 Thread Abhishek Sagar
On Nov 16, 2007 5:37 AM, Jim Keniston <[EMAIL PROTECTED]> wrote:
> On Thu, 2007-11-15 at 20:30 +0530, Abhishek Sagar wrote:
> > On Nov 15, 2007 4:21 AM, Jim Keniston <[EMAIL PROTECTED]> wrote:
> > > 2. Simplify the task of correlating data (e.g., timestamps) between
> > > function entry and function return.
> >
> > Would adding of data and len fields in ri help? Instead of "pouching"
> > data in one go at registration time, this would let user handlers do
> > the allocation
>
> Yes and no.  Adding just a data field -- void*, or maybe unsigned long
> long so it's big enought to accommodate big timestamps -- would be a big
> improvement on your current proposal.  That would save the user the
> drudgery of mapping the ri pointer to his/her per-instance data.
> There's plenty of precedent for passing "private_data" values to
> callbacks.
>
> I don't think a len field would help much.  If such info were needed, it
> could be stored in the data structure pointed to by the data field.
>
> I still don't think "letting [i.e., requiring that] user handlers do the
> allocation" is a win.  I'm still interested to see how this plays out in
> real examples.
>
> > and allow them to use different kinds of data
> > structures per-instance.
>
> I haven't been able to think of any scenarios where this would be
> useful.  A "data pouch" could always contain a union, FWIW.

I'm inlining a sample module which uses a data pointer in ri. I didn't
go for a timestamp because it's not reliable. Some platforms might
simply not have any h/w timestamp counters. For the same reason a lot
of platforms (on ARM, say) have their sched_clock() mapped on jiffies.
This may prevent timestamps from being distinct across function entry
and exit. Plus a data pointer looks pretty harmless.

--- test module ---

#include 
#include 
#include 
#include 
#include 

#define PRINT_DELAY (10 * HZ)

/* This module calculates the total time and instances of func being called
 * across all cpu's. An average is calculated every 10 seconds and displayed.
 * Only one function instance per-task is monitored. This cuts out the
 * possibility of measuring time for recursive and nested function
 * invocations.
 *
 * Note: If compiling as a standalone module, make sure sched_clock() is
 * exported in the kernel. */

/* per-task data */
struct prof_data {
struct task_struct *task;
struct list_head list;
unsigned long long entry_stamp;
};

static const char *func = "sys_open";
static spinlock_t time_lock;
static ktime_t total_time;
static unsigned long hits;
static LIST_HEAD(data_nodes); /* list of per-task data */
static struct delayed_work work_print;

static struct prof_data *get_per_task_data(struct task_struct *tsk)
{
struct prof_data *p;

/* lookup prof_data corresponding to tsk */
list_for_each_entry(p, _nodes, list) {
if (p->task == tsk)
return p;
}
return NULL;
}

/* called with kretprobe_lock held */
static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
struct prof_data *stats;

stats = get_per_task_data(current);
if (stats)
return 1; /* recursive/nested call */

stats = kmalloc(sizeof(struct prof_data), GFP_ATOMIC);
if (!stats)
return 1;

stats->entry_stamp = sched_clock();
stats->task = current;
INIT_LIST_HEAD(>list);
list_add(>list, _nodes);
ri->data = stats;
return 0;
}

/* called with kretprobe_lock held */
static int return_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
unsigned long flags;
struct prof_data *stats = (struct prof_data *)ri->data;
u64 elapsed;

BUG_ON(ri->data == NULL);
elapsed = (long long)sched_clock() - (long long)stats->entry_stamp;

/* update stats */
spin_lock_irqsave(_lock, flags);
++hits;
total_time = ktime_add_ns(total_time, elapsed);
spin_unlock_irqrestore(_lock, flags);

list_del(>list);
kfree(stats);
return 0;
}

static struct kretprobe my_kretprobe = {
.handler = return_handler,
.entry_handler = entry_handler,
};

/* called after every PRINT_DELAY seconds */
static void print_time(struct work_struct *work)
{
unsigned long flags;
s64 time_ns;
struct timespec ts;

BUG_ON(work != _print.work);
spin_lock_irqsave(_lock, flags);
time_ns = ktime_to_ns(total_time);
do_div(time_ns, hits);
spin_unlock_irqrestore(_lock, flags);

ts = ns_to_timespec(time_ns);
printk(KERN_DEBUG "Avg. running time of %s = %ld sec, %ld nsec\n",
   func, t

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-16 Thread Abhishek Sagar
equired because if the entry_handler() returns error (a
> > voluntary miss), then there has to be a way to roll-back all the
> > changes that arch_prepare_kretprobe() and entry_handler() made to
> > *regs. Such an instance is considered "aborted".
>
> No.  Handlers shouldn't be writing to the pt_regs struct unless they
> want to change the operation of the probed code.  If an entry handler
> scribbles on *regs and then decides to "abort", it's up to that handler
> to restore the contents of *regs.  It's that way with all kprobe and
> kretprobe handlers, and the proposed entry handler is no different, as
> far as I can see.

I had the copy mainly to undo the effect of arch_prepare_kretprobe()
on regs if entry_handler decided to abort. But I think it didn't serve
that purpose as well since arch_prepare_kretprobe() ends up modifying
the stack on x86. So I've gotten rid of the copy now. The
entry_handler now gets called before arch_prepare_kretprobe() and
therefore shouldn't expect to use ri->ret_addr. In effect the
entry_handler is an early decision maker on whether the return
instance setup should even take place for current function hit or not.

> > Wouldn't the return addresses be different for recursive/nested calls?
>
> Not for recursive calls.  Think about it.  Or write a little program
> that calls a recursive function like factorial(), and have factorial()
> report the value of __builtin_return_address(0) each time it's called.

Oh yea got it...I was stuck with the nested call case and forgot about
direct recursion :)

> > > > The fact that ri is passed to both handlers should allow any user
> > > > handler to identify each of these cases and take appropriate
> > > > synchronization action pertaining to its private data, if needed.
> > >
> > > I don't think Abhishek has made his case here.  See below.
> > >
> > > >
> > > > > (Hence I feel sol a) would be nice).
> > > >
> > > > With an entry-handler, any module aiming to profile running time of a
> > > > function (say) can simply do the following without being "return
> > > > instance" conscious. Note however that I'm not trying to address just
> > > > this scenario but trying to provide a general way to use
> > > > entry-handlers in kretprobes:
> > > >
> > > > static unsigned  long flag = 0; /* use bit 0 as a global flag */
> > > > unsigned long long entry, exit;
> > > >
> > > > int my_entry_handler(struct kretprobe_instance *ri, struct pt_regs 
> > > > *regs)
> > > > {
> > > > if (!test_and_set_bit(0, ))
> > > > /* this instance claims the first entry to kretprobe'd function */
> > > > entry = sched_clock();
> > > > /* do other stuff */
> > > > return 0; /* right on! */
> > > > }
> > > > return 1; /* error: no return instance to be allocated for this
> > > > function entry */
> > > > }
> > > >
> > > > /* will only be called iff flag == 1 */
> > > > int my_return_handler(struct kretprobe_instance *ri, struct pt_regs 
> > > > *regs)
> > > > {
> > > > BUG_ON(!flag);
> > > > exit = sched_clock();
> > > > set_bit(0, );
> > > > }
> > > >
> > > > I think something like this should do the trick for you.
> > >
> > > Since flag is static, it seems to me that if there were instances of the
> > > probed function active concurrently in multiple tasks, only the
> > > first-called instance would be  profiled.
> >
> > Oh that's right, or we could use a per-cpu flag which would restrict
> > us to only one profiling instance per processor.
>
> If the probed function can sleep, then it could return on a different
> CPU; a per-cpu flag wouldn't work in such cases.

Hmmm...since disabling preemption from entry_handler won't work, a
more practical approach would be to associate all return instances to
tasks. I'll try to come up with an example to exploit the per-task
return instance associativity.

> >
> > > Jim Keniston
> >
> > --
> > Thanks & Regards
> > Abhishek Sagar
> >
> > ---
> > Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
> >
> > diff -upNr linux-2.6.24-rc2/include/linux/kprobes.h
> > linux-2.6.24-rc2_kp/include/linux/kprobes.h
> > --- linux-2.6.24-rc2/include/linux/kprobes.h  2007-11-07 03:27:46.0 
> > +0530
> > +++ linux-2.6.24-rc2_kp/include/linux/kprobes.h   2007-1

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-16 Thread Abhishek Sagar
 kprobe and
 kretprobe handlers, and the proposed entry handler is no different, as
 far as I can see.

I had the copy mainly to undo the effect of arch_prepare_kretprobe()
on regs if entry_handler decided to abort. But I think it didn't serve
that purpose as well since arch_prepare_kretprobe() ends up modifying
the stack on x86. So I've gotten rid of the copy now. The
entry_handler now gets called before arch_prepare_kretprobe() and
therefore shouldn't expect to use ri-ret_addr. In effect the
entry_handler is an early decision maker on whether the return
instance setup should even take place for current function hit or not.

  Wouldn't the return addresses be different for recursive/nested calls?

 Not for recursive calls.  Think about it.  Or write a little program
 that calls a recursive function like factorial(), and have factorial()
 report the value of __builtin_return_address(0) each time it's called.

Oh yea got it...I was stuck with the nested call case and forgot about
direct recursion :)

The fact that ri is passed to both handlers should allow any user
handler to identify each of these cases and take appropriate
synchronization action pertaining to its private data, if needed.
  
   I don't think Abhishek has made his case here.  See below.
  
   
 (Hence I feel sol a) would be nice).
   
With an entry-handler, any module aiming to profile running time of a
function (say) can simply do the following without being return
instance conscious. Note however that I'm not trying to address just
this scenario but trying to provide a general way to use
entry-handlers in kretprobes:
   
static unsigned  long flag = 0; /* use bit 0 as a global flag */
unsigned long long entry, exit;
   
int my_entry_handler(struct kretprobe_instance *ri, struct pt_regs 
*regs)
{
if (!test_and_set_bit(0, flag))
/* this instance claims the first entry to kretprobe'd function */
entry = sched_clock();
/* do other stuff */
return 0; /* right on! */
}
return 1; /* error: no return instance to be allocated for this
function entry */
}
   
/* will only be called iff flag == 1 */
int my_return_handler(struct kretprobe_instance *ri, struct pt_regs 
*regs)
{
BUG_ON(!flag);
exit = sched_clock();
set_bit(0, flag);
}
   
I think something like this should do the trick for you.
  
   Since flag is static, it seems to me that if there were instances of the
   probed function active concurrently in multiple tasks, only the
   first-called instance would be  profiled.
 
  Oh that's right, or we could use a per-cpu flag which would restrict
  us to only one profiling instance per processor.

 If the probed function can sleep, then it could return on a different
 CPU; a per-cpu flag wouldn't work in such cases.

Hmmm...since disabling preemption from entry_handler won't work, a
more practical approach would be to associate all return instances to
tasks. I'll try to come up with an example to exploit the per-task
return instance associativity.

 
   Jim Keniston
 
  --
  Thanks  Regards
  Abhishek Sagar
 
  ---
  Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
 
  diff -upNr linux-2.6.24-rc2/include/linux/kprobes.h
  linux-2.6.24-rc2_kp/include/linux/kprobes.h
  --- linux-2.6.24-rc2/include/linux/kprobes.h  2007-11-07 03:27:46.0 
  +0530
  +++ linux-2.6.24-rc2_kp/include/linux/kprobes.h   2007-11-15
  15:49:39.0 +0530
  @@ -152,6 +152,7 @@ static inline int arch_trampoline_kprobe
   struct kretprobe {
struct kprobe kp;
kretprobe_handler_t handler;
  + kretprobe_handler_t entry_handler;
int maxactive;
int nmissed;
struct hlist_head free_instances;
  diff -upNr linux-2.6.24-rc2/kernel/kprobes.c
  linux-2.6.24-rc2_kp/kernel/kprobes.c
  --- linux-2.6.24-rc2/kernel/kprobes.c 2007-11-07 03:27:46.0 +0530
  +++ linux-2.6.24-rc2_kp/kernel/kprobes.c  2007-11-15 16:00:57.0 
  +0530
  @@ -694,12 +694,24 @@ static int __kprobes pre_handler_kretpro
spin_lock_irqsave(kretprobe_lock, flags);
if (!hlist_empty(rp-free_instances)) {
struct kretprobe_instance *ri;
  + struct pt_regs copy;

 NAK.  Saving and restoring regs is expensive and inconsistent with
 existing kprobes usage.

Revising the patch with the suggested change:

---
Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]

diff -upNr linux-2.6.24-rc2/include/linux/kprobes.h
linux-2.6.24-rc2_kp/include/linux/kprobes.h
--- linux-2.6.24-rc2/include/linux/kprobes.h2007-11-07 03:27:46.0 
+0530
+++ linux-2.6.24-rc2_kp/include/linux/kprobes.h 2007-11-16
22:50:24.0 +0530
@@ -152,6 +152,7 @@ static inline int arch_trampoline_kprobe
 struct kretprobe {
struct kprobe kp;
kretprobe_handler_t handler;
+   kretprobe_handler_t entry_handler;
int maxactive;
int

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-16 Thread Abhishek Sagar
On Nov 16, 2007 5:37 AM, Jim Keniston [EMAIL PROTECTED] wrote:
 On Thu, 2007-11-15 at 20:30 +0530, Abhishek Sagar wrote:
  On Nov 15, 2007 4:21 AM, Jim Keniston [EMAIL PROTECTED] wrote:
   2. Simplify the task of correlating data (e.g., timestamps) between
   function entry and function return.
 
  Would adding of data and len fields in ri help? Instead of pouching
  data in one go at registration time, this would let user handlers do
  the allocation

 Yes and no.  Adding just a data field -- void*, or maybe unsigned long
 long so it's big enought to accommodate big timestamps -- would be a big
 improvement on your current proposal.  That would save the user the
 drudgery of mapping the ri pointer to his/her per-instance data.
 There's plenty of precedent for passing private_data values to
 callbacks.

 I don't think a len field would help much.  If such info were needed, it
 could be stored in the data structure pointed to by the data field.

 I still don't think letting [i.e., requiring that] user handlers do the
 allocation is a win.  I'm still interested to see how this plays out in
 real examples.

  and allow them to use different kinds of data
  structures per-instance.

 I haven't been able to think of any scenarios where this would be
 useful.  A data pouch could always contain a union, FWIW.

I'm inlining a sample module which uses a data pointer in ri. I didn't
go for a timestamp because it's not reliable. Some platforms might
simply not have any h/w timestamp counters. For the same reason a lot
of platforms (on ARM, say) have their sched_clock() mapped on jiffies.
This may prevent timestamps from being distinct across function entry
and exit. Plus a data pointer looks pretty harmless.

--- test module ---

#include linux/kernel.h
#include linux/version.h
#include linux/module.h
#include linux/kprobes.h
#include linux/ktime.h

#define PRINT_DELAY (10 * HZ)

/* This module calculates the total time and instances of func being called
 * across all cpu's. An average is calculated every 10 seconds and displayed.
 * Only one function instance per-task is monitored. This cuts out the
 * possibility of measuring time for recursive and nested function
 * invocations.
 *
 * Note: If compiling as a standalone module, make sure sched_clock() is
 * exported in the kernel. */

/* per-task data */
struct prof_data {
struct task_struct *task;
struct list_head list;
unsigned long long entry_stamp;
};

static const char *func = sys_open;
static spinlock_t time_lock;
static ktime_t total_time;
static unsigned long hits;
static LIST_HEAD(data_nodes); /* list of per-task data */
static struct delayed_work work_print;

static struct prof_data *get_per_task_data(struct task_struct *tsk)
{
struct prof_data *p;

/* lookup prof_data corresponding to tsk */
list_for_each_entry(p, data_nodes, list) {
if (p-task == tsk)
return p;
}
return NULL;
}

/* called with kretprobe_lock held */
static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
struct prof_data *stats;

stats = get_per_task_data(current);
if (stats)
return 1; /* recursive/nested call */

stats = kmalloc(sizeof(struct prof_data), GFP_ATOMIC);
if (!stats)
return 1;

stats-entry_stamp = sched_clock();
stats-task = current;
INIT_LIST_HEAD(stats-list);
list_add(stats-list, data_nodes);
ri-data = stats;
return 0;
}

/* called with kretprobe_lock held */
static int return_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
unsigned long flags;
struct prof_data *stats = (struct prof_data *)ri-data;
u64 elapsed;

BUG_ON(ri-data == NULL);
elapsed = (long long)sched_clock() - (long long)stats-entry_stamp;

/* update stats */
spin_lock_irqsave(time_lock, flags);
++hits;
total_time = ktime_add_ns(total_time, elapsed);
spin_unlock_irqrestore(time_lock, flags);

list_del(stats-list);
kfree(stats);
return 0;
}

static struct kretprobe my_kretprobe = {
.handler = return_handler,
.entry_handler = entry_handler,
};

/* called after every PRINT_DELAY seconds */
static void print_time(struct work_struct *work)
{
unsigned long flags;
s64 time_ns;
struct timespec ts;

BUG_ON(work != work_print.work);
spin_lock_irqsave(time_lock, flags);
time_ns = ktime_to_ns(total_time);
do_div(time_ns, hits);
spin_unlock_irqrestore(time_lock, flags);

ts = ns_to_timespec(time_ns);
printk(KERN_DEBUG Avg. running time of %s = %ld sec, %ld nsec\n,
   func, ts.tv_sec, ts.tv_nsec);
schedule_delayed_work(work_print, PRINT_DELAY);
}

static int __init test_module_init(void)
{
int ret;
my_kretprobe.kp.symbol_name

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-15 Thread Abhishek Sagar
On Nov 15, 2007 4:21 AM, Jim Keniston <[EMAIL PROTECTED]> wrote:
> 2. Simplify the task of correlating data (e.g., timestamps) between
> function entry and function return.

Would adding of data and len fields in ri help? Instead of "pouching"
data in one go at registration time, this would let user handlers do
the allocation and allow them to use different kinds of data
structures per-instance.

- Abhishek Sagar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-15 Thread Abhishek Sagar
On Nov 15, 2007 4:21 AM, Jim Keniston <[EMAIL PROTECTED]> wrote:
> On Wed, 2007-11-14 at 19:00 +0530, Abhishek Sagar wrote:
>
> First of all, some general comments.  We seem to be trying to solve two
> problems here:
> 1. Prevent the asymmetry in entry- vs. return-handler calls that can
> develop when we temporarily run out of kretprobe_instances.  E.g., if we
> have m kretprobe "misses", we may report n calls but only (n-m) returns.

That has already been taken care of. The entry-handler is called iff
'hlist_empty(>free_instances)' is false. Additionally, if
entry_handler() returns an error then the corresponding return handler
is also not called because that particular "return instance" is
aborted/voluntarily-missed. Hence the following guarantee is implied:

No. of return handler calls = No. of entry_handler calls which
returned 0 (success).

The number of failed entry_handler calls doesn't update any kind of
ri->voluntary_nmissed count since the user handlers are internally
aware of them (unlike ri->nmissed).

> 2. Simplify the task of correlating data (e.g., timestamps) between
> function entry and function return.

Right. Srinivasa and you have been hinting at the use of per-instance
private data for the same. I think ri should be enough.

> Problem #1 wouldn't exist if we could solve problem #1a:
> 1a. Ensure that we never run out of kretprobe_instances (for some
> appropriate value of "never").
>
> I've thought of various approaches to #1a -- e.g., allocate
> kretprobe_instances from GFP_ATOMIC memory when we're out of
> preallocated instances -- but I've never found time to pursue them.
> This might be a good time to brainstorm solutions to that problem.
>
> Lacking a solution to #1a, I think Abhishek's approach provides a
> reasonable solution to problem #1.

If you're not convinced that problem #1 isn't appropriately handled,
we should look for something like that I guess.

> I don't think it takes us very far toward solving #2, though.  I agree
> with Srinivasa that it would be more helpful if we could store the data
> collected at function-entry time right in the kretprobe_instance. Kevin
> Stafford prototyped this "data pouch" idea a couple of years ago --
> http://sourceware.org/ml/systemtap/2005-q3/msg00593.html
> -- but an analogous feature was implemented at a higher level in
> SystemTap.  Either approach -- in kprobes or SystemTap -- would benefit
> from a fix to #1 (or #1a).

There are three problems in the "data pouch" approach, which is a
generalized case of Srinivasa's timestamp case:

1. It bloats the size of each return instance to a run-time defined
value. I'm certain that quite a few memory starved ARM kprobes users
would certainly be wishing they could install more probes on their
system without taking away too much from the precious memory pools
which can impact their system performance. This is not a deal breaker
though, just an annoyance.

2. Forces user entry/return handlers to use ri->entry_info (see
Kevin's patch) even if their design only wanted such private data to
be used in certain instances. Therefore ideally, any per-instance data
allocation should be left to user entry handlers, IMO. Even if we
allow a pouch for private data in a return instance, the user handlers
would still need be aware of "return instances" to actually use them
globally.

3. It's redundant. 'ri' can uniquely identify any entry-return handler
pair. This itself solves your problem #2. It only moves the onus of
private data allocation to user handlers.

> Review of Abhishek's patch:
>
> I see no reason to save a copy of *regs and pass that to the entry
> handler.  Passing the real regs pointer is good enough for other kprobe
> handlers.

No, a copy is required because if the entry_handler() returns error (a
voluntary miss), then there has to be a way to roll-back all the
changes that arch_prepare_kretprobe() and entry_handler() made to
*regs. Such an instance is considered "aborted".

> And if a handler on i386 uses >esp as the value of the
> stack pointer (which is correct -- long story), it'll get the wrong
> value if its regs arg points at the copy.

That's a catch! I've made the fix (see inlined patch below). It now
passes regs instead of  to both the entry_handler() and
arch_prepare_kretprobe().

> More comments below.
[snip]
> > 1. Multiple function entries from various tasks (the one you've just
> > pointed out).
> > 2. Multiple kretprobe registration on the same function.
> > 3. Nested calls of kretprobe'd function.
> >
> > In cases 1 and 3, the following information can be used to match
> > corresponding entry and return handlers inside a user handler (if
> > needed):
> >
> > (ri->task, ri->ret_addr)
> > where ri is st

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-15 Thread Abhishek Sagar
On Nov 15, 2007 4:21 AM, Jim Keniston [EMAIL PROTECTED] wrote:
 On Wed, 2007-11-14 at 19:00 +0530, Abhishek Sagar wrote:

 First of all, some general comments.  We seem to be trying to solve two
 problems here:
 1. Prevent the asymmetry in entry- vs. return-handler calls that can
 develop when we temporarily run out of kretprobe_instances.  E.g., if we
 have m kretprobe misses, we may report n calls but only (n-m) returns.

That has already been taken care of. The entry-handler is called iff
'hlist_empty(rp-free_instances)' is false. Additionally, if
entry_handler() returns an error then the corresponding return handler
is also not called because that particular return instance is
aborted/voluntarily-missed. Hence the following guarantee is implied:

No. of return handler calls = No. of entry_handler calls which
returned 0 (success).

The number of failed entry_handler calls doesn't update any kind of
ri-voluntary_nmissed count since the user handlers are internally
aware of them (unlike ri-nmissed).

 2. Simplify the task of correlating data (e.g., timestamps) between
 function entry and function return.

Right. Srinivasa and you have been hinting at the use of per-instance
private data for the same. I think ri should be enough.

 Problem #1 wouldn't exist if we could solve problem #1a:
 1a. Ensure that we never run out of kretprobe_instances (for some
 appropriate value of never).

 I've thought of various approaches to #1a -- e.g., allocate
 kretprobe_instances from GFP_ATOMIC memory when we're out of
 preallocated instances -- but I've never found time to pursue them.
 This might be a good time to brainstorm solutions to that problem.

 Lacking a solution to #1a, I think Abhishek's approach provides a
 reasonable solution to problem #1.

If you're not convinced that problem #1 isn't appropriately handled,
we should look for something like that I guess.

 I don't think it takes us very far toward solving #2, though.  I agree
 with Srinivasa that it would be more helpful if we could store the data
 collected at function-entry time right in the kretprobe_instance. Kevin
 Stafford prototyped this data pouch idea a couple of years ago --
 http://sourceware.org/ml/systemtap/2005-q3/msg00593.html
 -- but an analogous feature was implemented at a higher level in
 SystemTap.  Either approach -- in kprobes or SystemTap -- would benefit
 from a fix to #1 (or #1a).

There are three problems in the data pouch approach, which is a
generalized case of Srinivasa's timestamp case:

1. It bloats the size of each return instance to a run-time defined
value. I'm certain that quite a few memory starved ARM kprobes users
would certainly be wishing they could install more probes on their
system without taking away too much from the precious memory pools
which can impact their system performance. This is not a deal breaker
though, just an annoyance.

2. Forces user entry/return handlers to use ri-entry_info (see
Kevin's patch) even if their design only wanted such private data to
be used in certain instances. Therefore ideally, any per-instance data
allocation should be left to user entry handlers, IMO. Even if we
allow a pouch for private data in a return instance, the user handlers
would still need be aware of return instances to actually use them
globally.

3. It's redundant. 'ri' can uniquely identify any entry-return handler
pair. This itself solves your problem #2. It only moves the onus of
private data allocation to user handlers.

 Review of Abhishek's patch:

 I see no reason to save a copy of *regs and pass that to the entry
 handler.  Passing the real regs pointer is good enough for other kprobe
 handlers.

No, a copy is required because if the entry_handler() returns error (a
voluntary miss), then there has to be a way to roll-back all the
changes that arch_prepare_kretprobe() and entry_handler() made to
*regs. Such an instance is considered aborted.

 And if a handler on i386 uses regs-esp as the value of the
 stack pointer (which is correct -- long story), it'll get the wrong
 value if its regs arg points at the copy.

That's a catch! I've made the fix (see inlined patch below). It now
passes regs instead of copy to both the entry_handler() and
arch_prepare_kretprobe().

 More comments below.
[snip]
  1. Multiple function entries from various tasks (the one you've just
  pointed out).
  2. Multiple kretprobe registration on the same function.
  3. Nested calls of kretprobe'd function.
 
  In cases 1 and 3, the following information can be used to match
  corresponding entry and return handlers inside a user handler (if
  needed):
 
  (ri-task, ri-ret_addr)
  where ri is struct kretprobe_instance *
 
  This tuple should uniquely identify a return address (right?).

 But if it's a recursive function, there could be multiple instances in
 the same task with the same return address.  The stack pointer would be
 different, FWIW.

Wouldn't the return addresses be different for recursive/nested calls?
I think the only

Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-15 Thread Abhishek Sagar
On Nov 15, 2007 4:21 AM, Jim Keniston [EMAIL PROTECTED] wrote:
 2. Simplify the task of correlating data (e.g., timestamps) between
 function entry and function return.

Would adding of data and len fields in ri help? Instead of pouching
data in one go at registration time, this would let user handlers do
the allocation and allow them to use different kinds of data
structures per-instance.

- Abhishek Sagar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-14 Thread Abhishek Sagar
On Nov 14, 2007 3:53 PM, Srinivasa Ds <[EMAIL PROTECTED]> wrote:
> No, eventhough return instances are chained in an order, order of execution of
> return handler entirely depends on which process returns first

Right...the LIFO chain analogy holds true for return instances for the
same task only. As you've pointed out, kretprobe_instance is the only
thing that can bind corresponding entry and return handlers together,
which has been taken care of.

> So entry_handler() which gets executed last doesn't guarantee
> that its return handler will be executed first(because it took a lot time
> to return).

Only if there are return instances pending belonging to different tasks.

> So only thing to match the entry_handler() with its return_handler() is
> return probe instance(ri)'s address, which user has to take care explicitly

Lets see how entry and return handlers can be matched up in three
different scenarios:-

1. Multiple function entries from various tasks (the one you've just
pointed out).
2. Multiple kretprobe registration on the same function.
3. Nested calls of kretprobe'd function.

In cases 1 and 3, the following information can be used to match
corresponding entry and return handlers inside a user handler (if
needed):

(ri->task, ri->ret_addr)
where ri is struct kretprobe_instance *

This tuple should uniquely identify a return address (right?).


In case 2, entry and return handlers are anyways called in the right
order (taken care of by
trampoline_handler() due to LIFO insertion in ri->hlist).

The fact that ri is passed to both handlers should allow any user
handler to identify each of these cases and take appropriate
synchronization action pertaining to its private data, if needed.

> (Hence I feel sol a) would be nice).

With an entry-handler, any module aiming to profile running time of a
function (say) can simply do the following without being "return
instance" conscious. Note however that I'm not trying to address just
this scenario but trying to provide a general way to use
entry-handlers in kretprobes:

static unsigned  long flag = 0; /* use bit 0 as a global flag */
unsigned long long entry, exit;

int my_entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
if (!test_and_set_bit(0, ))
/* this instance claims the first entry to kretprobe'd function */
entry = sched_clock();
/* do other stuff */
return 0; /* right on! */
}
return 1; /* error: no return instance to be allocated for this
function entry */
}

/* will only be called iff flag == 1 */
int my_return_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
BUG_ON(!flag);
exit = sched_clock();
set_bit(0, );
}

I think something like this should do the trick for you.

> Thanks
>  Srinivasa DS

--
Thanks & Regards
Abhishek Sagar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-14 Thread Abhishek Sagar
On Nov 14, 2007 1:27 PM, Srinivasa Ds <[EMAIL PROTECTED]> wrote:
> 1) How do you map the entry_handler(which gets executed when a process enters 
> the function) with each instance of return probe handler.
> I accept that entry_handler() will execute each time process enters the 
> function, but to  calculate time, one needs to know corresponding instance of 
> return probe handler(there should be a map for each return handler).

Yes, a one-to-one correspondence between the entry and return handlers
is important. I've tried to address this by passing the same
kretprobe_instance to both the entry and return handlers.

>   Let me explain briefly.
> Suppose in a SMP system, 4 processes  enter the same function at almost 
> sametime(by executing entry_hanlder()) and returns from 4 different locations 
> by executing the return handler.  Now how do I match entry_handler() with 
> corresponding instance of return handler for calculating time.

Right, and this scenario would also occur on UPs where the kretprobe'd
function has nested calls. This has been taken care of (see below).

> Now What I think is, there could be 2 solutions to these problem.
>
> a) Collect the entry time and exit time and put it in that kretprobe_instance 
> structure and fetch it before freeing that instance.

In case someone wants to calculate the entry and exit timestamps of a
function using kretprobes, the appropriate place for it is not the
entry and return handlers. Thats because the path from when a function
is called or from when it returns, to the point of invocation of the
entry or return handler is not O(1).

Looking at trampoline_handler(), it actually traverses a list of all
pending return instances to reach the correct return instance. So any
kind of exit timestamp must be placed just before that and passed to
the return handler via kretprobe_instance (as you just suggested).

> b) Or pass ri(kretprobe_instance address to entry_handler) and match it with 
> return probe handler.

This is what I'm trying to do with this patch. I hope I've not misread
what you meant here, but as you'll notice from the patch:


+   if (rp->entry_handler) {
+   copy = *regs;
+   arch_prepare_kretprobe(ri, );
+   if (rp->entry_handler(ri, ))
< (entry-handler with ri)
+   goto out; /* skip current kretprobe instance */
+   *regs = copy;
+   } else {
+   arch_prepare_kretprobe(ri, regs);
+   }

the entry handler is called with the appropriate return instance. I
haven't put any explicit "match" test here for ri. The reason is that
the correct ri would be passed to both the entry and return handlers
as trampoline_handler() explicitly matches them to the correct task.
Note that all pending return instances of a function are chained in
LIFO order. S the entry-handler which gets called last, should have
its return handler called first (in case of multiple pending return
instances).

Another cool thing here is that if the entry handler returns a
non-zero value, the current return instance is aborted altogether. So
if the entry-handler fails, no return handler gets called.

Does this address your concerns?

--
Regards
Abhishek Sagar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-14 Thread Abhishek Sagar
On Nov 14, 2007 1:27 PM, Srinivasa Ds [EMAIL PROTECTED] wrote:
 1) How do you map the entry_handler(which gets executed when a process enters 
 the function) with each instance of return probe handler.
 I accept that entry_handler() will execute each time process enters the 
 function, but to  calculate time, one needs to know corresponding instance of 
 return probe handler(there should be a map for each return handler).

Yes, a one-to-one correspondence between the entry and return handlers
is important. I've tried to address this by passing the same
kretprobe_instance to both the entry and return handlers.

   Let me explain briefly.
 Suppose in a SMP system, 4 processes  enter the same function at almost 
 sametime(by executing entry_hanlder()) and returns from 4 different locations 
 by executing the return handler.  Now how do I match entry_handler() with 
 corresponding instance of return handler for calculating time.

Right, and this scenario would also occur on UPs where the kretprobe'd
function has nested calls. This has been taken care of (see below).

 Now What I think is, there could be 2 solutions to these problem.

 a) Collect the entry time and exit time and put it in that kretprobe_instance 
 structure and fetch it before freeing that instance.

In case someone wants to calculate the entry and exit timestamps of a
function using kretprobes, the appropriate place for it is not the
entry and return handlers. Thats because the path from when a function
is called or from when it returns, to the point of invocation of the
entry or return handler is not O(1).

Looking at trampoline_handler(), it actually traverses a list of all
pending return instances to reach the correct return instance. So any
kind of exit timestamp must be placed just before that and passed to
the return handler via kretprobe_instance (as you just suggested).

 b) Or pass ri(kretprobe_instance address to entry_handler) and match it with 
 return probe handler.

This is what I'm trying to do with this patch. I hope I've not misread
what you meant here, but as you'll notice from the patch:


+   if (rp-entry_handler) {
+   copy = *regs;
+   arch_prepare_kretprobe(ri, copy);
+   if (rp-entry_handler(ri, copy))
 (entry-handler with ri)
+   goto out; /* skip current kretprobe instance */
+   *regs = copy;
+   } else {
+   arch_prepare_kretprobe(ri, regs);
+   }

the entry handler is called with the appropriate return instance. I
haven't put any explicit match test here for ri. The reason is that
the correct ri would be passed to both the entry and return handlers
as trampoline_handler() explicitly matches them to the correct task.
Note that all pending return instances of a function are chained in
LIFO order. S the entry-handler which gets called last, should have
its return handler called first (in case of multiple pending return
instances).

Another cool thing here is that if the entry handler returns a
non-zero value, the current return instance is aborted altogether. So
if the entry-handler fails, no return handler gets called.

Does this address your concerns?

--
Regards
Abhishek Sagar

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-14 Thread Abhishek Sagar
On Nov 14, 2007 3:53 PM, Srinivasa Ds [EMAIL PROTECTED] wrote:
 No, eventhough return instances are chained in an order, order of execution of
 return handler entirely depends on which process returns first

Right...the LIFO chain analogy holds true for return instances for the
same task only. As you've pointed out, kretprobe_instance is the only
thing that can bind corresponding entry and return handlers together,
which has been taken care of.

 So entry_handler() which gets executed last doesn't guarantee
 that its return handler will be executed first(because it took a lot time
 to return).

Only if there are return instances pending belonging to different tasks.

 So only thing to match the entry_handler() with its return_handler() is
 return probe instance(ri)'s address, which user has to take care explicitly

Lets see how entry and return handlers can be matched up in three
different scenarios:-

1. Multiple function entries from various tasks (the one you've just
pointed out).
2. Multiple kretprobe registration on the same function.
3. Nested calls of kretprobe'd function.

In cases 1 and 3, the following information can be used to match
corresponding entry and return handlers inside a user handler (if
needed):

(ri-task, ri-ret_addr)
where ri is struct kretprobe_instance *

This tuple should uniquely identify a return address (right?).


In case 2, entry and return handlers are anyways called in the right
order (taken care of by
trampoline_handler() due to LIFO insertion in ri-hlist).

The fact that ri is passed to both handlers should allow any user
handler to identify each of these cases and take appropriate
synchronization action pertaining to its private data, if needed.

 (Hence I feel sol a) would be nice).

With an entry-handler, any module aiming to profile running time of a
function (say) can simply do the following without being return
instance conscious. Note however that I'm not trying to address just
this scenario but trying to provide a general way to use
entry-handlers in kretprobes:

static unsigned  long flag = 0; /* use bit 0 as a global flag */
unsigned long long entry, exit;

int my_entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
if (!test_and_set_bit(0, flag))
/* this instance claims the first entry to kretprobe'd function */
entry = sched_clock();
/* do other stuff */
return 0; /* right on! */
}
return 1; /* error: no return instance to be allocated for this
function entry */
}

/* will only be called iff flag == 1 */
int my_return_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
BUG_ON(!flag);
exit = sched_clock();
set_bit(0, flag);
}

I think something like this should do the trick for you.

 Thanks
  Srinivasa DS

--
Thanks  Regards
Abhishek Sagar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-13 Thread Abhishek Sagar
On Nov 13, 2007 12:09 AM, Abhishek Sagar <[EMAIL PROTECTED]> wrote:
> Whoops...sry for the repeated email..emailer trouble.

Expecting this one to makes it to the list. Summary again:

This patch introduces a provision to specify a user-defined callback
to run at function entry to complement the return handler in
kretprobes. Currently, whenever a kretprobe is registered, a user can
specify a callback (return-handler) to be run each time the target
function returns. This is also not guaranteed and is limited by the
number of concurrently pending return instances of the target function
in the current process's context.

This patch will now allow registration of another user defined handler
which is guaranteed to run each time the current return instance is
allocated and the return handler is set-up. Conversely, if the
entry-handler returns an error, it'll cause the current return
instance to be dropped and the return handler will also not run. The
purpose is to provide flexibility to do certain kinds of function
level profiling using kretprobes. By being able to register function
entry and return handlers, kretprobes will now be able to reduce an
extra probe registration (and associated race) for scenarios where an
entry handler is required to capture the function call/entry event
along with the corresponding function exit event.

Hope these simple changes would suffice; all suggestions/corrections
are welcome.


Signed-off-by: Abhishek Sagar <[EMAIL PROTECTED]>
---

diff -X /home/sagara/kprobes/dontdiff -upNr
linux-2.6.24-rc2/include/linux/kprobes.h
linux-2.6.24-rc2_kp/include/linux/kprobes.h
--- linux-2.6.24-rc2/include/linux/kprobes.h2007-11-07 03:27:46.0 
+0530
+++ linux-2.6.24-rc2_kp/include/linux/kprobes.h 2007-11-13
16:04:35.0 +0530
@@ -152,6 +152,7 @@ static inline int arch_trampoline_kprobe
 struct kretprobe {
struct kprobe kp;
kretprobe_handler_t handler;
+   kretprobe_handler_t entry_handler;
int maxactive;
int nmissed;
struct hlist_head free_instances;
diff -X /home/sagara/kprobes/dontdiff -upNr
linux-2.6.24-rc2/kernel/kprobes.c linux-2.6.24-rc2_kp/kernel/kprobes.c
--- linux-2.6.24-rc2/kernel/kprobes.c   2007-11-07 03:27:46.0 +0530
+++ linux-2.6.24-rc2_kp/kernel/kprobes.c2007-11-13 16:04:17.0 
+0530
@@ -694,12 +694,22 @@ static int __kprobes pre_handler_kretpro
spin_lock_irqsave(_lock, flags);
if (!hlist_empty(>free_instances)) {
struct kretprobe_instance *ri;
+   struct pt_regs copy;

ri = hlist_entry(rp->free_instances.first,
 struct kretprobe_instance, uflist);
ri->rp = rp;
ri->task = current;
-   arch_prepare_kretprobe(ri, regs);
+
+   if (rp->entry_handler) {
+   copy = *regs;
+   arch_prepare_kretprobe(ri, );
+   if (rp->entry_handler(ri, ))
+   goto out; /* skip current kretprobe instance */
+   *regs = copy;
+   } else {
+   arch_prepare_kretprobe(ri, regs);
+   }

/* XXX(hch): why is there no hlist_move_head? */
hlist_del(>uflist);
@@ -707,6 +717,7 @@ static int __kprobes pre_handler_kretpro
hlist_add_head(>hlist, kretprobe_inst_table_head(ri->task));
} else
rp->nmissed++;
+out:
spin_unlock_irqrestore(_lock, flags);
return 0;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-13 Thread Abhishek Sagar
On Nov 13, 2007 12:09 AM, Abhishek Sagar [EMAIL PROTECTED] wrote:
 Whoops...sry for the repeated email..emailer trouble.

Expecting this one to makes it to the list. Summary again:

This patch introduces a provision to specify a user-defined callback
to run at function entry to complement the return handler in
kretprobes. Currently, whenever a kretprobe is registered, a user can
specify a callback (return-handler) to be run each time the target
function returns. This is also not guaranteed and is limited by the
number of concurrently pending return instances of the target function
in the current process's context.

This patch will now allow registration of another user defined handler
which is guaranteed to run each time the current return instance is
allocated and the return handler is set-up. Conversely, if the
entry-handler returns an error, it'll cause the current return
instance to be dropped and the return handler will also not run. The
purpose is to provide flexibility to do certain kinds of function
level profiling using kretprobes. By being able to register function
entry and return handlers, kretprobes will now be able to reduce an
extra probe registration (and associated race) for scenarios where an
entry handler is required to capture the function call/entry event
along with the corresponding function exit event.

Hope these simple changes would suffice; all suggestions/corrections
are welcome.


Signed-off-by: Abhishek Sagar [EMAIL PROTECTED]
---

diff -X /home/sagara/kprobes/dontdiff -upNr
linux-2.6.24-rc2/include/linux/kprobes.h
linux-2.6.24-rc2_kp/include/linux/kprobes.h
--- linux-2.6.24-rc2/include/linux/kprobes.h2007-11-07 03:27:46.0 
+0530
+++ linux-2.6.24-rc2_kp/include/linux/kprobes.h 2007-11-13
16:04:35.0 +0530
@@ -152,6 +152,7 @@ static inline int arch_trampoline_kprobe
 struct kretprobe {
struct kprobe kp;
kretprobe_handler_t handler;
+   kretprobe_handler_t entry_handler;
int maxactive;
int nmissed;
struct hlist_head free_instances;
diff -X /home/sagara/kprobes/dontdiff -upNr
linux-2.6.24-rc2/kernel/kprobes.c linux-2.6.24-rc2_kp/kernel/kprobes.c
--- linux-2.6.24-rc2/kernel/kprobes.c   2007-11-07 03:27:46.0 +0530
+++ linux-2.6.24-rc2_kp/kernel/kprobes.c2007-11-13 16:04:17.0 
+0530
@@ -694,12 +694,22 @@ static int __kprobes pre_handler_kretpro
spin_lock_irqsave(kretprobe_lock, flags);
if (!hlist_empty(rp-free_instances)) {
struct kretprobe_instance *ri;
+   struct pt_regs copy;

ri = hlist_entry(rp-free_instances.first,
 struct kretprobe_instance, uflist);
ri-rp = rp;
ri-task = current;
-   arch_prepare_kretprobe(ri, regs);
+
+   if (rp-entry_handler) {
+   copy = *regs;
+   arch_prepare_kretprobe(ri, copy);
+   if (rp-entry_handler(ri, copy))
+   goto out; /* skip current kretprobe instance */
+   *regs = copy;
+   } else {
+   arch_prepare_kretprobe(ri, regs);
+   }

/* XXX(hch): why is there no hlist_move_head? */
hlist_del(ri-uflist);
@@ -707,6 +717,7 @@ static int __kprobes pre_handler_kretpro
hlist_add_head(ri-hlist, kretprobe_inst_table_head(ri-task));
} else
rp-nmissed++;
+out:
spin_unlock_irqrestore(kretprobe_lock, flags);
return 0;
 }
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-12 Thread Abhishek Sagar
On Nov 13, 2007 12:01 AM, Abhishek Sagar <[EMAIL PROTECTED]> wrote:
> Hope these simple changes would suffice; all suggestions/corrections are 
> welcome.

Whoops...sry for the repeated email..emailer trouble.

--
Regards,
Abhishek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] kprobes: Add user entry-handler in kretprobes

2007-11-12 Thread Abhishek Sagar
On Nov 13, 2007 12:01 AM, Abhishek Sagar [EMAIL PROTECTED] wrote:
 Hope these simple changes would suffice; all suggestions/corrections are 
 welcome.

Whoops...sry for the repeated email..emailer trouble.

--
Regards,
Abhishek
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Out of memory management in embedded systems

2007-09-29 Thread Abhishek Sagar
On 9/29/07, Daniel Spång <[EMAIL PROTECTED]> wrote:
> > An embedded system is NOT an ordinary system that happens to
> > boot from flash. An embedded system requires intelligent design.
>
> We might be talking about slightly different systems. I agree that
> systems that are really embedded, in the classic meaning often with
> real time constraints, should be designed as you suggests. But there
> are a lot of other systems that almost actually are ordinary systems
> but with limited memory and often without demand paging.

[snip]

> For those systems I think we need a method to dynamically decrease the
> working set of a process when memory is scarce, and not just accept
> that we "are screwed" and let the OOM killer solve the problem.

In certain cases, I guess it could be a problem in the embedded
environment. Especially while running general purpose applications
with carefully designed real-time tasks. An OOM in such a case is
unacceptable.

The whole problem looks like an extension of page frame reclamation in
user space. If the user application's cache was owned by the kernel
(something like vmsplice with SPLICE_F_GIFT?), and the application
managed it accordingly, then they could probably be brought under the
purview of kernel's memory reclamation. This would mean that
applications wouldn't need to be triggered on low memory, and leave
memory freeing to the kernel (simpler and uniform). Perhaps it is even
possible to do this in the kernel currently somehow...?

--
Abhishek Sagar

-
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Out of memory management in embedded systems

2007-09-29 Thread Abhishek Sagar
On 9/29/07, Daniel Spång [EMAIL PROTECTED] wrote:
  An embedded system is NOT an ordinary system that happens to
  boot from flash. An embedded system requires intelligent design.

 We might be talking about slightly different systems. I agree that
 systems that are really embedded, in the classic meaning often with
 real time constraints, should be designed as you suggests. But there
 are a lot of other systems that almost actually are ordinary systems
 but with limited memory and often without demand paging.

[snip]

 For those systems I think we need a method to dynamically decrease the
 working set of a process when memory is scarce, and not just accept
 that we are screwed and let the OOM killer solve the problem.

In certain cases, I guess it could be a problem in the embedded
environment. Especially while running general purpose applications
with carefully designed real-time tasks. An OOM in such a case is
unacceptable.

The whole problem looks like an extension of page frame reclamation in
user space. If the user application's cache was owned by the kernel
(something like vmsplice with SPLICE_F_GIFT?), and the application
managed it accordingly, then they could probably be brought under the
purview of kernel's memory reclamation. This would mean that
applications wouldn't need to be triggered on low memory, and leave
memory freeing to the kernel (simpler and uniform). Perhaps it is even
possible to do this in the kernel currently somehow...?

--
Abhishek Sagar

-
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: KPROBES: Instrumenting a function's call site

2007-09-26 Thread Abhishek Sagar
On 9/26/07, Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> wrote:

> PS: There was a thought of providing a facility to run a handler at
> function entry even when just a kretprobe is used. Maybe we need to
> relook at that; it'd have been useful in this case.

That would be really useful. I was writing a tool to dump some
function profiling info through /proc. This would help save an extra
probe on each function's entry.

-
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: KPROBES: Instrumenting a function's call site

2007-09-26 Thread Abhishek Sagar
On 9/26/07, Avishay Traeger <[EMAIL PROTECTED]> wrote:
> So to measure the latency of foo(), I basically want kprobes to do this:
> pre_handler();
> foo();
> post_handler();
>
> The problem is that the latencies that I am getting are consistently low
> (~10,000 cycles).  When I manually instrument the functions, the latency
> is about 20,000,000 cycles.  Clearly something is not right here.

Single-stepping is done with preemption (and on some archs like ARM,
interrupts) disabled. May be that is contributing to such a skew.

> Is this a known issue?  Instead of using the post-handler, I can try to
> add a kprobe to the following instruction with a pre-handler.  I was
> just curious if there was something fundamentally wrong with the
> approach I took, or maybe a bug that you should be made aware of.
>
> Please CC me on any replies (not subscribed to LKML).
>
> Thanks,
> Avishay
>
--
Regards
Abhishek Sagar

-
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: KPROBES: Instrumenting a function's call site

2007-09-26 Thread Abhishek Sagar
On 9/26/07, Avishay Traeger [EMAIL PROTECTED] wrote:
 So to measure the latency of foo(), I basically want kprobes to do this:
 pre_handler();
 foo();
 post_handler();

 The problem is that the latencies that I am getting are consistently low
 (~10,000 cycles).  When I manually instrument the functions, the latency
 is about 20,000,000 cycles.  Clearly something is not right here.

Single-stepping is done with preemption (and on some archs like ARM,
interrupts) disabled. May be that is contributing to such a skew.

 Is this a known issue?  Instead of using the post-handler, I can try to
 add a kprobe to the following instruction with a pre-handler.  I was
 just curious if there was something fundamentally wrong with the
 approach I took, or maybe a bug that you should be made aware of.

 Please CC me on any replies (not subscribed to LKML).

 Thanks,
 Avishay

--
Regards
Abhishek Sagar

-
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: KPROBES: Instrumenting a function's call site

2007-09-26 Thread Abhishek Sagar
On 9/26/07, Ananth N Mavinakayanahalli [EMAIL PROTECTED] wrote:

 PS: There was a thought of providing a facility to run a handler at
 function entry even when just a kretprobe is used. Maybe we need to
 relook at that; it'd have been useful in this case.

That would be really useful. I was writing a tool to dump some
function profiling info through /proc. This would help save an extra
probe on each function's entry.

-
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] build system: section garbage collection for vmlinux

2007-09-14 Thread Abhishek Sagar
On 9/14/07, Sam Ravnborg <[EMAIL PROTECTED]> wrote:
> > With some tweaks it worked for me.
> Could you post your tweaked version - against an older kernel is OK.

The inlined patch should apply cleanly on top of the patch posted on
the link I mentioned before. The *.S files are the ones I chose to
bring them under the purview of --ffunction-sections. My observation
remains that if a fine-grained function/data/exported-symbol level
garbage collection can be incorporated into the build environment,
then it'll be something really useful.

--
Abhishek Sagar

---
diff -upNr linux_orig-2.6.12/arch/arm/kernel/armksyms.c
linux-2.6.12/arch/arm/kernel/armksyms.c
--- linux_orig-2.6.12/arch/arm/kernel/armksyms.c2005-06-18
01:18:29.0 +0530
+++ linux-2.6.12/arch/arm/kernel/armksyms.c 2007-09-14 09:00:03.0 
+0530
@@ -44,10 +44,17 @@ extern void fp_enter(void);
  * This has a special calling convention; it doesn't
  * modify any of the usual registers, except for LR.
  */
+#ifndef CONFIG_GCSECTIONS
 #define EXPORT_SYMBOL_ALIAS(sym,orig)  \
  const struct kernel_symbol __ksymtab_##sym\
   __attribute__((section("__ksymtab"))) =  \
 { (unsigned long), #sym };
+#else
+#define EXPORT_SYMBOL_ALIAS(sym,orig)\
+ const struct kernel_symbol __ksymtab_##sym  \
+  __attribute__((section("___ksymtab_" #sym))) = \
+{ (unsigned long), #sym };
+#endif /* CONFIG_GCSECTIONS */

 /*
  * floating point math emulator support.
diff -upNr linux_orig-2.6.12/arch/arm/kernel/iwmmxt.S
linux-2.6.12/arch/arm/kernel/iwmmxt.S
--- linux_orig-2.6.12/arch/arm/kernel/iwmmxt.S  2005-06-18
01:18:29.0 +0530
+++ linux-2.6.12/arch/arm/kernel/iwmmxt.S   2007-09-14 09:21:01.0 
+0530
@@ -55,7 +55,11 @@
  *
  * called from prefetch exception handler with interrupts disabled
  */
-
+#ifdef CONFIG_GCSECTIONS
+  .section ".text.iwmmxt_task_enable"
+#else
+  .text
+#endif
 ENTRY(iwmmxt_task_enable)

mrc p15, 0, r2, c15, c1, 0
diff -upNr linux_orig-2.6.12/arch/arm/kernel/vmlinux.lds.S
linux-2.6.12/arch/arm/kernel/vmlinux.lds.S
--- linux_orig-2.6.12/arch/arm/kernel/vmlinux.lds.S 2005-06-18
01:18:29.0 +0530
+++ linux-2.6.12/arch/arm/kernel/vmlinux.lds.S  2007-09-14
08:58:30.0 +0530
@@ -20,50 +20,50 @@ SECTIONS
.init : {   /* Init code and data   */
_stext = .;
_sinittext = .;
-   *(.init.text)
+   KEEP(*(.init.text))
_einittext = .;
__proc_info_begin = .;
-   *(.proc.info)
+   KEEP(*(.proc.info))
__proc_info_end = .;
__arch_info_begin = .;
-   *(.arch.info)
+   KEEP(*(.arch.info))
__arch_info_end = .;
__tagtable_begin = .;
-   *(.taglist)
+   KEEP(*(.taglist))
__tagtable_end = .;
. = ALIGN(16);
__setup_start = .;
-   *(.init.setup)
+   KEEP(*(.init.setup))
__setup_end = .;
__early_begin = .;
-   *(__early_param)
+   KEEP(*(__early_param))
__early_end = .;
__initcall_start = .;
-   *(.initcall1.init)
-   *(.initcall2.init)
-   *(.initcall3.init)
-   *(.initcall4.init)
-   *(.initcall5.init)
-   *(.initcall6.init)
-   *(.initcall7.init)
+   KEEP(*(.initcall1.init))
+   KEEP(*(.initcall2.init))
+   KEEP(*(.initcall3.init))
+   KEEP(*(.initcall4.init))
+   KEEP(*(.initcall5.init))
+   KEEP(*(.initcall6.init))
+   KEEP(*(.initcall7.init))
__initcall_end = .;
__con_initcall_start = .;
-   *(.con_initcall.init)
+   KEEP(*(.con_initcall.init))
__con_initcall_end = .;
__security_initcall_start = .;
-   *(.security_initcall.init)
+   KEEP(*(.security_initcall.init))
__security_initcall_end = .;
. = ALIGN(32);
__initramfs_start = .;
-   usr/built-in.o(.init.ramfs)
+   KEEP(usr/built-in.o(.init.ramfs))
__initramfs_end = .;
. = ALIGN(64);
__per_cpu_start = .;
-   *(.data.percpu)
+   KEEP(*(.data.percpu))
__per_cpu_end = .;
 #ifndef CONFIG_XIP_KERNEL
__init

Re: [PATCH 0/4] build system: section garbage collection for vmlinux

2007-09-14 Thread Abhishek Sagar
On 9/14/07, Sam Ravnborg [EMAIL PROTECTED] wrote:
  With some tweaks it worked for me.
 Could you post your tweaked version - against an older kernel is OK.

The inlined patch should apply cleanly on top of the patch posted on
the link I mentioned before. The *.S files are the ones I chose to
bring them under the purview of --ffunction-sections. My observation
remains that if a fine-grained function/data/exported-symbol level
garbage collection can be incorporated into the build environment,
then it'll be something really useful.

--
Abhishek Sagar

---
diff -upNr linux_orig-2.6.12/arch/arm/kernel/armksyms.c
linux-2.6.12/arch/arm/kernel/armksyms.c
--- linux_orig-2.6.12/arch/arm/kernel/armksyms.c2005-06-18
01:18:29.0 +0530
+++ linux-2.6.12/arch/arm/kernel/armksyms.c 2007-09-14 09:00:03.0 
+0530
@@ -44,10 +44,17 @@ extern void fp_enter(void);
  * This has a special calling convention; it doesn't
  * modify any of the usual registers, except for LR.
  */
+#ifndef CONFIG_GCSECTIONS
 #define EXPORT_SYMBOL_ALIAS(sym,orig)  \
  const struct kernel_symbol __ksymtab_##sym\
   __attribute__((section(__ksymtab))) =  \
 { (unsigned long)orig, #sym };
+#else
+#define EXPORT_SYMBOL_ALIAS(sym,orig)\
+ const struct kernel_symbol __ksymtab_##sym  \
+  __attribute__((section(___ksymtab_ #sym))) = \
+{ (unsigned long)orig, #sym };
+#endif /* CONFIG_GCSECTIONS */

 /*
  * floating point math emulator support.
diff -upNr linux_orig-2.6.12/arch/arm/kernel/iwmmxt.S
linux-2.6.12/arch/arm/kernel/iwmmxt.S
--- linux_orig-2.6.12/arch/arm/kernel/iwmmxt.S  2005-06-18
01:18:29.0 +0530
+++ linux-2.6.12/arch/arm/kernel/iwmmxt.S   2007-09-14 09:21:01.0 
+0530
@@ -55,7 +55,11 @@
  *
  * called from prefetch exception handler with interrupts disabled
  */
-
+#ifdef CONFIG_GCSECTIONS
+  .section .text.iwmmxt_task_enable
+#else
+  .text
+#endif
 ENTRY(iwmmxt_task_enable)

mrc p15, 0, r2, c15, c1, 0
diff -upNr linux_orig-2.6.12/arch/arm/kernel/vmlinux.lds.S
linux-2.6.12/arch/arm/kernel/vmlinux.lds.S
--- linux_orig-2.6.12/arch/arm/kernel/vmlinux.lds.S 2005-06-18
01:18:29.0 +0530
+++ linux-2.6.12/arch/arm/kernel/vmlinux.lds.S  2007-09-14
08:58:30.0 +0530
@@ -20,50 +20,50 @@ SECTIONS
.init : {   /* Init code and data   */
_stext = .;
_sinittext = .;
-   *(.init.text)
+   KEEP(*(.init.text))
_einittext = .;
__proc_info_begin = .;
-   *(.proc.info)
+   KEEP(*(.proc.info))
__proc_info_end = .;
__arch_info_begin = .;
-   *(.arch.info)
+   KEEP(*(.arch.info))
__arch_info_end = .;
__tagtable_begin = .;
-   *(.taglist)
+   KEEP(*(.taglist))
__tagtable_end = .;
. = ALIGN(16);
__setup_start = .;
-   *(.init.setup)
+   KEEP(*(.init.setup))
__setup_end = .;
__early_begin = .;
-   *(__early_param)
+   KEEP(*(__early_param))
__early_end = .;
__initcall_start = .;
-   *(.initcall1.init)
-   *(.initcall2.init)
-   *(.initcall3.init)
-   *(.initcall4.init)
-   *(.initcall5.init)
-   *(.initcall6.init)
-   *(.initcall7.init)
+   KEEP(*(.initcall1.init))
+   KEEP(*(.initcall2.init))
+   KEEP(*(.initcall3.init))
+   KEEP(*(.initcall4.init))
+   KEEP(*(.initcall5.init))
+   KEEP(*(.initcall6.init))
+   KEEP(*(.initcall7.init))
__initcall_end = .;
__con_initcall_start = .;
-   *(.con_initcall.init)
+   KEEP(*(.con_initcall.init))
__con_initcall_end = .;
__security_initcall_start = .;
-   *(.security_initcall.init)
+   KEEP(*(.security_initcall.init))
__security_initcall_end = .;
. = ALIGN(32);
__initramfs_start = .;
-   usr/built-in.o(.init.ramfs)
+   KEEP(usr/built-in.o(.init.ramfs))
__initramfs_end = .;
. = ALIGN(64);
__per_cpu_start = .;
-   *(.data.percpu)
+   KEEP(*(.data.percpu))
__per_cpu_end = .;
 #ifndef CONFIG_XIP_KERNEL
__init_begin = _stext;
-   *(.init.data

Re: [PATCH 0/4] build system: section garbage collection for vmlinux

2007-09-13 Thread Abhishek Sagar
On 9/12/07, Denys Vlasenko <[EMAIL PROTECTED]> wrote:
> Patches were run-tested on x86_64, and likely do not work on any other arch
> (need to add KEEP() to arch/*/kernel/vmlinux.lds.S for each arch).

This is good stuff. I had been using a ported variant of this
optimization for ARM on quite an older 2.6 kernel for a while now. I
derived that port from:
http://lkml.org/lkml/2006/6/4/169

With some tweaks it worked for me. Could you also have a look at the
mentioned link and see if that's a superset of what you're trying to
achieve?

--
Abhishek Sagar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] build system: section garbage collection for vmlinux

2007-09-13 Thread Abhishek Sagar
On 9/12/07, Denys Vlasenko [EMAIL PROTECTED] wrote:
 Patches were run-tested on x86_64, and likely do not work on any other arch
 (need to add KEEP() to arch/*/kernel/vmlinux.lds.S for each arch).

This is good stuff. I had been using a ported variant of this
optimization for ARM on quite an older 2.6 kernel for a while now. I
derived that port from:
http://lkml.org/lkml/2006/6/4/169

With some tweaks it worked for me. Could you also have a look at the
mentioned link and see if that's a superset of what you're trying to
achieve?

--
Abhishek Sagar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: instrumentation and kprobes really still "EXPERIMENTAL"?

2007-07-02 Thread Abhishek Sagar

On 7/2/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:

consider this random Kconfig file arch/x86_64/oprofile/Kconfig:

=
config PROFILING
bool "Profiling support (EXPERIMENTAL)"
help
  Say Y here to enable the extended profiling support mechanisms used
  by profilers such as OProfile.


config OPROFILE
tristate "OProfile system profiling (EXPERIMENTAL)"
depends on PROFILING
help
  OProfile is a profiling system capable of profiling the
  whole system, include the kernel, kernel modules, libraries,
  and applications.

  If unsure, say N.
==

  the above is a bit silly.  note that both prompts advertise
themselves as "EXPERIMENTAL" even though neither of them has such a
dependency.  they *are*, however, part of an submenu that *is*
dependent on EXPERIMENTAL (although you'd never know that just by
looking at that Kconfig file).  in short, it's just kind of ugly
hackery at the moment.


Yes but pushing the EXPERIMENTAL check to individual menu members
shouldn't be done in a way which leads to an empty menu for any kernel
configuration. I suspect that this is why the EXPERIMENTAL check is on
the instrumentation menu in the first place.


  my original point was simply that, based on its acceptance, it would
seem kprobes has progressed beyond the EXPERIMENTAL phase, that's all.


It's worth a look on some archs.

--
Abhishek Sagar
-

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: instrumentation and kprobes really still "EXPERIMENTAL"?

2007-07-02 Thread Abhishek Sagar

On 7/2/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:

On Mon, 2 Jul 2007, Abhishek Sagar wrote:

> On 7/1/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:
> > isn't kprobes mature enough to not be considered experimental anymore?
>
> That would vary from arch to arch.

fair enough.  however, at the very least, i'm thinking that the entire
"Instrumentation support" submenu can be made non-EXPERIMENTAL, while
individual entries therein can be EXPERIMENTAL on a choice-by-choice
basis or something like that.


There's not a lot in that menu. It holds oprofile on every arch I
grepped on. Oprofile might come across as being marked
non-experimental but that may very well be due to the assumption that
the menu that holds that option will enforce the EXPERIMENTAL check.
How many archs do you see where oprofile and kprobe defer on their
dependency on EXPERIMENTAL?

Removing the EXPERIMENTAL check from the instrumentation menu will
also require you to fix/double-check dependency on EXPERIMENTAL of
oprofile/kprobes, provided you know that for sure. Which would bring
you back to the same question you began with, namely, whether option
xyz is EXPERIMENTAL or not.

--
Abhishek Sagar
-

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: instrumentation and kprobes really still "EXPERIMENTAL"?

2007-07-02 Thread Abhishek Sagar

On 7/1/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote:

isn't kprobes mature enough to not be considered experimental anymore?


That would vary from arch to arch.


  in addition, while most of the KPROBES config options depend on

  KALLSYMS && EXPERIMENTAL && MODULES

the s390 architecture depends only on

  EXPERIMENTAL && MODULES

and its instrumentation support is *not* listed as experimental.

  also, the avr32 entry is in the file Kconfig.debug, and depends only
on DEBUG_KERNEL.  just an observation.


This probably stems from the episodic growth kprobes across different
archs. It can be cleaned up I suppose.

--
Abhishek Sagar

P.S: Adding systemtap in CC.
-

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: instrumentation and kprobes really still EXPERIMENTAL?

2007-07-02 Thread Abhishek Sagar

On 7/1/07, Robert P. J. Day [EMAIL PROTECTED] wrote:

isn't kprobes mature enough to not be considered experimental anymore?


That would vary from arch to arch.


  in addition, while most of the KPROBES config options depend on

  KALLSYMS  EXPERIMENTAL  MODULES

the s390 architecture depends only on

  EXPERIMENTAL  MODULES

and its instrumentation support is *not* listed as experimental.

  also, the avr32 entry is in the file Kconfig.debug, and depends only
on DEBUG_KERNEL.  just an observation.


This probably stems from the episodic growth kprobes across different
archs. It can be cleaned up I suppose.

--
Abhishek Sagar

P.S: Adding systemtap in CC.
-

To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: instrumentation and kprobes really still EXPERIMENTAL?

2007-07-02 Thread Abhishek Sagar

On 7/2/07, Robert P. J. Day [EMAIL PROTECTED] wrote:

On Mon, 2 Jul 2007, Abhishek Sagar wrote:

 On 7/1/07, Robert P. J. Day [EMAIL PROTECTED] wrote:
  isn't kprobes mature enough to not be considered experimental anymore?

 That would vary from arch to arch.

fair enough.  however, at the very least, i'm thinking that the entire
Instrumentation support submenu can be made non-EXPERIMENTAL, while
individual entries therein can be EXPERIMENTAL on a choice-by-choice
basis or something like that.


There's not a lot in that menu. It holds oprofile on every arch I
grepped on. Oprofile might come across as being marked
non-experimental but that may very well be due to the assumption that
the menu that holds that option will enforce the EXPERIMENTAL check.
How many archs do you see where oprofile and kprobe defer on their
dependency on EXPERIMENTAL?

Removing the EXPERIMENTAL check from the instrumentation menu will
also require you to fix/double-check dependency on EXPERIMENTAL of
oprofile/kprobes, provided you know that for sure. Which would bring
you back to the same question you began with, namely, whether option
xyz is EXPERIMENTAL or not.

--
Abhishek Sagar
-

To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: instrumentation and kprobes really still EXPERIMENTAL?

2007-07-02 Thread Abhishek Sagar

On 7/2/07, Robert P. J. Day [EMAIL PROTECTED] wrote:

consider this random Kconfig file arch/x86_64/oprofile/Kconfig:

=
config PROFILING
bool Profiling support (EXPERIMENTAL)
help
  Say Y here to enable the extended profiling support mechanisms used
  by profilers such as OProfile.


config OPROFILE
tristate OProfile system profiling (EXPERIMENTAL)
depends on PROFILING
help
  OProfile is a profiling system capable of profiling the
  whole system, include the kernel, kernel modules, libraries,
  and applications.

  If unsure, say N.
==

  the above is a bit silly.  note that both prompts advertise
themselves as EXPERIMENTAL even though neither of them has such a
dependency.  they *are*, however, part of an submenu that *is*
dependent on EXPERIMENTAL (although you'd never know that just by
looking at that Kconfig file).  in short, it's just kind of ugly
hackery at the moment.


Yes but pushing the EXPERIMENTAL check to individual menu members
shouldn't be done in a way which leads to an empty menu for any kernel
configuration. I suspect that this is why the EXPERIMENTAL check is on
the instrumentation menu in the first place.


  my original point was simply that, based on its acceptance, it would
seem kprobes has progressed beyond the EXPERIMENTAL phase, that's all.


It's worth a look on some archs.

--
Abhishek Sagar
-

To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] Make jprobes a little safer for users

2007-06-26 Thread Abhishek Sagar

On 6/26/07, Michael Ellerman <[EMAIL PROTECTED]> wrote:

It did occur to me that someone might be doing something crazy like
branching to code that's not in the kernel/module text - but I was
hoping that wouldn't be the case. I'm not sure what ITCM is?


The reference to tightly coupled memory (ITCM) was just to have you
consider the possibility of the jprobe handler being outside kernel
text region. Totally paranoid really.


> >  int __kprobes register_jprobe(struct jprobe *jp)
> >  {
> > +   unsigned long addr = arch_deref_entry_point(jp->entry);
> > +
> > +   if (!kernel_text_address(addr))
> > +   return -EINVAL;
>
> Seems like you're checking for the jprobe handler to be within
> kernel/module range. Why not narrow this down to just module range
> (!module_text_address(addr), say)? Core kernel functions would not be
> ending with a 'jprobe_return()' anyway.

There's jprobe code in net/ipv4/tcp_probe.c and net/dccp/probe.c that
can be builtin or modular, so I think kernel_text_address() is right.


Ok..thanks for that clarification.

--
Abhishek Sagar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] Make jprobes a little safer for users

2007-06-26 Thread Abhishek Sagar

On 6/26/07, Michael Ellerman <[EMAIL PROTECTED]> wrote:


We can then use that in register_jprobe() to check that the entry point
we're passed is actually in the kernel text, rather than just some random
value.


A similar cleanup is possible even for return probes then. I wonder if
there are any kprobe related scenarios where the executable code may
be located outside the core kernel text region (e.g, ITCM?). In that
case would it also be wrong to assume that the jprobe handler may be
situated outside the kernel core text / module  region? Would it then
make sense to move this check from register_jprobe() to the arch
dependent code?


 int __kprobes register_jprobe(struct jprobe *jp)
 {
+   unsigned long addr = arch_deref_entry_point(jp->entry);
+
+   if (!kernel_text_address(addr))
+   return -EINVAL;


Seems like you're checking for the jprobe handler to be within
kernel/module range. Why not narrow this down to just module range
(!module_text_address(addr), say)? Core kernel functions would not be
ending with a 'jprobe_return()' anyway.

--
Abhishek Sagar

-

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] Make jprobes a little safer for users

2007-06-26 Thread Abhishek Sagar

On 6/26/07, Michael Ellerman [EMAIL PROTECTED] wrote:


We can then use that in register_jprobe() to check that the entry point
we're passed is actually in the kernel text, rather than just some random
value.


A similar cleanup is possible even for return probes then. I wonder if
there are any kprobe related scenarios where the executable code may
be located outside the core kernel text region (e.g, ITCM?). In that
case would it also be wrong to assume that the jprobe handler may be
situated outside the kernel core text / module  region? Would it then
make sense to move this check from register_jprobe() to the arch
dependent code?


 int __kprobes register_jprobe(struct jprobe *jp)
 {
+   unsigned long addr = arch_deref_entry_point(jp-entry);
+
+   if (!kernel_text_address(addr))
+   return -EINVAL;


Seems like you're checking for the jprobe handler to be within
kernel/module range. Why not narrow this down to just module range
(!module_text_address(addr), say)? Core kernel functions would not be
ending with a 'jprobe_return()' anyway.

--
Abhishek Sagar

-

To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] Make jprobes a little safer for users

2007-06-26 Thread Abhishek Sagar

On 6/26/07, Michael Ellerman [EMAIL PROTECTED] wrote:

It did occur to me that someone might be doing something crazy like
branching to code that's not in the kernel/module text - but I was
hoping that wouldn't be the case. I'm not sure what ITCM is?


The reference to tightly coupled memory (ITCM) was just to have you
consider the possibility of the jprobe handler being outside kernel
text region. Totally paranoid really.


   int __kprobes register_jprobe(struct jprobe *jp)
   {
  +   unsigned long addr = arch_deref_entry_point(jp-entry);
  +
  +   if (!kernel_text_address(addr))
  +   return -EINVAL;

 Seems like you're checking for the jprobe handler to be within
 kernel/module range. Why not narrow this down to just module range
 (!module_text_address(addr), say)? Core kernel functions would not be
 ending with a 'jprobe_return()' anyway.

There's jprobe code in net/ipv4/tcp_probe.c and net/dccp/probe.c that
can be builtin or modular, so I think kernel_text_address() is right.


Ok..thanks for that clarification.

--
Abhishek Sagar

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/