Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-24 Thread Namhyung Kim
Hi Oleg,

On Tue, 12 Nov 2013 17:00:01 +0900, Namhyung Kim wrote:
> For @+addr syntax: user-space uses relative symbol address from a loaded
>base address and kernel calculates the base address
>using "current->utask->vaddr - tu->offset".

I tried this approach and realized that current->utask is not set or has
an invalid vaddr when handler_chain() is called.  So I had to apply
following patch and it seems to work well for me.  Could you confirm it?

Thanks,
Namhyung


diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index ad8e1bdca70e..e63748d3520e 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1456,7 +1456,7 @@ static void prepare_uretprobe(struct uprobe *uprobe, 
struct pt_regs *regs)
 
 /* Prepare to single-step probed instruction out of line. */
 static int
-pre_ssout(struct uprobe *uprobe, struct pt_regs *regs, unsigned long bp_vaddr)
+pre_ssout(struct uprobe *uprobe, struct pt_regs *regs)
 {
struct uprobe_task *utask;
unsigned long xol_vaddr;
@@ -1471,7 +1471,6 @@ pre_ssout(struct uprobe *uprobe, struct pt_regs *regs, 
unsigned long bp_vaddr)
return -ENOMEM;
 
utask->xol_vaddr = xol_vaddr;
-   utask->vaddr = bp_vaddr;
 
err = arch_uprobe_pre_xol(>arch, regs);
if (unlikely(err)) {
@@ -1701,6 +1700,7 @@ static bool handle_trampoline(struct pt_regs *regs)
 static void handle_swbp(struct pt_regs *regs)
 {
struct uprobe *uprobe;
+   struct uprobe_task *utask;
unsigned long bp_vaddr;
int uninitialized_var(is_swbp);
 
@@ -1744,11 +1744,17 @@ static void handle_swbp(struct pt_regs *regs)
if (unlikely(!test_bit(UPROBE_COPY_INSN, >flags)))
goto out;
 
+   utask = get_utask();
+   if (!utask)
+   goto out;
+
+   utask->vaddr = bp_vaddr;
+
handler_chain(uprobe, regs);
if (can_skip_sstep(uprobe, regs))
goto out;
 
-   if (!pre_ssout(uprobe, regs, bp_vaddr))
+   if (!pre_ssout(uprobe, regs))
return;
 
/* can_skip_sstep() succeeded, or restart if can't singlestep */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-24 Thread Namhyung Kim
Hi Oleg,

On Tue, 12 Nov 2013 17:00:01 +0900, Namhyung Kim wrote:
 For @+addr syntax: user-space uses relative symbol address from a loaded
base address and kernel calculates the base address
using current-utask-vaddr - tu-offset.

I tried this approach and realized that current-utask is not set or has
an invalid vaddr when handler_chain() is called.  So I had to apply
following patch and it seems to work well for me.  Could you confirm it?

Thanks,
Namhyung


diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index ad8e1bdca70e..e63748d3520e 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1456,7 +1456,7 @@ static void prepare_uretprobe(struct uprobe *uprobe, 
struct pt_regs *regs)
 
 /* Prepare to single-step probed instruction out of line. */
 static int
-pre_ssout(struct uprobe *uprobe, struct pt_regs *regs, unsigned long bp_vaddr)
+pre_ssout(struct uprobe *uprobe, struct pt_regs *regs)
 {
struct uprobe_task *utask;
unsigned long xol_vaddr;
@@ -1471,7 +1471,6 @@ pre_ssout(struct uprobe *uprobe, struct pt_regs *regs, 
unsigned long bp_vaddr)
return -ENOMEM;
 
utask-xol_vaddr = xol_vaddr;
-   utask-vaddr = bp_vaddr;
 
err = arch_uprobe_pre_xol(uprobe-arch, regs);
if (unlikely(err)) {
@@ -1701,6 +1700,7 @@ static bool handle_trampoline(struct pt_regs *regs)
 static void handle_swbp(struct pt_regs *regs)
 {
struct uprobe *uprobe;
+   struct uprobe_task *utask;
unsigned long bp_vaddr;
int uninitialized_var(is_swbp);
 
@@ -1744,11 +1744,17 @@ static void handle_swbp(struct pt_regs *regs)
if (unlikely(!test_bit(UPROBE_COPY_INSN, uprobe-flags)))
goto out;
 
+   utask = get_utask();
+   if (!utask)
+   goto out;
+
+   utask-vaddr = bp_vaddr;
+
handler_chain(uprobe, regs);
if (can_skip_sstep(uprobe, regs))
goto out;
 
-   if (!pre_ssout(uprobe, regs, bp_vaddr))
+   if (!pre_ssout(uprobe, regs))
return;
 
/* can_skip_sstep() succeeded, or restart if can't singlestep */
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-12 Thread Oleg Nesterov
Hi Namhyung,

On 11/12, Namhyung Kim wrote:
>
> Let me clarify what I understand.
>
> For @addr syntax: kernel does no translation and uses given address

Yes,

> For @+addr syntax: user-space uses relative symbol address from a loaded
>base address and kernel calculates the base address
>using "current->utask->vaddr - tu->offset".

Looks right to me.

IOW, when user-space calculates the "addr" for @+ argument, it should
assume that this binary will be mmaped at "NULL" (so that the virtual
address of the probed function is always equal to its offset in the
probed binary).

Masami, Steven, et al, do you agree?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-12 Thread Namhyung Kim
Hi Oleg and Masami,

On Sat, 9 Nov 2013 16:23:13 +0100, Oleg Nesterov wrote:
> On 11/09, Masami Hiramatsu wrote:
>>
>> In that case, I suggest you to use "@+addr" for the relative address,
>> since that is an offset, isn't that? :)
>
> Agreed, @+addr looks better!

Looks good to me too.

>
>> BTW, it seems that @addr syntax is hard to use for uprobes, because
>> current uprobes is based on a binary, not a process, we cannot specify
>> which process is probed when we define it.
>
> Yes, exactly. That is why we suggest that user-space should pass the
> ip-relative address (actually offset). This should hopefully solve all
> problems with relocations.

Let me clarify what I understand.

For @addr syntax: kernel does no translation and uses given address

For @+addr syntax: user-space uses relative symbol address from a loaded
   base address and kernel calculates the base address
   using "current->utask->vaddr - tu->offset".

Is that right?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-12 Thread Namhyung Kim
Hi Oleg and Masami,

On Sat, 9 Nov 2013 16:23:13 +0100, Oleg Nesterov wrote:
 On 11/09, Masami Hiramatsu wrote:

 In that case, I suggest you to use @+addr for the relative address,
 since that is an offset, isn't that? :)

 Agreed, @+addr looks better!

Looks good to me too.


 BTW, it seems that @addr syntax is hard to use for uprobes, because
 current uprobes is based on a binary, not a process, we cannot specify
 which process is probed when we define it.

 Yes, exactly. That is why we suggest that user-space should pass the
 ip-relative address (actually offset). This should hopefully solve all
 problems with relocations.

Let me clarify what I understand.

For @addr syntax: kernel does no translation and uses given address

For @+addr syntax: user-space uses relative symbol address from a loaded
   base address and kernel calculates the base address
   using current-utask-vaddr - tu-offset.

Is that right?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-12 Thread Oleg Nesterov
Hi Namhyung,

On 11/12, Namhyung Kim wrote:

 Let me clarify what I understand.

 For @addr syntax: kernel does no translation and uses given address

Yes,

 For @+addr syntax: user-space uses relative symbol address from a loaded
base address and kernel calculates the base address
using current-utask-vaddr - tu-offset.

Looks right to me.

IOW, when user-space calculates the addr for @+ argument, it should
assume that this binary will be mmaped at NULL (so that the virtual
address of the probed function is always equal to its offset in the
probed binary).

Masami, Steven, et al, do you agree?

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-11 Thread Namhyung Kim
Hi Oleg,

On Fri, 8 Nov 2013 18:00:17 +0100, Oleg Nesterov wrote:
> On 11/07, Namhyung Kim wrote:
>>
>> On Wed, 6 Nov 2013 19:24:08 +0100, Oleg Nesterov wrote:
>> >
>> > Note that instruction_pointer_set() is not really nice in any case,
>> > this can obviously confuse FETCH_MTD_reg("ip").
>>
>> Yes.
>>
>> >
>> > But lets ignore this. The solution is simple, we can pass/use this
>> > info via current->utask. We can either add the new member, or add
>> > a union. Or simply reuse xol_vaddr. Doesn't matter.
>>
>> Okay, I'll take a look.
>
> I guess we need the union in uprobe_task anyway... I'll send the patch
> soon.
>
> Until we have the new members in uprobe_task, you can reuse utask->vaddr,
> this is safe.
>
> IOW, you can use current->utask->vaddr instead of regs->ip (as we did
> in the previous discussin) to pass the necessary info to ->fetch().

Thanks for the info!
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-11 Thread Namhyung Kim
Hi Oleg,

On Fri, 8 Nov 2013 18:00:17 +0100, Oleg Nesterov wrote:
 On 11/07, Namhyung Kim wrote:

 On Wed, 6 Nov 2013 19:24:08 +0100, Oleg Nesterov wrote:
 
  Note that instruction_pointer_set() is not really nice in any case,
  this can obviously confuse FETCH_MTD_reg(ip).

 Yes.

 
  But lets ignore this. The solution is simple, we can pass/use this
  info via current-utask. We can either add the new member, or add
  a union. Or simply reuse xol_vaddr. Doesn't matter.

 Okay, I'll take a look.

 I guess we need the union in uprobe_task anyway... I'll send the patch
 soon.

 Until we have the new members in uprobe_task, you can reuse utask-vaddr,
 this is safe.

 IOW, you can use current-utask-vaddr instead of regs-ip (as we did
 in the previous discussin) to pass the necessary info to -fetch().

Thanks for the info!
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-09 Thread Oleg Nesterov
On 11/09, Masami Hiramatsu wrote:
>
> In that case, I suggest you to use "@+addr" for the relative address,
> since that is an offset, isn't that? :)

Agreed, @+addr looks better!

> BTW, it seems that @addr syntax is hard to use for uprobes, because
> current uprobes is based on a binary, not a process, we cannot specify
> which process is probed when we define it.

Yes, exactly. That is why we suggest that user-space should pass the
ip-relative address (actually offset). This should hopefully solve all
problems with relocations.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-09 Thread Oleg Nesterov
On 11/09, Masami Hiramatsu wrote:

 In that case, I suggest you to use @+addr for the relative address,
 since that is an offset, isn't that? :)

Agreed, @+addr looks better!

 BTW, it seems that @addr syntax is hard to use for uprobes, because
 current uprobes is based on a binary, not a process, we cannot specify
 which process is probed when we define it.

Yes, exactly. That is why we suggest that user-space should pass the
ip-relative address (actually offset). This should hopefully solve all
problems with relocations.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-08 Thread Masami Hiramatsu
(2013/11/07 17:48), Namhyung Kim wrote:
> On Wed, 6 Nov 2013 18:37:54 +0100, Oleg Nesterov wrote:
>> On 11/06, Namhyung Kim wrote:
>>>
>>> On Tue, 5 Nov 2013 20:24:01 +0100, Oleg Nesterov wrote:
 On 11/05, Oleg Nesterov wrote:
>
> As for "-= tu->offset"... Can't we avoid it? User-space needs to calculate
> the "@" argument anyway, why it can't also substruct this offset?
>
> Or perhaps we can change parse_probe_arg("@") to update "param" ? Yes,
> in this case it needs another argument, not sure...

 Or,

>   +   if (is_ret_probe(tu)) {
>   +   saved_ip = instruction_pointer(regs);
>   +   instruction_pointer_set(func);
>   +   }
>   store_trace_args(...);
>   +   if (is_ret_probe(tu))
>   +   instruction_pointer_set(saved_ip);

 we can put "-= tu->offset" here.
>>>
>>> I don't think I get the point.
>>
>> I meant,
>>
>>  saved_ip = instruction_pointer(regs);
>>
>>  // pass the "ip" which was used to calculate
>>  // the @addr argument to fetch_*() methods
>>
>>  temp_ip = is_ret_probe(tu) ? func : saved_ip;
>>  temp_ip -= tu->offset;
>>  instruction_pointer_set(temp_ip);
>>
>>  store_trace_args(...);
>>
>>  instruction_pointer_set(saved_ip);
>>
>> This way we can avoid the new "void *" argument for fetch_func_t,
>> we do not need it to calculate the address.
> 
> Okay, but as I said before, subtracting tu->offset part can be removed.

Ah, that's good to me too :)


>>
>> However, now I think it would be more clean to leave FETCH_MTD_memory
>> alone and add FETCH_MTD_memory_dotranslate instead.
>>
>> So trace_uprobes.c should define
>>
>>  void FETCH_FUNC_NAME(memory, type)(addr, ...)
>>  {
>>  copy_from_user((void __user *)addr);
>>  }
>>
>>  void FETCH_FUNC_NAME(memory_dotranslate, type)(addr, ...)
>>  {
>>  void __user *uaddr = get_user_vaddr(regs, addr);
>>  copy_from_user(uaddr);
>>  }
> 
> Looks good.
> 
>>
>> Then,
>>
 Or. Perhaps we can leave "case '@'" in parse_probe_arg() and
 FETCH_MTD_memory alone. You seem to agree that "absolute address"
 can be useful anyway.
>>>
>>> Yes, but it's only meaningful to process-wide tracing sessions IMHO.
>>
>> Yes, yes, sure.
>>
>> I meant, we need both. Say, "perf probe "func global=@addr" means
>> FETCH_MTD_memory, and "perf probe "func global=*addr" means
>> FETCH_MTD_memory_dotranslate.
>>
>> Just in case, of course I do not care about the syntax, for example we
>> can use "@~addr" for translate (or not translate) or whatever.
> 
> Yeah, and I want to hear from Masami.

Hm, this part I need to clarify. So you mean the @addr is for referring
the absolute address in a user process, but @~addr is for referring
the relative address of a executable or library, correct?
In that case, I suggest you to use "@+addr" for the relative address,
since that is an offset, isn't that? :)

BTW, it seems that @addr syntax is hard to use for uprobes, because
current uprobes is based on a binary, not a process, we cannot specify
which process is probed when we define it.


>> My only point: I think we need both to
>>
>>  1. avoid the new argument in fetch_func_t
>>
>>  2. allow the dump the data from the absolute address

Looks good to me :)

Thank you!

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-08 Thread Oleg Nesterov
On 11/07, Namhyung Kim wrote:
>
> On Wed, 6 Nov 2013 19:24:08 +0100, Oleg Nesterov wrote:
> >
> > Note that instruction_pointer_set() is not really nice in any case,
> > this can obviously confuse FETCH_MTD_reg("ip").
>
> Yes.
>
> >
> > But lets ignore this. The solution is simple, we can pass/use this
> > info via current->utask. We can either add the new member, or add
> > a union. Or simply reuse xol_vaddr. Doesn't matter.
>
> Okay, I'll take a look.

I guess we need the union in uprobe_task anyway... I'll send the patch
soon.

Until we have the new members in uprobe_task, you can reuse utask->vaddr,
this is safe.

IOW, you can use current->utask->vaddr instead of regs->ip (as we did
in the previous discussin) to pass the necessary info to ->fetch().

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-08 Thread Oleg Nesterov
Hi Namhyung,

sorry for delay.

On 11/07, Namhyung Kim wrote:
>
> >> > As for "-= tu->offset"... Can't we avoid it? User-space needs to 
> >> > calculate
> >> > the "@" argument anyway, why it can't also substruct this offset?
> >>
> >> Hmm.. it makes sense too. :)
> >
> > I am no longer sure ;)
> >
> > This way the "@" argument will look more confusing, it will depend on the
> > address/offset of the probed insn. But again, I do not know, this is up
> > to you.
>
> That said, I'd prefer the original "-= -tu->offset" approach.  It'll
> make debugging easier IMHO.

I do not really mind, and probably you are right. Actually it seems
that I was confused, if user-space does "-= -tu->offset" itself then
the "@" argument will look more consistent (contrary to what I said
above).

In any case we should make the calculation of "@" argument (in user
space) as simple/clear as possible, it is very easy to add the
additional hacks in kernel if necessary.

And this is very (if not most) important part, we can change the kernel
later, but it is not easy to change the already working semantics, so I'd
like to know what other reviewers think.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-08 Thread Oleg Nesterov
Hi Namhyung,

sorry for delay.

On 11/07, Namhyung Kim wrote:

   As for -= tu-offset... Can't we avoid it? User-space needs to 
   calculate
   the @ argument anyway, why it can't also substruct this offset?
 
  Hmm.. it makes sense too. :)
 
  I am no longer sure ;)
 
  This way the @ argument will look more confusing, it will depend on the
  address/offset of the probed insn. But again, I do not know, this is up
  to you.

 That said, I'd prefer the original -= -tu-offset approach.  It'll
 make debugging easier IMHO.

I do not really mind, and probably you are right. Actually it seems
that I was confused, if user-space does -= -tu-offset itself then
the @ argument will look more consistent (contrary to what I said
above).

In any case we should make the calculation of @ argument (in user
space) as simple/clear as possible, it is very easy to add the
additional hacks in kernel if necessary.

And this is very (if not most) important part, we can change the kernel
later, but it is not easy to change the already working semantics, so I'd
like to know what other reviewers think.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-08 Thread Oleg Nesterov
On 11/07, Namhyung Kim wrote:

 On Wed, 6 Nov 2013 19:24:08 +0100, Oleg Nesterov wrote:
 
  Note that instruction_pointer_set() is not really nice in any case,
  this can obviously confuse FETCH_MTD_reg(ip).

 Yes.

 
  But lets ignore this. The solution is simple, we can pass/use this
  info via current-utask. We can either add the new member, or add
  a union. Or simply reuse xol_vaddr. Doesn't matter.

 Okay, I'll take a look.

I guess we need the union in uprobe_task anyway... I'll send the patch
soon.

Until we have the new members in uprobe_task, you can reuse utask-vaddr,
this is safe.

IOW, you can use current-utask-vaddr instead of regs-ip (as we did
in the previous discussin) to pass the necessary info to -fetch().

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-08 Thread Masami Hiramatsu
(2013/11/07 17:48), Namhyung Kim wrote:
 On Wed, 6 Nov 2013 18:37:54 +0100, Oleg Nesterov wrote:
 On 11/06, Namhyung Kim wrote:

 On Tue, 5 Nov 2013 20:24:01 +0100, Oleg Nesterov wrote:
 On 11/05, Oleg Nesterov wrote:

 As for -= tu-offset... Can't we avoid it? User-space needs to calculate
 the @ argument anyway, why it can't also substruct this offset?

 Or perhaps we can change parse_probe_arg(@) to update param ? Yes,
 in this case it needs another argument, not sure...

 Or,

   +   if (is_ret_probe(tu)) {
   +   saved_ip = instruction_pointer(regs);
   +   instruction_pointer_set(func);
   +   }
   store_trace_args(...);
   +   if (is_ret_probe(tu))
   +   instruction_pointer_set(saved_ip);

 we can put -= tu-offset here.

 I don't think I get the point.

 I meant,

  saved_ip = instruction_pointer(regs);

  // pass the ip which was used to calculate
  // the @addr argument to fetch_*() methods

  temp_ip = is_ret_probe(tu) ? func : saved_ip;
  temp_ip -= tu-offset;
  instruction_pointer_set(temp_ip);

  store_trace_args(...);

  instruction_pointer_set(saved_ip);

 This way we can avoid the new void * argument for fetch_func_t,
 we do not need it to calculate the address.
 
 Okay, but as I said before, subtracting tu-offset part can be removed.

Ah, that's good to me too :)



 However, now I think it would be more clean to leave FETCH_MTD_memory
 alone and add FETCH_MTD_memory_dotranslate instead.

 So trace_uprobes.c should define

  void FETCH_FUNC_NAME(memory, type)(addr, ...)
  {
  copy_from_user((void __user *)addr);
  }

  void FETCH_FUNC_NAME(memory_dotranslate, type)(addr, ...)
  {
  void __user *uaddr = get_user_vaddr(regs, addr);
  copy_from_user(uaddr);
  }
 
 Looks good.
 

 Then,

 Or. Perhaps we can leave case '@' in parse_probe_arg() and
 FETCH_MTD_memory alone. You seem to agree that absolute address
 can be useful anyway.

 Yes, but it's only meaningful to process-wide tracing sessions IMHO.

 Yes, yes, sure.

 I meant, we need both. Say, perf probe func global=@addr means
 FETCH_MTD_memory, and perf probe func global=*addr means
 FETCH_MTD_memory_dotranslate.

 Just in case, of course I do not care about the syntax, for example we
 can use @~addr for translate (or not translate) or whatever.
 
 Yeah, and I want to hear from Masami.

Hm, this part I need to clarify. So you mean the @addr is for referring
the absolute address in a user process, but @~addr is for referring
the relative address of a executable or library, correct?
In that case, I suggest you to use @+addr for the relative address,
since that is an offset, isn't that? :)

BTW, it seems that @addr syntax is hard to use for uprobes, because
current uprobes is based on a binary, not a process, we cannot specify
which process is probed when we define it.


 My only point: I think we need both to

  1. avoid the new argument in fetch_func_t

  2. allow the dump the data from the absolute address

Looks good to me :)

Thank you!

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-07 Thread Namhyung Kim
On Wed, 6 Nov 2013 19:24:08 +0100, Oleg Nesterov wrote:
> Forgot to mention,
>
> On 11/06, Oleg Nesterov wrote:
>>
>> I meant,
>>
>>  saved_ip = instruction_pointer(regs);
>>
>>  // pass the "ip" which was used to calculate
>>  // the @addr argument to fetch_*() methods
>>
>>  temp_ip = is_ret_probe(tu) ? func : saved_ip;
>>  temp_ip -= tu->offset;
>>  instruction_pointer_set(temp_ip);
>>
>>  store_trace_args(...);
>
> Note that instruction_pointer_set() is not really nice in any case,
> this can obviously confuse FETCH_MTD_reg("ip").

Yes.

>
> But lets ignore this. The solution is simple, we can pass/use this
> info via current->utask. We can either add the new member, or add
> a union. Or simply reuse xol_vaddr. Doesn't matter.

Okay, I'll take a look.

>
> So the only question is should we rely on instruction_pointer/func
> to translate the address or we should do something else (say, vma).
>
> So far I like this approach.

Me too. :)

So, I'll wait for more feedbacks from others and then send a new
version.  Actually I don't have much to do this these days - probably
until next week (there'll be the KLF).  Please forgive my brevity in
replies.

I'll be back with a new patchset then. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-07 Thread Namhyung Kim
On Wed, 6 Nov 2013 18:37:54 +0100, Oleg Nesterov wrote:
> On 11/06, Namhyung Kim wrote:
>>
>> On Tue, 5 Nov 2013 20:24:01 +0100, Oleg Nesterov wrote:
>> > On 11/05, Oleg Nesterov wrote:
>> >>
>> >> As for "-= tu->offset"... Can't we avoid it? User-space needs to calculate
>> >> the "@" argument anyway, why it can't also substruct this offset?
>> >>
>> >> Or perhaps we can change parse_probe_arg("@") to update "param" ? Yes,
>> >> in this case it needs another argument, not sure...
>> >
>> > Or,
>> >
>> >>   +   if (is_ret_probe(tu)) {
>> >>   +   saved_ip = instruction_pointer(regs);
>> >>   +   instruction_pointer_set(func);
>> >>   +   }
>> >>   store_trace_args(...);
>> >>   +   if (is_ret_probe(tu))
>> >>   +   instruction_pointer_set(saved_ip);
>> >
>> > we can put "-= tu->offset" here.
>>
>> I don't think I get the point.
>
> I meant,
>
>   saved_ip = instruction_pointer(regs);
>
>   // pass the "ip" which was used to calculate
>   // the @addr argument to fetch_*() methods
>
>   temp_ip = is_ret_probe(tu) ? func : saved_ip;
>   temp_ip -= tu->offset;
>   instruction_pointer_set(temp_ip);
>
>   store_trace_args(...);
>
>   instruction_pointer_set(saved_ip);
>
> This way we can avoid the new "void *" argument for fetch_func_t,
> we do not need it to calculate the address.

Okay, but as I said before, subtracting tu->offset part can be removed.

>
> But: we still need the additional "bool translate_vaddr" to solve
> the problems with FETCH_MTD_deref.
>
> We already discussed this a bit, previously I suggested the new
> FETCH_MTD_memory_notranslate and
>
> -   dprm->fetch = t->fetch[FETCH_MTD_memory];
> +   dprm->fetch = t->fetch[FETCH_MTD_memory_notranslate];
>
> change in parse_probe_arg().

Okay, I agree with you that adding one more fetch method will make
things simpler.

>
> However, now I think it would be more clean to leave FETCH_MTD_memory
> alone and add FETCH_MTD_memory_dotranslate instead.
>
> So trace_uprobes.c should define
>
>   void FETCH_FUNC_NAME(memory, type)(addr, ...)
>   {
>   copy_from_user((void __user *)addr);
>   }
>
>   void FETCH_FUNC_NAME(memory_dotranslate, type)(addr, ...)
>   {
>   void __user *uaddr = get_user_vaddr(regs, addr);
>   copy_from_user(uaddr);
>   }

Looks good.

>
> Then,
>
>> > Or. Perhaps we can leave "case '@'" in parse_probe_arg() and
>> > FETCH_MTD_memory alone. You seem to agree that "absolute address"
>> > can be useful anyway.
>>
>> Yes, but it's only meaningful to process-wide tracing sessions IMHO.
>
> Yes, yes, sure.
>
> I meant, we need both. Say, "perf probe "func global=@addr" means
> FETCH_MTD_memory, and "perf probe "func global=*addr" means
> FETCH_MTD_memory_dotranslate.
>
> Just in case, of course I do not care about the syntax, for example we
> can use "@~addr" for translate (or not translate) or whatever.

Yeah, and I want to hear from Masami.

>
> My only point: I think we need both to
>
>   1. avoid the new argument in fetch_func_t
>
>   2. allow the dump the data from the absolute address

I got it.

>
> And just to simplify the discussion, lets assume we use "*addr" for
> FETCH_MTD_memory_dotranslate and thus parse_probe_arg() gets the new
>
>   case '*':
>   if (is_kprobe)
>   return -EINVAL;
>
>   kstrtoul(arg + 1, 0, );
>   f->fn = t->fetch[FETCH_MTD_memory_dotranslate];
>   f->data = (void *)param;
>   break;
>   
> branch.

Looks good.

>
>> > Instead, perhaps we can add FETCH_MTD_memory_do_fancy_addr_translation,
>> > and, say, the new "case '*'" in parse_probe_arg() should add all the
>> > neccessary info as f->data (like, say, FETCH_MTD_symbol).
>>
>> Could you elaborate this more?
>
> Yes, I was confusing sorry.
>
> As for FETCH_MTD_memory_do_fancy_addr_translation, please see above.

Okay.

>
> As for "neccessary info as f->data". Suppose that we still have a reason
> for the additional argument in FETCH_MTD_memory_dotranslate method. Even
> in this case I don't think we should change the signature of fetch_func_t.
>
> What I think we can do is something like
>
>   1. Changed parse_probe_arg() to accept "struct trace_uprobe *tu"
>  instead of is_kprobe. Naturally, !tu can be used instead.
>
>   2. Introduce
>
>   struct dotranslate_fetch_param {
>   struct trace_uprobe *tu;
>   fetch_func_tfetch;
>   fetch_func_tfetch_size;
>   };
>
>   3. Change the "case '*'" above to do
>
>   case '*':
>   if (!tu)
>   return -EINVAL;
>
>   struct dotranslate_fetch_param *xxx = kmalloc(..);
>
>  

Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-07 Thread Namhyung Kim
On Wed, 6 Nov 2013 18:37:54 +0100, Oleg Nesterov wrote:
 On 11/06, Namhyung Kim wrote:

 On Tue, 5 Nov 2013 20:24:01 +0100, Oleg Nesterov wrote:
  On 11/05, Oleg Nesterov wrote:
 
  As for -= tu-offset... Can't we avoid it? User-space needs to calculate
  the @ argument anyway, why it can't also substruct this offset?
 
  Or perhaps we can change parse_probe_arg(@) to update param ? Yes,
  in this case it needs another argument, not sure...
 
  Or,
 
+   if (is_ret_probe(tu)) {
+   saved_ip = instruction_pointer(regs);
+   instruction_pointer_set(func);
+   }
store_trace_args(...);
+   if (is_ret_probe(tu))
+   instruction_pointer_set(saved_ip);
 
  we can put -= tu-offset here.

 I don't think I get the point.

 I meant,

   saved_ip = instruction_pointer(regs);

   // pass the ip which was used to calculate
   // the @addr argument to fetch_*() methods

   temp_ip = is_ret_probe(tu) ? func : saved_ip;
   temp_ip -= tu-offset;
   instruction_pointer_set(temp_ip);

   store_trace_args(...);

   instruction_pointer_set(saved_ip);

 This way we can avoid the new void * argument for fetch_func_t,
 we do not need it to calculate the address.

Okay, but as I said before, subtracting tu-offset part can be removed.


 But: we still need the additional bool translate_vaddr to solve
 the problems with FETCH_MTD_deref.

 We already discussed this a bit, previously I suggested the new
 FETCH_MTD_memory_notranslate and

 -   dprm-fetch = t-fetch[FETCH_MTD_memory];
 +   dprm-fetch = t-fetch[FETCH_MTD_memory_notranslate];

 change in parse_probe_arg().

Okay, I agree with you that adding one more fetch method will make
things simpler.


 However, now I think it would be more clean to leave FETCH_MTD_memory
 alone and add FETCH_MTD_memory_dotranslate instead.

 So trace_uprobes.c should define

   void FETCH_FUNC_NAME(memory, type)(addr, ...)
   {
   copy_from_user((void __user *)addr);
   }

   void FETCH_FUNC_NAME(memory_dotranslate, type)(addr, ...)
   {
   void __user *uaddr = get_user_vaddr(regs, addr);
   copy_from_user(uaddr);
   }

Looks good.


 Then,

  Or. Perhaps we can leave case '@' in parse_probe_arg() and
  FETCH_MTD_memory alone. You seem to agree that absolute address
  can be useful anyway.

 Yes, but it's only meaningful to process-wide tracing sessions IMHO.

 Yes, yes, sure.

 I meant, we need both. Say, perf probe func global=@addr means
 FETCH_MTD_memory, and perf probe func global=*addr means
 FETCH_MTD_memory_dotranslate.

 Just in case, of course I do not care about the syntax, for example we
 can use @~addr for translate (or not translate) or whatever.

Yeah, and I want to hear from Masami.


 My only point: I think we need both to

   1. avoid the new argument in fetch_func_t

   2. allow the dump the data from the absolute address

I got it.


 And just to simplify the discussion, lets assume we use *addr for
 FETCH_MTD_memory_dotranslate and thus parse_probe_arg() gets the new

   case '*':
   if (is_kprobe)
   return -EINVAL;

   kstrtoul(arg + 1, 0, param);
   f-fn = t-fetch[FETCH_MTD_memory_dotranslate];
   f-data = (void *)param;
   break;
   
 branch.

Looks good.


  Instead, perhaps we can add FETCH_MTD_memory_do_fancy_addr_translation,
  and, say, the new case '*' in parse_probe_arg() should add all the
  neccessary info as f-data (like, say, FETCH_MTD_symbol).

 Could you elaborate this more?

 Yes, I was confusing sorry.

 As for FETCH_MTD_memory_do_fancy_addr_translation, please see above.

Okay.


 As for neccessary info as f-data. Suppose that we still have a reason
 for the additional argument in FETCH_MTD_memory_dotranslate method. Even
 in this case I don't think we should change the signature of fetch_func_t.

 What I think we can do is something like

   1. Changed parse_probe_arg() to accept struct trace_uprobe *tu
  instead of is_kprobe. Naturally, !tu can be used instead.

   2. Introduce

   struct dotranslate_fetch_param {
   struct trace_uprobe *tu;
   fetch_func_tfetch;
   fetch_func_tfetch_size;
   };

   3. Change the case '*' above to do

   case '*':
   if (!tu)
   return -EINVAL;

   struct dotranslate_fetch_param *xxx = kmalloc(..);

   xxx-fetch = t-fetch[FETCH_MTD_memory];

   // ... kstrtoul, fetch_size, etc, ...

   f-fn = t-fetch[FETCH_MTD_memory_dotranslate];
   f-data = (void *)xxx;

   4. Update 

Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-07 Thread Namhyung Kim
On Wed, 6 Nov 2013 19:24:08 +0100, Oleg Nesterov wrote:
 Forgot to mention,

 On 11/06, Oleg Nesterov wrote:

 I meant,

  saved_ip = instruction_pointer(regs);

  // pass the ip which was used to calculate
  // the @addr argument to fetch_*() methods

  temp_ip = is_ret_probe(tu) ? func : saved_ip;
  temp_ip -= tu-offset;
  instruction_pointer_set(temp_ip);

  store_trace_args(...);

 Note that instruction_pointer_set() is not really nice in any case,
 this can obviously confuse FETCH_MTD_reg(ip).

Yes.


 But lets ignore this. The solution is simple, we can pass/use this
 info via current-utask. We can either add the new member, or add
 a union. Or simply reuse xol_vaddr. Doesn't matter.

Okay, I'll take a look.


 So the only question is should we rely on instruction_pointer/func
 to translate the address or we should do something else (say, vma).

 So far I like this approach.

Me too. :)

So, I'll wait for more feedbacks from others and then send a new
version.  Actually I don't have much to do this these days - probably
until next week (there'll be the KLF).  Please forgive my brevity in
replies.

I'll be back with a new patchset then. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
Hi Oleg,

On Wed, 6 Nov 2013 17:28:06 +0100, Oleg Nesterov wrote:
> On 11/06, Namhyung Kim wrote:
>>
>> On Tue, 5 Nov 2013 18:45:35 +0100, Oleg Nesterov wrote:
>> > On 11/05, Namhyung Kim wrote:
>> >>
>> >> This is what I have for now:
>> >>
>> >> static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
>> >> addr,
>> >>  struct trace_uprobe *tu)
>> >> {
>> >>   unsigned long base_addr;
>> >>   unsigned long vaddr;
>> >>
>> >>   base_addr = instruction_pointer(regs) - tu->offset;
>> >>   vaddr = base_addr + addr;
>> >>
>> >>   return (void __force __user *) vaddr;
>> >> }
>> >>
>> >> When I tested it, it was able to fetch global and bss data from both of
>> >> executable and library properly.
>> >
>> > Heh ;) I didn't expect you will agree with this suggestion. But if you
>> > think it can work - great!
>>
>> It seems to work for me well except the cross-fetch.
>
> Yes, but cross-fetching needs something different anyway, so I think we
> should discuss this separately.

Okay.

>
>> But I'm not sure it'll work for every cases.
>
> I think "ip - tu->offset + vaddr" trick should always work, just we need
> to calculate this "vaddr" passed as an argument correctly.

Right.

>
> Except: user-space can create another executable mapping and call the
> probed function via another address, but I think we can ignore this.
> And I think we can do nothing in this case, because in this case we
> can't even rely on tu->inode.

Agreed.


>> > As for "-= tu->offset"... Can't we avoid it? User-space needs to calculate
>> > the "@" argument anyway, why it can't also substruct this offset?
>>
>> Hmm.. it makes sense too. :)
>
> I am no longer sure ;)
>
> This way the "@" argument will look more confusing, it will depend on the
> address/offset of the probed insn. But again, I do not know, this is up
> to you.

That said, I'd prefer the original "-= -tu->offset" approach.  It'll
make debugging easier IMHO.

>
>> >> But it still doesn't work for uretprobes
>> >> as you said before.
>> >
>> > This looks simple,
>> >
>> >+   if (is_ret_probe(tu)) {
>> >+   saved_ip = instruction_pointer(regs);
>> >+   instruction_pointer_set(func);
>> >+   }
>> >store_trace_args(...);
>> >+   if (is_ret_probe(tu))
>> >+   instruction_pointer_set(saved_ip);
>> >
>> > although not pretty.
>>
>> So for normal non-uretprobes, func == instruction_pointer(), right?
>
> No, for normal non-uretprobes func == 0 (actually, undefined).

Ah, okay.

>
>> If so, just passing func as you suggested looks better than this.
>
> Not sure I understand... OK, we can change uprobe_trace_func() and
> uprobe_perf_func()
>
>   if (!is_ret_probe(tu))
>   -   uprobe_trace_print(tu, 0, regs);
>   +   uprobe_trace_print(tu, instruction_pointer(regs), regs);
>   return 0;
>
> but why?
>
> We need the "saved_ip" ugly hack above only if is_ret_probe() == T and
> thus instruction_pointer() doesn't match the address of the probed function.
> And there is no way to pass some additional info to call_fetch/etc from
> uprobe_*_print().

Ah, I was confused that the 'func' is somewhat available in trace_uprobe
and it can make to avoid passing regs to get_user_vaddr().

Sorry for the noise.


> See also another email...

Will do.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Oleg Nesterov
Forgot to mention,

On 11/06, Oleg Nesterov wrote:
>
> I meant,
>
>   saved_ip = instruction_pointer(regs);
>
>   // pass the "ip" which was used to calculate
>   // the @addr argument to fetch_*() methods
>
>   temp_ip = is_ret_probe(tu) ? func : saved_ip;
>   temp_ip -= tu->offset;
>   instruction_pointer_set(temp_ip);
>
>   store_trace_args(...);

Note that instruction_pointer_set() is not really nice in any case,
this can obviously confuse FETCH_MTD_reg("ip").

But lets ignore this. The solution is simple, we can pass/use this
info via current->utask. We can either add the new member, or add
a union. Or simply reuse xol_vaddr. Doesn't matter.

So the only question is should we rely on instruction_pointer/func
to translate the address or we should do something else (say, vma).

So far I like this approach.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Oleg Nesterov
On 11/06, Namhyung Kim wrote:
>
> On Tue, 5 Nov 2013 20:24:01 +0100, Oleg Nesterov wrote:
> > On 11/05, Oleg Nesterov wrote:
> >>
> >> As for "-= tu->offset"... Can't we avoid it? User-space needs to calculate
> >> the "@" argument anyway, why it can't also substruct this offset?
> >>
> >> Or perhaps we can change parse_probe_arg("@") to update "param" ? Yes,
> >> in this case it needs another argument, not sure...
> >
> > Or,
> >
> >>+   if (is_ret_probe(tu)) {
> >>+   saved_ip = instruction_pointer(regs);
> >>+   instruction_pointer_set(func);
> >>+   }
> >>store_trace_args(...);
> >>+   if (is_ret_probe(tu))
> >>+   instruction_pointer_set(saved_ip);
> >
> > we can put "-= tu->offset" here.
>
> I don't think I get the point.

I meant,

saved_ip = instruction_pointer(regs);

// pass the "ip" which was used to calculate
// the @addr argument to fetch_*() methods

temp_ip = is_ret_probe(tu) ? func : saved_ip;
temp_ip -= tu->offset;
instruction_pointer_set(temp_ip);

store_trace_args(...);

instruction_pointer_set(saved_ip);

This way we can avoid the new "void *" argument for fetch_func_t,
we do not need it to calculate the address.

But: we still need the additional "bool translate_vaddr" to solve
the problems with FETCH_MTD_deref.

We already discussed this a bit, previously I suggested the new
FETCH_MTD_memory_notranslate and

-   dprm->fetch = t->fetch[FETCH_MTD_memory];
+   dprm->fetch = t->fetch[FETCH_MTD_memory_notranslate];

change in parse_probe_arg().

However, now I think it would be more clean to leave FETCH_MTD_memory
alone and add FETCH_MTD_memory_dotranslate instead.

So trace_uprobes.c should define

void FETCH_FUNC_NAME(memory, type)(addr, ...)
{
copy_from_user((void __user *)addr);
}

void FETCH_FUNC_NAME(memory_dotranslate, type)(addr, ...)
{
void __user *uaddr = get_user_vaddr(regs, addr);
copy_from_user(uaddr);
}

Then,

> > Or. Perhaps we can leave "case '@'" in parse_probe_arg() and
> > FETCH_MTD_memory alone. You seem to agree that "absolute address"
> > can be useful anyway.
>
> Yes, but it's only meaningful to process-wide tracing sessions IMHO.

Yes, yes, sure.

I meant, we need both. Say, "perf probe "func global=@addr" means
FETCH_MTD_memory, and "perf probe "func global=*addr" means
FETCH_MTD_memory_dotranslate.

Just in case, of course I do not care about the syntax, for example we
can use "@~addr" for translate (or not translate) or whatever.

My only point: I think we need both to

1. avoid the new argument in fetch_func_t

2. allow the dump the data from the absolute address

And just to simplify the discussion, lets assume we use "*addr" for
FETCH_MTD_memory_dotranslate and thus parse_probe_arg() gets the new

case '*':
if (is_kprobe)
return -EINVAL;

kstrtoul(arg + 1, 0, );
f->fn = t->fetch[FETCH_MTD_memory_dotranslate];
f->data = (void *)param;
break;

branch.

> > Instead, perhaps we can add FETCH_MTD_memory_do_fancy_addr_translation,
> > and, say, the new "case '*'" in parse_probe_arg() should add all the
> > neccessary info as f->data (like, say, FETCH_MTD_symbol).
>
> Could you elaborate this more?

Yes, I was confusing sorry.

As for FETCH_MTD_memory_do_fancy_addr_translation, please see above.

As for "neccessary info as f->data". Suppose that we still have a reason
for the additional argument in FETCH_MTD_memory_dotranslate method. Even
in this case I don't think we should change the signature of fetch_func_t.

What I think we can do is something like

1. Changed parse_probe_arg() to accept "struct trace_uprobe *tu"
   instead of is_kprobe. Naturally, !tu can be used instead.

2. Introduce

struct dotranslate_fetch_param {
struct trace_uprobe *tu;
fetch_func_tfetch;
fetch_func_tfetch_size;
};

3. Change the "case '*'" above to do

case '*':
if (!tu)
return -EINVAL;

struct dotranslate_fetch_param *xxx = kmalloc(..);

xxx->fetch = t->fetch[FETCH_MTD_memory];

// ... kstrtoul, fetch_size, etc, ...

f->fn = t->fetch[FETCH_MTD_memory_dotranslate];
f->data = (void *)xxx;

4. Update traceprobe_free_probe_arg/etc.

5. Now,

void FETCH_FUNC_NAME(memory_dotranslate, type)(addr, ...)
{

Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Oleg Nesterov
On 11/06, Namhyung Kim wrote:
>
> On Tue, 5 Nov 2013 18:45:35 +0100, Oleg Nesterov wrote:
> > On 11/05, Namhyung Kim wrote:
> >>
> >> This is what I have for now:
> >>
> >> static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
> >> addr,
> >>   struct trace_uprobe *tu)
> >> {
> >>unsigned long base_addr;
> >>unsigned long vaddr;
> >>
> >>base_addr = instruction_pointer(regs) - tu->offset;
> >>vaddr = base_addr + addr;
> >>
> >>return (void __force __user *) vaddr;
> >> }
> >>
> >> When I tested it, it was able to fetch global and bss data from both of
> >> executable and library properly.
> >
> > Heh ;) I didn't expect you will agree with this suggestion. But if you
> > think it can work - great!
>
> It seems to work for me well except the cross-fetch.

Yes, but cross-fetching needs something different anyway, so I think we
should discuss this separately.

> But I'm not sure it'll work for every cases.

I think "ip - tu->offset + vaddr" trick should always work, just we need
to calculate this "vaddr" passed as an argument correctly.

Except: user-space can create another executable mapping and call the
probed function via another address, but I think we can ignore this.
And I think we can do nothing in this case, because in this case we
can't even rely on tu->inode.

But,

> It would be great if some
> elf gurus come up and give some feedbacks.
>
> Masami?

Yes.

> > As for "-= tu->offset"... Can't we avoid it? User-space needs to calculate
> > the "@" argument anyway, why it can't also substruct this offset?
>
> Hmm.. it makes sense too. :)

I am no longer sure ;)

This way the "@" argument will look more confusing, it will depend on the
address/offset of the probed insn. But again, I do not know, this is up
to you.

> >> But it still doesn't work for uretprobes
> >> as you said before.
> >
> > This looks simple,
> >
> > +   if (is_ret_probe(tu)) {
> > +   saved_ip = instruction_pointer(regs);
> > +   instruction_pointer_set(func);
> > +   }
> > store_trace_args(...);
> > +   if (is_ret_probe(tu))
> > +   instruction_pointer_set(saved_ip);
> >
> > although not pretty.
>
> So for normal non-uretprobes, func == instruction_pointer(), right?

No, for normal non-uretprobes func == 0 (actually, undefined).

> If so, just passing func as you suggested looks better than this.

Not sure I understand... OK, we can change uprobe_trace_func() and
uprobe_perf_func()

if (!is_ret_probe(tu))
-   uprobe_trace_print(tu, 0, regs);
+   uprobe_trace_print(tu, instruction_pointer(regs), regs);
return 0;

but why?

We need the "saved_ip" ugly hack above only if is_ret_probe() == T and
thus instruction_pointer() doesn't match the address of the probed function.
And there is no way to pass some additional info to call_fetch/etc from
uprobe_*_print().

See also another email...

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
On Tue, 5 Nov 2013 20:24:01 +0100, Oleg Nesterov wrote:
> On 11/05, Oleg Nesterov wrote:
>>
>> As for "-= tu->offset"... Can't we avoid it? User-space needs to calculate
>> the "@" argument anyway, why it can't also substruct this offset?
>>
>> Or perhaps we can change parse_probe_arg("@") to update "param" ? Yes,
>> in this case it needs another argument, not sure...
>
> Or,
>
>>  +   if (is_ret_probe(tu)) {
>>  +   saved_ip = instruction_pointer(regs);
>>  +   instruction_pointer_set(func);
>>  +   }
>>  store_trace_args(...);
>>  +   if (is_ret_probe(tu))
>>  +   instruction_pointer_set(saved_ip);
>
> we can put "-= tu->offset" here.

I don't think I get the point.

>
> Or. Perhaps we can leave "case '@'" in parse_probe_arg() and
> FETCH_MTD_memory alone. You seem to agree that "absolute address"
> can be useful anyway.

Yes, but it's only meaningful to process-wide tracing sessions IMHO.

>
> Instead, perhaps we can add FETCH_MTD_memory_do_fancy_addr_translation,
> and, say, the new "case '*'" in parse_probe_arg() should add all the
> neccessary info as f->data (like, say, FETCH_MTD_symbol).

Could you elaborate this more?

>
> But, just in case, I do not have a strong opinion. Just I think it
> is better to discuss every choice we have.

Okay.  I really appreciate your reviews.


Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
On Tue, 5 Nov 2013 18:45:35 +0100, Oleg Nesterov wrote:
> On 11/05, Namhyung Kim wrote:
>>
>> This is what I have for now:
>>
>> static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long addr,
>> struct trace_uprobe *tu)
>> {
>>  unsigned long base_addr;
>>  unsigned long vaddr;
>>
>>  base_addr = instruction_pointer(regs) - tu->offset;
>>  vaddr = base_addr + addr;
>>
>>  return (void __force __user *) vaddr;
>> }
>>
>> When I tested it, it was able to fetch global and bss data from both of
>> executable and library properly.
>
> Heh ;) I didn't expect you will agree with this suggestion. But if you
> think it can work - great!

It seems to work for me well except the cross-fetch.

But I'm not sure it'll work for every cases.  It would be great if some
elf gurus come up and give some feedbacks.

Masami?

>
> Let me clarify just in case. Yes, _personally_ I think we should try
> to avoid the vma games, and it looks better to me this way. But I won't
> argue if you change your mind, I understand this approach has its own
> disadvantages.
>
> As for "-= tu->offset"... Can't we avoid it? User-space needs to calculate
> the "@" argument anyway, why it can't also substruct this offset?

Hmm.. it makes sense too. :)

>
> Or perhaps we can change parse_probe_arg("@") to update "param" ? Yes,
> in this case it needs another argument, not sure...
>
>> But it still doesn't work for uretprobes
>> as you said before.
>
> This looks simple,
>
>   +   if (is_ret_probe(tu)) {
>   +   saved_ip = instruction_pointer(regs);
>   +   instruction_pointer_set(func);
>   +   }
>   store_trace_args(...);
>   +   if (is_ret_probe(tu))
>   +   instruction_pointer_set(saved_ip);
>
> although not pretty.

So for normal non-uretprobes, func == instruction_pointer(), right?

If so, just passing func as you suggested looks better than this.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
On Tue, 5 Nov 2013 17:41:02 +0100, Oleg Nesterov wrote:
> On 11/05, Namhyung Kim wrote:
>>
>> On Mon, 4 Nov 2013 19:57:54 +0100, Oleg Nesterov wrote:
>> >>
>> >>   static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
>> >> addr)
>> >>   {
>> >>   return (void __force __user *)addr + instruction_pointer(regs);
>> >>   }
>> >>
>> >> ?
>> >>
>> >> This should solve the problems with relocations/randomization/bss.
>> >>
>> >> The obvious disadvantage is that it is not easy to calculate the
>> >> offset we need to pass as an argument, it depends on the probed
>> >> function.
>> >
>> > forgot to mention... and instruction_pointer() can't work in ret-probe,
>> > we need to pass the "unsigned long func" arg somehow...
>>
>> Hmm.. what's the value of tu->offset in this case?  Does it have the
>> offset of the return address or the start of the function?
>
> It is the offest of function. IOW, it is the same regardless of
> is_ret_probe().

Ah, okay.  Thanks for the info.  Then yes, it'd probably better to pass
the func rather than regs since it's the only info we need, right?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
On Tue, 5 Nov 2013 17:33:41 +0100, Oleg Nesterov wrote:
> On 11/05, Namhyung Kim wrote:
>>
>> On Mon, 4 Nov 2013 17:22:29 +0100, Oleg Nesterov wrote:
>> >
>> > So probably I was wrong and this all needs more thinking. Damn.
>>
>> :)
>
> Don't laugh at me ;)

:)

(And now I'm feeling guilty..)

>
>> > Perhaps we really need to pass @file/offset, but it is not clear what
>> > we can do with bss/anon-mapping.
>>
>> The @file/offset should work with bss since data in bss is accessed via
>> an offset in the program,
>
> Yes, yes, but it is still not clear to me how we can identify the "right"
> vma which doesn't cover the .bss address but can be used to calculate the
> offset.

Yes, that's the open problem for the cross-fetch.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
Hi Oleg,

On Tue, 5 Nov 2013 17:28:20 +0100, Oleg Nesterov wrote:
> On 11/05, Namhyung Kim wrote:
>>
>> On Mon, 4 Nov 2013 16:01:12 +0100, Oleg Nesterov wrote:
>> > Or the syntax should be "name=probe @file/addr" or something like this.
>>
>> Okay.  Let's call this kind of thing "cross-fetch" (or a better name can
>> be suggested).
>
> Yes ;) and I am afraid there was some confusion in our discussion.
> I probably confused "probe other binaries" with cross-fetch and vice
> versa sometimes.

Sorry for not being clear enough.

>
>> > So far I think that trace_uprobes.c should not play games with vma. At all.
>>
>> Yes, playing with vma is fragile.  But otherwise how can we get the
>> address from the file+offset in random processes?
>
> Yes, this is not as simple as I thought.
>
> Let me repeat, somehow I completely forgot we need to probe other (libc)
> binaries (not only the executable itself) and dump the data from that
> binary. That is why I wrongly thought that that ->start_data trick can
> work.
>
> OK, I see other emails from you. Perhaps we can rely on instruction_pointer(),
> (I'll write more emails on this), But if not, then we probably need tu->inode
> and vma games in fetch/get_user_vaddr(). I'd still like to avoid this, if
> possible, but I am no longer sure.

Yes, I think it'll be necessary for the cross-fetch anyway.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
Hi Oleg,

On Tue, 5 Nov 2013 17:28:20 +0100, Oleg Nesterov wrote:
 On 11/05, Namhyung Kim wrote:

 On Mon, 4 Nov 2013 16:01:12 +0100, Oleg Nesterov wrote:
  Or the syntax should be name=probe @file/addr or something like this.

 Okay.  Let's call this kind of thing cross-fetch (or a better name can
 be suggested).

 Yes ;) and I am afraid there was some confusion in our discussion.
 I probably confused probe other binaries with cross-fetch and vice
 versa sometimes.

Sorry for not being clear enough.


  So far I think that trace_uprobes.c should not play games with vma. At all.

 Yes, playing with vma is fragile.  But otherwise how can we get the
 address from the file+offset in random processes?

 Yes, this is not as simple as I thought.

 Let me repeat, somehow I completely forgot we need to probe other (libc)
 binaries (not only the executable itself) and dump the data from that
 binary. That is why I wrongly thought that that -start_data trick can
 work.

 OK, I see other emails from you. Perhaps we can rely on instruction_pointer(),
 (I'll write more emails on this), But if not, then we probably need tu-inode
 and vma games in fetch/get_user_vaddr(). I'd still like to avoid this, if
 possible, but I am no longer sure.

Yes, I think it'll be necessary for the cross-fetch anyway.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
On Tue, 5 Nov 2013 17:33:41 +0100, Oleg Nesterov wrote:
 On 11/05, Namhyung Kim wrote:

 On Mon, 4 Nov 2013 17:22:29 +0100, Oleg Nesterov wrote:
 
  So probably I was wrong and this all needs more thinking. Damn.

 :)

 Don't laugh at me ;)

:)

(And now I'm feeling guilty..)


  Perhaps we really need to pass @file/offset, but it is not clear what
  we can do with bss/anon-mapping.

 The @file/offset should work with bss since data in bss is accessed via
 an offset in the program,

 Yes, yes, but it is still not clear to me how we can identify the right
 vma which doesn't cover the .bss address but can be used to calculate the
 offset.

Yes, that's the open problem for the cross-fetch.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
On Tue, 5 Nov 2013 17:41:02 +0100, Oleg Nesterov wrote:
 On 11/05, Namhyung Kim wrote:

 On Mon, 4 Nov 2013 19:57:54 +0100, Oleg Nesterov wrote:
 
static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
  addr)
{
return (void __force __user *)addr + instruction_pointer(regs);
}
 
  ?
 
  This should solve the problems with relocations/randomization/bss.
 
  The obvious disadvantage is that it is not easy to calculate the
  offset we need to pass as an argument, it depends on the probed
  function.
 
  forgot to mention... and instruction_pointer() can't work in ret-probe,
  we need to pass the unsigned long func arg somehow...

 Hmm.. what's the value of tu-offset in this case?  Does it have the
 offset of the return address or the start of the function?

 It is the offest of function. IOW, it is the same regardless of
 is_ret_probe().

Ah, okay.  Thanks for the info.  Then yes, it'd probably better to pass
the func rather than regs since it's the only info we need, right?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
On Tue, 5 Nov 2013 18:45:35 +0100, Oleg Nesterov wrote:
 On 11/05, Namhyung Kim wrote:

 This is what I have for now:

 static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long addr,
 struct trace_uprobe *tu)
 {
  unsigned long base_addr;
  unsigned long vaddr;

  base_addr = instruction_pointer(regs) - tu-offset;
  vaddr = base_addr + addr;

  return (void __force __user *) vaddr;
 }

 When I tested it, it was able to fetch global and bss data from both of
 executable and library properly.

 Heh ;) I didn't expect you will agree with this suggestion. But if you
 think it can work - great!

It seems to work for me well except the cross-fetch.

But I'm not sure it'll work for every cases.  It would be great if some
elf gurus come up and give some feedbacks.

Masami?


 Let me clarify just in case. Yes, _personally_ I think we should try
 to avoid the vma games, and it looks better to me this way. But I won't
 argue if you change your mind, I understand this approach has its own
 disadvantages.

 As for -= tu-offset... Can't we avoid it? User-space needs to calculate
 the @ argument anyway, why it can't also substruct this offset?

Hmm.. it makes sense too. :)


 Or perhaps we can change parse_probe_arg(@) to update param ? Yes,
 in this case it needs another argument, not sure...

 But it still doesn't work for uretprobes
 as you said before.

 This looks simple,

   +   if (is_ret_probe(tu)) {
   +   saved_ip = instruction_pointer(regs);
   +   instruction_pointer_set(func);
   +   }
   store_trace_args(...);
   +   if (is_ret_probe(tu))
   +   instruction_pointer_set(saved_ip);

 although not pretty.

So for normal non-uretprobes, func == instruction_pointer(), right?

If so, just passing func as you suggested looks better than this.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
On Tue, 5 Nov 2013 20:24:01 +0100, Oleg Nesterov wrote:
 On 11/05, Oleg Nesterov wrote:

 As for -= tu-offset... Can't we avoid it? User-space needs to calculate
 the @ argument anyway, why it can't also substruct this offset?

 Or perhaps we can change parse_probe_arg(@) to update param ? Yes,
 in this case it needs another argument, not sure...

 Or,

  +   if (is_ret_probe(tu)) {
  +   saved_ip = instruction_pointer(regs);
  +   instruction_pointer_set(func);
  +   }
  store_trace_args(...);
  +   if (is_ret_probe(tu))
  +   instruction_pointer_set(saved_ip);

 we can put -= tu-offset here.

I don't think I get the point.


 Or. Perhaps we can leave case '@' in parse_probe_arg() and
 FETCH_MTD_memory alone. You seem to agree that absolute address
 can be useful anyway.

Yes, but it's only meaningful to process-wide tracing sessions IMHO.


 Instead, perhaps we can add FETCH_MTD_memory_do_fancy_addr_translation,
 and, say, the new case '*' in parse_probe_arg() should add all the
 neccessary info as f-data (like, say, FETCH_MTD_symbol).

Could you elaborate this more?


 But, just in case, I do not have a strong opinion. Just I think it
 is better to discuss every choice we have.

Okay.  I really appreciate your reviews.


Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Oleg Nesterov
On 11/06, Namhyung Kim wrote:

 On Tue, 5 Nov 2013 18:45:35 +0100, Oleg Nesterov wrote:
  On 11/05, Namhyung Kim wrote:
 
  This is what I have for now:
 
  static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
  addr,
struct trace_uprobe *tu)
  {
 unsigned long base_addr;
 unsigned long vaddr;
 
 base_addr = instruction_pointer(regs) - tu-offset;
 vaddr = base_addr + addr;
 
 return (void __force __user *) vaddr;
  }
 
  When I tested it, it was able to fetch global and bss data from both of
  executable and library properly.
 
  Heh ;) I didn't expect you will agree with this suggestion. But if you
  think it can work - great!

 It seems to work for me well except the cross-fetch.

Yes, but cross-fetching needs something different anyway, so I think we
should discuss this separately.

 But I'm not sure it'll work for every cases.

I think ip - tu-offset + vaddr trick should always work, just we need
to calculate this vaddr passed as an argument correctly.

Except: user-space can create another executable mapping and call the
probed function via another address, but I think we can ignore this.
And I think we can do nothing in this case, because in this case we
can't even rely on tu-inode.

But,

 It would be great if some
 elf gurus come up and give some feedbacks.

 Masami?

Yes.

  As for -= tu-offset... Can't we avoid it? User-space needs to calculate
  the @ argument anyway, why it can't also substruct this offset?

 Hmm.. it makes sense too. :)

I am no longer sure ;)

This way the @ argument will look more confusing, it will depend on the
address/offset of the probed insn. But again, I do not know, this is up
to you.

  But it still doesn't work for uretprobes
  as you said before.
 
  This looks simple,
 
  +   if (is_ret_probe(tu)) {
  +   saved_ip = instruction_pointer(regs);
  +   instruction_pointer_set(func);
  +   }
  store_trace_args(...);
  +   if (is_ret_probe(tu))
  +   instruction_pointer_set(saved_ip);
 
  although not pretty.

 So for normal non-uretprobes, func == instruction_pointer(), right?

No, for normal non-uretprobes func == 0 (actually, undefined).

 If so, just passing func as you suggested looks better than this.

Not sure I understand... OK, we can change uprobe_trace_func() and
uprobe_perf_func()

if (!is_ret_probe(tu))
-   uprobe_trace_print(tu, 0, regs);
+   uprobe_trace_print(tu, instruction_pointer(regs), regs);
return 0;

but why?

We need the saved_ip ugly hack above only if is_ret_probe() == T and
thus instruction_pointer() doesn't match the address of the probed function.
And there is no way to pass some additional info to call_fetch/etc from
uprobe_*_print().

See also another email...

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Oleg Nesterov
On 11/06, Namhyung Kim wrote:

 On Tue, 5 Nov 2013 20:24:01 +0100, Oleg Nesterov wrote:
  On 11/05, Oleg Nesterov wrote:
 
  As for -= tu-offset... Can't we avoid it? User-space needs to calculate
  the @ argument anyway, why it can't also substruct this offset?
 
  Or perhaps we can change parse_probe_arg(@) to update param ? Yes,
  in this case it needs another argument, not sure...
 
  Or,
 
 +   if (is_ret_probe(tu)) {
 +   saved_ip = instruction_pointer(regs);
 +   instruction_pointer_set(func);
 +   }
 store_trace_args(...);
 +   if (is_ret_probe(tu))
 +   instruction_pointer_set(saved_ip);
 
  we can put -= tu-offset here.

 I don't think I get the point.

I meant,

saved_ip = instruction_pointer(regs);

// pass the ip which was used to calculate
// the @addr argument to fetch_*() methods

temp_ip = is_ret_probe(tu) ? func : saved_ip;
temp_ip -= tu-offset;
instruction_pointer_set(temp_ip);

store_trace_args(...);

instruction_pointer_set(saved_ip);

This way we can avoid the new void * argument for fetch_func_t,
we do not need it to calculate the address.

But: we still need the additional bool translate_vaddr to solve
the problems with FETCH_MTD_deref.

We already discussed this a bit, previously I suggested the new
FETCH_MTD_memory_notranslate and

-   dprm-fetch = t-fetch[FETCH_MTD_memory];
+   dprm-fetch = t-fetch[FETCH_MTD_memory_notranslate];

change in parse_probe_arg().

However, now I think it would be more clean to leave FETCH_MTD_memory
alone and add FETCH_MTD_memory_dotranslate instead.

So trace_uprobes.c should define

void FETCH_FUNC_NAME(memory, type)(addr, ...)
{
copy_from_user((void __user *)addr);
}

void FETCH_FUNC_NAME(memory_dotranslate, type)(addr, ...)
{
void __user *uaddr = get_user_vaddr(regs, addr);
copy_from_user(uaddr);
}

Then,

  Or. Perhaps we can leave case '@' in parse_probe_arg() and
  FETCH_MTD_memory alone. You seem to agree that absolute address
  can be useful anyway.

 Yes, but it's only meaningful to process-wide tracing sessions IMHO.

Yes, yes, sure.

I meant, we need both. Say, perf probe func global=@addr means
FETCH_MTD_memory, and perf probe func global=*addr means
FETCH_MTD_memory_dotranslate.

Just in case, of course I do not care about the syntax, for example we
can use @~addr for translate (or not translate) or whatever.

My only point: I think we need both to

1. avoid the new argument in fetch_func_t

2. allow the dump the data from the absolute address

And just to simplify the discussion, lets assume we use *addr for
FETCH_MTD_memory_dotranslate and thus parse_probe_arg() gets the new

case '*':
if (is_kprobe)
return -EINVAL;

kstrtoul(arg + 1, 0, param);
f-fn = t-fetch[FETCH_MTD_memory_dotranslate];
f-data = (void *)param;
break;

branch.

  Instead, perhaps we can add FETCH_MTD_memory_do_fancy_addr_translation,
  and, say, the new case '*' in parse_probe_arg() should add all the
  neccessary info as f-data (like, say, FETCH_MTD_symbol).

 Could you elaborate this more?

Yes, I was confusing sorry.

As for FETCH_MTD_memory_do_fancy_addr_translation, please see above.

As for neccessary info as f-data. Suppose that we still have a reason
for the additional argument in FETCH_MTD_memory_dotranslate method. Even
in this case I don't think we should change the signature of fetch_func_t.

What I think we can do is something like

1. Changed parse_probe_arg() to accept struct trace_uprobe *tu
   instead of is_kprobe. Naturally, !tu can be used instead.

2. Introduce

struct dotranslate_fetch_param {
struct trace_uprobe *tu;
fetch_func_tfetch;
fetch_func_tfetch_size;
};

3. Change the case '*' above to do

case '*':
if (!tu)
return -EINVAL;

struct dotranslate_fetch_param *xxx = kmalloc(..);

xxx-fetch = t-fetch[FETCH_MTD_memory];

// ... kstrtoul, fetch_size, etc, ...

f-fn = t-fetch[FETCH_MTD_memory_dotranslate];
f-data = (void *)xxx;

4. Update traceprobe_free_probe_arg/etc.

5. Now,

void FETCH_FUNC_NAME(memory_dotranslate, type)(addr, ...)
{
struct dotranslate_fetch_param *xxx = data;
void __user *uaddr = get_user_vaddr(regs, 

Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Oleg Nesterov
Forgot to mention,

On 11/06, Oleg Nesterov wrote:

 I meant,

   saved_ip = instruction_pointer(regs);

   // pass the ip which was used to calculate
   // the @addr argument to fetch_*() methods

   temp_ip = is_ret_probe(tu) ? func : saved_ip;
   temp_ip -= tu-offset;
   instruction_pointer_set(temp_ip);

   store_trace_args(...);

Note that instruction_pointer_set() is not really nice in any case,
this can obviously confuse FETCH_MTD_reg(ip).

But lets ignore this. The solution is simple, we can pass/use this
info via current-utask. We can either add the new member, or add
a union. Or simply reuse xol_vaddr. Doesn't matter.

So the only question is should we rely on instruction_pointer/func
to translate the address or we should do something else (say, vma).

So far I like this approach.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-06 Thread Namhyung Kim
Hi Oleg,

On Wed, 6 Nov 2013 17:28:06 +0100, Oleg Nesterov wrote:
 On 11/06, Namhyung Kim wrote:

 On Tue, 5 Nov 2013 18:45:35 +0100, Oleg Nesterov wrote:
  On 11/05, Namhyung Kim wrote:
 
  This is what I have for now:
 
  static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
  addr,
   struct trace_uprobe *tu)
  {
unsigned long base_addr;
unsigned long vaddr;
 
base_addr = instruction_pointer(regs) - tu-offset;
vaddr = base_addr + addr;
 
return (void __force __user *) vaddr;
  }
 
  When I tested it, it was able to fetch global and bss data from both of
  executable and library properly.
 
  Heh ;) I didn't expect you will agree with this suggestion. But if you
  think it can work - great!

 It seems to work for me well except the cross-fetch.

 Yes, but cross-fetching needs something different anyway, so I think we
 should discuss this separately.

Okay.


 But I'm not sure it'll work for every cases.

 I think ip - tu-offset + vaddr trick should always work, just we need
 to calculate this vaddr passed as an argument correctly.

Right.


 Except: user-space can create another executable mapping and call the
 probed function via another address, but I think we can ignore this.
 And I think we can do nothing in this case, because in this case we
 can't even rely on tu-inode.

Agreed.


  As for -= tu-offset... Can't we avoid it? User-space needs to calculate
  the @ argument anyway, why it can't also substruct this offset?

 Hmm.. it makes sense too. :)

 I am no longer sure ;)

 This way the @ argument will look more confusing, it will depend on the
 address/offset of the probed insn. But again, I do not know, this is up
 to you.

That said, I'd prefer the original -= -tu-offset approach.  It'll
make debugging easier IMHO.


  But it still doesn't work for uretprobes
  as you said before.
 
  This looks simple,
 
 +   if (is_ret_probe(tu)) {
 +   saved_ip = instruction_pointer(regs);
 +   instruction_pointer_set(func);
 +   }
 store_trace_args(...);
 +   if (is_ret_probe(tu))
 +   instruction_pointer_set(saved_ip);
 
  although not pretty.

 So for normal non-uretprobes, func == instruction_pointer(), right?

 No, for normal non-uretprobes func == 0 (actually, undefined).

Ah, okay.


 If so, just passing func as you suggested looks better than this.

 Not sure I understand... OK, we can change uprobe_trace_func() and
 uprobe_perf_func()

   if (!is_ret_probe(tu))
   -   uprobe_trace_print(tu, 0, regs);
   +   uprobe_trace_print(tu, instruction_pointer(regs), regs);
   return 0;

 but why?

 We need the saved_ip ugly hack above only if is_ret_probe() == T and
 thus instruction_pointer() doesn't match the address of the probed function.
 And there is no way to pass some additional info to call_fetch/etc from
 uprobe_*_print().

Ah, I was confused that the 'func' is somewhat available in trace_uprobe
and it can make to avoid passing regs to get_user_vaddr().

Sorry for the noise.


 See also another email...

Will do.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-05 Thread Oleg Nesterov
On 11/05, Oleg Nesterov wrote:
>
> As for "-= tu->offset"... Can't we avoid it? User-space needs to calculate
> the "@" argument anyway, why it can't also substruct this offset?
>
> Or perhaps we can change parse_probe_arg("@") to update "param" ? Yes,
> in this case it needs another argument, not sure...

Or,

>   +   if (is_ret_probe(tu)) {
>   +   saved_ip = instruction_pointer(regs);
>   +   instruction_pointer_set(func);
>   +   }
>   store_trace_args(...);
>   +   if (is_ret_probe(tu))
>   +   instruction_pointer_set(saved_ip);

we can put "-= tu->offset" here.

> although not pretty.

Yes.

Or. Perhaps we can leave "case '@'" in parse_probe_arg() and
FETCH_MTD_memory alone. You seem to agree that "absolute address"
can be useful anyway.

Instead, perhaps we can add FETCH_MTD_memory_do_fancy_addr_translation,
and, say, the new "case '*'" in parse_probe_arg() should add all the
neccessary info as f->data (like, say, FETCH_MTD_symbol).

But, just in case, I do not have a strong opinion. Just I think it
is better to discuss every choice we have.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-05 Thread Oleg Nesterov
On 11/05, Namhyung Kim wrote:
>
> This is what I have for now:
>
> static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long addr,
>  struct trace_uprobe *tu)
> {
>   unsigned long base_addr;
>   unsigned long vaddr;
>
>   base_addr = instruction_pointer(regs) - tu->offset;
>   vaddr = base_addr + addr;
>
>   return (void __force __user *) vaddr;
> }
>
> When I tested it, it was able to fetch global and bss data from both of
> executable and library properly.

Heh ;) I didn't expect you will agree with this suggestion. But if you
think it can work - great!

Let me clarify just in case. Yes, _personally_ I think we should try
to avoid the vma games, and it looks better to me this way. But I won't
argue if you change your mind, I understand this approach has its own
disadvantages.

As for "-= tu->offset"... Can't we avoid it? User-space needs to calculate
the "@" argument anyway, why it can't also substruct this offset?

Or perhaps we can change parse_probe_arg("@") to update "param" ? Yes,
in this case it needs another argument, not sure...

> But it still doesn't work for uretprobes
> as you said before.

This looks simple,

+   if (is_ret_probe(tu)) {
+   saved_ip = instruction_pointer(regs);
+   instruction_pointer_set(func);
+   }
store_trace_args(...);
+   if (is_ret_probe(tu))
+   instruction_pointer_set(saved_ip);

although not pretty.

> This symbol offset calculation was done in the getsymoff which implemented
> like below (I'm sure there's a much simpler way to do this, but ...).

Perhaps I'll even try to read/understand it later, but this elf stuff is
the black magic to me ;)

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-05 Thread Oleg Nesterov
On 11/05, Namhyung Kim wrote:
>
> On Mon, 4 Nov 2013 16:01:12 +0100, Oleg Nesterov wrote:
> > Or the syntax should be "name=probe @file/addr" or something like this.
>
> Okay.  Let's call this kind of thing "cross-fetch" (or a better name can
> be suggested).

Yes ;) and I am afraid there was some confusion in our discussion.
I probably confused "probe other binaries" with cross-fetch and vice
versa sometimes.

> This is more complex situation and needs more discussion
> as you said.  So let's skip the discussion for now. :)

Agreed.

> > So far I think that trace_uprobes.c should not play games with vma. At all.
>
> Yes, playing with vma is fragile.  But otherwise how can we get the
> address from the file+offset in random processes?

Yes, this is not as simple as I thought.

Let me repeat, somehow I completely forgot we need to probe other (libc)
binaries (not only the executable itself) and dump the data from that
binary. That is why I wrongly thought that that ->start_data trick can
work.

OK, I see other emails from you. Perhaps we can rely on instruction_pointer(),
(I'll write more emails on this), But if not, then we probably need tu->inode
and vma games in fetch/get_user_vaddr(). I'd still like to avoid this, if
possible, but I am no longer sure.


> >> > Can't we simply implement get_user_vaddr() as
> >> >
> >> >  static void __user *get_user_vaddr(unsigned long addr, struct 
> >> > trace_uprobe *tu)
> >> >  {
> >> >  void __user *vaddr = (void __force __user *)addr;
> >> >
> >> >  /* A NULL tu means that we already got the vaddr */
> >> >  if (tu)
> >> >  vaddr += (current->mm->start_data & PAGE_MASK);
> >> >
> >> >  return vaddr;
> >> >  }
> >> >
> >> This can only work with the probes fetching data from the executable,
> >> right?  But as I said it should support any other binaries too.
> >
> > See above, we can't in general read other binaries.
>
> Okay, I need to clarify my words.  I'm not saying about "cross-fetch"
> here, what I wanted to say is adding a probe in some dso and fetch data
> from the dso.
>
> [...snip...]

Yes, sorry for confusion, see above.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-05 Thread Oleg Nesterov
On 11/05, Namhyung Kim wrote:
>
> On Mon, 4 Nov 2013 17:22:29 +0100, Oleg Nesterov wrote:
> >
> > So probably I was wrong and this all needs more thinking. Damn.
>
> :)

Don't laugh at me ;)

> > Perhaps we really need to pass @file/offset, but it is not clear what
> > we can do with bss/anon-mapping.
>
> The @file/offset should work with bss since data in bss is accessed via
> an offset in the program,

Yes, yes, but it is still not clear to me how we can identify the "right"
vma which doesn't cover the .bss address but can be used to calculate the
offset.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-05 Thread Oleg Nesterov
On 11/05, Namhyung Kim wrote:
>
> On Mon, 4 Nov 2013 19:57:54 +0100, Oleg Nesterov wrote:
> >>
> >>static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
> >> addr)
> >>{
> >>return (void __force __user *)addr + instruction_pointer(regs);
> >>}
> >>
> >> ?
> >>
> >> This should solve the problems with relocations/randomization/bss.
> >>
> >> The obvious disadvantage is that it is not easy to calculate the
> >> offset we need to pass as an argument, it depends on the probed
> >> function.
> >
> > forgot to mention... and instruction_pointer() can't work in ret-probe,
> > we need to pass the "unsigned long func" arg somehow...
>
> Hmm.. what's the value of tu->offset in this case?  Does it have the
> offset of the return address or the start of the function?

It is the offest of function. IOW, it is the same regardless of
is_ret_probe().

Ignoring probes_seq_show() we only need it for uprobe_register().

(yes, it is also used in uprobe_unregister/apply, but this is only
 because this API ugly and should be cleanuped).

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-05 Thread Oleg Nesterov
On 11/05, Namhyung Kim wrote:

 On Mon, 4 Nov 2013 19:57:54 +0100, Oleg Nesterov wrote:
 
 static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
  addr)
 {
 return (void __force __user *)addr + instruction_pointer(regs);
 }
 
  ?
 
  This should solve the problems with relocations/randomization/bss.
 
  The obvious disadvantage is that it is not easy to calculate the
  offset we need to pass as an argument, it depends on the probed
  function.
 
  forgot to mention... and instruction_pointer() can't work in ret-probe,
  we need to pass the unsigned long func arg somehow...

 Hmm.. what's the value of tu-offset in this case?  Does it have the
 offset of the return address or the start of the function?

It is the offest of function. IOW, it is the same regardless of
is_ret_probe().

Ignoring probes_seq_show() we only need it for uprobe_register().

(yes, it is also used in uprobe_unregister/apply, but this is only
 because this API ugly and should be cleanuped).

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-05 Thread Oleg Nesterov
On 11/05, Namhyung Kim wrote:

 On Mon, 4 Nov 2013 17:22:29 +0100, Oleg Nesterov wrote:
 
  So probably I was wrong and this all needs more thinking. Damn.

 :)

Don't laugh at me ;)

  Perhaps we really need to pass @file/offset, but it is not clear what
  we can do with bss/anon-mapping.

 The @file/offset should work with bss since data in bss is accessed via
 an offset in the program,

Yes, yes, but it is still not clear to me how we can identify the right
vma which doesn't cover the .bss address but can be used to calculate the
offset.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-05 Thread Oleg Nesterov
On 11/05, Namhyung Kim wrote:

 On Mon, 4 Nov 2013 16:01:12 +0100, Oleg Nesterov wrote:
  Or the syntax should be name=probe @file/addr or something like this.

 Okay.  Let's call this kind of thing cross-fetch (or a better name can
 be suggested).

Yes ;) and I am afraid there was some confusion in our discussion.
I probably confused probe other binaries with cross-fetch and vice
versa sometimes.

 This is more complex situation and needs more discussion
 as you said.  So let's skip the discussion for now. :)

Agreed.

  So far I think that trace_uprobes.c should not play games with vma. At all.

 Yes, playing with vma is fragile.  But otherwise how can we get the
 address from the file+offset in random processes?

Yes, this is not as simple as I thought.

Let me repeat, somehow I completely forgot we need to probe other (libc)
binaries (not only the executable itself) and dump the data from that
binary. That is why I wrongly thought that that -start_data trick can
work.

OK, I see other emails from you. Perhaps we can rely on instruction_pointer(),
(I'll write more emails on this), But if not, then we probably need tu-inode
and vma games in fetch/get_user_vaddr(). I'd still like to avoid this, if
possible, but I am no longer sure.


   Can't we simply implement get_user_vaddr() as
  
static void __user *get_user_vaddr(unsigned long addr, struct 
   trace_uprobe *tu)
{
void __user *vaddr = (void __force __user *)addr;
  
/* A NULL tu means that we already got the vaddr */
if (tu)
vaddr += (current-mm-start_data  PAGE_MASK);
  
return vaddr;
}
  
  This can only work with the probes fetching data from the executable,
  right?  But as I said it should support any other binaries too.
 
  See above, we can't in general read other binaries.

 Okay, I need to clarify my words.  I'm not saying about cross-fetch
 here, what I wanted to say is adding a probe in some dso and fetch data
 from the dso.

 [...snip...]

Yes, sorry for confusion, see above.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-05 Thread Oleg Nesterov
On 11/05, Namhyung Kim wrote:

 This is what I have for now:

 static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long addr,
  struct trace_uprobe *tu)
 {
   unsigned long base_addr;
   unsigned long vaddr;

   base_addr = instruction_pointer(regs) - tu-offset;
   vaddr = base_addr + addr;

   return (void __force __user *) vaddr;
 }

 When I tested it, it was able to fetch global and bss data from both of
 executable and library properly.

Heh ;) I didn't expect you will agree with this suggestion. But if you
think it can work - great!

Let me clarify just in case. Yes, _personally_ I think we should try
to avoid the vma games, and it looks better to me this way. But I won't
argue if you change your mind, I understand this approach has its own
disadvantages.

As for -= tu-offset... Can't we avoid it? User-space needs to calculate
the @ argument anyway, why it can't also substruct this offset?

Or perhaps we can change parse_probe_arg(@) to update param ? Yes,
in this case it needs another argument, not sure...

 But it still doesn't work for uretprobes
 as you said before.

This looks simple,

+   if (is_ret_probe(tu)) {
+   saved_ip = instruction_pointer(regs);
+   instruction_pointer_set(func);
+   }
store_trace_args(...);
+   if (is_ret_probe(tu))
+   instruction_pointer_set(saved_ip);

although not pretty.

 This symbol offset calculation was done in the getsymoff which implemented
 like below (I'm sure there's a much simpler way to do this, but ...).

Perhaps I'll even try to read/understand it later, but this elf stuff is
the black magic to me ;)

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-05 Thread Oleg Nesterov
On 11/05, Oleg Nesterov wrote:

 As for -= tu-offset... Can't we avoid it? User-space needs to calculate
 the @ argument anyway, why it can't also substruct this offset?

 Or perhaps we can change parse_probe_arg(@) to update param ? Yes,
 in this case it needs another argument, not sure...

Or,

   +   if (is_ret_probe(tu)) {
   +   saved_ip = instruction_pointer(regs);
   +   instruction_pointer_set(func);
   +   }
   store_trace_args(...);
   +   if (is_ret_probe(tu))
   +   instruction_pointer_set(saved_ip);

we can put -= tu-offset here.

 although not pretty.

Yes.

Or. Perhaps we can leave case '@' in parse_probe_arg() and
FETCH_MTD_memory alone. You seem to agree that absolute address
can be useful anyway.

Instead, perhaps we can add FETCH_MTD_memory_do_fancy_addr_translation,
and, say, the new case '*' in parse_probe_arg() should add all the
neccessary info as f-data (like, say, FETCH_MTD_symbol).

But, just in case, I do not have a strong opinion. Just I think it
is better to discuss every choice we have.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
This is what I have for now:

static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long addr,
   struct trace_uprobe *tu)
{
unsigned long base_addr;
unsigned long vaddr;

base_addr = instruction_pointer(regs) - tu->offset;
vaddr = base_addr + addr;

return (void __force __user *) vaddr;
}

When I tested it, it was able to fetch global and bss data from both of
executable and library properly.  But it still doesn't work for uretprobes
as you said before.

  # perf probe -x ./uprobe-test -a "t1=test1 bss=@0x203000:s32 
global=@0x201250:s32 str=@0x201254:string"
  # perf probe -x ./uprobe-test -a "t2=test2 bss=@0x203000:s32 
global=@0x201250:s32 str=@0x201254:string"
  # perf probe -x ./uprobe-test -a "t3=test3 bss=@0x203000:s32 
global=@0x201250:s32 str=@0x201254:string"
  # perf probe -x ./libfoo.so -a "t4=foo1 bar=@0x201258:s32 baz=@0x203000:s32"
  # perf probe -x ./libfoo.so -a "t5=foo2 bar=@0x201258:s32 baz=@0x203000:s32"
  # perf probe -x ./libfoo.so -a "t6=foo3 bar=@0x201258:s32 baz=@0x203000:s32"
  # perf record -e probe_uprobe:* -e probe_libfoo:* -- ./uprobe-test

  # perf script | grep -v ^#
 uprobe-test  2997 [002] 13108.308952: probe_uprobe:t1: (400660) bss=0 
global=1 str="hello uprobe"
 uprobe-test  2997 [002] 13108.322479: probe_uprobe:t2: (400666) bss=0 
global=2 str="hello uprobe"
 uprobe-test  2997 [002] 13108.335552: probe_uprobe:t3: (40066c) bss=1 
global=2 str="hello uprobe"
 uprobe-test  2997 [002] 13108.342182: probe_libfoo:t4: (7f5eb977b798) 
bar=7 baz=0
 uprobe-test  2997 [002] 13108.348982: probe_libfoo:t5: (7f5eb977b79e) 
bar=8 baz=0
 uprobe-test  2997 [002] 13108.356041: probe_libfoo:t6: (7f5eb977b7a4) 
bar=8 baz=9


As you can see symbol offset passed to the uprobes now look like 0x203000
since it's the difference to the base mapping address.  For a dso, it's same
as the symbol value, but for an executable the symbol value would be larger
value like 0x603000 since the text segment would be mapped to 0x40.
But still the difference is same, and I believe this applies to the
randomization too.

This symbol offset calculation was done in the getsymoff which implemented
like below (I'm sure there's a much simpler way to do this, but ...).

And I revised my toy test program like this:


/* - 8< - test.c - 8< - */
#include 
#include 

int global = 1;
char str[] = "hello uprobe";
int bss __attribute__((aligned(4096)));

/* this came from libfoo.so */
extern void foo(void);

void test1(void)
{
  /* only for adding probe */
}

void test2(void)
{
  /* only for adding probe */
}

void test3(void)
{
  /* only for adding probe */
}

int main(void)
{
  int local = 3;
  char buf[128];

  test1();
  global = 2;
  test2();
  bss = 1;
  test3();
  foo();
  //  snprintf(buf, sizeof(buf), "cat /proc/%d/maps", getpid());
  //  system(buf);
  return 0;
}


/* - 8< - foo.c - 8< - */
int bar = 7;
int baz __attribute__((aligned(4096)));

void foo1(void)
{
  /* only for adding probe */
}

void foo2(void)
{
  /* only for adding probe */
}

void foo3(void)
{
  /* only for adding probe */
}

void foo(void)
{
  foo1();
  bar = 8;
  foo2();
  baz = 9;
  foo3();
}


/* - 8< - Makefile - 8< - */
PERF=/home/namhyung/project/linux/tools/perf/perf
GETSYMOFF=./getsymoff

define make-args
$(eval ARG1 := $(shell echo "bss=@`${GETSYMOFF} uprobe-test bss`:s32"))
$(eval ARG2 := $(shell echo "global=@`${GETSYMOFF} uprobe-test global`:s32"))
$(eval ARG3 := $(shell echo "str=@`${GETSYMOFF} uprobe-test str`:string"))
$(eval ARG4 := $(shell echo "bar=@`${GETSYMOFF} libfoo.so bar`:s32"))
$(eval ARG5 := $(shell echo "baz=@`${GETSYMOFF} libfoo.so baz`:s32"))
endef

all: uprobe-test

uprobe-test: test.c foo.c
gcc -shared -g -fpic -o libfoo.so foo.c
gcc -g -o $@ test.c -Wl,-rpath,. -L. -lfoo

getsymoff: getsymoff.c
gcc -g -o $@ getsymoff.c -lelf

test: uprobe-test getsymoff
$(call make-args)
${PERF} probe -x ./uprobe-test -a "t1=test1 ${ARG1} ${ARG2} ${ARG3}"
${PERF} probe -x ./uprobe-test -a "t2=test2 ${ARG1} ${ARG2} ${ARG3}"
${PERF} probe -x ./uprobe-test -a "t3=test3 ${ARG1} ${ARG2} ${ARG3}"
${PERF} probe -x ./libfoo.so -a "t4=foo1 ${ARG4} ${ARG5}"
${PERF} probe -x ./libfoo.so -a "t5=foo2 ${ARG4} ${ARG5}"
${PERF} probe -x ./libfoo.so -a "t6=foo3 ${ARG4} ${ARG5}"
${PERF} record -e probe_uprobe:* -e probe_libfoo:* -- ./uprobe-test
${PERF} script | grep -v ^#
${PERF} probe -d probe_uprobe:*
${PERF} probe -d probe_libfoo:*

clean:
rm -f uprobe-test libfoo.so getsymoff *.o *~
#   ${PERF} probe -d probe_uprobe:* -d probe_libfoo:*


/* - 8< - getsymoff.c - 8< - */
#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 

struct sym {
unsigned long addr;
unsigned long size;
char *name;
};


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 4 Nov 2013 19:57:54 +0100, Oleg Nesterov wrote:
> On 11/04, Oleg Nesterov wrote:
>>
>> On 11/04, Oleg Nesterov wrote:
>> >
>> > On 11/04, Oleg Nesterov wrote:
>> > >
>> > > But in any case, I strongly believe that it doesn't make any sense to
>> > > rely on tu->inode in get_user_vaddr().
>> >
>> > Hmm. But I forgot about the case when you probe the function in libc
>> > and want to dump the variable in libc...
>> >
>> > So probably I was wrong and this all needs more thinking. Damn.
>> > Perhaps we really need to pass @file/offset, but it is not clear what
>> > we can do with bss/anon-mapping.
>>
>> Or. Not that I really like this, but just for discussion...
>>
>> How about
>>
>>  static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
>> addr)
>>  {
>>  return (void __force __user *)addr + instruction_pointer(regs);
>>  }
>>
>> ?
>>
>> This should solve the problems with relocations/randomization/bss.
>>
>> The obvious disadvantage is that it is not easy to calculate the
>> offset we need to pass as an argument, it depends on the probed
>> function.
>
> forgot to mention... and instruction_pointer() can't work in ret-probe,
> we need to pass the "unsigned long func" arg somehow...

Hmm.. what's the value of tu->offset in this case?  Does it have the
offset of the return address or the start of the function?

Anyway, what we really need is the base address of the text mapping
IMHO.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 4 Nov 2013 19:47:41 +0100, Oleg Nesterov wrote:
> On 11/04, Oleg Nesterov wrote:
>>
>> On 11/04, Oleg Nesterov wrote:
>> >
>> > But in any case, I strongly believe that it doesn't make any sense to
>> > rely on tu->inode in get_user_vaddr().
>>
>> Hmm. But I forgot about the case when you probe the function in libc
>> and want to dump the variable in libc...
>>
>> So probably I was wrong and this all needs more thinking. Damn.
>> Perhaps we really need to pass @file/offset, but it is not clear what
>> we can do with bss/anon-mapping.
>
> Or. Not that I really like this, but just for discussion...
>
> How about
>
>   static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
> addr)
>   {
>   return (void __force __user *)addr + instruction_pointer(regs);
>   }
>
> ?
>
> This should solve the problems with relocations/randomization/bss.

Right.  I think this approach is more reliable than playing with vma.

>
> The obvious disadvantage is that it is not easy to calculate the
> offset we need to pass as an argument, it depends on the probed
> function.

Well, maybe it's not that hard if we use symbol address in elf image
rather than file offset for the data.

IOW we can get the base mapping address by subtracting tu->offset from
instruction pointer.  And then adding symbol address of the data should
give us the final virtual address, yay!

I'll try to play with it after lunch.

>
> And this still doesn't allow to, say, probe the executable but read
> the data from libc. Unless, again, we attach to the running process
> or randomize_va_space = 0, so we can know it in advance. But otherwise
> I do not think there is any solution.

Yes, cross-fetching is hard, let's go lunch. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 4 Nov 2013 17:22:29 +0100, Oleg Nesterov wrote:
> On 11/04, Oleg Nesterov wrote:
>>
>> But in any case, I strongly believe that it doesn't make any sense to
>> rely on tu->inode in get_user_vaddr().
>
> Hmm. But I forgot about the case when you probe the function in libc
> and want to dump the variable in libc...

Right.  Actually that's what I really wanted.

>
> So probably I was wrong and this all needs more thinking. Damn.

:)

> Perhaps we really need to pass @file/offset, but it is not clear what
> we can do with bss/anon-mapping.

The @file/offset should work with bss since data in bss is accessed via
an offset in the program, but still anon-mapping has nothing to do with
it.  Hmm...

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 4 Nov 2013 16:51:31 +0100, Oleg Nesterov wrote:
> On 11/04, Namhyung Kim wrote:
>> On Mon, 04 Nov 2013 17:46:41 +0900, Namhyung Kim wrote:
>> > On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
>> >> - this only allows to read the data from the same binary.
>> >
>> > Right.  This is also an unnecessary restriction.  We should be able to
>> > access data in other binary.
>>
>> Hmm.. I guess this gonna be not simple
>
> Yes ;)
>
> - perhaps it can only be
>> supported for per-process uprobe with known virtual addresses?
>
> "Known" is very limited. Even in the simplest case (like your test-case
> from from 0/13), you simply can't know the address of "int glob" if you
> compile it with "-pie -fPIC".
>
> As for other binaries (say libc) the problem is even worse, and
> randomize_va_space adds even more pain.

Hmm.. right.  We should deal with relative addresses from the base
mapping address of a binary.

>
> But in any case, I strongly believe that it doesn't make any sense to
> rely on tu->inode in get_user_vaddr().

I'll think about it more.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 4 Nov 2013 16:01:12 +0100, Oleg Nesterov wrote:
> On 11/04, Namhyung Kim wrote:
>>
>> On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
>> >
>> > This does not look right to me.
>> >
>> > - get_user_vaddr() is costly, it does vma_interval_tree_foreach() under
>> >   ->i_mmap_mutex.
>>
>> Hmm.. yes, I think this is not needed.  I guess it should lookup a
>> proper vma in current->mm with mmap_sem read-locked.
>>
>> >
>> > - this only allows to read the data from the same binary.
>>
>> Right.  This is also an unnecessary restriction.  We should be able to
>> access data in other binary.
>
> Yes... but this needs another discussion. In general, we simply can not
> do this with the suggested syntax.

Agreed.

>
> Say you want to probe this "foo" binary and dump "stdin" from libc.so.
> You can't do this. You simply can't know where libc.so will be mmaped.
>
> But: if we attach the event to the already running process, or if we
> disable the randomization, then we can probably do this, see below.
>
> Or the syntax should be "name=probe @file/addr" or something like this.

Okay.  Let's call this kind of thing "cross-fetch" (or a better name can
be suggested).  This is more complex situation and needs more discussion
as you said.  So let's skip the discussion for now. :)

>
>> > - in particular, you can't read the data from bss
>>
>> I can't understand why..  The bss region should also be in a same vma of
>> normal data, no?
>
> No, no. bss is mmaped anonymously, at least in general. See set_brk() in
> load_elf().

Ah, thanks for the pointer.  I also need to say that I'm not familiar
with the code base.

Looking at the code, it seems to add a anon mapping iff the bss region
spans on two or more pages - that's why I missed it from my simple
test. :/

>
>> I thought the gcc somehow aligns data to next page boundary.
>
> And perhaps it even should, my system is old. But this doesn't really
> matter, the process itself can create another mapping.

Right.

>
>> But if
>> it's not the case, we need to recognize which is the proper one..
>>
>> Simply preferring a writable vma to a read-only vma is what's came to my
>> head now.  Do you have an idea?
>
> So far I think that trace_uprobes.c should not play games with vma. At all.

Yes, playing with vma is fragile.  But otherwise how can we get the
address from the file+offset in random processes?

>
>> > ---
>> > Can't we simply implement get_user_vaddr() as
>> >
>> >static void __user *get_user_vaddr(unsigned long addr, struct 
>> > trace_uprobe *tu)
>> >{
>> >void __user *vaddr = (void __force __user *)addr;
>> >
>> >/* A NULL tu means that we already got the vaddr */
>> >if (tu)
>> >vaddr += (current->mm->start_data & PAGE_MASK);
>> >
>> >return vaddr;
>> >}
>> >
>> > ?
>> >
>> > I did this change, and now the test-case above works. And it also works
>> > with "cc -pie -fPIC",
>> >
>> ># nm foo | grep -w global
>> >00200c9c D global
>> >
>> ># perf probe -x ./foo -a "func var=@0xc9c:u32"
>> ># perf record -e probe_foo:func ./foo
>> >...
>> ># perf script | tail -1
>> >foo   576 [001]   475.519940: probe_foo:func: (7ffe95ca3814) 
>> > var=4321
>> >
>> > What do you think?
>>
>> This can only work with the probes fetching data from the executable,
>> right?  But as I said it should support any other binaries too.
>
> See above, we can't in general read other binaries.

Okay, I need to clarify my words.  I'm not saying about "cross-fetch"
here, what I wanted to say is adding a probe in some dso and fetch data
from the dso.

Primary usecase I have in mind is supporting SDTs in the perf probe
tool.  Currently many libraries including glibc add tracepoints (SDTs)
within themselves to be traced/profilied easily.

You can see Hemant's work on this here:

  https://lkml.org/lkml/2013/10/18/274

>
> But: if we know know where it is mmapped we can do this, just we need
> to calculate the right addr passed to trace_uprobes.
>
> Or: we should support both absolute and relative addresses, this is what
> I was going to discuss later.

But I guess this "specifying address directly" is hard to apply to
multiple processes - like system-wide tracing in perf.

>
>> static void __user *get_user_vaddr(unsigned long addr, struct trace_uprobe 
>> *tu)
>> {
>>  unsigned long pgoff = addr >> PAGE_SHIFT;
>>  struct vm_area_struct *vma, *orig_vma = NULL;
>>  unsigned long vaddr = 0;
>>
>>  if (tu == NULL) {
>>  /* A NULL tu means that we already got the vaddr */
>>  return (void __force __user *) addr;
>>  }
>>
>>  down_read(>mm->mmap_sem);
>>
>>  vma = current->mm->mmap;
>
> Cough, it can be null if another thread does munmap(0, TASK_SIZE) ;)
>
> But this doesn't matter.

:)

>
>>  do {
>>  if (!vma->vm_file || 

Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Oleg Nesterov
On 11/04, Oleg Nesterov wrote:
>
> On 11/04, Oleg Nesterov wrote:
> >
> > On 11/04, Oleg Nesterov wrote:
> > >
> > > But in any case, I strongly believe that it doesn't make any sense to
> > > rely on tu->inode in get_user_vaddr().
> >
> > Hmm. But I forgot about the case when you probe the function in libc
> > and want to dump the variable in libc...
> >
> > So probably I was wrong and this all needs more thinking. Damn.
> > Perhaps we really need to pass @file/offset, but it is not clear what
> > we can do with bss/anon-mapping.
>
> Or. Not that I really like this, but just for discussion...
>
> How about
>
>   static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
> addr)
>   {
>   return (void __force __user *)addr + instruction_pointer(regs);
>   }
>
> ?
>
> This should solve the problems with relocations/randomization/bss.
>
> The obvious disadvantage is that it is not easy to calculate the
> offset we need to pass as an argument, it depends on the probed
> function.

forgot to mention... and instruction_pointer() can't work in ret-probe,
we need to pass the "unsigned long func" arg somehow...

>
> And this still doesn't allow to, say, probe the executable but read
> the data from libc. Unless, again, we attach to the running process
> or randomize_va_space = 0, so we can know it in advance. But otherwise
> I do not think there is any solution.
>
> Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Oleg Nesterov
On 11/04, Oleg Nesterov wrote:
>
> On 11/04, Oleg Nesterov wrote:
> >
> > But in any case, I strongly believe that it doesn't make any sense to
> > rely on tu->inode in get_user_vaddr().
>
> Hmm. But I forgot about the case when you probe the function in libc
> and want to dump the variable in libc...
>
> So probably I was wrong and this all needs more thinking. Damn.
> Perhaps we really need to pass @file/offset, but it is not clear what
> we can do with bss/anon-mapping.

Or. Not that I really like this, but just for discussion...

How about

static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
addr)
{
return (void __force __user *)addr + instruction_pointer(regs);
}

?

This should solve the problems with relocations/randomization/bss.

The obvious disadvantage is that it is not easy to calculate the
offset we need to pass as an argument, it depends on the probed
function.

And this still doesn't allow to, say, probe the executable but read
the data from libc. Unless, again, we attach to the running process
or randomize_va_space = 0, so we can know it in advance. But otherwise
I do not think there is any solution.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Oleg Nesterov
On 11/04, Oleg Nesterov wrote:
>
> But in any case, I strongly believe that it doesn't make any sense to
> rely on tu->inode in get_user_vaddr().

Hmm. But I forgot about the case when you probe the function in libc
and want to dump the variable in libc...

So probably I was wrong and this all needs more thinking. Damn.
Perhaps we really need to pass @file/offset, but it is not clear what
we can do with bss/anon-mapping.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Oleg Nesterov
On 11/04, Namhyung Kim wrote:
> On Mon, 04 Nov 2013 17:46:41 +0900, Namhyung Kim wrote:
> > On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
> >> - this only allows to read the data from the same binary.
> >
> > Right.  This is also an unnecessary restriction.  We should be able to
> > access data in other binary.
>
> Hmm.. I guess this gonna be not simple

Yes ;)

- perhaps it can only be
> supported for per-process uprobe with known virtual addresses?

"Known" is very limited. Even in the simplest case (like your test-case
from from 0/13), you simply can't know the address of "int glob" if you
compile it with "-pie -fPIC".

As for other binaries (say libc) the problem is even worse, and
randomize_va_space adds even more pain.

But in any case, I strongly believe that it doesn't make any sense to
rely on tu->inode in get_user_vaddr().

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Oleg Nesterov
On 11/04, Namhyung Kim wrote:
>
> On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
> >
> > This does not look right to me.
> >
> > - get_user_vaddr() is costly, it does vma_interval_tree_foreach() under
> >   ->i_mmap_mutex.
>
> Hmm.. yes, I think this is not needed.  I guess it should lookup a
> proper vma in current->mm with mmap_sem read-locked.
>
> >
> > - this only allows to read the data from the same binary.
>
> Right.  This is also an unnecessary restriction.  We should be able to
> access data in other binary.

Yes... but this needs another discussion. In general, we simply can not
do this with the suggested syntax.

Say you want to probe this "foo" binary and dump "stdin" from libc.so.
You can't do this. You simply can't know where libc.so will be mmaped.

But: if we attach the event to the already running process, or if we
disable the randomization, then we can probably do this, see below.

Or the syntax should be "name=probe @file/addr" or something like this.

> > - in particular, you can't read the data from bss
>
> I can't understand why..  The bss region should also be in a same vma of
> normal data, no?

No, no. bss is mmaped anonymously, at least in general. See set_brk() in
load_elf().

> > #include 
> > #include 
> > #include 
> >
> > unsigned int global = 0x1234;
> >
> > void func(void)
> > {
> > }
> >
> > int main(void)
> > {
> > char cmd[64];
> >
> > global = 0x4321;
> > func();
> >
> > printf("addr = %p\n", );
> >
> > sprintf(cmd, "cat /proc/%d/maps", getpid());
> > system(cmd);
> >
> > return 0;
> > }
> >
> > # nm foo | grep -w global
> > 00600a04 D global
> >
> > # perf probe -x ./foo -a "func var=@0xa04:u32"
> > # perf record -e probe_foo:func ./foo
> > addr = 0x600a04
> > 0040-00401000 r-xp  fe:01 20958 
> >  /root/foo
> > 0060-00601000 rw-p  fe:01 20958 
> >  /root/foo
> > ...
> >
> > # perf script | tail -1
> > foo   555 [000]  1302.345642: probe_foo:func: (40059c) var=1234
> >
> > Note that it reports "1234", not "4321". This is because
> > get_user_vaddr() finds another (1st) read-only mapping, and
> > prints the initial value of "global".
> >
> > IOW, it reads the memory from 0x400a04, not from 0x600a04.
>
> Argh..  This is a problem.
>
> I thought the gcc somehow aligns data to next page boundary.

And perhaps it even should, my system is old. But this doesn't really
matter, the process itself can create another mapping.

> But if
> it's not the case, we need to recognize which is the proper one..
>
> Simply preferring a writable vma to a read-only vma is what's came to my
> head now.  Do you have an idea?

So far I think that trace_uprobes.c should not play games with vma. At all.

> > ---
> > Can't we simply implement get_user_vaddr() as
> >
> > static void __user *get_user_vaddr(unsigned long addr, struct 
> > trace_uprobe *tu)
> > {
> > void __user *vaddr = (void __force __user *)addr;
> >
> > /* A NULL tu means that we already got the vaddr */
> > if (tu)
> > vaddr += (current->mm->start_data & PAGE_MASK);
> >
> > return vaddr;
> > }
> >
> > ?
> >
> > I did this change, and now the test-case above works. And it also works
> > with "cc -pie -fPIC",
> >
> > # nm foo | grep -w global
> > 00200c9c D global
> >
> > # perf probe -x ./foo -a "func var=@0xc9c:u32"
> > # perf record -e probe_foo:func ./foo
> > ...
> > # perf script | tail -1
> > foo   576 [001]   475.519940: probe_foo:func: (7ffe95ca3814) 
> > var=4321
> >
> > What do you think?
>
> This can only work with the probes fetching data from the executable,
> right?  But as I said it should support any other binaries too.

See above, we can't in general read other binaries.

But: if we know know where it is mmapped we can do this, just we need
to calculate the right addr passed to trace_uprobes.

Or: we should support both absolute and relative addresses, this is what
I was going to discuss later.

> static void __user *get_user_vaddr(unsigned long addr, struct trace_uprobe 
> *tu)
> {
>   unsigned long pgoff = addr >> PAGE_SHIFT;
>   struct vm_area_struct *vma, *orig_vma = NULL;
>   unsigned long vaddr = 0;
>
>   if (tu == NULL) {
>   /* A NULL tu means that we already got the vaddr */
>   return (void __force __user *) addr;
>   }
>
>   down_read(>mm->mmap_sem);
>
>   vma = current->mm->mmap;

Cough, it can be null if another thread does munmap(0, TASK_SIZE) ;)

But this doesn't matter.

>   do {
>   if (!vma->vm_file || vma->vm_file->f_inode != tu->inode) {
>  

Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 04 Nov 2013 17:46:41 +0900, Namhyung Kim wrote:
> On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
>> - this only allows to read the data from the same binary.
>
> Right.  This is also an unnecessary restriction.  We should be able to
> access data in other binary.

Hmm.. I guess this gonna be not simple - perhaps it can only be
supported for per-process uprobe with known virtual addresses?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
> Hello,
>
> Let me first apologize again if this was already discussed. And I also
> need to mention that I know almost nothing about elf/randomization/etc.

No no, this was not discussed enough.  And I really appreciate your
thorough review! :)

>
> However,
>
> On 10/29, Namhyung Kim wrote:
>>
>>   # nm foo | grep -e glob$ -e str -e foo
>>   006008bc D foo
>>   006008a8 D glob
>>   006008ac D str
>>
>>   # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \
>
> This does not look right to me.
>
> - get_user_vaddr() is costly, it does vma_interval_tree_foreach() under
>   ->i_mmap_mutex.

Hmm.. yes, I think this is not needed.  I guess it should lookup a
proper vma in current->mm with mmap_sem read-locked.

>
> - this only allows to read the data from the same binary.

Right.  This is also an unnecessary restriction.  We should be able to
access data in other binary.

>
> - in particular, you can't read the data from bss

I can't understand why..  The bss region should also be in a same vma of
normal data, no?

>
> - get_user_vaddr() looks simply wrong. I blindly applied the whole series
>   and did the test to ensure.
>
>   Test-case:
>
>   #include 
>   #include 
>   #include 
>
>   unsigned int global = 0x1234;
>
>   void func(void)
>   {
>   }
>
>   int main(void)
>   {
>   char cmd[64];
>
>   global = 0x4321;
>   func();
>
>   printf("addr = %p\n", );
>
>   sprintf(cmd, "cat /proc/%d/maps", getpid());
>   system(cmd);
>
>   return 0;
>   }
>
>   # nm foo | grep -w global
>   00600a04 D global
>
>   # perf probe -x ./foo -a "func var=@0xa04:u32"
>   # perf record -e probe_foo:func ./foo
>   addr = 0x600a04
>   0040-00401000 r-xp  fe:01 20958 
>  /root/foo
>   0060-00601000 rw-p  fe:01 20958 
>  /root/foo
>   ...
>
>   # perf script | tail -1
>   foo   555 [000]  1302.345642: probe_foo:func: (40059c) var=1234
>
>   Note that it reports "1234", not "4321". This is because
>   get_user_vaddr() finds another (1st) read-only mapping, and
>   prints the initial value of "global".
>
>   IOW, it reads the memory from 0x400a04, not from 0x600a04.

Argh..  This is a problem.

I thought the gcc somehow aligns data to next page boundary.  But if
it's not the case, we need to recognize which is the proper one..

Simply preferring a writable vma to a read-only vma is what's came to my
head now.  Do you have an idea?

>
> ---
> Can't we simply implement get_user_vaddr() as
>
>   static void __user *get_user_vaddr(unsigned long addr, struct 
> trace_uprobe *tu)
>   {
>   void __user *vaddr = (void __force __user *)addr;
>
>   /* A NULL tu means that we already got the vaddr */
>   if (tu)
>   vaddr += (current->mm->start_data & PAGE_MASK);
>
>   return vaddr;
>   }
>
> ?
>
> I did this change, and now the test-case above works. And it also works
> with "cc -pie -fPIC",
>
>   # nm foo | grep -w global
>   00200c9c D global
>
>   # perf probe -x ./foo -a "func var=@0xc9c:u32"
>   # perf record -e probe_foo:func ./foo
>   ...
>   # perf script | tail -1
>   foo   576 [001]   475.519940: probe_foo:func: (7ffe95ca3814) 
> var=4321
>
> What do you think?

This can only work with the probes fetching data from the executable,
right?  But as I said it should support any other binaries too.

What about this?


static void __user *get_user_vaddr(unsigned long addr, struct trace_uprobe *tu)
{
unsigned long pgoff = addr >> PAGE_SHIFT;
struct vm_area_struct *vma, *orig_vma = NULL;
unsigned long vaddr = 0;

if (tu == NULL) {
/* A NULL tu means that we already got the vaddr */
return (void __force __user *) addr;
}

down_read(>mm->mmap_sem);

vma = current->mm->mmap;
do {
if (!vma->vm_file || vma->vm_file->f_inode != tu->inode) {
/*
 * We found read-only mapping for this inode.
 * (provided that all mappings for this inode
 * have consecutive addresses)
 */
if (orig_vma)
break;
continue;
}

if (vma->vm_pgoff > pgoff ||
(vma->vm_pgoff + vma_pages(vma) <= pgoff))
continue;

orig_vma = vma;

/*
 * We prefer writable mapping over read-only 

Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Oleg Nesterov
On 11/04, Oleg Nesterov wrote:

 But in any case, I strongly believe that it doesn't make any sense to
 rely on tu-inode in get_user_vaddr().

Hmm. But I forgot about the case when you probe the function in libc
and want to dump the variable in libc...

So probably I was wrong and this all needs more thinking. Damn.
Perhaps we really need to pass @file/offset, but it is not clear what
we can do with bss/anon-mapping.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Oleg Nesterov
On 11/04, Oleg Nesterov wrote:

 On 11/04, Oleg Nesterov wrote:
 
  But in any case, I strongly believe that it doesn't make any sense to
  rely on tu-inode in get_user_vaddr().

 Hmm. But I forgot about the case when you probe the function in libc
 and want to dump the variable in libc...

 So probably I was wrong and this all needs more thinking. Damn.
 Perhaps we really need to pass @file/offset, but it is not clear what
 we can do with bss/anon-mapping.

Or. Not that I really like this, but just for discussion...

How about

static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
addr)
{
return (void __force __user *)addr + instruction_pointer(regs);
}

?

This should solve the problems with relocations/randomization/bss.

The obvious disadvantage is that it is not easy to calculate the
offset we need to pass as an argument, it depends on the probed
function.

And this still doesn't allow to, say, probe the executable but read
the data from libc. Unless, again, we attach to the running process
or randomize_va_space = 0, so we can know it in advance. But otherwise
I do not think there is any solution.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Oleg Nesterov
On 11/04, Oleg Nesterov wrote:

 On 11/04, Oleg Nesterov wrote:
 
  On 11/04, Oleg Nesterov wrote:
  
   But in any case, I strongly believe that it doesn't make any sense to
   rely on tu-inode in get_user_vaddr().
 
  Hmm. But I forgot about the case when you probe the function in libc
  and want to dump the variable in libc...
 
  So probably I was wrong and this all needs more thinking. Damn.
  Perhaps we really need to pass @file/offset, but it is not clear what
  we can do with bss/anon-mapping.

 Or. Not that I really like this, but just for discussion...

 How about

   static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
 addr)
   {
   return (void __force __user *)addr + instruction_pointer(regs);
   }

 ?

 This should solve the problems with relocations/randomization/bss.

 The obvious disadvantage is that it is not easy to calculate the
 offset we need to pass as an argument, it depends on the probed
 function.

forgot to mention... and instruction_pointer() can't work in ret-probe,
we need to pass the unsigned long func arg somehow...


 And this still doesn't allow to, say, probe the executable but read
 the data from libc. Unless, again, we attach to the running process
 or randomize_va_space = 0, so we can know it in advance. But otherwise
 I do not think there is any solution.

 Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 4 Nov 2013 16:01:12 +0100, Oleg Nesterov wrote:
 On 11/04, Namhyung Kim wrote:

 On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
 
  This does not look right to me.
 
  - get_user_vaddr() is costly, it does vma_interval_tree_foreach() under
-i_mmap_mutex.

 Hmm.. yes, I think this is not needed.  I guess it should lookup a
 proper vma in current-mm with mmap_sem read-locked.

 
  - this only allows to read the data from the same binary.

 Right.  This is also an unnecessary restriction.  We should be able to
 access data in other binary.

 Yes... but this needs another discussion. In general, we simply can not
 do this with the suggested syntax.

Agreed.


 Say you want to probe this foo binary and dump stdin from libc.so.
 You can't do this. You simply can't know where libc.so will be mmaped.

 But: if we attach the event to the already running process, or if we
 disable the randomization, then we can probably do this, see below.

 Or the syntax should be name=probe @file/addr or something like this.

Okay.  Let's call this kind of thing cross-fetch (or a better name can
be suggested).  This is more complex situation and needs more discussion
as you said.  So let's skip the discussion for now. :)


  - in particular, you can't read the data from bss

 I can't understand why..  The bss region should also be in a same vma of
 normal data, no?

 No, no. bss is mmaped anonymously, at least in general. See set_brk() in
 load_elf().

Ah, thanks for the pointer.  I also need to say that I'm not familiar
with the code base.

Looking at the code, it seems to add a anon mapping iff the bss region
spans on two or more pages - that's why I missed it from my simple
test. :/


 I thought the gcc somehow aligns data to next page boundary.

 And perhaps it even should, my system is old. But this doesn't really
 matter, the process itself can create another mapping.

Right.


 But if
 it's not the case, we need to recognize which is the proper one..

 Simply preferring a writable vma to a read-only vma is what's came to my
 head now.  Do you have an idea?

 So far I think that trace_uprobes.c should not play games with vma. At all.

Yes, playing with vma is fragile.  But otherwise how can we get the
address from the file+offset in random processes?


  ---
  Can't we simply implement get_user_vaddr() as
 
 static void __user *get_user_vaddr(unsigned long addr, struct 
  trace_uprobe *tu)
 {
 void __user *vaddr = (void __force __user *)addr;
 
 /* A NULL tu means that we already got the vaddr */
 if (tu)
 vaddr += (current-mm-start_data  PAGE_MASK);
 
 return vaddr;
 }
 
  ?
 
  I did this change, and now the test-case above works. And it also works
  with cc -pie -fPIC,
 
 # nm foo | grep -w global
 00200c9c D global
 
 # perf probe -x ./foo -a func var=@0xc9c:u32
 # perf record -e probe_foo:func ./foo
 ...
 # perf script | tail -1
 foo   576 [001]   475.519940: probe_foo:func: (7ffe95ca3814) 
  var=4321
 
  What do you think?

 This can only work with the probes fetching data from the executable,
 right?  But as I said it should support any other binaries too.

 See above, we can't in general read other binaries.

Okay, I need to clarify my words.  I'm not saying about cross-fetch
here, what I wanted to say is adding a probe in some dso and fetch data
from the dso.

Primary usecase I have in mind is supporting SDTs in the perf probe
tool.  Currently many libraries including glibc add tracepoints (SDTs)
within themselves to be traced/profilied easily.

You can see Hemant's work on this here:

  https://lkml.org/lkml/2013/10/18/274


 But: if we know know where it is mmapped we can do this, just we need
 to calculate the right addr passed to trace_uprobes.

 Or: we should support both absolute and relative addresses, this is what
 I was going to discuss later.

But I guess this specifying address directly is hard to apply to
multiple processes - like system-wide tracing in perf.


 static void __user *get_user_vaddr(unsigned long addr, struct trace_uprobe 
 *tu)
 {
  unsigned long pgoff = addr  PAGE_SHIFT;
  struct vm_area_struct *vma, *orig_vma = NULL;
  unsigned long vaddr = 0;

  if (tu == NULL) {
  /* A NULL tu means that we already got the vaddr */
  return (void __force __user *) addr;
  }

  down_read(current-mm-mmap_sem);

  vma = current-mm-mmap;

 Cough, it can be null if another thread does munmap(0, TASK_SIZE) ;)

 But this doesn't matter.

:)


  do {
  if (!vma-vm_file || vma-vm_file-f_inode != tu-inode) {
  /*
   * We found read-only mapping for this inode.
   * (provided that all mappings for this inode
   * have consecutive addresses)
  

Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 4 Nov 2013 16:51:31 +0100, Oleg Nesterov wrote:
 On 11/04, Namhyung Kim wrote:
 On Mon, 04 Nov 2013 17:46:41 +0900, Namhyung Kim wrote:
  On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
  - this only allows to read the data from the same binary.
 
  Right.  This is also an unnecessary restriction.  We should be able to
  access data in other binary.

 Hmm.. I guess this gonna be not simple

 Yes ;)

 - perhaps it can only be
 supported for per-process uprobe with known virtual addresses?

 Known is very limited. Even in the simplest case (like your test-case
 from from 0/13), you simply can't know the address of int glob if you
 compile it with -pie -fPIC.

 As for other binaries (say libc) the problem is even worse, and
 randomize_va_space adds even more pain.

Hmm.. right.  We should deal with relative addresses from the base
mapping address of a binary.


 But in any case, I strongly believe that it doesn't make any sense to
 rely on tu-inode in get_user_vaddr().

I'll think about it more.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 4 Nov 2013 17:22:29 +0100, Oleg Nesterov wrote:
 On 11/04, Oleg Nesterov wrote:

 But in any case, I strongly believe that it doesn't make any sense to
 rely on tu-inode in get_user_vaddr().

 Hmm. But I forgot about the case when you probe the function in libc
 and want to dump the variable in libc...

Right.  Actually that's what I really wanted.


 So probably I was wrong and this all needs more thinking. Damn.

:)

 Perhaps we really need to pass @file/offset, but it is not clear what
 we can do with bss/anon-mapping.

The @file/offset should work with bss since data in bss is accessed via
an offset in the program, but still anon-mapping has nothing to do with
it.  Hmm...

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 4 Nov 2013 19:47:41 +0100, Oleg Nesterov wrote:
 On 11/04, Oleg Nesterov wrote:

 On 11/04, Oleg Nesterov wrote:
 
  But in any case, I strongly believe that it doesn't make any sense to
  rely on tu-inode in get_user_vaddr().

 Hmm. But I forgot about the case when you probe the function in libc
 and want to dump the variable in libc...

 So probably I was wrong and this all needs more thinking. Damn.
 Perhaps we really need to pass @file/offset, but it is not clear what
 we can do with bss/anon-mapping.

 Or. Not that I really like this, but just for discussion...

 How about

   static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
 addr)
   {
   return (void __force __user *)addr + instruction_pointer(regs);
   }

 ?

 This should solve the problems with relocations/randomization/bss.

Right.  I think this approach is more reliable than playing with vma.


 The obvious disadvantage is that it is not easy to calculate the
 offset we need to pass as an argument, it depends on the probed
 function.

Well, maybe it's not that hard if we use symbol address in elf image
rather than file offset for the data.

IOW we can get the base mapping address by subtracting tu-offset from
instruction pointer.  And then adding symbol address of the data should
give us the final virtual address, yay!

I'll try to play with it after lunch.


 And this still doesn't allow to, say, probe the executable but read
 the data from libc. Unless, again, we attach to the running process
 or randomize_va_space = 0, so we can know it in advance. But otherwise
 I do not think there is any solution.

Yes, cross-fetching is hard, let's go lunch. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 4 Nov 2013 19:57:54 +0100, Oleg Nesterov wrote:
 On 11/04, Oleg Nesterov wrote:

 On 11/04, Oleg Nesterov wrote:
 
  On 11/04, Oleg Nesterov wrote:
  
   But in any case, I strongly believe that it doesn't make any sense to
   rely on tu-inode in get_user_vaddr().
 
  Hmm. But I forgot about the case when you probe the function in libc
  and want to dump the variable in libc...
 
  So probably I was wrong and this all needs more thinking. Damn.
  Perhaps we really need to pass @file/offset, but it is not clear what
  we can do with bss/anon-mapping.

 Or. Not that I really like this, but just for discussion...

 How about

  static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long 
 addr)
  {
  return (void __force __user *)addr + instruction_pointer(regs);
  }

 ?

 This should solve the problems with relocations/randomization/bss.

 The obvious disadvantage is that it is not easy to calculate the
 offset we need to pass as an argument, it depends on the probed
 function.

 forgot to mention... and instruction_pointer() can't work in ret-probe,
 we need to pass the unsigned long func arg somehow...

Hmm.. what's the value of tu-offset in this case?  Does it have the
offset of the return address or the start of the function?

Anyway, what we really need is the base address of the text mapping
IMHO.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
This is what I have for now:

static void __user *get_user_vaddr(struct pt_regs *regs, unsigned long addr,
   struct trace_uprobe *tu)
{
unsigned long base_addr;
unsigned long vaddr;

base_addr = instruction_pointer(regs) - tu-offset;
vaddr = base_addr + addr;

return (void __force __user *) vaddr;
}

When I tested it, it was able to fetch global and bss data from both of
executable and library properly.  But it still doesn't work for uretprobes
as you said before.

  # perf probe -x ./uprobe-test -a t1=test1 bss=@0x203000:s32 
global=@0x201250:s32 str=@0x201254:string
  # perf probe -x ./uprobe-test -a t2=test2 bss=@0x203000:s32 
global=@0x201250:s32 str=@0x201254:string
  # perf probe -x ./uprobe-test -a t3=test3 bss=@0x203000:s32 
global=@0x201250:s32 str=@0x201254:string
  # perf probe -x ./libfoo.so -a t4=foo1 bar=@0x201258:s32 baz=@0x203000:s32
  # perf probe -x ./libfoo.so -a t5=foo2 bar=@0x201258:s32 baz=@0x203000:s32
  # perf probe -x ./libfoo.so -a t6=foo3 bar=@0x201258:s32 baz=@0x203000:s32
  # perf record -e probe_uprobe:* -e probe_libfoo:* -- ./uprobe-test

  # perf script | grep -v ^#
 uprobe-test  2997 [002] 13108.308952: probe_uprobe:t1: (400660) bss=0 
global=1 str=hello uprobe
 uprobe-test  2997 [002] 13108.322479: probe_uprobe:t2: (400666) bss=0 
global=2 str=hello uprobe
 uprobe-test  2997 [002] 13108.335552: probe_uprobe:t3: (40066c) bss=1 
global=2 str=hello uprobe
 uprobe-test  2997 [002] 13108.342182: probe_libfoo:t4: (7f5eb977b798) 
bar=7 baz=0
 uprobe-test  2997 [002] 13108.348982: probe_libfoo:t5: (7f5eb977b79e) 
bar=8 baz=0
 uprobe-test  2997 [002] 13108.356041: probe_libfoo:t6: (7f5eb977b7a4) 
bar=8 baz=9


As you can see symbol offset passed to the uprobes now look like 0x203000
since it's the difference to the base mapping address.  For a dso, it's same
as the symbol value, but for an executable the symbol value would be larger
value like 0x603000 since the text segment would be mapped to 0x40.
But still the difference is same, and I believe this applies to the
randomization too.

This symbol offset calculation was done in the getsymoff which implemented
like below (I'm sure there's a much simpler way to do this, but ...).

And I revised my toy test program like this:


/* - 8 - test.c - 8 - */
#include stdio.h
#include stdlib.h

int global = 1;
char str[] = hello uprobe;
int bss __attribute__((aligned(4096)));

/* this came from libfoo.so */
extern void foo(void);

void test1(void)
{
  /* only for adding probe */
}

void test2(void)
{
  /* only for adding probe */
}

void test3(void)
{
  /* only for adding probe */
}

int main(void)
{
  int local = 3;
  char buf[128];

  test1();
  global = 2;
  test2();
  bss = 1;
  test3();
  foo();
  //  snprintf(buf, sizeof(buf), cat /proc/%d/maps, getpid());
  //  system(buf);
  return 0;
}


/* - 8 - foo.c - 8 - */
int bar = 7;
int baz __attribute__((aligned(4096)));

void foo1(void)
{
  /* only for adding probe */
}

void foo2(void)
{
  /* only for adding probe */
}

void foo3(void)
{
  /* only for adding probe */
}

void foo(void)
{
  foo1();
  bar = 8;
  foo2();
  baz = 9;
  foo3();
}


/* - 8 - Makefile - 8 - */
PERF=/home/namhyung/project/linux/tools/perf/perf
GETSYMOFF=./getsymoff

define make-args
$(eval ARG1 := $(shell echo bss=@`${GETSYMOFF} uprobe-test bss`:s32))
$(eval ARG2 := $(shell echo global=@`${GETSYMOFF} uprobe-test global`:s32))
$(eval ARG3 := $(shell echo str=@`${GETSYMOFF} uprobe-test str`:string))
$(eval ARG4 := $(shell echo bar=@`${GETSYMOFF} libfoo.so bar`:s32))
$(eval ARG5 := $(shell echo baz=@`${GETSYMOFF} libfoo.so baz`:s32))
endef

all: uprobe-test

uprobe-test: test.c foo.c
gcc -shared -g -fpic -o libfoo.so foo.c
gcc -g -o $@ test.c -Wl,-rpath,. -L. -lfoo

getsymoff: getsymoff.c
gcc -g -o $@ getsymoff.c -lelf

test: uprobe-test getsymoff
$(call make-args)
${PERF} probe -x ./uprobe-test -a t1=test1 ${ARG1} ${ARG2} ${ARG3}
${PERF} probe -x ./uprobe-test -a t2=test2 ${ARG1} ${ARG2} ${ARG3}
${PERF} probe -x ./uprobe-test -a t3=test3 ${ARG1} ${ARG2} ${ARG3}
${PERF} probe -x ./libfoo.so -a t4=foo1 ${ARG4} ${ARG5}
${PERF} probe -x ./libfoo.so -a t5=foo2 ${ARG4} ${ARG5}
${PERF} probe -x ./libfoo.so -a t6=foo3 ${ARG4} ${ARG5}
${PERF} record -e probe_uprobe:* -e probe_libfoo:* -- ./uprobe-test
${PERF} script | grep -v ^#
${PERF} probe -d probe_uprobe:*
${PERF} probe -d probe_libfoo:*

clean:
rm -f uprobe-test libfoo.so getsymoff *.o *~
#   ${PERF} probe -d probe_uprobe:* -d probe_libfoo:*


/* - 8 - getsymoff.c - 8 - */
#define _GNU_SOURCE
#include stdio.h
#include stdlib.h
#include string.h
#include fcntl.h
#include errno.h
#include gelf.h

struct sym {
unsigned long addr;
unsigned long size;
char *name;
};


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
 Hello,

 Let me first apologize again if this was already discussed. And I also
 need to mention that I know almost nothing about elf/randomization/etc.

No no, this was not discussed enough.  And I really appreciate your
thorough review! :)


 However,

 On 10/29, Namhyung Kim wrote:

   # nm foo | grep -e glob$ -e str -e foo
   006008bc D foo
   006008a8 D glob
   006008ac D str

   # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \

 This does not look right to me.

 - get_user_vaddr() is costly, it does vma_interval_tree_foreach() under
   -i_mmap_mutex.

Hmm.. yes, I think this is not needed.  I guess it should lookup a
proper vma in current-mm with mmap_sem read-locked.


 - this only allows to read the data from the same binary.

Right.  This is also an unnecessary restriction.  We should be able to
access data in other binary.


 - in particular, you can't read the data from bss

I can't understand why..  The bss region should also be in a same vma of
normal data, no?


 - get_user_vaddr() looks simply wrong. I blindly applied the whole series
   and did the test to ensure.

   Test-case:

   #include stdio.h
   #include stdlib.h
   #include unistd.h

   unsigned int global = 0x1234;

   void func(void)
   {
   }

   int main(void)
   {
   char cmd[64];

   global = 0x4321;
   func();

   printf(addr = %p\n, global);

   sprintf(cmd, cat /proc/%d/maps, getpid());
   system(cmd);

   return 0;
   }

   # nm foo | grep -w global
   00600a04 D global

   # perf probe -x ./foo -a func var=@0xa04:u32
   # perf record -e probe_foo:func ./foo
   addr = 0x600a04
   0040-00401000 r-xp  fe:01 20958 
  /root/foo
   0060-00601000 rw-p  fe:01 20958 
  /root/foo
   ...

   # perf script | tail -1
   foo   555 [000]  1302.345642: probe_foo:func: (40059c) var=1234

   Note that it reports 1234, not 4321. This is because
   get_user_vaddr() finds another (1st) read-only mapping, and
   prints the initial value of global.

   IOW, it reads the memory from 0x400a04, not from 0x600a04.

Argh..  This is a problem.

I thought the gcc somehow aligns data to next page boundary.  But if
it's not the case, we need to recognize which is the proper one..

Simply preferring a writable vma to a read-only vma is what's came to my
head now.  Do you have an idea?


 ---
 Can't we simply implement get_user_vaddr() as

   static void __user *get_user_vaddr(unsigned long addr, struct 
 trace_uprobe *tu)
   {
   void __user *vaddr = (void __force __user *)addr;

   /* A NULL tu means that we already got the vaddr */
   if (tu)
   vaddr += (current-mm-start_data  PAGE_MASK);

   return vaddr;
   }

 ?

 I did this change, and now the test-case above works. And it also works
 with cc -pie -fPIC,

   # nm foo | grep -w global
   00200c9c D global

   # perf probe -x ./foo -a func var=@0xc9c:u32
   # perf record -e probe_foo:func ./foo
   ...
   # perf script | tail -1
   foo   576 [001]   475.519940: probe_foo:func: (7ffe95ca3814) 
 var=4321

 What do you think?

This can only work with the probes fetching data from the executable,
right?  But as I said it should support any other binaries too.

What about this?


static void __user *get_user_vaddr(unsigned long addr, struct trace_uprobe *tu)
{
unsigned long pgoff = addr  PAGE_SHIFT;
struct vm_area_struct *vma, *orig_vma = NULL;
unsigned long vaddr = 0;

if (tu == NULL) {
/* A NULL tu means that we already got the vaddr */
return (void __force __user *) addr;
}

down_read(current-mm-mmap_sem);

vma = current-mm-mmap;
do {
if (!vma-vm_file || vma-vm_file-f_inode != tu-inode) {
/*
 * We found read-only mapping for this inode.
 * (provided that all mappings for this inode
 * have consecutive addresses)
 */
if (orig_vma)
break;
continue;
}

if (vma-vm_pgoff  pgoff ||
(vma-vm_pgoff + vma_pages(vma) = pgoff))
continue;

orig_vma = vma;

/*
 * We prefer writable mapping over read-only since
 * data is usually in read/write memory region.  But
 * in case of 

Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Namhyung Kim
On Mon, 04 Nov 2013 17:46:41 +0900, Namhyung Kim wrote:
 On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
 - this only allows to read the data from the same binary.

 Right.  This is also an unnecessary restriction.  We should be able to
 access data in other binary.

Hmm.. I guess this gonna be not simple - perhaps it can only be
supported for per-process uprobe with known virtual addresses?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Oleg Nesterov
On 11/04, Namhyung Kim wrote:

 On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
 
  This does not look right to me.
 
  - get_user_vaddr() is costly, it does vma_interval_tree_foreach() under
-i_mmap_mutex.

 Hmm.. yes, I think this is not needed.  I guess it should lookup a
 proper vma in current-mm with mmap_sem read-locked.

 
  - this only allows to read the data from the same binary.

 Right.  This is also an unnecessary restriction.  We should be able to
 access data in other binary.

Yes... but this needs another discussion. In general, we simply can not
do this with the suggested syntax.

Say you want to probe this foo binary and dump stdin from libc.so.
You can't do this. You simply can't know where libc.so will be mmaped.

But: if we attach the event to the already running process, or if we
disable the randomization, then we can probably do this, see below.

Or the syntax should be name=probe @file/addr or something like this.

  - in particular, you can't read the data from bss

 I can't understand why..  The bss region should also be in a same vma of
 normal data, no?

No, no. bss is mmaped anonymously, at least in general. See set_brk() in
load_elf().

  #include stdio.h
  #include stdlib.h
  #include unistd.h
 
  unsigned int global = 0x1234;
 
  void func(void)
  {
  }
 
  int main(void)
  {
  char cmd[64];
 
  global = 0x4321;
  func();
 
  printf(addr = %p\n, global);
 
  sprintf(cmd, cat /proc/%d/maps, getpid());
  system(cmd);
 
  return 0;
  }
 
  # nm foo | grep -w global
  00600a04 D global
 
  # perf probe -x ./foo -a func var=@0xa04:u32
  # perf record -e probe_foo:func ./foo
  addr = 0x600a04
  0040-00401000 r-xp  fe:01 20958 
   /root/foo
  0060-00601000 rw-p  fe:01 20958 
   /root/foo
  ...
 
  # perf script | tail -1
  foo   555 [000]  1302.345642: probe_foo:func: (40059c) var=1234
 
  Note that it reports 1234, not 4321. This is because
  get_user_vaddr() finds another (1st) read-only mapping, and
  prints the initial value of global.
 
  IOW, it reads the memory from 0x400a04, not from 0x600a04.

 Argh..  This is a problem.

 I thought the gcc somehow aligns data to next page boundary.

And perhaps it even should, my system is old. But this doesn't really
matter, the process itself can create another mapping.

 But if
 it's not the case, we need to recognize which is the proper one..

 Simply preferring a writable vma to a read-only vma is what's came to my
 head now.  Do you have an idea?

So far I think that trace_uprobes.c should not play games with vma. At all.

  ---
  Can't we simply implement get_user_vaddr() as
 
  static void __user *get_user_vaddr(unsigned long addr, struct 
  trace_uprobe *tu)
  {
  void __user *vaddr = (void __force __user *)addr;
 
  /* A NULL tu means that we already got the vaddr */
  if (tu)
  vaddr += (current-mm-start_data  PAGE_MASK);
 
  return vaddr;
  }
 
  ?
 
  I did this change, and now the test-case above works. And it also works
  with cc -pie -fPIC,
 
  # nm foo | grep -w global
  00200c9c D global
 
  # perf probe -x ./foo -a func var=@0xc9c:u32
  # perf record -e probe_foo:func ./foo
  ...
  # perf script | tail -1
  foo   576 [001]   475.519940: probe_foo:func: (7ffe95ca3814) 
  var=4321
 
  What do you think?

 This can only work with the probes fetching data from the executable,
 right?  But as I said it should support any other binaries too.

See above, we can't in general read other binaries.

But: if we know know where it is mmapped we can do this, just we need
to calculate the right addr passed to trace_uprobes.

Or: we should support both absolute and relative addresses, this is what
I was going to discuss later.

 static void __user *get_user_vaddr(unsigned long addr, struct trace_uprobe 
 *tu)
 {
   unsigned long pgoff = addr  PAGE_SHIFT;
   struct vm_area_struct *vma, *orig_vma = NULL;
   unsigned long vaddr = 0;

   if (tu == NULL) {
   /* A NULL tu means that we already got the vaddr */
   return (void __force __user *) addr;
   }

   down_read(current-mm-mmap_sem);

   vma = current-mm-mmap;

Cough, it can be null if another thread does munmap(0, TASK_SIZE) ;)

But this doesn't matter.

   do {
   if (!vma-vm_file || vma-vm_file-f_inode != tu-inode) {
   /*
* We found read-only mapping for this inode.
* (provided that all mappings for this inode
* have consecutive addresses)
  

Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-04 Thread Oleg Nesterov
On 11/04, Namhyung Kim wrote:
 On Mon, 04 Nov 2013 17:46:41 +0900, Namhyung Kim wrote:
  On Sat, 2 Nov 2013 16:54:58 +0100, Oleg Nesterov wrote:
  - this only allows to read the data from the same binary.
 
  Right.  This is also an unnecessary restriction.  We should be able to
  access data in other binary.

 Hmm.. I guess this gonna be not simple

Yes ;)

- perhaps it can only be
 supported for per-process uprobe with known virtual addresses?

Known is very limited. Even in the simplest case (like your test-case
from from 0/13), you simply can't know the address of int glob if you
compile it with -pie -fPIC.

As for other binaries (say libc) the problem is even worse, and
randomize_va_space adds even more pain.

But in any case, I strongly believe that it doesn't make any sense to
rely on tu-inode in get_user_vaddr().

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-02 Thread Oleg Nesterov
Hello,

Let me first apologize again if this was already discussed. And I also
need to mention that I know almost nothing about elf/randomization/etc.

However,

On 10/29, Namhyung Kim wrote:
>
>   # nm foo | grep -e glob$ -e str -e foo
>   006008bc D foo
>   006008a8 D glob
>   006008ac D str
>
>   # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \

This does not look right to me.

- get_user_vaddr() is costly, it does vma_interval_tree_foreach() under
  ->i_mmap_mutex.

- this only allows to read the data from the same binary.

- in particular, you can't read the data from bss

- get_user_vaddr() looks simply wrong. I blindly applied the whole series
  and did the test to ensure.

  Test-case:

#include 
#include 
#include 

unsigned int global = 0x1234;

void func(void)
{
}

int main(void)
{
char cmd[64];

global = 0x4321;
func();

printf("addr = %p\n", );

sprintf(cmd, "cat /proc/%d/maps", getpid());
system(cmd);

return 0;
}

# nm foo | grep -w global
00600a04 D global

# perf probe -x ./foo -a "func var=@0xa04:u32"
# perf record -e probe_foo:func ./foo
addr = 0x600a04
0040-00401000 r-xp  fe:01 20958 
 /root/foo
0060-00601000 rw-p  fe:01 20958 
 /root/foo
...

# perf script | tail -1
foo   555 [000]  1302.345642: probe_foo:func: (40059c) var=1234

Note that it reports "1234", not "4321". This is because
get_user_vaddr() finds another (1st) read-only mapping, and
prints the initial value of "global".

IOW, it reads the memory from 0x400a04, not from 0x600a04.

---
Can't we simply implement get_user_vaddr() as

static void __user *get_user_vaddr(unsigned long addr, struct 
trace_uprobe *tu)
{
void __user *vaddr = (void __force __user *)addr;

/* A NULL tu means that we already got the vaddr */
if (tu)
vaddr += (current->mm->start_data & PAGE_MASK);

return vaddr;
}

?

I did this change, and now the test-case above works. And it also works
with "cc -pie -fPIC",

# nm foo | grep -w global
00200c9c D global

# perf probe -x ./foo -a "func var=@0xc9c:u32"
# perf record -e probe_foo:func ./foo
...
# perf script | tail -1
foo   576 [001]   475.519940: probe_foo:func: (7ffe95ca3814) 
var=4321

What do you think?

---
Note:
- I think that /* A NULL tu means that we already got the vaddr */
  needs more discussion... IOW, I am not sure about 11/13.

- Perhaps it also makes sense to allow to pass the absolute address
  (iow, += start_data should be conditional)

but lets ignore this for now.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-11-02 Thread Oleg Nesterov
Hello,

Let me first apologize again if this was already discussed. And I also
need to mention that I know almost nothing about elf/randomization/etc.

However,

On 10/29, Namhyung Kim wrote:

   # nm foo | grep -e glob$ -e str -e foo
   006008bc D foo
   006008a8 D glob
   006008ac D str

   # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \

This does not look right to me.

- get_user_vaddr() is costly, it does vma_interval_tree_foreach() under
  -i_mmap_mutex.

- this only allows to read the data from the same binary.

- in particular, you can't read the data from bss

- get_user_vaddr() looks simply wrong. I blindly applied the whole series
  and did the test to ensure.

  Test-case:

#include stdio.h
#include stdlib.h
#include unistd.h

unsigned int global = 0x1234;

void func(void)
{
}

int main(void)
{
char cmd[64];

global = 0x4321;
func();

printf(addr = %p\n, global);

sprintf(cmd, cat /proc/%d/maps, getpid());
system(cmd);

return 0;
}

# nm foo | grep -w global
00600a04 D global

# perf probe -x ./foo -a func var=@0xa04:u32
# perf record -e probe_foo:func ./foo
addr = 0x600a04
0040-00401000 r-xp  fe:01 20958 
 /root/foo
0060-00601000 rw-p  fe:01 20958 
 /root/foo
...

# perf script | tail -1
foo   555 [000]  1302.345642: probe_foo:func: (40059c) var=1234

Note that it reports 1234, not 4321. This is because
get_user_vaddr() finds another (1st) read-only mapping, and
prints the initial value of global.

IOW, it reads the memory from 0x400a04, not from 0x600a04.

---
Can't we simply implement get_user_vaddr() as

static void __user *get_user_vaddr(unsigned long addr, struct 
trace_uprobe *tu)
{
void __user *vaddr = (void __force __user *)addr;

/* A NULL tu means that we already got the vaddr */
if (tu)
vaddr += (current-mm-start_data  PAGE_MASK);

return vaddr;
}

?

I did this change, and now the test-case above works. And it also works
with cc -pie -fPIC,

# nm foo | grep -w global
00200c9c D global

# perf probe -x ./foo -a func var=@0xc9c:u32
# perf record -e probe_foo:func ./foo
...
# perf script | tail -1
foo   576 [001]   475.519940: probe_foo:func: (7ffe95ca3814) 
var=4321

What do you think?

---
Note:
- I think that /* A NULL tu means that we already got the vaddr */
  needs more discussion... IOW, I am not sure about 11/13.

- Perhaps it also makes sense to allow to pass the absolute address
  (iow, += start_data should be conditional)

but lets ignore this for now.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-10-30 Thread Masami Hiramatsu
(2013/10/29 15:53), Namhyung Kim wrote:
> Hello,
> 
> This patchset implements memory (address), stack[N], deference,
> bitfield and retval (it needs uretprobe tho) fetch methods for
> uprobes.  It's based on the previous work [1] done by Hyeoncheol Lee.
> 
> Now kprobes and uprobes have their own fetch_type_tables and, in turn,
> memory and stack access methods.  Other fetch methods are shared.
> 
> For the dereference method, I added a new argument to fetch functions.
> It's because for uprobes it needs to know whether the given address is
> a file offset or a virtual address in an user process.  For instance,
> in case of fetching from a memory directly (like @offset) it should
> convert the address (offset) to a virtual address of the process, but
> if it's a dereferencing, the given address already has the virtual
> address.
> 
> To determine this in a fetch function, I passed a pointer to
> trace_uprobe for direct fetch, and passed NULL for dereference.
> 
> The patch 1-2 are bug fixes and can be applied independently.

You'd better add [BUGFIX] and send those separately. ;)
But anyway, I'm OK to pull those first two (and others too).


> Please look at patch 10 that uses per-cpu buffer for accessing user
> memory as suggested by Steven.  While I tried hard not to mess things
> up there might be a chance I did something horrible.  It'd be great if
> you guys take a look and give comments.
> 
> 
>  * v6 changes:
>   - add more Ack's from Masami
>   - fix ref count of uprobe_cpu_buffer (thanks to Jovi)
> 
>  * v5 changes:
>   - use user_stack_pointer() instead of GET_USP()
>   - fix a bug in 'stack' fetch method of uprobes
> 
>  * v4 changes:
>   - add Ack's from Masami
>   - rearrange patches to make it easy for simple fixes to be applied
>   - update documentation
>   - use per-cpu buffer for storing args (thanks to Steve!)
> 
> 
> [1] https://lkml.org/lkml/2012/11/14/84
> 
> A simple example:
> 
>   # cat foo.c
>   int glob = -1;
>   char str[] = "hello uprobe.";
> 
>   struct foo {
> unsigned int unused: 2;
> unsigned int foo: 20;
> unsigned int bar: 10;
>   } foo = {
> .foo = 5,
>   };
> 
>   int main(int argc, char *argv[])
>   {
> long local = 0x1234;
> 
> return 127;
>   }
> 
>   # gcc -o foo -g foo.c
> 
>   # objdump -d foo | grep -A9 -F ''
>   004004b0 :
> 4004b0:   55  push   %rbp
> 4004b1:   48 89 e5mov%rsp,%rbp
> 4004b4:   89 7d ecmov%edi,-0x14(%rbp)
> 4004b7:   48 89 75 e0 mov%rsi,-0x20(%rbp)
> 4004bb:   48 c7 45 f8 34 12 00movq   $0x1234,-0x8(%rbp)
> 4004c2:   00 
> 4004c3:   b8 7f 00 00 00  mov$0x7f,%eax
> 4004c8:   5d  pop%rbp
> 4004c9:   c3  retq   
> 
>   # nm foo | grep -e glob$ -e str -e foo
>   006008bc D foo
>   006008a8 D glob
>   006008ac D str
> 
>   # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \
>   > str=@0x8ac:string bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp)'
>   Added new event:
> probe_foo:foo  (on 0x4c3 with glob=@0x8a8:s32 str=@0x8ac:string 
>  bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp))
> 
>   You can now use it in all perf tools, such as:
> 
>   perf record -e probe_foo:foo -aR sleep 1
> 
>   # perf record -e probe_foo:foo ./foo
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.001 MB perf.data (~33 samples) ]
> 
>   # perf script | grep -v ^#
>foo  2008 [002  2199.867154: probe_foo:foo (4004c3)
>glob=-1 str="hello uprobe." bit=5 argc=1 local=1234

Nice ! :)

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-10-30 Thread Masami Hiramatsu
(2013/10/29 15:53), Namhyung Kim wrote:
 Hello,
 
 This patchset implements memory (address), stack[N], deference,
 bitfield and retval (it needs uretprobe tho) fetch methods for
 uprobes.  It's based on the previous work [1] done by Hyeoncheol Lee.
 
 Now kprobes and uprobes have their own fetch_type_tables and, in turn,
 memory and stack access methods.  Other fetch methods are shared.
 
 For the dereference method, I added a new argument to fetch functions.
 It's because for uprobes it needs to know whether the given address is
 a file offset or a virtual address in an user process.  For instance,
 in case of fetching from a memory directly (like @offset) it should
 convert the address (offset) to a virtual address of the process, but
 if it's a dereferencing, the given address already has the virtual
 address.
 
 To determine this in a fetch function, I passed a pointer to
 trace_uprobe for direct fetch, and passed NULL for dereference.
 
 The patch 1-2 are bug fixes and can be applied independently.

You'd better add [BUGFIX] and send those separately. ;)
But anyway, I'm OK to pull those first two (and others too).


 Please look at patch 10 that uses per-cpu buffer for accessing user
 memory as suggested by Steven.  While I tried hard not to mess things
 up there might be a chance I did something horrible.  It'd be great if
 you guys take a look and give comments.
 
 
  * v6 changes:
   - add more Ack's from Masami
   - fix ref count of uprobe_cpu_buffer (thanks to Jovi)
 
  * v5 changes:
   - use user_stack_pointer() instead of GET_USP()
   - fix a bug in 'stack' fetch method of uprobes
 
  * v4 changes:
   - add Ack's from Masami
   - rearrange patches to make it easy for simple fixes to be applied
   - update documentation
   - use per-cpu buffer for storing args (thanks to Steve!)
 
 
 [1] https://lkml.org/lkml/2012/11/14/84
 
 A simple example:
 
   # cat foo.c
   int glob = -1;
   char str[] = hello uprobe.;
 
   struct foo {
 unsigned int unused: 2;
 unsigned int foo: 20;
 unsigned int bar: 10;
   } foo = {
 .foo = 5,
   };
 
   int main(int argc, char *argv[])
   {
 long local = 0x1234;
 
 return 127;
   }
 
   # gcc -o foo -g foo.c
 
   # objdump -d foo | grep -A9 -F 'main'
   004004b0 main:
 4004b0:   55  push   %rbp
 4004b1:   48 89 e5mov%rsp,%rbp
 4004b4:   89 7d ecmov%edi,-0x14(%rbp)
 4004b7:   48 89 75 e0 mov%rsi,-0x20(%rbp)
 4004bb:   48 c7 45 f8 34 12 00movq   $0x1234,-0x8(%rbp)
 4004c2:   00 
 4004c3:   b8 7f 00 00 00  mov$0x7f,%eax
 4004c8:   5d  pop%rbp
 4004c9:   c3  retq   
 
   # nm foo | grep -e glob$ -e str -e foo
   006008bc D foo
   006008a8 D glob
   006008ac D str
 
   # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \
str=@0x8ac:string bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp)'
   Added new event:
 probe_foo:foo  (on 0x4c3 with glob=@0x8a8:s32 str=@0x8ac:string 
  bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp))
 
   You can now use it in all perf tools, such as:
 
   perf record -e probe_foo:foo -aR sleep 1
 
   # perf record -e probe_foo:foo ./foo
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.001 MB perf.data (~33 samples) ]
 
   # perf script | grep -v ^#
foo  2008 [002  2199.867154: probe_foo:foo (4004c3)
glob=-1 str=hello uprobe. bit=5 argc=1 local=1234

Nice ! :)

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-10-29 Thread Namhyung Kim
Hello,

This patchset implements memory (address), stack[N], deference,
bitfield and retval (it needs uretprobe tho) fetch methods for
uprobes.  It's based on the previous work [1] done by Hyeoncheol Lee.

Now kprobes and uprobes have their own fetch_type_tables and, in turn,
memory and stack access methods.  Other fetch methods are shared.

For the dereference method, I added a new argument to fetch functions.
It's because for uprobes it needs to know whether the given address is
a file offset or a virtual address in an user process.  For instance,
in case of fetching from a memory directly (like @offset) it should
convert the address (offset) to a virtual address of the process, but
if it's a dereferencing, the given address already has the virtual
address.

To determine this in a fetch function, I passed a pointer to
trace_uprobe for direct fetch, and passed NULL for dereference.

The patch 1-2 are bug fixes and can be applied independently.

Please look at patch 10 that uses per-cpu buffer for accessing user
memory as suggested by Steven.  While I tried hard not to mess things
up there might be a chance I did something horrible.  It'd be great if
you guys take a look and give comments.


 * v6 changes:
  - add more Ack's from Masami
  - fix ref count of uprobe_cpu_buffer (thanks to Jovi)

 * v5 changes:
  - use user_stack_pointer() instead of GET_USP()
  - fix a bug in 'stack' fetch method of uprobes

 * v4 changes:
  - add Ack's from Masami
  - rearrange patches to make it easy for simple fixes to be applied
  - update documentation
  - use per-cpu buffer for storing args (thanks to Steve!)


[1] https://lkml.org/lkml/2012/11/14/84

A simple example:

  # cat foo.c
  int glob = -1;
  char str[] = "hello uprobe.";

  struct foo {
unsigned int unused: 2;
unsigned int foo: 20;
unsigned int bar: 10;
  } foo = {
.foo = 5,
  };

  int main(int argc, char *argv[])
  {
long local = 0x1234;

return 127;
  }

  # gcc -o foo -g foo.c

  # objdump -d foo | grep -A9 -F ''
  004004b0 :
4004b0: 55  push   %rbp
4004b1: 48 89 e5mov%rsp,%rbp
4004b4: 89 7d ecmov%edi,-0x14(%rbp)
4004b7: 48 89 75 e0 mov%rsi,-0x20(%rbp)
4004bb: 48 c7 45 f8 34 12 00movq   $0x1234,-0x8(%rbp)
4004c2: 00 
4004c3: b8 7f 00 00 00  mov$0x7f,%eax
4004c8: 5d  pop%rbp
4004c9: c3  retq   

  # nm foo | grep -e glob$ -e str -e foo
  006008bc D foo
  006008a8 D glob
  006008ac D str

  # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \
  > str=@0x8ac:string bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp)'
  Added new event:
probe_foo:foo  (on 0x4c3 with glob=@0x8a8:s32 str=@0x8ac:string 
 bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp))

  You can now use it in all perf tools, such as:

  perf record -e probe_foo:foo -aR sleep 1

  # perf record -e probe_foo:foo ./foo
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (~33 samples) ]

  # perf script | grep -v ^#
   foo  2008 [002  2199.867154: probe_foo:foo (4004c3)
   glob=-1 str="hello uprobe." bit=5 argc=1 local=1234


This patchset is based on the current for-next branch of the Steven
Rostedt's linux-trace tree.  I also put this on my 'uprobe/fetch-v6'
branch in my tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks.
Namhyung


Cc: Masami Hiramatsu 
Cc: Srikar Dronamraju 
Cc: Oleg Nesterov 
Cc: zhangwei(Jovi) 
Cc: Arnaldo Carvalho de Melo 
Cc: Hemant Kumar 


Hyeoncheol Lee (2):
  tracing/kprobes: Move fetch functions to trace_kprobe.c
  tracing/kprobes: Add fetch{,_size} member into deref fetch method

Namhyung Kim (11):
  tracing/uprobes: Fix documentation of uprobe registration syntax
  tracing/probes: Fix basic print type functions
  tracing/kprobes: Staticize stack and memory fetch functions
  tracing/kprobes: Factor out struct trace_probe
  tracing/uprobes: Convert to struct trace_probe
  tracing/kprobes: Move common functions to trace_probe.h
  tracing/kprobes: Integrate duplicate set_print_fmt()
  tracing/uprobes: Fetch args before reserving a ring buffer
  tracing/kprobes: Add priv argument to fetch functions
  tracing/uprobes: Add more fetch functions
  tracing/uprobes: Add support for full argument access methods

 Documentation/trace/uprobetracer.txt |  35 +-
 kernel/trace/trace_kprobe.c  | 642 +++
 kernel/trace/trace_probe.c   | 453 +---
 kernel/trace/trace_probe.h   | 202 ++-
 kernel/trace/trace_uprobe.c  | 458 +
 5 files changed, 1063 insertions(+), 727 deletions(-)

-- 
1.7.11.7

--
To unsubscribe from 

[PATCHSET 00/13] tracing/uprobes: Add support for more fetch methods (v6)

2013-10-29 Thread Namhyung Kim
Hello,

This patchset implements memory (address), stack[N], deference,
bitfield and retval (it needs uretprobe tho) fetch methods for
uprobes.  It's based on the previous work [1] done by Hyeoncheol Lee.

Now kprobes and uprobes have their own fetch_type_tables and, in turn,
memory and stack access methods.  Other fetch methods are shared.

For the dereference method, I added a new argument to fetch functions.
It's because for uprobes it needs to know whether the given address is
a file offset or a virtual address in an user process.  For instance,
in case of fetching from a memory directly (like @offset) it should
convert the address (offset) to a virtual address of the process, but
if it's a dereferencing, the given address already has the virtual
address.

To determine this in a fetch function, I passed a pointer to
trace_uprobe for direct fetch, and passed NULL for dereference.

The patch 1-2 are bug fixes and can be applied independently.

Please look at patch 10 that uses per-cpu buffer for accessing user
memory as suggested by Steven.  While I tried hard not to mess things
up there might be a chance I did something horrible.  It'd be great if
you guys take a look and give comments.


 * v6 changes:
  - add more Ack's from Masami
  - fix ref count of uprobe_cpu_buffer (thanks to Jovi)

 * v5 changes:
  - use user_stack_pointer() instead of GET_USP()
  - fix a bug in 'stack' fetch method of uprobes

 * v4 changes:
  - add Ack's from Masami
  - rearrange patches to make it easy for simple fixes to be applied
  - update documentation
  - use per-cpu buffer for storing args (thanks to Steve!)


[1] https://lkml.org/lkml/2012/11/14/84

A simple example:

  # cat foo.c
  int glob = -1;
  char str[] = hello uprobe.;

  struct foo {
unsigned int unused: 2;
unsigned int foo: 20;
unsigned int bar: 10;
  } foo = {
.foo = 5,
  };

  int main(int argc, char *argv[])
  {
long local = 0x1234;

return 127;
  }

  # gcc -o foo -g foo.c

  # objdump -d foo | grep -A9 -F 'main'
  004004b0 main:
4004b0: 55  push   %rbp
4004b1: 48 89 e5mov%rsp,%rbp
4004b4: 89 7d ecmov%edi,-0x14(%rbp)
4004b7: 48 89 75 e0 mov%rsi,-0x20(%rbp)
4004bb: 48 c7 45 f8 34 12 00movq   $0x1234,-0x8(%rbp)
4004c2: 00 
4004c3: b8 7f 00 00 00  mov$0x7f,%eax
4004c8: 5d  pop%rbp
4004c9: c3  retq   

  # nm foo | grep -e glob$ -e str -e foo
  006008bc D foo
  006008a8 D glob
  006008ac D str

  # perf probe -x /home/namhyung/tmp/foo -a 'foo=main+0x13 glob=@0x8a8:s32 \
   str=@0x8ac:string bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp)'
  Added new event:
probe_foo:foo  (on 0x4c3 with glob=@0x8a8:s32 str=@0x8ac:string 
 bit=@0x8bc:b10@2/32 argc=%di local=-0x8(%bp))

  You can now use it in all perf tools, such as:

  perf record -e probe_foo:foo -aR sleep 1

  # perf record -e probe_foo:foo ./foo
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (~33 samples) ]

  # perf script | grep -v ^#
   foo  2008 [002  2199.867154: probe_foo:foo (4004c3)
   glob=-1 str=hello uprobe. bit=5 argc=1 local=1234


This patchset is based on the current for-next branch of the Steven
Rostedt's linux-trace tree.  I also put this on my 'uprobe/fetch-v6'
branch in my tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks.
Namhyung


Cc: Masami Hiramatsu masami.hiramatsu...@hitachi.com
Cc: Srikar Dronamraju sri...@linux.vnet.ibm.com
Cc: Oleg Nesterov o...@redhat.com
Cc: zhangwei(Jovi) jovi.zhang...@huawei.com
Cc: Arnaldo Carvalho de Melo a...@ghostprotocols.net
Cc: Hemant Kumar hks...@linux.vnet.ibm.com


Hyeoncheol Lee (2):
  tracing/kprobes: Move fetch functions to trace_kprobe.c
  tracing/kprobes: Add fetch{,_size} member into deref fetch method

Namhyung Kim (11):
  tracing/uprobes: Fix documentation of uprobe registration syntax
  tracing/probes: Fix basic print type functions
  tracing/kprobes: Staticize stack and memory fetch functions
  tracing/kprobes: Factor out struct trace_probe
  tracing/uprobes: Convert to struct trace_probe
  tracing/kprobes: Move common functions to trace_probe.h
  tracing/kprobes: Integrate duplicate set_print_fmt()
  tracing/uprobes: Fetch args before reserving a ring buffer
  tracing/kprobes: Add priv argument to fetch functions
  tracing/uprobes: Add more fetch functions
  tracing/uprobes: Add support for full argument access methods

 Documentation/trace/uprobetracer.txt |  35 +-
 kernel/trace/trace_kprobe.c  | 642 +++
 kernel/trace/trace_probe.c   | 453 +---
 kernel/trace/trace_probe.h   | 202 ++-