Re: [PATCH v3 6/9] trace_uprobe: Support SDT markers having reference count (semaphore)

2018-05-07 Thread Masami Hiramatsu
On Mon, 7 May 2018 13:51:21 +0530
Ravi Bangoria <ravi.bango...@linux.ibm.com> wrote:

> Hi Masami,
> 
> On 05/04/2018 07:51 PM, Ravi Bangoria wrote:
> >
> >>> +}
> >>> +
> >>> +static void sdt_increment_ref_ctr(struct trace_uprobe *tu)
> >>> +{
> >>> + struct uprobe_map_info *info;
> >>> +
> >>> + uprobe_down_write_dup_mmap();
> >>> + info = uprobe_build_map_info(tu->inode->i_mapping,
> >>> + tu->ref_ctr_offset, false);
> >>> + if (IS_ERR(info))
> >>> + goto out;
> >>> +
> >>> + while (info) {
> >>> + down_write(>mm->mmap_sem);
> >>> +
> >>> + if (sdt_find_vma(tu, info->mm, info->vaddr))
> >>> + sdt_update_ref_ctr(info->mm, info->vaddr, 1);
> >> Don't you have to handle the error to map pages here?
> > Correct.. I think, I've to feedback error code to 
> > probe_event_{enable|disable}
> > and handler failure there.
> 
> I looked at this. Actually, It looks difficult to feedback errors to
> probe_event_{enable|disable}, esp. in the mmap() case.

Hmm, can't you roll that back if sdt_increment_ref_ctr() fails?
If so, how does sdt_decrement_ref_ctr() work in that case?

> Is it fine if we just warn sdt_update_ref_ctr() failures in dmesg? I'm
> doing this in [PATCH 7]. (Though, it makes more sense to do that in
> [PATCH 6], will change it in next version).

Of course we need to warn it at least, but the best is rejecting to
enable it.

> 
> Any better ideas?
> 
> BTW, same issue exists for normal uprobe. If uprobe_mmap() fails,
> there is no feedback to trace_uprobe and no warnigns in dmesg as
> well !! There was a patch by Naveen to warn such failures in dmesg
> but that didn't go in: https://lkml.org/lkml/2017/9/22/155

Oops, that's a real bug. It seems the ball is in Naveen's hand.
Naveen, could you update it according to Oleg's comment, and resend it?

> 
> Also, I'll add a check in sdt_update_ref_ctr() to make sure reference
> counter never goes to negative incase increment fails but decrement
> succeeds. OTOH, if increment succeeds but decrement fails, the
> counter remains >0 but there is no harm as such, except we will
> execute some unnecessary code.

I see. Please carefully clarify whether such case is kernel's bug or not.
I would like to know what the condition causes that uneven behavior.

Thank you,

> 
> Thanks,
> Ravi
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 6/9] trace_uprobe: Support SDT markers having reference count (semaphore)

2018-05-03 Thread Masami Hiramatsu
 uprobe_up_write_dup_mmap();
> +}
> +
> +/* Called with down_write(>vm_mm->mmap_sem) */
> +static void trace_uprobe_mmap(struct vm_area_struct *vma)
> +{
> + struct trace_uprobe *tu;
> + unsigned long vaddr;
> +
> + if (!(vma->vm_flags & VM_WRITE))
> + return;
> +
> + mutex_lock(_lock);
> + list_for_each_entry(tu, _list, list) {
> + if (!trace_probe_is_enabled(>tp))
> + continue;
> +
> + vaddr = vma_offset_to_vaddr(vma, tu->ref_ctr_offset);
> + if (!sdt_valid_vma(tu, vma, vaddr))
> + continue;
> +
> + sdt_update_ref_ctr(vma->vm_mm, vaddr, 1);

Same here.

> + }
> + mutex_unlock(_lock);
> +}
> +
> +static void sdt_decrement_ref_ctr(struct trace_uprobe *tu)
> +{
> + struct uprobe_map_info *info;
> +
> + uprobe_down_write_dup_mmap();
> + info = uprobe_build_map_info(tu->inode->i_mapping,
> + tu->ref_ctr_offset, false);
> + if (IS_ERR(info))
> + goto out;
> +
> + while (info) {
> + down_write(>mm->mmap_sem);
> +
> + if (sdt_find_vma(tu, info->mm, info->vaddr))
> + sdt_update_ref_ctr(info->mm, info->vaddr, -1);

Ditto.

Thank you,

> +
> + up_write(>mm->mmap_sem);
> + info = uprobe_free_map_info(info);
> +     }
> +
> +out:
> + uprobe_up_write_dup_mmap();
> +}
> +
>  typedef bool (*filter_func_t)(struct uprobe_consumer *self,
>   enum uprobe_filter_ctx ctx,
>   struct mm_struct *mm);
> @@ -941,6 +1091,9 @@ typedef bool (*filter_func_t)(struct uprobe_consumer 
> *self,
>   if (ret)
>   goto err_buffer;
>  
> + if (tu->ref_ctr_offset)
> + sdt_increment_ref_ctr(tu);
> +
>   return 0;
>  
>   err_buffer:
> @@ -981,6 +1134,9 @@ typedef bool (*filter_func_t)(struct uprobe_consumer 
> *self,
>  
>   WARN_ON(!uprobe_filter_is_empty(>filter));
>  
> + if (tu->ref_ctr_offset)
> + sdt_decrement_ref_ctr(tu);
> +
>   uprobe_unregister(tu->inode, tu->offset, >consumer);
>   tu->tp.flags &= file ? ~TP_FLAG_TRACE : ~TP_FLAG_PROFILE;
>  
> @@ -1425,6 +1581,8 @@ static __init int init_uprobe_trace(void)
>   /* Profile interface */
>   trace_create_file("uprobe_profile", 0444, d_tracer,
>   NULL, _profile_ops);
> +
> + uprobe_mmap_callback = trace_uprobe_mmap;
>   return 0;
>  }
>  
> -- 
> 1.8.3.1
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 8/9] trace_uprobe/sdt: Document about reference counter

2018-05-03 Thread Masami Hiramatsu
On Tue, 17 Apr 2018 10:02:43 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:

> From: Ravi Bangoria <ravi.bango...@linux.ibm.com>
> 
> Reference counter gate the invocation of probe. If present,
> by default reference count is 0. Kernel needs to increment
> it before tracing the probe and decrement it when done. This
> is identical to semaphore in Userspace Statically Defined
> Tracepoints (USDT).
> 
> Document usage of reference counter.
> 
> Signed-off-by: Ravi Bangoria <ravi.bango...@linux.ibm.com>

Looks good to me.

Acked-by: Masami Hiramatsu <mhira...@kernel.org>

Thanks!

> ---
>  Documentation/trace/uprobetracer.txt | 16 +---
>  kernel/trace/trace.c |  2 +-
>  2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/trace/uprobetracer.txt 
> b/Documentation/trace/uprobetracer.txt
> index bf526a7c..cb6751d 100644
> --- a/Documentation/trace/uprobetracer.txt
> +++ b/Documentation/trace/uprobetracer.txt
> @@ -19,15 +19,25 @@ user to calculate the offset of the probepoint in the 
> object.
>  
>  Synopsis of uprobe_tracer
>  -
> -  p[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a uprobe
> -  r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a return uprobe (uretprobe)
> -  -:[GRP/]EVENT   : Clear uprobe or uretprobe event
> +  p[:[GRP/]EVENT] PATH:OFFSET[(REF_CTR_OFFSET)] [FETCHARGS]
> +  r[:[GRP/]EVENT] PATH:OFFSET[(REF_CTR_OFFSET)] [FETCHARGS]
> +  -:[GRP/]EVENT
> +
> +  p : Set a uprobe
> +  r : Set a return uprobe (uretprobe)
> +  - : Clear uprobe or uretprobe event
>  
>GRP   : Group name. If omitted, "uprobes" is the default value.
>EVENT : Event name. If omitted, the event name is generated based
>on PATH+OFFSET.
>PATH  : Path to an executable or a library.
>OFFSET: Offset where the probe is inserted.
> +  REF_CTR_OFFSET: Reference counter offset. Optional field. Reference count
> +   gate the invocation of probe. If present, by default
> +   reference count is 0. Kernel needs to increment it before
> +   tracing the probe and decrement it when done. This is
> +   identical to semaphore in Userspace Statically Defined
> +   Tracepoints (USDT).
>  
>FETCHARGS : Arguments. Each probe can have up to 128 args.
> %REG : Fetch register REG
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 300f4ea..d211937 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -4604,7 +4604,7 @@ static int tracing_trace_options_open(struct inode 
> *inode, struct file *file)
>"place (kretprobe): [:][+]|\n"
>  #endif
>  #ifdef CONFIG_UPROBE_EVENTS
> - "\tplace: :\n"
> +  "   place (uprobe): :[(ref_ctr_offset)]\n"
>  #endif
>   "\t args: =fetcharg[:type]\n"
>   "\t fetcharg: %, @, @[+|-],\n"
> -- 
> 1.8.3.1
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 9/9] perf probe: Support SDT markers having reference counter (semaphore)

2018-05-03 Thread Masami Hiramatsu
On Tue, 17 Apr 2018 10:02:44 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:

> From: Ravi Bangoria <ravi.bango...@linux.ibm.com>
> 
> With this, perf buildid-cache will save SDT markers with reference
> counter in probe cache. Perf probe will be able to probe markers
> having reference counter. Ex,
> 
>   # readelf -n /tmp/tick | grep -A1 loop2
> Name: loop2
> ... Semaphore: 0x10020036
> 
>   # ./perf buildid-cache --add /tmp/tick
>   # ./perf probe sdt_tick:loop2
>   # ./perf stat -e sdt_tick:loop2 /tmp/tick
> hi: 0
> hi: 1
> hi: 2
> ^C
>  Performance counter stats for '/tmp/tick':
>  3  sdt_tick:loop2
>2.561851452 seconds time elapsed
> 
> Signed-off-by: Ravi Bangoria <ravi.bango...@linux.ibm.com>

Looks good to me.

Acked-by: Masami Hiramatsu <mhira...@kernel.org>

Thanks!

> ---
>  tools/perf/util/probe-event.c | 39 
>  tools/perf/util/probe-event.h |  1 +
>  tools/perf/util/probe-file.c  | 34 ++--
>  tools/perf/util/probe-file.h  |  1 +
>  tools/perf/util/symbol-elf.c  | 46 
> ---
>  tools/perf/util/symbol.h  |  7 +++
>  6 files changed, 106 insertions(+), 22 deletions(-)
> 
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index e1dbc98..9b9c26e 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -1832,6 +1832,12 @@ int parse_probe_trace_command(const char *cmd, struct 
> probe_trace_event *tev)
>   tp->offset = strtoul(fmt2_str, NULL, 10);
>   }
>  
> + if (tev->uprobes) {
> + fmt2_str = strchr(p, '(');
> + if (fmt2_str)
> + tp->ref_ctr_offset = strtoul(fmt2_str + 1, NULL, 0);
> + }
> +
>   tev->nargs = argc - 2;
>   tev->args = zalloc(sizeof(struct probe_trace_arg) * tev->nargs);
>   if (tev->args == NULL) {
> @@ -2025,6 +2031,22 @@ static int synthesize_probe_trace_arg(struct 
> probe_trace_arg *arg,
>   return err;
>  }
>  
> +static int
> +synthesize_uprobe_trace_def(struct probe_trace_event *tev, struct strbuf 
> *buf)
> +{
> + struct probe_trace_point *tp = >point;
> + int err;
> +
> + err = strbuf_addf(buf, "%s:0x%lx", tp->module, tp->address);
> +
> + if (err >= 0 && tp->ref_ctr_offset) {
> + if (!uprobe_ref_ctr_is_supported())
> + return -1;
> + err = strbuf_addf(buf, "(0x%lx)", tp->ref_ctr_offset);
> + }
> + return err >= 0 ? 0 : -1;
> +}
> +
>  char *synthesize_probe_trace_command(struct probe_trace_event *tev)
>  {
>   struct probe_trace_point *tp = >point;
> @@ -2054,15 +2076,17 @@ char *synthesize_probe_trace_command(struct 
> probe_trace_event *tev)
>   }
>  
>   /* Use the tp->address for uprobes */
> - if (tev->uprobes)
> - err = strbuf_addf(, "%s:0x%lx", tp->module, tp->address);
> - else if (!strncmp(tp->symbol, "0x", 2))
> + if (tev->uprobes) {
> + err = synthesize_uprobe_trace_def(tev, );
> + } else if (!strncmp(tp->symbol, "0x", 2)) {
>   /* Absolute address. See try_to_find_absolute_address() */
>   err = strbuf_addf(, "%s%s0x%lx", tp->module ?: "",
> tp->module ? ":" : "", tp->address);
> - else
> + } else {
>   err = strbuf_addf(, "%s%s%s+%lu", tp->module ?: "",
>   tp->module ? ":" : "", tp->symbol, tp->offset);
> + }
> +
>   if (err)
>   goto error;
>  
> @@ -2646,6 +2670,13 @@ static void warn_uprobe_event_compat(struct 
> probe_trace_event *tev)
>  {
>   int i;
>   char *buf = synthesize_probe_trace_command(tev);
> + struct probe_trace_point *tp = >point;
> +
> + if (tp->ref_ctr_offset && !uprobe_ref_ctr_is_supported()) {
> + pr_warning("A semaphore is associated with %s:%s and "
> +"seems your kernel doesn't support it.\n",
> +tev->group, tev->event);
> + }
>  
>   /* Old uprobe event doesn't support memory dereference */
>   if (!tev->uprobes || tev->nargs == 0 || !buf)
> diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
> index 45b14f0..15a98c3 10

Re: [PATCH v2 9/9] perf probe: Support SDT markers having reference counter (semaphore)

2018-04-09 Thread Masami Hiramatsu
urn note->bit32 ?
> + (unsigned long long)note->addr.a32[SDT_NOTE_IDX_REFCTR] :
> + (unsigned long long)note->addr.a64[SDT_NOTE_IDX_REFCTR];
>  }
>  
>  static const char * const type_to_suffix[] = {
> @@ -776,14 +784,21 @@ static char *synthesize_sdt_probe_command(struct 
> sdt_note *note,
>  {
>   struct strbuf buf;
>   char *ret = NULL, **args;
> - int i, args_count;
> + int i, args_count, err;
> + unsigned long long ref_ctr_offset;
>  
>   if (strbuf_init(, 32) < 0)
>   return NULL;
>  
> - if (strbuf_addf(, "p:%s/%s %s:0x%llx",
> - sdtgrp, note->name, pathname,
> - sdt_note__get_addr(note)) < 0)
> + err = strbuf_addf(, "p:%s/%s %s:0x%llx",
> + sdtgrp, note->name, pathname,
> + sdt_note__get_addr(note));
> +
> + ref_ctr_offset = sdt_note__get_ref_ctr_offset(note);
> + if (uprobe_ref_ctr_is_supported() && ref_ctr_offset && err >= 0)
> + err = strbuf_addf(, "(0x%llx)", ref_ctr_offset);

We don't have to care about uprobe_ref_ctr support here, because
this information will be just cached, not directly written to
uprobe_events.

Other parts look good to me.

Thanks,

-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] trace_uprobe/sdt: Document about reference counter

2018-03-16 Thread Masami Hiramatsu
On Fri, 16 Mar 2018 15:12:38 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:

> On 03/15/2018 06:17 PM, Masami Hiramatsu wrote:
> > Hi Ravi,
> >
> > On Wed, 14 Mar 2018 20:52:59 +0530
> > Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:
> >
> >> On 03/14/2018 07:20 PM, Masami Hiramatsu wrote:
> >>> On Tue, 13 Mar 2018 18:26:03 +0530
> >>> Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:
> >>>
> >>>> No functionality changes.
> >>> Please consider to describe what is this change and why, here.
> >> Will add in next version.
> > Thanks, and could you also move this before perf-probe patch?
> > Also Could you make perf-probe check the tracing/README whether
> > the kernel supports reference counter syntax or not?
> >
> > perf-tool can be used on older (or stable) kernel.
> 
> Sure, Will do that.

Please see scan_ftrace_readme@util/probe-file.c :)
It is easy to expand the pattern table.

Thank you,

-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] trace_uprobe/sdt: Document about reference counter

2018-03-15 Thread Masami Hiramatsu
Hi Ravi,

On Wed, 14 Mar 2018 20:52:59 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:

> On 03/14/2018 07:20 PM, Masami Hiramatsu wrote:
> > On Tue, 13 Mar 2018 18:26:03 +0530
> > Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:
> >
> >> No functionality changes.
> > Please consider to describe what is this change and why, here.
> 
> Will add in next version.

Thanks, and could you also move this before perf-probe patch?
Also Could you make perf-probe check the tracing/README whether
the kernel supports reference counter syntax or not?

perf-tool can be used on older (or stable) kernel.

Thank you,

> 
> >> Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
> >> ---
> >>  Documentation/trace/uprobetracer.txt | 16 +---
> >>  kernel/trace/trace.c |  2 +-
> >>  2 files changed, 14 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/Documentation/trace/uprobetracer.txt 
> >> b/Documentation/trace/uprobetracer.txt
> >> index bf526a7c..8fb13b0 100644
> >> --- a/Documentation/trace/uprobetracer.txt
> >> +++ b/Documentation/trace/uprobetracer.txt
> >> @@ -19,15 +19,25 @@ user to calculate the offset of the probepoint in the 
> >> object.
> >>  
> >>  Synopsis of uprobe_tracer
> >>  -
> >> -  p[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a uprobe
> >> -  r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a return uprobe 
> >> (uretprobe)
> >> -  -:[GRP/]EVENT   : Clear uprobe or uretprobe 
> >> event
> >> +  p[:[GRP/]EVENT] PATH:OFFSET[(REF_CTR_OFFSET)] [FETCHARGS]
> >> +  r[:[GRP/]EVENT] PATH:OFFSET[(REF_CTR_OFFSET)] [FETCHARGS]
> > Ah, OK in this context, [] means optional syntax :)
> 
> Correct.
> 
> Thanks,
> Ravi
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] trace_uprobe/sdt: Fix multiple update of same reference counter

2018-03-14 Thread Masami Hiramatsu
> +
> +static void sdt_flush_mm_list(struct trace_uprobe *tu)
> +{
> + struct sdt_mm_list *next, *curr = tu->sml;
> +
> + while (curr) {
> + next = curr->next;
> + kfree(curr);
> + curr = next;
> + }
> + tu->sml = NULL;
> +}
> +
>  static bool sdt_valid_vma(struct trace_uprobe *tu, struct vm_area_struct 
> *vma)
>  {
>   unsigned long vaddr = vma_offset_to_vaddr(vma, tu->ref_ctr_offset);
> @@ -989,17 +1065,25 @@ static void sdt_increment_ref_ctr(struct trace_uprobe 
> *tu)
>   if (IS_ERR(info))
>   goto out;
>  
> + down_write(>sml_rw_sem);
>   while (info) {
> + if (sdt_check_mm_list(tu, info->mm))
> + goto cont;
> +
>   down_write(>mm->mmap_sem);
>  
>   vma = sdt_find_vma(info->mm, tu);
>   vaddr = vma_offset_to_vaddr(vma, tu->ref_ctr_offset);
> - sdt_update_ref_ctr(info->mm, vaddr, 1);
> + if (!sdt_update_ref_ctr(info->mm, vaddr, 1))
> + sdt_add_mm_list(tu, info->mm);
>  
>   up_write(>mm->mmap_sem);
> +
> +cont:
>   mmput(info->mm);
>   info = uprobe_free_map_info(info);
>   }
> + up_write(>sml_rw_sem);
>  
>  out:
>   uprobe_end_dup_mmap();
> @@ -1020,8 +1104,16 @@ void trace_uprobe_mmap_callback(struct vm_area_struct 
> *vma)
>   !trace_probe_is_enabled(>tp))
>   continue;
>  
> + down_write(>sml_rw_sem);
> + if (sdt_check_mm_list(tu, vma->vm_mm))
> + goto cont;
> +
>   vaddr = vma_offset_to_vaddr(vma, tu->ref_ctr_offset);
> - sdt_update_ref_ctr(vma->vm_mm, vaddr, 1);
> + if (!sdt_update_ref_ctr(vma->vm_mm, vaddr, 1))
> + sdt_add_mm_list(tu, vma->vm_mm);
> +
> +cont:
> + up_write(>sml_rw_sem);
>   }
>   mutex_unlock(_lock);
>  }
> @@ -1038,7 +1130,11 @@ static void sdt_decrement_ref_ctr(struct trace_uprobe 
> *tu)
>   if (IS_ERR(info))
>   goto out;
>  
> + down_write(>sml_rw_sem);
>   while (info) {
> + if (!sdt_check_mm_list(tu, info->mm))
> + goto cont;
> +
>   down_write(>mm->mmap_sem);
>  
>   vma = sdt_find_vma(info->mm, tu);
> @@ -1046,9 +1142,14 @@ static void sdt_decrement_ref_ctr(struct trace_uprobe 
> *tu)
>   sdt_update_ref_ctr(info->mm, vaddr, -1);
>  
>   up_write(>mm->mmap_sem);
> + sdt_del_mm_list(tu, info->mm);
> +
> +cont:
>   mmput(info->mm);
>   info = uprobe_free_map_info(info);
>   }
> + sdt_flush_mm_list(tu);
> + up_write(>sml_rw_sem);
>  
>  out:
>   uprobe_end_dup_mmap();
> -- 
> 1.8.3.1
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] perf probe: Support SDT markers having reference counter (semaphore)

2018-03-14 Thread Masami Hiramatsu
sdt_note__get_addr(note), ref_ctr_offset);
> + else
> + err = strbuf_addf(, "p:%s/%s %s:0x%llx",
> + sdtgrp, note->name, pathname,
> + sdt_note__get_addr(note));

This can be minimized (and avoid repeating) by using 2 strbuf_addf()s, like

err = strbuf_addf(, "p:%s/%s %s:0x%llx",
sdtgrp, note->name, pathname,
sdt_note__get_addr(note));
if (ref_ctr_offset && !err < 0)
err = strbuf_addf("(0x%llx)", ref_ctr_offset);


> +
> + if (err < 0)
>   goto error;
>  
>   if (!note->args)
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 2de7705..76c7b54 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1928,6 +1928,16 @@ static int populate_sdt_note(Elf **elf, const char 
> *data, size_t len,
>   }
>   }
>  
> + /* Adjust reference counter offset */
> + if (elf_section_by_name(*elf, , , SDT_PROBES_SCN, NULL)) {
> + if (shdr.sh_offset) {
> + if (tmp->bit32)
> +         tmp->addr.a32[2] -= (shdr.sh_addr - 
> shdr.sh_offset);
> + else
> + tmp->addr.a64[2] -= (shdr.sh_addr - 
> shdr.sh_offset);

Here we should use enum above too.

Thank you,

> + }
> + }
> +
>   list_add_tail(>note_list, sdt_notes);
>   return 0;
>  
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 70c16741..ad0c4f2 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -384,6 +384,7 @@ struct sdt_note {
>  int cleanup_sdt_note_list(struct list_head *sdt_notes);
>  int sdt_notes__get_count(struct list_head *start);
>  
> +#define SDT_PROBES_SCN ".probes"
>  #define SDT_BASE_SCN ".stapsdt.base"
>  #define SDT_NOTE_SCN  ".note.stapsdt"
>  #define SDT_NOTE_TYPE 3
> -- 
> 1.8.3.1
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] trace_uprobe/sdt: Document about reference counter

2018-03-14 Thread Masami Hiramatsu
On Tue, 13 Mar 2018 18:26:03 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:

> No functionality changes.

Please consider to describe what is this change and why, here.

> 
> Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
> ---
>  Documentation/trace/uprobetracer.txt | 16 +---
>  kernel/trace/trace.c |  2 +-
>  2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/trace/uprobetracer.txt 
> b/Documentation/trace/uprobetracer.txt
> index bf526a7c..8fb13b0 100644
> --- a/Documentation/trace/uprobetracer.txt
> +++ b/Documentation/trace/uprobetracer.txt
> @@ -19,15 +19,25 @@ user to calculate the offset of the probepoint in the 
> object.
>  
>  Synopsis of uprobe_tracer
>  -
> -  p[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a uprobe
> -  r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a return uprobe (uretprobe)
> -  -:[GRP/]EVENT   : Clear uprobe or uretprobe event
> +  p[:[GRP/]EVENT] PATH:OFFSET[(REF_CTR_OFFSET)] [FETCHARGS]
> +  r[:[GRP/]EVENT] PATH:OFFSET[(REF_CTR_OFFSET)] [FETCHARGS]

Ah, OK in this context, [] means optional syntax :)

> +  -:[GRP/]EVENT
> +
> +  p : Set a uprobe
> +  r : Set a return uprobe (uretprobe)
> +  - : Clear uprobe or uretprobe event
>  
>GRP   : Group name. If omitted, "uprobes" is the default value.
>EVENT : Event name. If omitted, the event name is generated based
>on PATH+OFFSET.
>PATH  : Path to an executable or a library.
>OFFSET: Offset where the probe is inserted.
> +  REF_CTR_OFFSET: Reference counter offset. Optional field. Reference count
> +  gate the invocation of probe. If present, by default
> +  reference count is 0. Kernel needs to increment it before
> +  tracing the probe and decrement it when done. This is
> +  identical to semaphore in Userspace Statically Defined
> +  Tracepoints (USDT).
>  
>FETCHARGS : Arguments. Each probe can have up to 128 args.
> %REG : Fetch register REG
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 20a2300..2104d03 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -4604,7 +4604,7 @@ static int tracing_trace_options_open(struct inode 
> *inode, struct file *file)
>"place (kretprobe): [:][+]|\n"
>  #endif
>  #ifdef CONFIG_UPROBE_EVENTS
> -     "\tplace: :\n"
> +  "   place (uprobe): :[(ref_ctr_offset)]\n"
>  #endif
>   "\t args: =fetcharg[:type]\n"
>   "\t fetcharg: %, @, @[+|-],\n"
> -- 
> 1.8.3.1
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/8] trace_uprobe: Support SDT markers having reference count (semaphore)

2018-03-14 Thread Masami Hiramatsu
, , , NULL);
> + if (ret <= 0)
> + return ret;
> +
> + kaddr = kmap_atomic(page);
> + memcpy(, kaddr + (vaddr & ~PAGE_MASK), sizeof(orig));
> + orig += d;
> + memcpy(kaddr + (vaddr & ~PAGE_MASK), , sizeof(orig));
> + kunmap_atomic(kaddr);
> +
> + put_page(page);
> + return 0;
> +}
> +
> +static void sdt_increment_ref_ctr(struct trace_uprobe *tu)
> +{
> + struct uprobe_map_info *info;
> + struct vm_area_struct *vma;
> + unsigned long vaddr;
> +
> + uprobe_start_dup_mmap();
> + info = uprobe_build_map_info(tu->inode->i_mapping,
> + tu->ref_ctr_offset, false);
> + if (IS_ERR(info))
> + goto out;
> +
> + while (info) {
> + down_write(>mm->mmap_sem);
> +
> + vma = sdt_find_vma(info->mm, tu);
> + vaddr = vma_offset_to_vaddr(vma, tu->ref_ctr_offset);
> + sdt_update_ref_ctr(info->mm, vaddr, 1);
> +
> + up_write(>mm->mmap_sem);
> + mmput(info->mm);
> + info = uprobe_free_map_info(info);
> + }
> +
> +out:
> + uprobe_end_dup_mmap();
> +}
> +
> +/* Called with down_write(>vm_mm->mmap_sem) */
> +void trace_uprobe_mmap_callback(struct vm_area_struct *vma)
> +{
> + struct trace_uprobe *tu;
> + unsigned long vaddr;
> +
> + if (!(vma->vm_flags & VM_WRITE))
> + return;
> +
> + mutex_lock(_lock);
> + list_for_each_entry(tu, _list, list) {
> + if (!sdt_valid_vma(tu, vma) ||
> + !trace_probe_is_enabled(>tp))
> + continue;
> +
> + vaddr = vma_offset_to_vaddr(vma, tu->ref_ctr_offset);
> + sdt_update_ref_ctr(vma->vm_mm, vaddr, 1);
> + }
> + mutex_unlock(_lock);
> +}
> +
> +static void sdt_decrement_ref_ctr(struct trace_uprobe *tu)
> +{
> + struct vm_area_struct *vma;
> + unsigned long vaddr;
> + struct uprobe_map_info *info;
> +
> + uprobe_start_dup_mmap();
> + info = uprobe_build_map_info(tu->inode->i_mapping,
> + tu->ref_ctr_offset, false);
> + if (IS_ERR(info))
> + goto out;
> +
> + while (info) {
> + down_write(>mm->mmap_sem);
> +
> + vma = sdt_find_vma(info->mm, tu);
> + vaddr = vma_offset_to_vaddr(vma, tu->ref_ctr_offset);
> + sdt_update_ref_ctr(info->mm, vaddr, -1);
> +
> + up_write(>mm->mmap_sem);
> + mmput(info->mm);
> + info = uprobe_free_map_info(info);
> + }
> +
> +out:
> + uprobe_end_dup_mmap();
> +}
> +
>  typedef bool (*filter_func_t)(struct uprobe_consumer *self,
>   enum uprobe_filter_ctx ctx,
>   struct mm_struct *mm);
> @@ -939,6 +1099,9 @@ typedef bool (*filter_func_t)(struct uprobe_consumer 
> *self,
>   if (ret)
>   goto err_buffer;
>  
> + if (tu->ref_ctr_offset)
> + sdt_increment_ref_ctr(tu);
> +
>   return 0;
>  
>   err_buffer:
> @@ -979,6 +1142,9 @@ typedef bool (*filter_func_t)(struct uprobe_consumer 
> *self,
>  
>   WARN_ON(!uprobe_filter_is_empty(>filter));
>  
> + if (tu->ref_ctr_offset)
> + sdt_decrement_ref_ctr(tu);
> +
>   uprobe_unregister(tu->inode, tu->offset, >consumer);
>   tu->tp.flags &= file ? ~TP_FLAG_TRACE : ~TP_FLAG_PROFILE;
>  
> @@ -1423,6 +1589,8 @@ static __init int init_uprobe_trace(void)
>   /* Profile interface */
>   trace_create_file("uprobe_profile", 0444, d_tracer,
>   NULL, _profile_ops);
> +
> + uprobe_mmap_callback = trace_uprobe_mmap_callback;
>   return 0;
>  }
>  
> -- 
> 1.8.3.1
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/30] kprobes.txt: standardize document format

2017-05-19 Thread Masami Hiramatsu
On Thu, 18 May 2017 22:24:03 -0300
Mauro Carvalho Chehab <mche...@s-opensource.com> wrote:

> Each text file under Documentation follows a different
> format. Some doesn't even have titles!
> 
> Change its representation to follow the adopted standard,
> using ReST markups for it to be parseable by Sphinx:
> 
> - comment the contents;
> - add proper markups for titles;
> - mark literal blocks as such;
> - use :Author: for authorship;
> - use the right markups for footnotes;
> - escape some literals that would otherwise cause problems;
> - fix identation and add blank lines where needed.

OK, it seems good to me.

Acked-by: Masami Hiramatsu <mhira...@kernel.org>

By the way, there are some words which referes other section
in this document, should it also be updated according to
Sphinx format?

e.g.
[snip]
> @@ -53,7 +59,8 @@ a post_handler, and how to use the maxactive and nmissed 
> fields of
>  a kretprobe.  But if you're in a hurry to start using Kprobes, you
>  can skip ahead to section 2.

here, it refers "section 2".

Thank you,


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] tracing/kprobes: expose maxactive for kretprobe in kprobe_events

2017-04-04 Thread Masami Hiramatsu
On Mon,  3 Apr 2017 12:36:22 +0200
Alban Crequy <alban.cre...@gmail.com> wrote:

> From: Alban Crequy <al...@kinvolk.io>
> 
> When a kretprobe is installed on a kernel function, there is a maximum
> limit of how many calls in parallel it can catch (aka "maxactive"). A
> kernel module could call register_kretprobe() and initialize maxactive
> (see example in samples/kprobes/kretprobe_example.c).
> 
> But that is not exposed to userspace and it is currently not possible to
> choose maxactive when writing to /sys/kernel/debug/tracing/kprobe_events
> 
> The default maxactive can be as low as 1 on single-core with a
> non-preemptive kernel. This is too low and we need to increase it not
> only for recursive functions, but for functions that sleep or resched.
> 
> This patch updates the format of the command that can be written to
> kprobe_events so that maxactive can be optionally specified.
> 
> I need this for a bpf program attached to the kretprobe of
> inet_csk_accept, which can sleep for a long time.
> 
> This patch includes a basic selftest:
> 
> > # ./ftracetest -v  test.d/kprobe/
> > === Ftrace unit tests ===
> > [1] Kprobe dynamic event - adding and removing  [PASS]
> > [2] Kprobe dynamic event - busy event check [PASS]
> > [3] Kprobe dynamic event with arguments [PASS]
> > [4] Kprobes event arguments with types  [PASS]
> > [5] Kprobe dynamic event with function tracer   [PASS]
> > [6] Kretprobe dynamic event with arguments  [PASS]
> > [7] Kretprobe dynamic event with maxactive  [PASS]
> >
> > # of passed:  7
> > # of failed:  0
> > # of unresolved:  0
> > # of untested:  0
> > # of unsupported:  0
> > # of xfailed:  0
> > # of undefined(test bug):  0
> 
> BugLink: https://github.com/iovisor/bcc/issues/1072
> Signed-off-by: Alban Crequy <al...@kinvolk.io>

Looks good to me.

Acked-by: Masami Hiramatsu <mhira...@kernel.org>

Thanks!

> 
> ---
> 
> Changes since v2:
> - Explain the default maxactive value in the documentation.
>   (Review from Steven Rostedt)
> 
> Changes since v1:
> - Remove "(*)" from documentation. (Review from Masami Hiramatsu)
> - Fix support for "r100" without the event name (Review from Masami Hiramatsu)
> - Get rid of magic numbers within the code.  (Review from Steven Rostedt)
>   Note that I didn't use KRETPROBE_MAXACTIVE_ALLOC since that patch is not
>   merged.
> - Return -E2BIG when maxactive is too big.
> - Add basic selftest
> ---
>  Documentation/trace/kprobetrace.txt|  5 ++-
>  kernel/trace/trace_kprobe.c| 39 
> ++
>  .../ftrace/test.d/kprobe/kretprobe_maxactive.tc| 39 
> ++
>  3 files changed, 76 insertions(+), 7 deletions(-)
>  create mode 100644 
> tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc
> 
> diff --git a/Documentation/trace/kprobetrace.txt 
> b/Documentation/trace/kprobetrace.txt
> index 41ef9d8..25f3960 100644
> --- a/Documentation/trace/kprobetrace.txt
> +++ b/Documentation/trace/kprobetrace.txt
> @@ -23,7 +23,7 @@ current_tracer. Instead of that, add probe points via
>  Synopsis of kprobe_events
>  -
>p[:[GRP/]EVENT] [MOD:]SYM[+offs]|MEMADDR [FETCHARGS]   : Set a probe
> -  r[:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]  : Set a return probe
> +  r[MAXACTIVE][:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]   : Set a return 
> probe
>-:[GRP/]EVENT  : Clear a probe
>  
>   GRP : Group name. If omitted, use "kprobes" for it.
> @@ -32,6 +32,9 @@ Synopsis of kprobe_events
>   MOD : Module name which has given SYM.
>   SYM[+offs]  : Symbol+offset where the probe is inserted.
>   MEMADDR : Address where the probe is inserted.
> + MAXACTIVE   : Maximum number of instances of the specified function that
> +   can be probed simultaneously, or 0 for the default value
> +   as defined in Documentation/kprobes.txt section 1.3.1.
>  
>   FETCHARGS   : Arguments. Each probe can have up to 128 args.
>%REG   : Fetch register REG
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index c5089c7..ae81f3c 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -25,6 +25,7 @@
>  #include "trace_probe.h"
>  
>  #define KPROBE_EVENT_SYSTEM "kprobes"
> +#define KRETPROBE_MAXACTIVE_MAX 4096
>  
>  /**
>   * Kprobe event core functions
> @@ -282,6 +283,7 @@ static struct trace_kprobe *alloc_trace_kprobe(const char 
> *group,
>   

Re: [RFC PATCH tip/master 2/3] kprobes: Allocate kretprobe instance if its free list is empty

2017-03-30 Thread Masami Hiramatsu
On Thu, 30 Mar 2017 08:53:32 +0200
Ingo Molnar <mi...@kernel.org> wrote:

> 
> * Masami Hiramatsu <mhira...@kernel.org> wrote:
> 
> > > So this is something I missed while the original code was merged, but the 
> > > concept 
> > > looks a bit weird: why do we do any "allocation" while a handler is 
> > > executing?
> > > 
> > > That's fundamentally fragile. What's the maximum number of parallel 
> > > 'kretprobe_instance' required per kretprobe - one per CPU?
> > 
> > It depends on the place where we put the probe. If the probed function will 
> > be
> > blocked (yield to other tasks), then we need a same number of threads on
> > the system which can invoke the function. So, ultimately, it is same
> > as function_graph tracer, we need it for each thread.
> 
> So then put it into task_struct (assuming there's no 
> kretprobe-inside-kretprobe 
> nesting allowed).

No, that is possible to put several kretprobes on same thread, e.g.
the func1() is called from func2(), user can put kretprobes for each
function at same time.
So the possible solution is to allocate new return-stack for each task_struct,
and that is what the function-graph tracer did.

Anyway, I'm considering to integrate kretprobe_instance with the ret_stack.
It will increase memory usage for kretprobes, but can provide safer way
to allocate kretprobe_instance.

> There's just no way in hell we should be calling any complex 
> kernel function from kernel probes!

OK, so let's drop this, since it may easily cause deadlock... 


> I mean, think about it, a kretprobe can be installed in a lot of places, and 
> now 
> we want to call get_free_pages() from it?? This would add a massive amount of 
> fragility.

I thought it was safe because GFP_ATOMIC is safe at interrupt handler.

> Instrumentation must be _simple_, every patch that adds more complexity to 
> the 
> most fundamental code path of it should raise a red flag ...
> 
> So let's make this more robust, ok?

Yeah, in that case, I think Alban's patch is enough at this point since
it gives user to tune their kretprobe events not to be missed.

Thank you,

> 
> Thanks,
> 
>   Ingo


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH tip/master 2/3] kprobes: Allocate kretprobe instance if its free list is empty

2017-03-29 Thread Masami Hiramatsu
On Wed, 29 Mar 2017 10:18:48 -0700
Josh Stone <jist...@redhat.com> wrote:

> On 03/29/2017 01:25 AM, Masami Hiramatsu wrote:
> > On Wed, 29 Mar 2017 08:30:05 +0200
> > Ingo Molnar <mi...@kernel.org> wrote:
> >>
> >> * Masami Hiramatsu <mhira...@kernel.org> wrote:
> >>
> >>> @@ -1824,6 +1823,30 @@ void unregister_jprobes(struct jprobe **jps, int 
> >>> num)
> >>>  EXPORT_SYMBOL_GPL(unregister_jprobes);
> >>>  
> >>>  #ifdef CONFIG_KRETPROBES
> >>> +
> >>> +/* Try to use free instance first, if failed, try to allocate new 
> >>> instance */
> >>> +struct kretprobe_instance *kretprobe_alloc_instance(struct kretprobe *rp)
> >>> +{
> >>> + struct kretprobe_instance *ri = NULL;
> >>> + unsigned long flags = 0;
> >>> +
> >>> + raw_spin_lock_irqsave(>lock, flags);
> >>> + if (!hlist_empty(>free_instances)) {
> >>> + ri = hlist_entry(rp->free_instances.first,
> >>> + struct kretprobe_instance, hlist);
> >>> + hlist_del(>hlist);
> >>> + }
> >>> + raw_spin_unlock_irqrestore(>lock, flags);
> >>> +
> >>> + /* Populate max active instance if possible */
> >>> + if (!ri && rp->maxactive < KRETPROBE_MAXACTIVE_ALLOC) {
> >>> + ri = kmalloc(sizeof(*ri) + rp->data_size, GFP_ATOMIC);
> >>> + if (ri)
> >>> + rp->maxactive++;
> >>> + }
> >>> +
> >>> + return ri;
> >>> +}
> >>>  /*
> >>>   * This kprobe pre_handler is registered with every kretprobe. When probe
> >>>   * hits it will set up the return probe.
> >>> @@ -1846,14 +1869,8 @@ static int pre_handler_kretprobe(struct kprobe *p, 
> >>> struct pt_regs *regs)
> >>>   }
> >>>  
> >>>   /* TODO: consider to only swap the RA after the last pre_handler fired 
> >>> */
> >>> - hash = hash_ptr(current, KPROBE_HASH_BITS);
> >>> - raw_spin_lock_irqsave(>lock, flags);
> >>> - if (!hlist_empty(>free_instances)) {
> >>> - ri = hlist_entry(rp->free_instances.first,
> >>> - struct kretprobe_instance, hlist);
> >>> - hlist_del(>hlist);
> >>> - raw_spin_unlock_irqrestore(>lock, flags);
> >>> -
> >>> + ri = kretprobe_alloc_instance(rp);
> >>> + if (ri) {
> >>>   ri->rp = rp;
> >>>   ri->task = current;
> >>>  
> >>> @@ -1868,13 +1885,13 @@ static int pre_handler_kretprobe(struct kprobe 
> >>> *p, struct pt_regs *regs)
> >>>  
> >>>   /* XXX(hch): why is there no hlist_move_head? */
> >>>   INIT_HLIST_NODE(>hlist);
> >>> + hash = hash_ptr(current, KPROBE_HASH_BITS);
> >>>   kretprobe_table_lock(hash, );
> >>>   hlist_add_head(>hlist, _inst_table[hash]);
> >>>   kretprobe_table_unlock(hash, );
> >>> - } else {
> >>> + } else
> >>>   rp->nmissed++;
> >>> - raw_spin_unlock_irqrestore(>lock, flags);
> >>> - }
> >>> +
> >>>   return 0;
> >>>  }
> >>>  NOKPROBE_SYMBOL(pre_handler_kretprobe);
> >>
> >> So this is something I missed while the original code was merged, but the 
> >> concept 
> >> looks a bit weird: why do we do any "allocation" while a handler is 
> >> executing?
> >>
> >> That's fundamentally fragile. What's the maximum number of parallel 
> >> 'kretprobe_instance' required per kretprobe - one per CPU?
> > 
> > It depends on the place where we put the probe. If the probed function will 
> > be
> > blocked (yield to other tasks), then we need a same number of threads on
> > the system which can invoke the function. So, ultimately, it is same
> > as function_graph tracer, we need it for each thread.
> 
> Isn't it also possible that the function may be reentrant?  Whether by
> plain recursion or an interrupt call, this leads to multiple live
> instances even for a given thread.

Yes, that's another possible case, but I don't think that's so serious in kernel
because we have very limited kernel stack, which means the recursion may not
so deep.

Thank you,

-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH tip/master 2/3] kprobes: Allocate kretprobe instance if its free list is empty

2017-03-29 Thread Masami Hiramatsu
On Wed, 29 Mar 2017 08:30:05 +0200
Ingo Molnar <mi...@kernel.org> wrote:
> 
> * Masami Hiramatsu <mhira...@kernel.org> wrote:
> 
> > @@ -1824,6 +1823,30 @@ void unregister_jprobes(struct jprobe **jps, int num)
> >  EXPORT_SYMBOL_GPL(unregister_jprobes);
> >  
> >  #ifdef CONFIG_KRETPROBES
> > +
> > +/* Try to use free instance first, if failed, try to allocate new instance 
> > */
> > +struct kretprobe_instance *kretprobe_alloc_instance(struct kretprobe *rp)
> > +{
> > +   struct kretprobe_instance *ri = NULL;
> > +   unsigned long flags = 0;
> > +
> > +   raw_spin_lock_irqsave(>lock, flags);
> > +   if (!hlist_empty(>free_instances)) {
> > +   ri = hlist_entry(rp->free_instances.first,
> > +   struct kretprobe_instance, hlist);
> > +   hlist_del(>hlist);
> > +   }
> > +   raw_spin_unlock_irqrestore(>lock, flags);
> > +
> > +   /* Populate max active instance if possible */
> > +   if (!ri && rp->maxactive < KRETPROBE_MAXACTIVE_ALLOC) {
> > +   ri = kmalloc(sizeof(*ri) + rp->data_size, GFP_ATOMIC);
> > +   if (ri)
> > +   rp->maxactive++;
> > +   }
> > +
> > +   return ri;
> > +}
> >  /*
> >   * This kprobe pre_handler is registered with every kretprobe. When probe
> >   * hits it will set up the return probe.
> > @@ -1846,14 +1869,8 @@ static int pre_handler_kretprobe(struct kprobe *p, 
> > struct pt_regs *regs)
> > }
> >  
> > /* TODO: consider to only swap the RA after the last pre_handler fired 
> > */
> > -   hash = hash_ptr(current, KPROBE_HASH_BITS);
> > -   raw_spin_lock_irqsave(>lock, flags);
> > -   if (!hlist_empty(>free_instances)) {
> > -   ri = hlist_entry(rp->free_instances.first,
> > -   struct kretprobe_instance, hlist);
> > -   hlist_del(>hlist);
> > -   raw_spin_unlock_irqrestore(>lock, flags);
> > -
> > +   ri = kretprobe_alloc_instance(rp);
> > +   if (ri) {
> > ri->rp = rp;
> > ri->task = current;
> >  
> > @@ -1868,13 +1885,13 @@ static int pre_handler_kretprobe(struct kprobe *p, 
> > struct pt_regs *regs)
> >  
> > /* XXX(hch): why is there no hlist_move_head? */
> > INIT_HLIST_NODE(>hlist);
> > +   hash = hash_ptr(current, KPROBE_HASH_BITS);
> > kretprobe_table_lock(hash, );
> > hlist_add_head(>hlist, _inst_table[hash]);
> > kretprobe_table_unlock(hash, );
> > -   } else {
> > +   } else
> > rp->nmissed++;
> > -   raw_spin_unlock_irqrestore(>lock, flags);
> > -   }
> > +
> > return 0;
> >  }
> >  NOKPROBE_SYMBOL(pre_handler_kretprobe);
> 
> So this is something I missed while the original code was merged, but the 
> concept 
> looks a bit weird: why do we do any "allocation" while a handler is executing?
> 
> That's fundamentally fragile. What's the maximum number of parallel 
> 'kretprobe_instance' required per kretprobe - one per CPU?

It depends on the place where we put the probe. If the probed function will be
blocked (yield to other tasks), then we need a same number of threads on
the system which can invoke the function. So, ultimately, it is same
as function_graph tracer, we need it for each thread.

> 
> If so then we should preallocate all of them when they are installed and not 
> do 
> any alloc/free dance when executing them.
> 
> This will also speed them up, and increase robustness all around.

I see, kretprobe already do that, and I keep it on the code.

By default, kretprobe will allocate NR_CPU of kretprobe_instance for each
kretprobe. For usual usecase (deeper inside functions in kernel) that is OK.
However, as Lukasz reported, for the function near the syscall entry, it may
require more instances. In that case, kretprobe user needs to increase
maxactive before registering kretprobes, which will be done by Alban's patch.

However, the next question is, how many instances are actually needed.
User may have to do trial & error loop to find that. For professional users,
they will do that, but for the light users, they may not want to do that.

I'm also considering to provide a "knob" of disabing this dynamic allocation
feature on debugfs, which will help users who would like to avoid memory
allocation on kretprobe.

Thank you,

-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH tip/master 3/3] kprobes: Limit kretprobe maximum instances

2017-03-28 Thread Masami Hiramatsu
Limit kretprobe maximum instance up to MAXACTIVE_ALLOC.
Without this limit, kretprobe user can specify huge number
(e.g. forget to zero-fill struct kretprobe) to maxactive
and may cause out-of-memory.

Signed-off-by: Masami Hiramatsu <mhira...@kernel.org>
---
 kernel/kprobes.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 75c5390..f1bebcf 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1942,6 +1942,9 @@ int register_kretprobe(struct kretprobe *rp)
rp->kp.break_handler = NULL;
 
/* Pre-allocate memory for max kretprobe instances */
+   if (rp->maxactive > KRETPROBE_MAXACTIVE_ALLOC)
+   return -E2BIG;
+
if (rp->maxactive <= 0) {
 #ifdef CONFIG_PREEMPT
rp->maxactive = max_t(unsigned int, 10, 2*num_possible_cpus());

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH tip/master 2/3] kprobes: Allocate kretprobe instance if its free list is empty

2017-03-28 Thread Masami Hiramatsu
Try to allocate kretprobe instance by GFP_ATOMIC if kretprobe's
free list is empty. This can prevent kretprobe miss-hit on the
function which can be heavily invoked and slept inside (like
locking syscall entries.)

NOTE: This may easily cause nested kprobe call which will be
just skipped (but nmissed count is incremented), if someone
probe functions on the memory allocation path.

Signed-off-by: Masami Hiramatsu <mhira...@kernel.org>
---
 Documentation/kprobes.txt |   25 +++--
 include/linux/kprobes.h   |2 ++
 kernel/kprobes.c  |   41 +
 3 files changed, 46 insertions(+), 22 deletions(-)

diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
index 1f6d45a..2de6533 100644
--- a/Documentation/kprobes.txt
+++ b/Documentation/kprobes.txt
@@ -131,11 +131,13 @@ kretprobe, then sets the saved instruction pointer to the 
saved return
 address, and that's where execution resumes upon return from the trap.
 
 While the probed function is executing, its return address is
-stored in an object of type kretprobe_instance.  Before calling
-register_kretprobe(), the user sets the maxactive field of the
-kretprobe struct to specify how many instances of the specified
-function can be probed simultaneously.  register_kretprobe()
-pre-allocates the indicated number of kretprobe_instance objects.
+stored in an object of type kretprobe_instance. Usually, this
+kretprobe_instance will be allocated dynamically.
+Since the dynamic allocation can cause mis-hit of other probes
+on memory allocation path, user can set maxactive field for pooling
+pre-allocated instances before calling register_kretprobe().
+This maxactive indicates how many instances of the specified
+function can be probed simultaneously.
 
 For example, if the function is non-recursive and is called with a
 spinlock held, maxactive = 1 should be enough.  If the function is
@@ -144,11 +146,14 @@ or preemption), NR_CPUS should be enough.  If maxactive 
<= 0, it is
 set to a default value.  If CONFIG_PREEMPT is enabled, the default
 is max(10, 2*NR_CPUS).  Otherwise, the default is NR_CPUS.
 
-It's not a disaster if you set maxactive too low; you'll just miss
-some probes.  In the kretprobe struct, the nmissed field is set to
-zero when the return probe is registered, and is incremented every
-time the probed function is entered but there is no kretprobe_instance
-object available for establishing the return probe.
+It's not a disaster if you set maxactive too low; you'll just see
+some probes on memory allocation path missed if it exists.
+
+In the kretprobe struct, the nmissed field is set to zero when the
+return probe is registered, and is incremented every time the probed
+function is entered but there is no kretprobe_instance object available
+and it fails to allocate new one, or hit the upper limit of maxactive
+(KRETPROBE_MAXACTIVE_ALLOC, currently it is 4096.)
 
 1.3.2 Kretprobe entry-handler
 
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 47e4da5..8064e14 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -192,6 +192,8 @@ struct kretprobe {
struct hlist_head free_instances;
raw_spinlock_t lock;
 };
+/* Upper limit of maxactive for dynamic allocation */
+#define KRETPROBE_MAXACTIVE_ALLOC 4096
 
 struct kretprobe_instance {
struct hlist_node hlist;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index d733479..75c5390 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -57,7 +57,6 @@
 #define KPROBE_HASH_BITS 6
 #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS)
 
-
 /*
  * Some oddball architectures like 64bit powerpc have function descriptors
  * so this must be overridable.
@@ -1824,6 +1823,30 @@ void unregister_jprobes(struct jprobe **jps, int num)
 EXPORT_SYMBOL_GPL(unregister_jprobes);
 
 #ifdef CONFIG_KRETPROBES
+
+/* Try to use free instance first, if failed, try to allocate new instance */
+struct kretprobe_instance *kretprobe_alloc_instance(struct kretprobe *rp)
+{
+   struct kretprobe_instance *ri = NULL;
+   unsigned long flags = 0;
+
+   raw_spin_lock_irqsave(>lock, flags);
+   if (!hlist_empty(>free_instances)) {
+   ri = hlist_entry(rp->free_instances.first,
+   struct kretprobe_instance, hlist);
+   hlist_del(>hlist);
+   }
+   raw_spin_unlock_irqrestore(>lock, flags);
+
+   /* Populate max active instance if possible */
+   if (!ri && rp->maxactive < KRETPROBE_MAXACTIVE_ALLOC) {
+   ri = kmalloc(sizeof(*ri) + rp->data_size, GFP_ATOMIC);
+   if (ri)
+   rp->maxactive++;
+   }
+
+   return ri;
+}
 /*
  * This kprobe pre_handler is registered with every kretprobe. When probe
  * hits it will set up the return probe.
@@ -1846,14 +1869,8 @@ static int pre_handler_kretprobe(struct kprobe *p, 
struct pt_regs *r

[RFC PATCH tip/master 1/3] trace: kprobes: Show sum of probe/retprobe nmissed count

2017-03-28 Thread Masami Hiramatsu
Show sum of probe and retprobe nmissed count in
kprobe_profile, since retprobe can be missed even
if the kprobe itself succeeeded.
This explains user why their return probe didn't hit
sometimes.

Signed-off-by: Masami Hiramatsu <mhira...@kernel.org>
---
 kernel/trace/trace_kprobe.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 013f4e7..bbdc3de 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -896,7 +896,7 @@ static int probes_profile_seq_show(struct seq_file *m, void 
*v)
seq_printf(m, "  %-44s %15lu %15lu\n",
   trace_event_name(>tp.call),
   trace_kprobe_nhit(tk),
-  tk->rp.kp.nmissed);
+  tk->rp.kp.nmissed + tk->rp.nmissed);
 
return 0;
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH tip/master 0/3] kprobes: tracing: kretprobe_instance dynamic allocation

2017-03-28 Thread Masami Hiramatsu
Here is a correction of patches to introduce kretprobe_instance
dynamic allocation for avoiding kretprobe silently miss-hits.

Original issue was reported by Lukasz on linux-trace ml.
 https://www.spinics.net/lists/linux-trace/msg00448.html

Also Alban is working on kprobe-tracer side because of
iovisor's issue.
 https://www.spinics.net/lists/netdev/msg427149.html
(Note that this series is independently applicable, no conflict)

This series is a kind of complementary patches for
Alban's patch. So I think both of them are needed.

 [1/3]: Add kretprobe's miss-hit counter to miss-hit
column on kprobe_profile. This helps user to
see what happened.
 [2/3]: Introduce kretprobe_instance dynamic allocation.
This will help user not to miss the ret probes
even it has low number of maxactive.
 [3/3]: Set maximum limitation for pre-allocated maxactive.
This can avoid miss configuration of struct kretprobe.

The downside of this patch is, dynamic allocation will
involve memory allocation, which sometimes traced by
kprobes. In that case those nested kprobes are missed.
To avoid this kind of miss-hit, user may need to make
maxactive enough large when registering kretprobes.

However, in other case, this can reduce the possibility
of miss-hit of kretprobes. Since the maxactive increased
automatically, user will not need to retry tracing with
larger maxactive.

Alban, you can reuse KRETPROBE_MAXACTIVE_ALLOC for checking
upper limiation in trace_kprobe.c too.

Thank you,

---

Masami Hiramatsu (3):
  trace: kprobes: Show sum of probe/retprobe nmissed count
  kprobes: Allocate kretprobe instance if its free list is empty
  kprobes: Limit kretprobe maximum instances


 Documentation/kprobes.txt   |   25 +++-
 include/linux/kprobes.h |2 ++
 kernel/kprobes.c|   44 +++
 kernel/trace/trace_kprobe.c |2 +-
 4 files changed, 50 insertions(+), 23 deletions(-)

--
Masami Hiramatsu (Linaro) <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] tracing/kprobes: expose maxactive for kretprobe in kprobe_events

2017-03-28 Thread Masami Hiramatsu
On Tue, 28 Mar 2017 18:08:16 +0200
Alban Crequy <al...@kinvolk.io> wrote:

> Thanks for the review,
> 
> On Tue, Mar 28, 2017 at 5:23 PM, Masami Hiramatsu <mhira...@kernel.org> wrote:
> > On Tue, 28 Mar 2017 15:52:22 +0200
> > Alban Crequy <alban.cre...@gmail.com> wrote:
> >
> >> When a kretprobe is installed on a kernel function, there is a maximum
> >> limit of how many calls in parallel it can catch (aka "maxactive"). A
> >> kernel module could call register_kretprobe() and initialize maxactive
> >> (see example in samples/kprobes/kretprobe_example.c).
> >>
> >> But that is not exposed to userspace and it is currently not possible to
> >> choose maxactive when writing to /sys/kernel/debug/tracing/kprobe_events
> >
> > I see, I found same issue last week...
> >
> >>
> >> The default maxactive can be as low as 1 on single-core with a
> >> non-preemptive kernel. This is too low and we need to increase it not
> >> only for recursive functions, but for functions that sleep or resched.
> >>
> >> This patch updates the format of the command that can be written to
> >> kprobe_events so that maxactive can be optionally specified.
> >
> > Good! this is completely same what I'm planning to add.
> >
> >>
> >> I need this for a bpf program attached to the kretprobe of
> >> inet_csk_accept, which can sleep for a long time.
> >
> > I'm also preparing another patch for using kmalloc(GFP_ATOMIC) in
> > kretprobe_pre_handler(), since this manual way is sometimes hard to
> > estimate how many instances needed. Anyway, since that may involve
> > unwilling memory allocation, this patch also needed.
> 
> Where is that kretprobe_pre_handler()? I don't see any allocations in
> pre_handler_kretprobe().

pre_handler_kretprobe(), I'll send my patch, but note that I also considered
to introduce a patch which exactly same as yours. So even with my patch,
this patch should be introduced.

> 
> >> BugLink: https://github.com/iovisor/bcc/issues/1072
> >
> > Could you also add Lukasz to Cc list, since he had reported an issue
> > related this one.
> >
> > https://www.spinics.net/lists/linux-trace/msg00448.html
> >
> > Please see my comments below.
> >
> >> Signed-off-by: Alban Crequy <al...@kinvolk.io>
> >> ---
> >>  Documentation/trace/kprobetrace.txt |  4 +++-
> >>  kernel/trace/trace_kprobe.c | 34 
> >> +-
> >>  2 files changed, 32 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/Documentation/trace/kprobetrace.txt 
> >> b/Documentation/trace/kprobetrace.txt
> >> index 41ef9d8..655ca7e 100644
> >> --- a/Documentation/trace/kprobetrace.txt
> >> +++ b/Documentation/trace/kprobetrace.txt
> >> @@ -23,7 +23,7 @@ current_tracer. Instead of that, add probe points via
> >>  Synopsis of kprobe_events
> >>  -
> >>p[:[GRP/]EVENT] [MOD:]SYM[+offs]|MEMADDR [FETCHARGS]   : Set a probe
> >> -  r[:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]  : Set a return probe
> >> +  r[MAXACTIVE][:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS]   : Set a 
> >> return probe
> >>-:[GRP/]EVENT  : Clear a 
> >> probe
> >>
> >>   GRP : Group name. If omitted, use "kprobes" for it.
> >> @@ -32,6 +32,8 @@ Synopsis of kprobe_events
> >>   MOD : Module name which has given SYM.
> >>   SYM[+offs]  : Symbol+offset where the probe is inserted.
> >>   MEMADDR : Address where the probe is inserted.
> >> + MAXACTIVE   : Maximum number of instances of the specified function that
> >> +   can be probed simultaneously, or 0 for the default.(*)
> >
> > Here, yoy don't need (*), since [MAXACTIVE] is not a FETCHARGS.
> 
> Why not? (*) refers to "(*) only for return probe." and the maxactive
> only makes sense for the kretprobe.

Because simply synopsis already explained MAXACTIVE is only for 'r' :)
The reason why I added the (*) note is that FETCHARGS are shown in
both synopsis of 'p' and 'r'.

> 
> >>   FETCHARGS   : Arguments. Each probe can have up to 128 args.
> >>%REG   : Fetch register REG
> >> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> >> index 5f688cc..807e01c 100644
> >> --- a/kernel/trace/trace_kprobe.c
> >> +++ b/kernel/trace/trace_kprobe.c
> >> @@ -282,6 +282,7 @@ stati

Re: [PATCH v1] tracing/kprobes: expose maxactive for kretprobe in kprobe_events

2017-03-28 Thread Masami Hiramatsu
On Tue, 28 Mar 2017 11:34:07 -0400
Steven Rostedt <rost...@goodmis.org> wrote:

> On Wed, 29 Mar 2017 00:23:35 +0900
> Masami Hiramatsu <mhira...@kernel.org> wrote:
> 
> > > @@ -598,8 +601,10 @@ static int create_trace_kprobe(int argc, char **argv)
> > >  {
> > >   /*
> > >* Argument syntax:
> > > -  *  - Add kprobe: p[:[GRP/]EVENT] [MOD:]KSYM[+OFFS]|KADDR [FETCHARGS]
> > > -  *  - Add kretprobe: r[:[GRP/]EVENT] [MOD:]KSYM[+0] [FETCHARGS]
> > > +  *  - Add kprobe:
> > > +  *  p[:[GRP/]EVENT] [MOD:]KSYM[+OFFS]|KADDR [FETCHARGS]
> > > +  *  - Add kretprobe:
> > > +  *  r[MAXACTIVE][:[GRP/]EVENT] [MOD:]KSYM[+0] [FETCHARGS]
> > >* Fetch args:
> > >*  $retval : fetch return value
> > >*  $stack  : fetch stack address
> > > @@ -619,6 +624,7 @@ static int create_trace_kprobe(int argc, char **argv)
> > >   int i, ret = 0;
> > >   bool is_return = false, is_delete = false;
> > >   char *symbol = NULL, *event = NULL, *group = NULL;
> > > + int maxactive = 0;
> > >   char *arg;
> > >   unsigned long offset = 0;
> > >   void *addr = NULL;
> > > @@ -637,8 +643,26 @@ static int create_trace_kprobe(int argc, char **argv)
> > >   return -EINVAL;
> > >   }
> > >  
> > > - if (argv[0][1] == ':') {
> > > + if (is_return && isdigit(argv[0][1]) && strchr([0][1], ':')) {  
> > 
> > This only supports r[MAXACTIVE:[GRP/]EVENT]. e.g "r100" without event name.
> 
> You mean it doesn't support adding MAXACTIVE without the ':event'.

Yeah, sorry for lacking the explanation...

> 
> > 
> > Thank you,
> > 
> > > + event = strchr([0][1], ':') + 1;
> > > + event[-1] = '\0';
> > > + ret = kstrtouint([0][1], 0, );
> > > + if (ret) {
> > > + pr_info("Failed to parse maxactive.\n");
> > > + return ret;
> > > + }
> > > + /* kretprobes instances are iterated over via a list. The
> > > +  * maximum should stay reasonable.
> > > +  */
> > > + if (maxactive > 1024) {
> 
> Also, can we get rid of magic numbers within the code. There should be a
> const or macro defined as MAX_MAXACTIVE or something, and use that here.

Good catch!

Thanks,

> 
> -- Steve
> 
> 
> > > + pr_info("Maxactive is too big.\n");
> > > + return -EINVAL;
> > > + }
> > > + } else if (argv[0][1] == ':') {
> > >   event = [0][2];
> > > + }
> > > +
> > > + if (event) {
> > >   if (strchr(event, '/')) {
> > >   group = event;
> > >   event = strchr(group, '/') + 1;
> > > @@ -718,8 +742,8 @@ static int create_trace_kprobe(int argc, char **argv)
> > >is_return ? 'r' : 'p', addr);
> > >   event = buf;
> > >   }
> > > - tk = alloc_trace_kprobe(group, event, addr, symbol, offset, argc,
> > > -is_return);
> > > + tk = alloc_trace_kprobe(group, event, addr, symbol, offset, maxactive,
> > > +argc, is_return);
> > >   if (IS_ERR(tk)) {
> > >   pr_info("Failed to allocate trace_probe.(%d)\n",
> > >   (int)PTR_ERR(tk));
> > > -- 
> > > 2.7.4
> > >   
> > 
> > 
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] tracing/kprobes: expose maxactive for kretprobe in kprobe_events

2017-03-28 Thread Masami Hiramatsu
t = 0;
>   bool is_return = false, is_delete = false;
>   char *symbol = NULL, *event = NULL, *group = NULL;
> + int maxactive = 0;
>   char *arg;
>   unsigned long offset = 0;
>   void *addr = NULL;
> @@ -637,8 +643,26 @@ static int create_trace_kprobe(int argc, char **argv)
>   return -EINVAL;
>   }
>  
> - if (argv[0][1] == ':') {
> + if (is_return && isdigit(argv[0][1]) && strchr([0][1], ':')) {

This only supports r[MAXACTIVE:[GRP/]EVENT]. e.g "r100" without event name.

Thank you,

> + event = strchr([0][1], ':') + 1;
> + event[-1] = '\0';
> + ret = kstrtouint([0][1], 0, );
> + if (ret) {
> + pr_info("Failed to parse maxactive.\n");
> + return ret;
> + }
> + /* kretprobes instances are iterated over via a list. The
> +  * maximum should stay reasonable.
> +  */
> + if (maxactive > 1024) {
> + pr_info("Maxactive is too big.\n");
> + return -EINVAL;
> + }
> + } else if (argv[0][1] == ':') {
>   event = [0][2];
> + }
> +
> + if (event) {
>   if (strchr(event, '/')) {
>   group = event;
>   event = strchr(group, '/') + 1;
> @@ -718,8 +742,8 @@ static int create_trace_kprobe(int argc, char **argv)
>is_return ? 'r' : 'p', addr);
>   event = buf;
>   }
> - tk = alloc_trace_kprobe(group, event, addr, symbol, offset, argc,
> -is_return);
> + tk = alloc_trace_kprobe(group, event, addr, symbol, offset, maxactive,
> +argc, is_return);
>   if (IS_ERR(tk)) {
>   pr_info("Failed to allocate trace_probe.(%d)\n",
>   (int)PTR_ERR(tk));
> -- 
> 2.7.4
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation: kprobes: Document jprobes stack copying limitations

2016-08-12 Thread Masami Hiramatsu
On Fri, 12 Aug 2016 16:24:44 -0400
David Long <dave.l...@linaro.org> wrote:

> From: "David A. Long" <dave.l...@linaro.org>
> 
> Some architectures (i.e.: sparc64 and arm64) make reasonable partial stack
> duplication for jprobes problematic. Document this.
> 
> Signed-off-by: David A. Long <dave.l...@linaro.org>

Looks good to me.

Acked-by: Masami Hiramatsu <mhira...@kernel.org>

Thanks!

> ---
>  Documentation/kprobes.txt | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
> index 1f9b3e2..1f6d45a 100644
> --- a/Documentation/kprobes.txt
> +++ b/Documentation/kprobes.txt
> @@ -103,6 +103,16 @@ Note that the probed function's args may be passed on 
> the stack
>  or in registers.  The jprobe will work in either case, so long as the
>  handler's prototype matches that of the probed function.
>  
> +Note that in some architectures (e.g.: arm64 and sparc64) the stack
> +copy is not done, as the actual location of stacked parameters may be
> +outside of a reasonable MAX_STACK_SIZE value and because that location
> +cannot be determined by the jprobes code. In this case the jprobes
> +user must be careful to make certain the calling signature of the
> +function does not cause parameters to be passed on the stack (e.g.:
> +more than eight function arguments, an argument of more than sixteen
> +bytes, or more than 64 bytes of argument data, depending on
> +architecture).
> +
>  1.3 Return Probes
>  
>  1.3.1 How Does a Return Probe Work?
> -- 
> 2.5.0
> 


-- 
Masami Hiramatsu <mhira...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html