Hi Masami, I just came across this patch set (buried deep in my INBOX). Are you still doing anything with this?
-- Steve On Tue, 22 Aug 2017 00:40:05 +0900 Masami Hiramatsu <[email protected]> wrote: > Hello, > > Here is a feasible study patch to use function_graph > tracer's per-thread return stack for storing kretprobe > return address as fast path. > > Currently kretprobe has own instance hash-list for storing > return address. However, it introduces a spin-lock for > hash list entry and compel users to estimate how many > probes run concurrently (and set it to kretprobe->maxactive). > > To solve this issue, this reuses function_graph's per-thread > ret_stack for kretprobes as fast path instead of using its > hash-list if possible. Note that if the kretprobe has > custom entry_handler and store data in kretprobe_instance, > we can not use the fast path, since current per-thread > return stack is fixed size. (This feature is used by some > systemtap scripts) > > This series also includes showing missed count of > kretprobes via ftrace's kprobe_profile interface, which > had been posted in this March. That is required for > below test case. (without that, we can not see any > kretprobe miss count) > > Usage > ===== > Note that this is just a feasibility study code, and since > the per-thread ret_stack is initialized only when the > function_graph tracer is enabled, you have to following > operation to enable it. > > # echo '*' > <tracefs>/set_graph_notrace > # echo function_graph > <tracefs>/current_tracer > > After that, try to add an kretprobe event with just 1 > instance (anyway we don't use it). > > # echo r1 vfs_write > <tracefs>/kprobe_events > # echo 1 > <tracefs>/events/kprobes/enable > > And run "yes" command concurrently. > > # for i in {0..31}; do yes > /dev/null & done > # cat <tracefs>/kprobe_profile > r_vfs_write_0 4756473 0 > > Then you will see the error count (the last column) is zero. > Currently, this feature is disabled when the function graph > tracer is stopped, so if you set nop tracer as below, > > # echo nop > <tracefs>/current_tracer > > Then you'll see the error count is increasing. > > # cat <tracefs>/kprobe_profile > r_vfs_write_0 7663462 238537 > > This may gain the performance of kretprobe, but I haven't > benchmark it yet. > > > TODO > ==== > This is just a feasible study code, I haven't tested it > deeper. It may still have some bugs. Anyway, if it is good, > I would like to split the per-thread return stack code > from ftrace, and make it a new generic feature (e.g. > CONFIG_THERAD_RETURN_STACK) so that both kprobes and > ftrace can share it. It may also move return-stack > allocation as direct call instead of event handler. > > Any comment? > > Thank you, > > --- > > Masami Hiramatsu (2): > trace: kprobes: Show sum of probe/retprobe nmissed count > kprobes/x86: Use graph_tracer's per-thread return stack for kretprobe > > > arch/x86/kernel/kprobes/core.c | 95 > ++++++++++++++++++++++++++++++++++ > include/linux/ftrace.h | 3 + > kernel/kprobes.c | 11 ++++ > kernel/trace/trace_functions_graph.c | 5 +- > kernel/trace/trace_kprobe.c | 2 - > 5 files changed, 112 insertions(+), 4 deletions(-) > > -- > Masami Hiramatsu (Linaro) <[email protected]>

