Hi,

On Thu, 15 Apr 2021 17:00:07 -0400
Steven Rostedt <rost...@goodmis.org> wrote:

> 
> [
>   Added Masami, as I didn't realize he wasn't on Cc. He's the maintainer of
>   kretprobes.
> 
>   Masami, you may want to use lore.kernel.org to read the history of this
>   thread.
> ]
> 
> On Thu, 15 Apr 2021 13:45:06 -0700
> Andrii Nakryiko <andrii.nakry...@gmail.com> wrote:
> 
> > > I don't know how the BPF code does it, but if you are tracing the exit
> > > of a function, I'm assuming that you hijack the return pointer and replace
> > > it with a call to a trampoline that has access to the arguments. To do  
> > 
> > As Jiri replied, BPF trampoline doesn't do it the same way as
> > kretprobe does it. Which gives the fexit BPF program another critical
> > advantage over kretprobe -- we know traced function's entry IP in both
> > entry and exit cases, which allows us to generically correlate them.
> > 
> > I've tried to figure out how to get that entry IP from kretprobe and
> > couldn't find any way. Do you know if it's possible at all or it's a
> > fundamental limitation of the way kretprobe is implemented (through
> > hijacking return address)?

Inside the kretprobe handler, you can get the entry IP from kretprobe as below;

static int my_kretprobe_handler(struct kretprobe_instance *ri, struct pt_regs 
*regs)
{
        struct kretprobe *rp = get_kretprobe(ri);
        unsigned long entry = (unsigned long)rp->kp.addr;
        unsigned long retaddr = (unsigned long)ri->ret_addr;
        ...
}

It is ensured that rp != NULL in the handler.

> 
> The function graph tracer has the entry IP on exit, today. That's how we
> can trace and show this:
> 
>  # cd /sys/kernel/tracing
>  # echo 1 > echo 1 > options/funcgraph-tail
>  # echo function_graph > current_tracer
>  # cat trace
> # tracer: function_graph
> #
> # CPU  DURATION                  FUNCTION CALLS
> # |     |   |                     |   |   |   |
>  7)   1.358 us    |  rcu_idle_exit();
>  7)   0.169 us    |  sched_idle_set_state();
>  7)               |  cpuidle_reflect() {
>  7)               |    menu_reflect() {
>  7)   0.170 us    |      tick_nohz_idle_got_tick();
>  7)   0.585 us    |    } /* menu_reflect */
>  7)   1.115 us    |  } /* cpuidle_reflect */
> 
> That's how we can show the tail function that's called. I'm sure kreprobes
> could do the same thing.

Yes, I have to update the document how to do that (and maybe introduce 2 
functions
to wrap the entry/retaddr code)

> 
> The patch series I shared with Jiri, was work to allow kretprobes to be
> built on top of the function graph tracer.
> 
> https://lore.kernel.org/lkml/20190525031633.811342...@goodmis.org/
> 
> The feature missing from that series, and why I didn't push it (as I had
> ran out of time to work on it), was that kreprobes wants the full regs
> stack as well. And since kretprobes was the main client of this work, that
> I decided to work on this at another time. But like everything else, I got
> distracted by other work, and didn't realize it has been almost 2 years
> since looking at it :-p
> 
> Anyway, IIRC, Masami wasn't sure that the full regs was ever needed for the
> return (who cares about the registers on return, except for the return
> value?)

I think kretprobe and ftrace are for a bit different usage. kretprobe can be
used for something like debugger. In that case, accessing full regs stack
will be more preferrable. (BTW, what the not "full regs" means? Does that
save partial registers?)


Thank you,

> But this code could easily save the parameters as well.
> 
> > 
> > > this you need a shadow stack to save the real return as well as the
> > > parameters of the function. This is something that I have patches that do
> > > similar things with function graph.
> > >
> > > If you want this feature, lets work together and make this work for both
> > > BPF and ftrace.  
> > 
> > Absolutely, ultimately for users it doesn't matter what specific
> > mechanism is used under the cover. It just seemed like BPF trampoline
> > has all the useful tracing features (entry IP and input arguments in
> > fexit) already and is just mostly missing a quick batch attach API. If
> > we can get the same from ftrace, all the better.
> 
> Let me pull these patches out again, and see what we can do. Since then,
> I've added the code that lets function tracer save parameters and the
> stack, and function graph can use that as well.
> 
> 
> -- Steve


-- 
Masami Hiramatsu <mhira...@kernel.org>

Reply via email to