> What's the source code of that kernel module?

I adapted code from an example in a repository Spotify have, which is here: 
https://github.com/spotify/linux/blob/master/samples/kprobes/kretprobe_example.c.
 The only change I made was just to print some extra information to correlate 
the calls with specific threads, and I added a hack before the printk call to 
only do so if the process ID was within a boundary which my program's process 
IDs are located within. The changes look like this:

    if (ri->task->pid >= 7000 || ri->task->pid <= 5000) return 0;
    printk(KERN_INFO "%s returned %d and took %lld ns to execute %d task %s 
%d\n",
            func_name, retval, (long long)delta, calls, ri->task->comm, 
ri->task->pid);

If I don't filter the print calls with the pid check, I start getting high 
numbers of missed calls, which I believe is because the printk call causes 
rsyslog to make futex calls; re-invoking the probe and saturating itself with 
calls. I wonder if this is what was happening with your sys_write code? I'm 
able to reliably track sys_write only when I'm not causing writing calls from 
within the eBPF program.

Another thing of note is that lowering the maxactive parameter to 1 or 10 leads 
to it never printing anything, which I find strange too. Is the maxactive 
parameter local to this specific kernel function and probe? I find it quite 
strange that with the default eBPF configuration I can get plenty of retprobes 
for syscalls like write but seemingly few for futex operations (which aren't 
necessarily denser)

If I track specific futex operations, such as `futex_wake` and `futex_wait`, it 
seems like the retprobes for these work perfectly, which I think is also a bit 
strange.



One thing that comes to mind: is it possible that some optimisations are 
interfering with this? In `do_futex` (which I assume `sys_futex` is hooking 
into), most operations are of the form:

return futex_wake(uaddr, flags, val, val3);

I seem to be able to reliably trace the return of these specific futex 
operations from eBPF (I have tried `_wake` and `_wait`), but they mostly return 
values, rather than another function call. Is it possible that the retprobe 
isn't being triggered because control jumps into `futex_X` and returns straight 
back to user space?

This wouldn't explain why the probes added by the kernel module can detect 
them, though, so I doubt this is the problem.

>I'm thinking it could be caused by a difference in kprobes configuration
>between the kernel module and bcc.

I'll look into playing with the parameters in BCC and see if I get anywhere 
with that. I've been using the pre-built binaries to this point so I'll have to 
spend some time setting that up, so it might be a few days before I get 
anywhere with this. I'll update you if I find anything with this.

Ashley

On Sat, Feb 9, 2019 at 6:03 PM Paul Chaignon 
<[email protected]<mailto:[email protected]>> wrote:
On Fri, Feb 08, 2019 at 01:49:44PM -0800, Ashley wrote:
> Hi Paul,
>
> Thanks for the prompt and detailed response! I hadn't considered trying it 
> with a kernel module, but I set one up to measure this as per your 
> suggestion, and -- almost frustratingly -- it seems to catch them all.
>
> Here's output from the kernel module:
> [ 2130.067718] sys_futex returned 1 and took 6419 ns to execute 372 task 6920
> [ 2130.067829] sys_futex returned 0 and took 500407745 ns to execute 376 task 
> 6921
> [ 2130.567894] sys_futex returned 1 and took 6279 ns to execute 380 task 6921
> [ 2130.567923] sys_futex returned 1 and took 3998 ns to execute 381 task 6921
> [ 2130.567925] sys_futex returned 0 and took 500442358 ns to execute 381 task 
> 6920
> [ 2130.568043] sys_futex returned 0 and took 1000852410 ns to execute 385 
> task 6922
> [ 2131.174619] sys_futex returned 0 and took 779 ns to execute 399 task 6922
> [ 2131.174713] sys_futex returned 1 and took 5393 ns to execute 400 task 6922
> [ 2131.174887] sys_futex returned 0 and took 607250418 ns to execute 403 task 
> 6920

What's the source code of that kernel module?

>
> And here's my output from my eBPF program:
> ashley@ashley-VirtualBox ~/4/e> cat test | grep "FUTEX <<"
>             main-6921  [000] ....  2129.844847: 0x00000001: FUTEX <<
>             main-6922  [000] ....  2129.844870: 0x00000001: FUTEX <<
>             main-6920  [000] ....  2130.344893: 0x00000001: FUTEX <<
>             main-6920  [000] ....  2130.344906: 0x00000001: FUTEX <<
>             main-6921  [000] ....  2130.845068: 0x00000001: FUTEX <<
>             main-6921  [000] ....  2130.845102: 0x00000001: FUTEX <<
>             main-6920  [000] ....  2130.845113: 0x00000001: FUTEX <<
>             main-6922  [000] ....  2131.451799: 0x00000001: FUTEX <<
>             main-6922  [000] .N..  2131.451887: 0x00000001: FUTEX <<
> ashley@ashley-VirtualBox ~/4/e> cat test | grep ">> FUTEX"
>             main-6921  [000] d...  2130.345015: 0x00000001: >> FUTEX
>             main-6922  [000] d...  2130.845228: 0x00000001: >> FUTEX
>
> So my eBPF program received all 9 futex calls as expected, but only 2 
> returns, whereas the kernel module successfully gets all of them consistently.
>
> I'm also able to accurately track k[ret]probes for sys_write and sys_read 
> through eBPF, even though they seem to happen in higher density on my system. 
> The problem still occurs even if I don't use bpf_trace_printk within the 
> retprobe also.

Weird, I have the same problem you have for sys_futex if I adapt the
script to track sys_write instead...

>
> Any ideas where might be a good place to look from here? I'm totally stumped; 
> it's quite tempting to just migrate my code to kernel modules, but everything 
> else I'm instrumenting seems to work perfectly in eBPF, so it'd be a shame to 
> do that.

I'm thinking it could be caused by a difference in kprobes configuration
between the kernel module and bcc.

>
> Best,
> Ashley

Paul

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1575): https://lists.iovisor.org/g/iovisor-dev/message/1575
Mute This Topic: https://lists.iovisor.org/mt/29702757/21656
Group Owner: [email protected]
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub  
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to