On Thu, Oct 30, 2025 at 8:36 PM Menglong Dong <[email protected]> wrote: > > On Fri, Oct 31, 2025 at 9:42 AM Alexei Starovoitov > <[email protected]> wrote: > > > > On Sat, Oct 25, 2025 at 8:02 PM Menglong Dong <[email protected]> > > wrote: > > > > > > Add BPF_TRACE_SESSION supporting to x86_64. invoke_bpf_session_entry and > > > invoke_bpf_session_exit is introduced for this purpose. > > > > > > In invoke_bpf_session_entry(), we will check if the return value of the > > > fentry is 0, and set the corresponding session flag if not. And in > > > invoke_bpf_session_exit(), we will check if the corresponding flag is > > > set. If set, the fexit will be skipped. > > > > > > As designed, the session flags and session cookie address is stored after > > > the return value, and the stack look like this: > > > > > > cookie ptr -> 8 bytes > > > session flags -> 8 bytes > > > return value -> 8 bytes > > > argN -> 8 bytes > > > ... > > > arg1 -> 8 bytes > > > nr_args -> 8 bytes > > > ... > > > cookieN -> 8 bytes > > > cookie1 -> 8 bytes > > > > > > In the entry of the session, we will clear the return value, so the fentry > > > will always get 0 with ctx[nr_args] or bpf_get_func_ret(). > > > > > > Before the execution of the BPF prog, the "cookie ptr" will be filled with > > > the corresponding cookie address, which is done in > > > invoke_bpf_session_entry() and invoke_bpf_session_exit(). > > > > ... > > > > > + if (session->nr_links) { > > > + for (i = 0; i < session->nr_links; i++) { > > > + if > > > (session->links[i]->link.prog->call_session_cookie) > > > + stack_size += 8; > > > + } > > > + } > > > + cookies_off = stack_size; > > > > This is not great. It's all root and such, > > but if somebody attaches 64 progs that use session cookies > > then the trampoline will consume 64*8 of stack space just for > > these cookies. Plus more for args, cookie, ptr, session_flag, etc. > > The session cookie stuff does take a lot of stack memory. > For fprobe, it will store the cookie into its shadow stack, which > can free the stack. > > How about we remove the session cookie stuff? Therefore, > only 8-bytes(session flags) are used on the stack. And if we reuse > the nr_regs slot, no stack will be consumed. However, it will make > thing complex, which I don't think we should do. > > > Sigh. > > I understand that cookie from one session shouldn't interfere > > with another, but it's all getting quite complex > > especially when everything is in assembly. > > And this is just x86 JIT. Other JITs would need to copy > > this complex logic :( > > Without the session cookie, it will be much easier to implement > in another arch. And with the hepler of AI(such as cursor), it can > be done easily ;)
The reality is the opposite. We see plenty of AI generated garbage. Please stay human. > > > At this point I'm not sure that "symmetry with kprobe_multi_session" > > is justified as a reason to add all that. > > We don't have a kprobe_session for individual kprobes after all. > > As for my case, the tracing session can make my code much > simpler, as I always use the fentry+fexit to hook a function. And > the fexit skipping according to the return value of fentry can also > achieve better performance. I don't buy the argument that 'if (cond) goto skip_fexit_prog' in the generated trampoline is measurably faster than 'if (cond) return' inside the fexit program. > AFAIT, the mast usage of session cookie in kprobe is passing the > function arguments to the exit. For tracing, we can get the args > in the fexit. So the session cookie in tracing is not as important as > in kprobe. Since kprobe_multi was introduced, retsnoop and tetragon adopted it to do mass attach, and both use bpf_get_attach_cookie(). While both don't use bpf_session_cookie(). Searching things around I also didn't find a single real user of bpf_session_cookie() other than selftests/bpf and Jiri's slides :) So, doing all this work in trampoline for bpf_session_cookie() doesn't seem warranted, but with that doing session in trampoline also doesn't look useful, since the only benefit vs a pair of fentry/fexit is skip of fexit, which can be done already. Plus complexity in all JITs... so, I say, let's shelve the whole thing for now.
