On Tue, 29 Jul 2025 11:33:35 -0400 Steven Rostedt <rost...@goodmis.org> wrote:
> From: Steven Rostedt <rost...@goodmis.org> > > Add syntax to the FETCHARGS parsing of probes to allow the use of > structure and member names to get the offsets to dereference pointers. > > Currently, a dereference must be a number, where the user has to figure > out manually the offset of a member of a structure that they want to > reference. For example, to get the size of a kmem_cache that was passed to > the function kmem_cache_alloc_noprof, one would need to do: > > # cd /sys/kernel/tracing > # echo 'f:cache kmem_cache_alloc_noprof size=+0x18($arg1):u32' >> > dynamic_events > > This requires knowing that the offset of size is 0x18, which can be found > with gdb: > > (gdb) p &((struct kmem_cache *)0)->size > $1 = (unsigned int *) 0x18 > > If BTF is in the kernel, it can be used to find this with names, where the > user doesn't need to find the actual offset: > > # echo 'f:cache kmem_cache_alloc_noprof size=+kmem_cache.size($arg1):u32' >> > dynamic_events Great! This is something like a evolution of assembler to "symbolic" assembler. ;) > > Instead of the "+0x18", it would have "+kmem_cache.size" where the format is: > > +STRUCT.MEMBER[.MEMBER[..]] Yeah, and using '.' delimiter looks nice to me. > > The delimiter is '.' and the first item is the structure name. Then the > member of the structure to get the offset of. If that member is an > embedded structure, another '.MEMBER' may be added to get the offset of > its members with respect to the original value. > > "+kmem_cache.size($arg1)" is equivalent to: > > (*(struct kmem_cache *)$arg1).size > > Anonymous structures are also handled: > > # echo 'e:xmit net.net_dev_xmit > +net_device.name(+sk_buff.dev($skbaddr)):string' >> dynamic_events So this only replaces the "offset" part. So we still need to use +OFFS() syntax for dereferencing the pointer. > > Where "+net_device.name(+sk_buff.dev($skbaddr))" is equivalent to: > > (*(struct net_device *)((*(struct sk_buff *)($skbaddr))->dev)->name) > > Note that "dev" of struct sk_buff is inside an anonymous structure: > > struct sk_buff { > union { > struct { > /* These two members must be first to match > sk_buff_head. */ > struct sk_buff *next; > struct sk_buff *prev; > > union { > struct net_device *dev; > [..] > }; > }; > [..] > }; > > This will allow up to three deep of anonymous structures before it will > fail to find a member. > > The above produces: > > sshd-session-1080 [000] b..5. 1526.337161: xmit: (net.net_dev_xmit) > arg1="enp7s0" > > And nested structures can be found by adding more members to the arg: > > # echo 'f:read filemap_readahead.isra.0 > file=+0(+dentry.d_name.name(+file.f_path.dentry($arg2))):string' >> > dynamic_events > > The above is equivalent to: > > *((*(struct dentry *)(*(struct file *)$arg2)->f_path.dentry)->d_name.name) > > And produces: > > trace-cmd-1381 [002] ...1. 2082.676268: read: > (filemap_readahead.isra.0+0x0/0x150) file="trace.dat" > OK, the desgin looks good to me. I have some comments below. > Signed-off-by: Steven Rostedt (Google) <rost...@goodmis.org> > --- > Documentation/trace/kprobetrace.rst | 3 + > kernel/trace/trace_btf.c | 106 ++++++++++++++++++++++++++++ > kernel/trace/trace_btf.h | 10 +++ > kernel/trace/trace_probe.c | 7 +- > 4 files changed, 124 insertions(+), 2 deletions(-) > > diff --git a/Documentation/trace/kprobetrace.rst > b/Documentation/trace/kprobetrace.rst > index 3b6791c17e9b..00273157100c 100644 > --- a/Documentation/trace/kprobetrace.rst > +++ b/Documentation/trace/kprobetrace.rst > @@ -54,6 +54,8 @@ Synopsis of kprobe_events > $retval : Fetch return value.(\*2) > $comm : Fetch current task comm. > +|-[u]OFFS(FETCHARG) : Fetch memory at FETCHARG +|- OFFS address.(\*3)(\*4) > + +STRUCT.MEMBER[.MEMBER[..]](FETCHARG) : If BTF is supported, Fetch memory > + at FETCHARG + the offset of MEMBER inside of STRUCT.(\*5) > \IMM : Store an immediate value to the argument. > NAME=FETCHARG : Set NAME as the argument name of FETCHARG. > FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types > @@ -70,6 +72,7 @@ Synopsis of kprobe_events > accesses one register. > (\*3) this is useful for fetching a field of data structures. > (\*4) "u" means user-space dereference. See :ref:`user_mem_access`. > + (\*5) +STRUCT.MEMBER(FETCHARG) is equivalent to (*(struct STRUCT > *)(FETCHARG)).MEMBER > > Function arguments at kretprobe > ------------------------------- > diff --git a/kernel/trace/trace_btf.c b/kernel/trace/trace_btf.c > index 5bbdbcbbde3c..b69404451410 100644 > --- a/kernel/trace/trace_btf.c > +++ b/kernel/trace/trace_btf.c > @@ -120,3 +120,109 @@ const struct btf_member *btf_find_struct_member(struct > btf *btf, > return member; > } > > +#define BITS_ROUNDDOWN_BYTES(bits) ((bits) >> 3) > + > +static int find_member(const char *ptr, struct btf *btf, > + const struct btf_type **type, int level) > +{ > + const struct btf_member *member; > + const struct btf_type *t = *type; > + int i; > + > + /* Max of 3 depth of anonymous structures */ > + if (level > 3) > + return -1; Please return an error code, maybe this is -E2BIG? > + > + for_each_member(i, t, member) { > + const char *tname = btf_name_by_offset(btf, member->name_off); > + > + if (strcmp(ptr, tname) == 0) { > + *type = btf_type_by_id(btf, member->type); > + return BITS_ROUNDDOWN_BYTES(member->offset); > + } > + > + /* Handle anonymous structures */ > + if (strlen(tname)) > + continue; > + > + *type = btf_type_by_id(btf, member->type); > + if (btf_type_is_struct(*type)) { > + int offset = find_member(ptr, btf, type, level + 1); > + > + if (offset < 0) > + continue; > + > + return offset + BITS_ROUNDDOWN_BYTES(member->offset); > + } > + } > + > + return -1; return -ENOENT; > +} > + > +/** > + * btf_find_offset - Find an offset of a member for a structure > + * @arg: A structure name followed by one or more members > + * @offset_p: A pointer to where to store the offset > + * > + * Will parse @arg with the expected format of: struct.member[[.member]..] > + * It is delimited by '.'. The first item must be a structure type. > + * The next are its members. If the member is also of a structure type it > + * another member may follow ".member". > + * > + * Note, @arg is modified but will be put back to what it was on return. > + * > + * Returns: 0 on success and -EINVAL if no '.' is present > + * or -ENXIO if the structure or member is not found. > + * Returns -EINVAL if BTF is not defined. > + * On success, @offset_p will contain the offset of the member specified > + * by @arg. > + */ > +int btf_find_offset(char *arg, long *offset_p) > +{ > + const struct btf_type *t; > + struct btf *btf; > + long offset = 0; > + char *ptr; > + int ret; > + s32 id; > + > + ptr = strchr(arg, '.'); > + if (!ptr) > + return -EINVAL; Instead of just returning error, can't we log an error for helping user? trace_probe_log_err(BYTE_OFFSET, ERROR_CODE); The base offset is stored in ctx->offset, so you can use it. ERROR_CODE is defined in trace_probe.h. Maybe you can add something like C(BAD_STRUCT_FMT, "Symbolic offset must be +STRUCT.MEMBER format"),\ And for other cases, you can use C(BAD_BTF_TID, "Failed to get BTF type info."),\ > + > + *ptr = '\0'; > + > + id = bpf_find_btf_id(arg, BTF_KIND_STRUCT, &btf); > + if (id < 0) > + goto error; > + > + /* Get BTF_KIND_FUNC type */ > + t = btf_type_by_id(btf, id); > + > + /* May allow more than one member, as long as they are structures */ > + do { > + if (!t || !btf_type_is_struct(t)) > + goto error; > + > + *ptr++ = '.'; > + arg = ptr; > + ptr = strchr(ptr, '.'); > + if (ptr) > + *ptr = '\0'; > + > + ret = find_member(arg, btf, &t, 0); > + if (ret < 0) > + goto error; > + > + offset += ret; > + > + } while (ptr); > + > + *offset_p = offset; > + return 0; > + > +error: > + if (ptr) > + *ptr = '.'; > + return -ENXIO; > +} > diff --git a/kernel/trace/trace_btf.h b/kernel/trace/trace_btf.h > index 4bc44bc261e6..7b0797a6050b 100644 > --- a/kernel/trace/trace_btf.h > +++ b/kernel/trace/trace_btf.h > @@ -9,3 +9,13 @@ const struct btf_member *btf_find_struct_member(struct btf > *btf, > const struct btf_type *type, > const char *member_name, > u32 *anon_offset); > + > +#ifdef CONFIG_PROBE_EVENTS_BTF_ARGS > +/* Will modify arg, but will put it back before returning. */ > +int btf_find_offset(char *arg, long *offset); > +#else > +static inline int btf_find_offset(char *arg, long *offset) > +{ Here also should use C(NOSUP_BTFARG, "BTF is not available or not supported"), \ Thank you, > + return -EINVAL; > +} > +#endif > diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c > index 424751cdf31f..4c13e51ea481 100644 > --- a/kernel/trace/trace_probe.c > +++ b/kernel/trace/trace_probe.c > @@ -1137,7 +1137,7 @@ parse_probe_arg(char *arg, const struct fetch_type > *type, > > case '+': /* deref memory */ > case '-': > - if (arg[1] == 'u') { > + if (arg[1] == 'u' && isdigit(arg[2])) { > deref = FETCH_OP_UDEREF; > arg[1] = arg[0]; > arg++; > @@ -1150,7 +1150,10 @@ parse_probe_arg(char *arg, const struct fetch_type > *type, > return -EINVAL; > } > *tmp = '\0'; > - ret = kstrtol(arg, 0, &offset); > + if (arg[0] != '-' && !isdigit(*arg)) > + ret = btf_find_offset(arg, &offset); > + else > + ret = kstrtol(arg, 0, &offset); > if (ret) { > trace_probe_log_err(ctx->offset, BAD_DEREF_OFFS); > break; > -- > 2.47.2 > -- Masami Hiramatsu (Google) <mhira...@kernel.org>