On Tue, Sep 2, 2025 at 9:47 AM Ian Rogers <irog...@google.com> wrote:
>
> On Mon, Sep 1, 2025 at 10:04 AM Ian Rogers <irog...@google.com> wrote:
> >
> > On Mon, Sep 1, 2025 at 12:57 AM Namhyung Kim <namhy...@kernel.org> wrote:
> > >
> > > Hello,
> > >
> > > On Sun, Aug 31, 2025 at 09:17:34PM -0700, Yonghong Song wrote:
> > > >
> > > >
> > > > On 8/29/25 10:19 PM, Ian Rogers wrote:
> > > > > On Thu, Jun 5, 2025 at 2:00 PM <patchwork-bot+netdev...@kernel.org> 
> > > > > wrote:
> > > > > > Hello:
> > > > > >
> > > > > > This series was applied to bpf/bpf-next.git (master)
> > > > > > by Andrii Nakryiko <and...@kernel.org>:
> > > > > >
> > > > > > On Tue,  3 Jun 2025 13:37:00 -0700 you wrote:
> > > > > > > The BTF dumper code currently displays arrays of characters as 
> > > > > > > just that -
> > > > > > > arrays, with each character formatted individually. Sometimes 
> > > > > > > this is what
> > > > > > > makes sense, but it's nice to be able to treat that array as a 
> > > > > > > string.
> > > > > > >
> > > > > > > This change adds a special case to the btf_dump functionality to 
> > > > > > > allow
> > > > > > > 0-terminated arrays of single-byte integer values to be printed as
> > > > > > > character strings. Characters for which isprint() returns false 
> > > > > > > are
> > > > > > > printed as hex-escaped values. This is enabled when the new 
> > > > > > > ".emit_strings"
> > > > > > > is set to 1 in the btf_dump_type_data_opts structure.
> > > > > > >
> > > > > > > [...]
> > > > > > Here is the summary with links:
> > > > > >    - [v3,1/2] libbpf: add support for printing BTF character arrays 
> > > > > > as strings
> > > > > >      https://git.kernel.org/bpf/bpf-next/c/87c9c79a02b4
> > > > > >    - [v3,2/2] Tests for the ".emit_strings" functionality in the 
> > > > > > BTF dumper.
> > > > > >      https://git.kernel.org/bpf/bpf-next/c/a570f386f3d1
> > > > > >
> > > > > > You are awesome, thank you!
> > > > > I believe this patch is responsible for segvs occurring in v6.17 in
> > > > > various perf tests when the perf tests run in parallel. There's lots
> > > >
> > > > Could you share the command line to reproduce this failure?
> > > > This will help debugging. Thanks!
> > >
> > > My reproducer is below:
> > >
> > > terminal 1: run perf trace in a loop.
> > >
> > >   $ while true; do sudo ./perf trace true; done
> > >
> > > terminal 2: run perf record in a loop until hit the segfault.
> > >
> > >   $ while true; do sudo ./perf record true || break; done
> > >   ...
> > >   perf: Segmentation fault
> > >       #0 0x560b2db790e4 in dump_stack debug.c:366
> > >       #1 0x560b2db7915a in sighandler_dump_stack debug.c:378
> > >       #2 0x560b2d973b1b in sigsegv_handler builtin-record.c:722
> > >       #3 0x7f975f249df0 in __restore_rt libc_sigaction.c:0
> > >       #4 0x560b2dca1ee6 in snprintf_hex bpf-event.c:39
> > >       #5 0x560b2dca2306 in synthesize_bpf_prog_name bpf-event.c:144
> > >       #6 0x560b2dca2d92 in bpf_metadata_create bpf-event.c:401
> > >       #7 0x560b2dca3838 in perf_event__synthesize_one_bpf_prog 
> > > bpf-event.c:673
> > >       #8 0x560b2dca3dd5 in perf_event__synthesize_bpf_events 
> > > bpf-event.c:798
> > >       #9 0x560b2d977ef5 in record__synthesize builtin-record.c:2131
> > >       #10 0x560b2d9797c1 in __cmd_record builtin-record.c:2581
> > >       #11 0x560b2d97db30 in cmd_record builtin-record.c:4376
> > >       #12 0x560b2da0672e in run_builtin perf.c:349
> > >       #13 0x560b2da069c6 in handle_internal_command perf.c:401
> > >       #14 0x560b2da06b1f in run_argv perf.c:448
> > >       #15 0x560b2da06e68 in main perf.c:555
> > >       #16 0x7f975f233ca8 in __libc_start_call_main 
> > > libc_start_call_main.h:74
> > >       #17 0x7f975f233d65 in __libc_start_main_alias_2 libc-start.c:128
> > >       #18 0x560b2d959b11 in _start perf[4cb11]
> > >
> > >
> > > I manually ran it with gdb to get some more hints.
> > >
> > >   Thread 1 "perf" received signal SIGSEGV, Segmentation fault.
> > >   0x00005555558e8ee6 in snprintf_hex (buf=0x5555562c1d79 "", size=503, 
> > > data=0x40 <error: Cannot access memory at address 0x40>, len=8)
> > >       at util/bpf-event.c:39
> > >   39                    ret += snprintf(buf + ret, size - ret, "%02x", 
> > > data[i]);
> > >
> > > The data is bpf_prog_info->prog_tags and it's called from
> > > synthesize_bpf_prog_name().
> > >
> > >   (gdb) bt
> > >   #0  0x00005555558e8ee6 in snprintf_hex (buf=0x5555562c1d79 "", 
> > > size=503, data=0x40 <error: Cannot access memory at address 0x40>,
> > >       len=8) at util/bpf-event.c:39
> > >   #1  0x00005555558e9306 in synthesize_bpf_prog_name (buf=0x5555562c1d70 
> > > "bpf_prog_", size=512, info=0x55555665e400, btf=0x5555562c5630,
> > >       sub_id=0) at util/bpf-event.c:144
> > >   #2  0x00005555558e9db5 in bpf_metadata_create (info=0x55555665e400) at 
> > > util/bpf-event.c:403
> > >   #3  0x00005555558ea85b in perf_event__synthesize_one_bpf_prog 
> > > (session=0x555556178510,
> > >       process=0x5555555ba7ab <process_synthesized_event>, 
> > > machine=0x555556178728, fd=25, event=0x5555561b73a0,
> > >       opts=0x5555560d33a8 <record+328>) at util/bpf-event.c:674
> > >   #4  0x00005555558eadf8 in perf_event__synthesize_bpf_events 
> > > (session=0x555556178510,
> > >       process=0x5555555ba7ab <process_synthesized_event>, 
> > > machine=0x555556178728, opts=0x5555560d33a8 <record+328>)
> > >       at util/bpf-event.c:799
> > >   #5  0x00005555555beef5 in record__synthesize (rec=0x5555560d3260 
> > > <record>, tail=false) at builtin-record.c:2131
> > >   #6  0x00005555555c07c1 in __cmd_record (rec=0x5555560d3260 <record>, 
> > > argc=1, argv=0x7fffffffe2e0) at builtin-record.c:2581
> > >   #7  0x00005555555c4b30 in cmd_record (argc=1, argv=0x7fffffffe2e0) at 
> > > builtin-record.c:4376
> > >   #8  0x000055555564d72e in run_builtin (p=0x5555560d63c0 <commands+288>, 
> > > argc=6, argv=0x7fffffffe2e0) at perf.c:349
> > >   #9  0x000055555564d9c6 in handle_internal_command (argc=6, 
> > > argv=0x7fffffffe2e0) at perf.c:401
> > >   #10 0x000055555564db1f in run_argv (argcp=0x7fffffffe0dc, 
> > > argv=0x7fffffffe0d0) at perf.c:445
> > >   #11 0x000055555564de68 in main (argc=6, argv=0x7fffffffe2e0) at 
> > > perf.c:553
> > >
> > > I seems bpf_prog_info is broken for some reason.
> > >
> > >   (gdb) up
> > >   #1  0x00005555558e9306 in synthesize_bpf_prog_name (buf=0x5555563305b0 
> > > "bpf_prog_", size=512, info=0x55555664e1d0, btf=0x55555637ad40,
> > >       sub_id=0) at util/bpf-event.c:144
> > >   144           name_len += snprintf_hex(buf + name_len, size - name_len,
> > >
> > >   (gdb) p *info
> > >   $1 = {type = 68, id = 80, tag = "\\\000\000\000\214\000\000", 
> > > jited_prog_len = 152, xlated_prog_len = 164,
> > >     jited_prog_insns = 824633721012, xlated_prog_insns = 1185410973912, 
> > > load_time = 1305670058276, created_by_uid = 352,
> > >     nr_map_ids = 364, map_ids = 1975684956608, name = 
> > > "\330\001\000\000\350\001\000\000$\002\000\0004\002\000", ifindex = 576,
> > >     gpl_compatible = 0, netns_dev = 2697239462496, netns_ino = 
> > > 2834678416000, nr_jited_ksyms = 756, nr_jited_func_lens = 768,
> > >     jited_ksyms = 3418793968396, jited_func_lens = 3573412791092, btf_id 
> > > = 844, func_info_rec_size = 880, func_info = 3934190044028,
> > >     nr_func_info = 928, nr_line_info = 952, line_info = 4294967296988, 
> > > jited_line_info = 4449586119680, nr_jited_line_info = 1060,
> > >     line_info_rec_size = 1076, jited_line_info_rec_size = 1092, 
> > > nr_prog_tags = 1108, prog_tags = 4861902980192,
> > >     run_time_ns = 5085241279632, run_cnt = 5257039971512, 
> > > recursion_misses = 5360119186644, verified_insns = 1264,
> > >     attach_btf_obj_id = 1288, attach_btf_id = 1312}
> >
> > Thanks Namhyung!
> >
> > So it looks like my "fix" was breaking the tools/perf build feature
> > test for btf_dump_type_data_opts opts.emit_strings and that was
> > avoiding this code.
> >
> > Having terminal 1 run perf trace is going to be loading/unloading a
> > BPF program for system call augmentation. This must be creating the
> > race condition that is causing perf record to segv when it is
> > inspecting the bpf_prog_info.
> >
> > The cast in:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/bpf-event.c#n135
> > ```
> > static int synthesize_bpf_prog_name(char *buf, int size,
> >     struct bpf_prog_info *info,
> >     struct btf *btf,
> >     u32 sub_id)
> > {
> > u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(uintptr_t)(info->prog_tags);
> > ```
> > looks concerning given the bad address comes from:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/bpf-event.c#n144
> > ```
> > name_len += snprintf_hex(buf + name_len, size - name_len,
> > prog_tags[sub_id], BPF_TAG_SIZE);
> > ```
> > Checking git blame this code has existed since 2019, commit
> > 7b612e291a5a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF
> > programs"):
> > http://lkml.kernel.org/r/20190117161521.1341602-8-songliubrav...@fb.com
> > it was refactored in 2019 to a single memory allocation commit
> > ("a742258af131 perf bpf: Synthesize bpf events with
> > bpf_program__get_prog_info_linear()")
> > http://lkml.kernel.org/r/20190312053051.2690567-5-songliubrav...@fb.com
> >
> > There seems like a potential race here:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/bpf-utils.c#n123
> > ```
> > /* step 1: get array dimensions */
> > err = bpf_obj_get_info_by_fd(fd, &info, &info_len);
> > ```
> > and later:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/bpf-utils.c#n185
> > ```
> > /* step 5: call syscall again to get required arrays */
> > err = bpf_obj_get_info_by_fd(fd, &info_linear->info, &info_len);
> > ```
> > There's a verification step that looks to cover issues with the race.
> > I thought making those warnings fatal may help, but no:
> > ```
> > --- a/tools/perf/util/bpf-utils.c
> > +++ b/tools/perf/util/bpf-utils.c
> > @@ -202,14 +202,20 @@ get_bpf_prog_info_linear(int fd, __u64 arrays)
> >                 v1 = bpf_prog_info_read_offset_u32(&info, 
> > desc->count_offset);
> >                 v2 = bpf_prog_info_read_offset_u32(&info_linear->info,
> >                                                    desc->count_offset);
> > -               if (v1 != v2)
> > +               if (v1 != v2) {
> >                         pr_warning("%s: mismatch in element count\n", 
> > __func__);
> > +                       free(info_linear);
> > +                       return ERR_PTR(-EFAULT);
> > +               }
> >
> >                 v1 = bpf_prog_info_read_offset_u32(&info, 
> > desc->size_offset);
> >                 v2 = bpf_prog_info_read_offset_u32(&info_linear->info,
> >                                                    desc->size_offset);
> > -               if (v1 != v2)
> > +               if (v1 != v2) {
> >                         pr_warning("%s: mismatch in rec size\n", __func__);
> > +                       free(info_linear);
> > +                       return ERR_PTR(-EFAULT);
> > +               }
> >         }
> >
> >         /* step 7: update info_len and data_len */
> > ```
> >
> > Fwiw, the address of "data=0x40" in the stack trace makes it looks
> > like an offset has been applied to NULL. 0x40 is 64 which corresponds
> > with "name" info a bpf_prog_info by way of pahole:
> > ```
> > struct bpf_prog_info {
> >         __u32                      type;                 /*     0     4 */
> >         /* --- cacheline 1 boundary (64 bytes) --- */
> > ...
> >         char                       name[16];             /*    64    16 */
> > ```
> >
> > I feel we're relatively close to discovering a proper fix for the
> > issue, if others could lend a hand as I'm not overly familiar with the
> > BPF code. I'm wondering if the second bpf_obj_get_info_by_fd could be
> > filling in offsets relative to NULL rather than returning an error,
> > but this would be (I believe) a kernel issue :-(
>
> It's a use-after-free perf bug:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/bpf-event.c?h=perf-tools-next#n660
> ```
> /* save bpf_prog_info to env */
> info_node = malloc(sizeof(struct bpf_prog_info_node));
> if (!info_node) {
> err = -1;
> goto out;
> }
>
> info_node->info_linear = info_linear;
> info_node->metadata = NULL;
> if (!perf_env__insert_bpf_prog_info(env, info_node)) {
> free(info_linear);
> free(info_node);
> }
> info_linear = NULL;
> ...
> /* Synthesize PERF_RECORD_BPF_METADATA */
> metadata = bpf_metadata_create(info);
> ```
> The insert can fail due to a duplicated BPF program (the sideband data
> thread is racing with the synthesizing thread in perf) leading to
> info_linear to be freed which the info points into. The later use of
> the freed memory leads to the crash.
>
> I hope that's all it is and I'll work on a fix.

That was all. Fixes sent in:
https://lore.kernel.org/all/20250902181713.309797-1-irog...@google.com/

Thanks,
Ian

Reply via email to