Re: consistency for statistics with XDP mode
Em Mon, Dec 03, 2018 at 11:30:01AM -0800, David Miller escreveu: > From: David Ahern > Date: Mon, 3 Dec 2018 08:45:12 -0700 > > > On 12/1/18 4:22 AM, Jesper Dangaard Brouer wrote: > >> IMHO XDP_DROP should not be accounted as netdev stats drops, this is a > >> user installed program like tc/iptables, that can also choose to drop > >> packets. > > > > sure and both tc and iptables have counters that can see the dropped > > packets. A counter in the driver level stats ("xdp_drop" is fine with > > with me). > > Part of the problem I have with this kind of logic is we take the choice > away from the XDP program. > > If I feel that the xdp_drop counter bump is too much overhead during a > DDoS attack and I want to avoid it, you don't give me a choice in the > matter. > > If I want to represent the statistics for that event differently, you > also give me no choice about it. > > Really, if XDP_DROP is returned, zero resources should be devoted to > the frame past that point. > > I know you want to live in this magical world where XDP stuff behaves > like the existing stack and give you all of the visibility to events > and objects. > > But that is your choice. > > Please give others the choice to not live in that world and allow XDP > programs to live in their own entirely different environment, with > custom statistics and complete control over how counters are > incremented and how objects are used and represented, if they choose > to do so. > > XDP is about choice. Coming out of the blue...: the presence of a "struct xdp_stats" in the XDP program BPF object file .BTF section, one could query and the parse to figure out what stats, if any, are provided. /me goes back to tweaking his btf_loader in pahole... :-) - Arnaldo
Re: Help with the BPF verifier
Em Fri, Nov 02, 2018 at 09:27:52PM +, Yonghong Song escreveu: > > > On 11/2/18 8:42 AM, Edward Cree wrote: > > On 02/11/18 15:02, Arnaldo Carvalho de Melo wrote: > >> Yeah, didn't work as well: > > > >> And the -vv in 'perf trace' didn't seem to map to further details in the > >> output of the verifier debug: > > Yeah for log_level 2 you probably need to make source-level changes to > > either > > perf or libbpf (I think the latter). It's annoying that essentially no > > tools > > plumb through an option for that, someone should fix them ;-) > > > >> libbpf: -- BEGIN DUMP LOG --- > >> libbpf: > >> 0: (bf) r6 = r1 > >> 1: (bf) r1 = r10 > >> 2: (07) r1 += -328 > >> 3: (b7) r7 = 64 > >> 4: (b7) r2 = 64 > >> 5: (bf) r3 = r6 > >> 6: (85) call bpf_probe_read#4 > >> 7: (79) r1 = *(u64 *)(r10 -320) > >> 8: (15) if r1 == 0x101 goto pc+4 > >> R0=inv(id=0) R1=inv(id=0) R6=ctx(id=0,off=0,imm=0) R7=inv64 > >> R10=fp0,call_-1 > >> 9: (55) if r1 != 0x2 goto pc+22 > >> R0=inv(id=0) R1=inv2 R6=ctx(id=0,off=0,imm=0) R7=inv64 R10=fp0,call_-1 > >> 10: (bf) r1 = r6 > >> 11: (07) r1 += 16 > >> 12: (05) goto pc+2 > >> 15: (79) r3 = *(u64 *)(r1 +0) > >> dereference of modified ctx ptr R1 off=16 disallowed > > Aha, we at least got a different error message this time. > > And indeed llvm has done that optimisation, rather than the more obvious > > 11: r3 = *(u64 *)(r1 +16) > > because it wants to have lots of reads share a single insn. You may be > > able > > to defeat that optimisation by adding compiler barriers, idk. Maybe > > someone > > with llvm knowledge can figure out how to stop it (ideally, llvm would > > know > > when it's generating for bpf backend and not do that). -O0? ¯\_(ツ)_/¯ > > The optimization mostly likes below: > br1: > ... > r1 += 16 > goto merge > br2: > ... > r1 += 20 > goto merge > merge: > *(u64 *)(r1 + 0) > > The compiler tries to merge common loads. There is no easy way to > stop this compiler optimization without turning off a lot of other > optimizations. The easiest way is to add barriers > __asm__ __volatile__("": : :"memory") > after the ctx memory access to prevent their down stream merging. Great, this made it work: cat /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c #include #include /* bpf-output associated map */ struct bpf_map SEC("maps") __augmented_syscalls__ = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; struct syscall_enter_args { unsigned long long common_tp_fields; long syscall_nr; unsigned long args[6]; }; struct syscall_exit_args { unsigned long long common_tp_fields; long syscall_nr; long ret; }; struct augmented_filename { unsigned intsize; int reserved; charvalue[256]; }; #define SYS_OPEN 2 #define SYS_OPENAT 257 SEC("raw_syscalls:sys_enter") int sys_enter(struct syscall_enter_args *args) { struct { struct syscall_enter_args args; struct augmented_filename filename; } augmented_args; unsigned int len = sizeof(augmented_args); const void *filename_arg = NULL; probe_read(_args.args, sizeof(augmented_args.args), args); switch (augmented_args.args.syscall_nr) { case SYS_OPEN: filename_arg = (const void *)args->args[0]; __asm__ __volatile__("": : :"memory"); break; case SYS_OPENAT: filename_arg = (const void *)args->args[1]; break; default: return 0; } if (filename_arg != NULL) { augmented_args.filename.reserved = 0; augmented_args.filename.size = probe_read_str(_args.filename.value, sizeof(augmented_args.filename.value), filename_arg); if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) { len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size; len &= sizeof(augmented_args.filename.value) - 1; } } else { len = sizeof(augmented_args.args);
Re: Help with the BPF verifier
Em Sat, Nov 03, 2018 at 08:29:34AM -0300, Arnaldo Carvalho de Melo escreveu: > PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: > Preprocessed source(s) and associated run script(s) are located at: > clang-7: note: diagnostic msg: /tmp/augmented_raw_syscalls-7444d9.c > clang-7: note: diagnostic msg: /tmp/augmented_raw_syscalls-7444d9.sh > clang-7: note: diagnostic msg: > > > ERROR:unable to compile > tools/perf/examples/bpf/augmented_raw_syscalls.c > Hint: Check error message shown above. > Hint: You can also pre-compile it into .o using: > clang -target bpf -O2 -c > tools/perf/examples/bpf/augmented_raw_syscalls.c > with proper -I and -D options. > event syntax error: 'tools/perf/examples/bpf/augmented_raw_syscalls.c' > \___ Failed to load > tools/perf/examples/bpf/augmented_raw_syscalls.c from source: Error when > compiling BPF scriptlet > > (add -v to see detail) > Run 'perf list' for a list of valid events > > Usage: perf trace [] [] > or: perf trace [] -- [] > or: perf trace record [] [] > or: perf trace record [] -- [] > > -e, --eventevent/syscall selector. use 'perf list' to list > available events > [root@seventh perf]# > > Trying with -O1... -O1 doesn't get clang confused, its just the verifier that doesn't like the result, i.e. we're back to that optimization, that isn't disabled with -O1 llvm compiling command : /usr/local/bin/clang -D__KERNEL__ -D__NR_CPUS__=4 -DLINUX_VERSION_CODE=0x41300 -I/home/acme/lib/perf/include/bpf -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/4.19.0-rc8-00014-gc0cff31be705/build -c /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c -target bpf -O1 -o - libbpf: loading object 'tools/perf/examples/bpf/augmented_raw_syscalls.c' from buffer libbpf: section(1) .strtab, size 168, link 0, flags 0, type=3 libbpf: skip section(1) .strtab libbpf: section(2) .text, size 0, link 0, flags 6, type=1 libbpf: skip section(2) .text libbpf: section(3) raw_syscalls:sys_enter, size 344, link 0, flags 6, type=1 libbpf: found program raw_syscalls:sys_enter libbpf: section(4) .relraw_syscalls:sys_enter, size 16, link 10, flags 0, type=9 libbpf: section(5) raw_syscalls:sys_exit, size 16, link 0, flags 6, type=1 libbpf: found program raw_syscalls:sys_exit libbpf: section(6) maps, size 56, link 0, flags 3, type=1 libbpf: section(7) license, size 4, link 0, flags 3, type=1 libbpf: license of tools/perf/examples/bpf/augmented_raw_syscalls.c is GPL libbpf: section(8) version, size 4, link 0, flags 3, type=1 libbpf: kernel version of tools/perf/examples/bpf/augmented_raw_syscalls.c is 41300 libbpf: section(9) .llvm_addrsig, size 6, link 10, flags 8000, type=1879002115 libbpf: skip section(9) .llvm_addrsig libbpf: section(10) .symtab, size 240, link 1, flags 0, type=2 libbpf: maps in tools/perf/examples/bpf/augmented_raw_syscalls.c: 2 maps in 56 bytes libbpf: map 0 is "__augmented_syscalls__" libbpf: map 1 is "__bpf_stdout__" libbpf: collecting relocating info for: 'raw_syscalls:sys_enter' libbpf: relo for 4 value 28 name 124 libbpf: relocation: insn_idx=35 libbpf: relocation: find map 1 (__augmented_syscalls__) for insn 35 Added extra kernel map __entry_SYSCALL_64_trampoline fe006000-fe007000 Added extra kernel map __entry_SYSCALL_64_trampoline fe032000-fe033000 Added extra kernel map __entry_SYSCALL_64_trampoline fe05e000-fe05f000 Added extra kernel map __entry_SYSCALL_64_trampoline fe08a000-fe08b000 bpf: config program 'raw_syscalls:sys_enter' bpf: config program 'raw_syscalls:sys_exit' libbpf: create map __bpf_stdout__: fd=3 libbpf: create map __augmented_syscalls__: fd=4 libbpf: load bpf program failed: Permission denied libbpf: -- BEGIN DUMP LOG --- libbpf: 0: (bf) r6 = r1 1: (bf) r1 = r10 2: (07) r1 += -328 3: (b7) r7 = 64 4: (b7) r2 = 64 5: (bf) r3 = r6 6: (85) call bpf_probe_read#4 7: (79) r1 = *(u64 *)(r10 -320) 8: (15) if r1 == 0x101 goto pc+4 R0=inv(id=0) R1=inv(id=0) R6=ctx(id=0,off=0,imm=0) R7=inv64 R10=fp0,call_-1 9: (55) if r1 != 0x2 goto pc+22 R0=inv(id=0) R1=inv2 R6=ctx(id=0,off=0,imm=0) R7=inv64 R10=fp0,call_-1 10: (bf) r1 = r6 11: (07) r1 += 16 12: (05) goto pc+2 15: (79) r3 = *(u64 *)(r1 +0) dereference of modified ctx ptr R1 off=16 disallowed libbpf: -- END LOG -- libbpf: failed to load program 'raw_syscalls:sys_enter' libbpf: failed to load object 'to
Re: Help with the BPF verifier
Em Fri, Nov 02, 2018 at 03:42:49PM +, Edward Cree escreveu: > On 02/11/18 15:02, Arnaldo Carvalho de Melo wrote: > > Yeah, didn't work as well: > > > And the -vv in 'perf trace' didn't seem to map to further details in the > > output of the verifier debug: > Yeah for log_level 2 you probably need to make source-level changes to either > perf or libbpf (I think the latter). It's annoying that essentially no tools > plumb through an option for that, someone should fix them ;-) > > > libbpf: -- BEGIN DUMP LOG --- > > libbpf: > > 0: (bf) r6 = r1 > > 1: (bf) r1 = r10 > > 2: (07) r1 += -328 > > 3: (b7) r7 = 64 > > 4: (b7) r2 = 64 > > 5: (bf) r3 = r6 > > 6: (85) call bpf_probe_read#4 > > 7: (79) r1 = *(u64 *)(r10 -320) > > 8: (15) if r1 == 0x101 goto pc+4 > > R0=inv(id=0) R1=inv(id=0) R6=ctx(id=0,off=0,imm=0) R7=inv64 R10=fp0,call_-1 > > 9: (55) if r1 != 0x2 goto pc+22 > > R0=inv(id=0) R1=inv2 R6=ctx(id=0,off=0,imm=0) R7=inv64 R10=fp0,call_-1 > > 10: (bf) r1 = r6 > > 11: (07) r1 += 16 > > 12: (05) goto pc+2 > > 15: (79) r3 = *(u64 *)(r1 +0) > > dereference of modified ctx ptr R1 off=16 disallowed > Aha, we at least got a different error message this time. > And indeed llvm has done that optimisation, rather than the more obvious > 11: r3 = *(u64 *)(r1 +16) > because it wants to have lots of reads share a single insn. You may be able > to defeat that optimisation by adding compiler barriers, idk. Maybe someone > with llvm knowledge can figure out how to stop it (ideally, llvm would know > when it's generating for bpf backend and not do that). -O0? ¯\_(ツ)_/¯ set env: NR_CPUS=4 set env: LINUX_VERSION_CODE=0x41300 set env: CLANG_EXEC=/usr/local/bin/clang unset env: CLANG_OPTIONS set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: PERF_BPF_INC_OPTIONS=-I/home/acme/lib/perf/include/bpf set env: WORKING_DIR=/lib/modules/4.19.0-rc8-00014-gc0cff31be705/build set env: CLANG_SOURCE=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE llvm compiling command : /usr/local/bin/clang -D__KERNEL__ -D__NR_CPUS__=4 -DLINUX_VERSION_CODE=0x41300 -I/home/acme/lib/perf/include/bpf -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/4.19.0-rc8-00014-gc0cff31be705/build -c /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c -target bpf -O2 -o - So it is using -O2, lets try with -O0... So I added this to my ~/.perfconfig, i.e. the default clang command line template replacing -O2 with -O0. [root@seventh perf]# cat ~/.perfconfig [llvm] clang-bpf-cmd-template = "$CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O0 -o - $LLVM_OPTIONS_PIPE" # dump-obj = true [root@seventh perf]# And got an explosion: # trace -vv -e tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1 bpf: builtin compilation failed: -95, try external compiler Kernel build dir is set to /lib/modules/4.19.0-rc8-00014-gc0cff31be705/build set env: KBUILD_DIR=/lib/modules/4.19.0-rc8-00014-gc0cff31be705/build unset env: KBUILD_OPTS include option is set to -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: NR_CPUS=4 set env: LINUX_VERSION_CODE=0x41300 set env: CLANG_EXE
Re: Help with the BPF verifier
Em Thu, Nov 01, 2018 at 08:05:07PM +, Edward Cree escreveu: > On 01/11/18 18:52, Arnaldo Carvalho de Melo wrote: > > R0=inv(id=0) R1=inv6 R2=inv6 R3=inv(id=0) R6=ctx(id=0,off=0,imm=0) > > R7=inv64 R10=fp0,call_-1 > > 15: (b7) r2 = 0 > > 16: (63) *(u32 *)(r10 -260) = r2 > > 17: (67) r1 <<= 32 > > 18: (77) r1 >>= 32 > > 19: (67) r1 <<= 3 > > 20: (bf) r2 = r6 > > 21: (0f) r2 += r1 > > 22: (79) r3 = *(u64 *)(r2 +16) > > R2 invalid mem access 'inv' > I wonder if you could run this with verifier log level 2? (I'm not sure how > you would go about plumbing that through the perf tooling.) It seems very > odd that it ends up with R2=inv, and I'm wondering whether R1 becomes unknown > during the shifts or whether the addition in insn 21 somehow produces the > unknown-ness. (I know we used to have a thing[1] where doing ptr += K and > then also having an offset in the LDX produced an error about > ptr+const+const, but that seems to have been fixed at some point.) > > Note however that even if we get past this, R1 at this point holds 6, so it > looks like the verifier is walking the impossible path where we're inside the > 'if' even though filename_arg = 6. This is a (slightly annoying) verifier > limitation, that it walks paths with impossible combinations of constraints > (we've previously had cases where assertions in the verifier would blow up > because of this, e.g. registers with max_val less than min_val). So if the > check_ctx_access() is going to worry about whether you're off the end of the > array (I'm not sure what your program type is and thus which is_valid_access > callback is involved), then it'll complain about this. > If filename_arg came from some external source you'd have a different > problem, because then it would have a totally unknown value, that on entering > the 'if' becomes "unknown but < 6", which is still too variable to have as > the offset of a ctx access. Those have to be at a known constant offset, so > that we can determine the type of the returned value. > > As a way to fix this, how about [UNTESTED!]: > const void *filename_arg = NULL; > /* ... */ > switch (augmented_args.args.syscall_nr) { > case SYS_OPEN: filename_arg = args->args[0]; break; > case SYS_OPENAT: filename_arg = args->args[1]; break; > } > /* ... */ > if (filename_arg) { > /* stuff */ > blah = probe_read_str(/* ... */, filename_arg); > } else { > /* the other stuff */ > } > That way, you're only ever dealing in constant pointers (although judging by > an old thread I found[1] about ptr+const+const, the compiler might decide to > make some optimisations that end up looking like your existing code). Yeah, didn't work as well: SEC("raw_syscalls:sys_enter") int sys_enter(struct syscall_enter_args *args) { struct { struct syscall_enter_args args; struct augmented_filename filename; } augmented_args; unsigned int len = sizeof(augmented_args); const void *filename_arg = NULL; probe_read(_args.args, sizeof(augmented_args.args), args); switch (augmented_args.args.syscall_nr) { case SYS_OPEN: filename_arg = (const void *)args->args[0]; break; case SYS_OPENAT: filename_arg = (const void *)args->args[1]; break; } if (filename_arg != NULL) { augmented_args.filename.reserved = 0; augmented_args.filename.size = probe_read_str(_args.filename.value, sizeof(augmented_args.filename.value), filename_arg); if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) { len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size; len &= sizeof(augmented_args.filename.value) - 1; } } else { len = sizeof(augmented_args.args); } perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, _args, len); return 0; } And the -vv in 'perf trace' didn't seem to map to further details in the output of the verifier debug: # trace -vv -e tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1 bpf: builtin compilation failed: -95, try external compiler Kernel build dir is set to /lib/modules/4.19.0-rc8-00014-gc0cff31be705/build set env: KBUILD_DIR=/lib/modules/4.19.0-rc8-00014-gc0cff31be705/build unset env: KBUILD_OPTS include option is set to -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch
Re: [PATCH bpf 1/4] bpf: fix partial copy of map_ptr when dst is scalar
Em Thu, Nov 01, 2018 at 07:17:29PM +, Edward Cree escreveu: > On 31/10/18 23:05, Daniel Borkmann wrote: > > ALU operations on pointers such as scalar_reg += map_value_ptr are > > handled in adjust_ptr_min_max_vals(). Problem is however that map_ptr > > and range in the register state share a union, so transferring state > > through dst_reg->range = ptr_reg->range is just buggy as any new > > map_ptr in the dst_reg is then truncated (or null) for subsequent > > checks. Fix this by adding a raw member and use it for copying state > > over to dst_reg. > > > > Fixes: f1174f77b50c ("bpf/verifier: rework value tracking") > > Signed-off-by: Daniel Borkmann > > Cc: Edward Cree > > Acked-by: Alexei Starovoitov > > --- > Acked-by: Edward Cree > (though I apparently missed the 63-minute window to hit the git record...) Those guys are fast! :-) - Arnaldo
Re: Help with the BPF verifier
Em Thu, Nov 01, 2018 at 12:10:39PM -0700, David Miller escreveu: > From: Arnaldo Carvalho de Melo > Date: Thu, 1 Nov 2018 15:52:17 -0300 > > > 50 unsigned int filename_arg = 6; > ... > > --- /wb/augmented_raw_syscalls.c.old2018-11-01 15:43:55.000394234 > > -0300 > > +++ /wb/augmented_raw_syscalls.c2018-11-01 15:44:15.102367838 -0300 > > @@ -67,7 +67,7 @@ > > augmented_args.filename.reserved = 0; > > augmented_args.filename.size = > > probe_read_str(_args.filename.value, > > > > sizeof(augmented_args.filename.value), > > - (const void > > *)args->args[0]); > > + (const void > > *)args->args[filename_arg]); > > args[] is sized to '6', therefore the last valid index is '5', yet you're > using '6' here which > is one entry past the end of the declared array. Nope... this is inside an if: if (filename_arg <= 5) { augmented_args.filename.reserved = 0; augmented_args.filename.size = probe_read_str(_args.filename.value, sizeof(augmented_args.filename.value), (const void *)args->args[filename_arg]); if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) { len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size; len &= sizeof(augmented_args.filename.value) - 1; } } else { I use 6 to mean "hey, this syscall doesn't have any string argument, don't bother with it". - Arnaldo
Help with the BPF verifier
tl;dr: I seem to be trying to get past clang optimizations that get the verifier to accept my proggie. Hi, So I'm moving to use raw_syscalls:sys_exit to collect pointer contents, using maps to tell the bpf program what to copy, how many bytes, filters, etc. I'm at the start of it at this point I need to use an index to get to the right syscall arg that is a filename, starting just with "open" and "openat", that have the filename in different args, so to get this first part working I'm doing it directly in the bpf restricted C program, later this will be to maps, etc, so if I set the index as a constant, just for testing, it works, look at the "open" and "openat" calls below, later we'll see why openat is failing to augment its "filename" arg while "open" works: [root@seventh perf]# trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1 ? ( ): sleep/10152 ... [continued]: execve()) = 0 0.045 ( 0.004 ms): sleep/10152 brk() = 0x55ccff356000 0.074 ( 0.007 ms): sleep/10152 access(filename: , mode: R) = -1 ENOENT No such file or directory 0.089 ( 0.006 ms): sleep/10152 openat(dfd: CWD, filename: , flags: CLOEXEC) = 3 0.097 ( 0.003 ms): sleep/10152 fstat(fd: 3, statbuf: 0x7ffecdd283f0) = 0 0.103 ( 0.006 ms): sleep/10152 mmap(len: 103334, prot: READ, flags: PRIVATE, fd: 3) = 0x7f8ffee9c000 0.111 ( 0.002 ms): sleep/10152 close(fd: 3) = 0 0.135 ( 0.007 ms): sleep/10152 openat(dfd: CWD, filename: , flags: CLOEXEC) = 3 0.144 ( 0.003 ms): sleep/10152 read(fd: 3, buf: 0x7ffecdd285b8, count: 832) = 832 0.150 ( 0.002 ms): sleep/10152 fstat(fd: 3, statbuf: 0x7ffecdd28450) = 0 0.155 ( 0.005 ms): sleep/10152 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7f8ffee9a000 0.166 ( 0.007 ms): sleep/10152 mmap(len: 3889792, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3) = 0x7f8ffe8dc000 0.175 ( 0.010 ms): sleep/10152 mprotect(start: 0x7f8ffea89000, len: 2093056) = 0 0.188 ( 0.010 ms): sleep/10152 mmap(addr: 0x7f8ffec88000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 1753088) = 0x7f8ffec88000 0.204 ( 0.005 ms): sleep/10152 mmap(addr: 0x7f8ffec8e000, len: 14976, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7f8ffec8e000 0.218 ( 0.002 ms): sleep/10152 close(fd: 3) = 0 0.239 ( 0.002 ms): sleep/10152 arch_prctl(option: 4098, arg2: 140256433779968) = 0 0.312 ( 0.009 ms): sleep/10152 mprotect(start: 0x7f8ffec88000, len: 16384, prot: READ) = 0 0.343 ( 0.005 ms): sleep/10152 mprotect(start: 0x55ccff1c6000, len: 4096, prot: READ) = 0 0.354 ( 0.006 ms): sleep/10152 mprotect(start: 0x7f8ffeeb6000, len: 4096, prot: READ) = 0 0.362 ( 0.019 ms): sleep/10152 munmap(addr: 0x7f8ffee9c000, len: 103334) = 0 0.476 ( 0.002 ms): sleep/10152 brk() = 0x55ccff356000 0.480 ( 0.004 ms): sleep/10152 brk(brk: 0x55ccff377000) = 0x55ccff377000 0.487 ( 0.002 ms): sleep/10152 brk() = 0x55ccff377000 0.497 ( 0.008 ms): sleep/10152 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3 0.507 ( 0.002 ms): sleep/10152 fstat(fd: 3, statbuf: 0x7f8ffec8daa0) = 0 0.511 ( 0.006 ms): sleep/10152 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7f8ff7d0d000 0.524 ( 0.002 ms): sleep/10152 close(fd: 3) = 0 0.574 (1000.140 ms): sleep/10152 nanosleep(rqtp: 0x7ffecdd29130) = 0 1000.753 ( 0.007 ms): sleep/10152 close(fd: 1) = 0 1000.767 ( 0.004 ms): sleep/10152 close(fd: 2) = 0 1000.781 ( ): sleep/10152 exit_group() [root@seventh perf]# 1 // SPDX-License-Identifier: GPL-2.0 2 /* 3 * Augment the raw_syscalls tracepoints with the contents of the pointer arguments. 4 * 5 * Test it with: 6 * 7 * perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c cat /etc/passwd > /dev/null 8 * 9 * This exactly matches what is marshalled into the raw_syscall:sys_enter 10 * payload expected by the 'perf trace' beautifiers. 11 * 12 * For now it just uses the existing tracepoint augmentation code in 'perf 13 * trace', in the next csets we'll hook up these with the sys_enter/sys_exit 14 * code that will combine entry/exit in a strace like way. 15 */ 16 #include 17 #include 18 /* bpf-output associated map */ 19 struct bpf_map SEC("maps") __augmented_syscalls__ = { 20 .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, 21 .key_size = sizeof(int), 22 .value_size = sizeof(u32), 23 .max_entries = __NR_CPUS__, 24 }; 25 struct syscall_enter_args { 26 unsigned long long common_tp_fields; 27 long syscall_nr; 28 unsigned long args[6]; 29 }; 30 struct syscall_exit_args { 31 unsigned long long common_tp_fields; 32
Re: [PATCH bpf] libbpf: Fix compile error in libbpf_attach_type_by_name
Em Wed, Oct 31, 2018 at 12:57:18PM -0700, Andrey Ignatov escreveu: > Arnaldo Carvalho de Melo reported build error in libbpf when clang > version 3.8.1-24 (tags/RELEASE_381/final) is used: > > libbpf.c:2201:36: error: comparison of constant -22 with expression of > type 'const enum bpf_attach_type' is always false > [-Werror,-Wtautological-constant-out-of-range-compare] > if (section_names[i].attach_type == -EINVAL) > ^ ~~~ > 1 error generated. > > Fix the error by keeping "is_attachable" property of a program in a > separate struct field instead of trying to use attach_type itself. Thanks, now it builds in all the previously failing systems: # export PERF_TARBALL=http://192.168.86.4/perf/perf-4.19.0.tar.xz # dm debian:9 fedora:25 fedora:26 fedora:27 ubuntu:16.04 ubuntu:17.10 1 debian:9: Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 clang version 3.8.1-24 (tags/RELEASE_381/final) 2 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1) clang version 3.9.1 (tags/RELEASE_391/final) 3 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2) clang version 4.0.1 (tags/RELEASE_401/final) 4 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6) clang version 5.0.2 (tags/RELEASE_502/final) 5 ubuntu:16.04: Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609 clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final) 6 ubuntu:17.10: Ok gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0 clang version 4.0.1-6 (tags/RELEASE_401/final) # Tested-by: Arnaldo Carvalho de Melo I also have it tentatively applied to my perf/urgent branch, that I'll push upstream soon. - Arnaldo > Fixes: commit 956b620fcf0b ("libbpf: Introduce libbpf_attach_type_by_name") > Reported-by: Arnaldo Carvalho de Melo > Signed-off-by: Andrey Ignatov > --- > tools/lib/bpf/libbpf.c | 13 +++-- > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index b607be7236d3..d6e62e90e8d4 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -2084,19 +2084,19 @@ void bpf_program__set_expected_attach_type(struct > bpf_program *prog, > prog->expected_attach_type = type; > } > > -#define BPF_PROG_SEC_IMPL(string, ptype, eatype, atype) \ > - { string, sizeof(string) - 1, ptype, eatype, atype } > +#define BPF_PROG_SEC_IMPL(string, ptype, eatype, is_attachable, atype) \ > + { string, sizeof(string) - 1, ptype, eatype, is_attachable, atype } > > /* Programs that can NOT be attached. */ > -#define BPF_PROG_SEC(string, ptype) BPF_PROG_SEC_IMPL(string, ptype, 0, > -EINVAL) > +#define BPF_PROG_SEC(string, ptype) BPF_PROG_SEC_IMPL(string, ptype, 0, 0, 0) > > /* Programs that can be attached. */ > #define BPF_APROG_SEC(string, ptype, atype) \ > - BPF_PROG_SEC_IMPL(string, ptype, 0, atype) > + BPF_PROG_SEC_IMPL(string, ptype, 0, 1, atype) > > /* Programs that must specify expected attach type at load time. */ > #define BPF_EAPROG_SEC(string, ptype, eatype) \ > - BPF_PROG_SEC_IMPL(string, ptype, eatype, eatype) > + BPF_PROG_SEC_IMPL(string, ptype, eatype, 1, eatype) > > /* Programs that can be attached but attach type can't be identified by > section > * name. Kept for backward compatibility. > @@ -2108,6 +2108,7 @@ static const struct { > size_t len; > enum bpf_prog_type prog_type; > enum bpf_attach_type expected_attach_type; > + int is_attachable; > enum bpf_attach_type attach_type; > } section_names[] = { > BPF_PROG_SEC("socket", BPF_PROG_TYPE_SOCKET_FILTER), > @@ -2198,7 +2199,7 @@ int libbpf_attach_type_by_name(const char *name, > for (i = 0; i < ARRAY_SIZE(section_names); i++) { > if (strncmp(name, section_names[i].sec, section_names[i].len)) > continue; > - if (section_names[i].attach_type == -EINVAL) > + if (!section_names[i].is_attachable) > return -EINVAL; > *attach_type = section_names[i].attach_type; > return 0; > -- > 2.17.1
libbpf build failure on debian:9 with clang
1740.66 debian:9 : FAIL gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 The failure was with clang tho: clang version 3.8.1-24 (tags/RELEASE_381/final) With: gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) it built without any warnings/errors. CC /tmp/build/perf/libbpf.o libbpf.c:2201:36: error: comparison of constant -22 with expression of type 'const enum bpf_attach_type' is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (section_names[i].attach_type == -EINVAL) ^ ~~~ 1 error generated. CC /tmp/build/perf/help.o mv: cannot stat '/tmp/build/perf/.libbpf.o.tmp': No such file or directory /git/linux/tools/build/Makefile.build:96: recipe for target '/tmp/build/perf/libbpf.o' failed make[4]: *** [/tmp/build/perf/libbpf.o] Error 1 This is the cset: commit 956b620fcf0b64de403cd26a56bc41e6e4826ea6 Author: Andrey Ignatov Date: Wed Sep 26 15:24:53 2018 -0700 libbpf: Introduce libbpf_attach_type_by_name Tests are continuing, so far: 143.53 alpine:3.4: Ok gcc (Alpine 5.3.0) 5.3.0 258.62 alpine:3.5: Ok gcc (Alpine 6.2.1) 6.2.1 20160822 351.62 alpine:3.6: Ok gcc (Alpine 6.3.0) 6.3.0 451.68 alpine:3.7: Ok gcc (Alpine 6.4.0) 6.4.0 549.38 alpine:3.8: Ok gcc (Alpine 6.4.0) 6.4.0 679.07 alpine:edge : Ok gcc (Alpine 6.4.0) 6.4.0 763.35 amazonlinux:1 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28) 859.65 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5) 947.39 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease) 1050.64 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease) 1128.75 centos:5 : Ok gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55) 1233.26 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23) 1343.16 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28) 1473.61 clearlinux:latest : FAIL gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502 1545.56 debian:7 : Ok gcc (Debian 4.7.2-5) 4.7.2 1645.53 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u1) 4.9.2 1740.66 debian:9 : FAIL gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 18 113.19 debian:experimental : Ok gcc (Debian 8.2.0-8) 8.2.0 1941.48 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 8.2.0-7) 8.2.0 2041.51 debian:experimental-x-mips: Ok mips-linux-gnu-gcc (Debian 8.2.0-7) 8.2.0 2140.09 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 8.1.0-12) 8.1.0 2242.17 debian:experimental-x-mipsel : Ok mipsel-linux-gnu-gcc (Debian 8.2.0-7) 8.2.0 2340.02 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7) 2445.47 fedora:21 : Ok gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6) 2541.64 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) 2643.60 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) 2744.04 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1) 2837.21 fedora:24-x-ARC-uClibc: Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710 The problem with clearlinux is unrelated: clang-7: error: unknown argument: '-fno-semantic-interposition' clang-7: error: unsupported argument '4' to option 'flto=' clang-7: error: optimization flag '-ffat-lto-objects' is not supported [-Werror,-Wignored-optimization-argument]
Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload
Em Wed, Oct 17, 2018 at 07:08:37PM +, Alexei Starovoitov escreveu: > On 10/17/18 11:53 AM, Arnaldo Carvalho de Melo wrote: > > Em Wed, Oct 17, 2018 at 04:36:08PM +, Alexei Starovoitov escreveu: > >> On 10/17/18 8:09 AM, David Ahern wrote: > >>> On 10/16/18 11:43 PM, Song Liu wrote: > >>>> I agree that processing events while recording has significant overhead. > >>>> In this case, perf user space need to know details about the the jited > >>>> BPF > >>>> program. It is impossible to pass all these details to user space through > >>>> the relatively stable ring_buffer API. Therefore, some processing of the > >>>> data is necessary (get bpf prog_id from ring buffer, and then fetch > >>>> program > >>>> details via BPF_OBJ_GET_INFO_BY_FD. > >>>> > >>>> I have some idea on processing important data with relatively low > >>>> overhead. > >>>> Let me try implement it. > >>>> > >>> > >>> As I understand it, you want this series: > >>> > >>> kernel: add event to perf buffer on bpf prog load > >>> > >>> userspace: perf reads the event and grabs information about the program > >>> from the fd > >>> > >>> Is that correct? > >>> > >>> Userpsace is not awakened immediately when an event is added the the > >>> ring. It is awakened once the number of events crosses a watermark. That > >>> means there is an unknown - and potentially long - time window where the > >>> program can be unloaded before perf reads the event. > > > >>> So no matter what you do expecting perf record to be able to process the > >>> event quickly is an unreasonable expectation. > > > >> yes... unless we go with threaded model as Arnaldo suggested and use > >> single event as a watermark to wakeup our perf thread. > >> In such case there is still a race window between user space waking up > >> and doing _single_ bpf_get_fd_from_id() call to hold that prog > >> and some other process trying to instantly unload the prog it > >> just loaded. > >> I think such race window is extremely tiny and if perf misses > >> those load/unload events it's a good thing, since there is no chance > >> that normal pmu event samples would be happening during prog execution. > >> The alternative approach with no race window at all is to burden kernel > >> RECORD_* events with _all_ information about bpf prog. Which is jited > >> addresses, jited image itself, info about all subprogs, info about line > >> info, all BTF data, etc. As I said earlier I'm strongly against such > >> RECORD_* bloating. > >> Instead we need to find a way to process new RECORD_BPF events with > >> single prog_id field in perf user space with minimal race > >> and threaded approach sounds like a win to me. > > There is another alternative, I think: put just a content based hash, > > like a git commit id into a PERF_RECORD_MMAP3 new record, and when the > > validator does the jit, etc, it stashes the content that > > BPF_OBJ_GET_INFO_BY_FD would get somewhere, some filesystem populated by > > the kernel right after getting stuff from sys_bpf() and preparing it for > > use, then we know that in (start, end) we have blob foo with content id, > > that we will use to retrieve information that augments what we know with > > just (start, end, id) and allows annotation, etc. > > That stash space for jitted stuff gets garbage collected from time to > > time or is even completely disabled if the user is not interested in > > such augmentation, just like one can do disabling perf's ~/.debug/ > > directory hashed by build-id. > > I think with this we have no races, the PERF_RECORD_MMAP3 gets just what > > is in PERF_RECORD_MMAP2 plus some extra 20 bytes for such content based > > cookie and we solve the other race we already have with kernel modules, > > DSOs, etc. > > I have mentioned this before, there were objections, perhaps this time I > > formulated in a different way that makes it more interesting? > that 'content based hash' we already have. It's called program tag. But that was calculated by whom? Userspace? It can't do that, its the kernel that ultimately puts together, from what userspace gave it, what we need to do performance analysis, line numbers, etc. > and we already taught iovisor/bcc to stash that stuff into > /var/tmp/bcc/bpf_prog_TAG/ directory. > Unfortunately that appr
Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload
Em Wed, Oct 17, 2018 at 04:36:08PM +, Alexei Starovoitov escreveu: > On 10/17/18 8:09 AM, David Ahern wrote: > > On 10/16/18 11:43 PM, Song Liu wrote: > >> I agree that processing events while recording has significant overhead. > >> In this case, perf user space need to know details about the the jited BPF > >> program. It is impossible to pass all these details to user space through > >> the relatively stable ring_buffer API. Therefore, some processing of the > >> data is necessary (get bpf prog_id from ring buffer, and then fetch program > >> details via BPF_OBJ_GET_INFO_BY_FD. > >> > >> I have some idea on processing important data with relatively low overhead. > >> Let me try implement it. > >> > > > > As I understand it, you want this series: > > > > kernel: add event to perf buffer on bpf prog load > > > > userspace: perf reads the event and grabs information about the program > > from the fd > > > > Is that correct? > > > > Userpsace is not awakened immediately when an event is added the the > > ring. It is awakened once the number of events crosses a watermark. That > > means there is an unknown - and potentially long - time window where the > > program can be unloaded before perf reads the event. > > So no matter what you do expecting perf record to be able to process the > > event quickly is an unreasonable expectation. > yes... unless we go with threaded model as Arnaldo suggested and use > single event as a watermark to wakeup our perf thread. > In such case there is still a race window between user space waking up > and doing _single_ bpf_get_fd_from_id() call to hold that prog > and some other process trying to instantly unload the prog it > just loaded. > I think such race window is extremely tiny and if perf misses > those load/unload events it's a good thing, since there is no chance > that normal pmu event samples would be happening during prog execution. > The alternative approach with no race window at all is to burden kernel > RECORD_* events with _all_ information about bpf prog. Which is jited > addresses, jited image itself, info about all subprogs, info about line > info, all BTF data, etc. As I said earlier I'm strongly against such > RECORD_* bloating. > Instead we need to find a way to process new RECORD_BPF events with > single prog_id field in perf user space with minimal race > and threaded approach sounds like a win to me. There is another alternative, I think: put just a content based hash, like a git commit id into a PERF_RECORD_MMAP3 new record, and when the validator does the jit, etc, it stashes the content that BPF_OBJ_GET_INFO_BY_FD would get somewhere, some filesystem populated by the kernel right after getting stuff from sys_bpf() and preparing it for use, then we know that in (start, end) we have blob foo with content id, that we will use to retrieve information that augments what we know with just (start, end, id) and allows annotation, etc. That stash space for jitted stuff gets garbage collected from time to time or is even completely disabled if the user is not interested in such augmentation, just like one can do disabling perf's ~/.debug/ directory hashed by build-id. I think with this we have no races, the PERF_RECORD_MMAP3 gets just what is in PERF_RECORD_MMAP2 plus some extra 20 bytes for such content based cookie and we solve the other race we already have with kernel modules, DSOs, etc. I have mentioned this before, there were objections, perhaps this time I formulated in a different way that makes it more interesting? - Arnaldo
Re: [PATCH bpf-next 0/3] improve and fix barriers for walking perf rb
Em Wed, Oct 17, 2018 at 04:41:53PM +0200, Daniel Borkmann escreveu: > This set first adds smp_* barrier variants to tools infrastructure > and in a second step updates perf and libbpf to make use of them. > For details, please see individual patches, thanks! > > Arnaldo, if there are no objections, could this be routed via bpf-next > with Acked-by's due to later dependencies in libbpf? Alternatively, > I could also get the 2nd patch out during merge window, but perhaps > it's okay to do in one go as there shouldn't be much conflict in perf. Right, when updating kernel/events/ring_buffer.c the corresponding code in tools/ should've been changed :-) Acked-by: Arnaldo Carvalho de Melo - Arnaldo > Thanks! > > Daniel Borkmann (3): > tools: add smp_* barrier variants to include infrastructure > tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb} > bpf, libbpf: use proper barriers in perf ring buffer walk > > tools/arch/arm64/include/asm/barrier.h | 10 ++ > tools/arch/x86/include/asm/barrier.h | 9 ++--- > tools/include/asm/barrier.h| 11 +++ > tools/lib/bpf/libbpf.c | 25 +++-- > tools/perf/util/mmap.h | 5 +++-- > 5 files changed, 49 insertions(+), 11 deletions(-) > > -- > 2.9.5
Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload
Em Wed, Oct 17, 2018 at 09:11:40AM -0300, Arnaldo Carvalho de Melo escreveu: > Adding Alexey, Jiri and Namhyung as they worked/are working on > multithreading 'perf record'. > > Em Tue, Oct 16, 2018 at 11:43:11PM -0700, Song Liu escreveu: > > On Tue, Oct 16, 2018 at 4:43 PM David Ahern wrote: > > > On 10/15/18 4:33 PM, Song Liu wrote: > > > > I am working with Alexei on the idea of fetching BPF program > > > > information via > > > > BPF_OBJ_GET_INFO_BY_FD cmd. I added PERF_RECORD_BPF_EVENT > > > > to perf_event_type, and dumped these events to perf event ring buffer. > > > > > I found that perf will not process event until the end of perf-record: > > > > > root@virt-test:~# ~/perf record -ag -- sleep 10 > > > > .. 10 seconds later > > > > [ perf record: Woken up 34 times to write data ] > > > > machine__process_bpf_event: prog_id 6 loaded > > > > machine__process_bpf_event: prog_id 6 unloaded > > > > [ perf record: Captured and wrote 9.337 MB perf.data (93178 samples) ] > > > > > In this example, the bpf program was loaded and then unloaded in > > > > another terminal. When machine__process_bpf_event() processes > > > > the load event, the bpf program is already unloaded. Therefore, > > > > machine__process_bpf_event() will not be able to get information > > > > about the program via BPF_OBJ_GET_INFO_BY_FD cmd. > > > > > To solve this problem, we will need to run BPF_OBJ_GET_INFO_BY_FD > > > > as soon as perf get the event from kernel. I looked around the perf > > > > code for a while. But I haven't found a good example where some > > > > events are processed before the end of perf-record. Could you > > > > please help me with this? > > > > perf record does not process events as they are generated. Its sole job > > > is pushing data from the maps to a file as fast as possible meaning in > > > bulk based on current read and write locations. > > > > Adding code to process events will add significant overhead to the > > > record command and will not really solve your race problem. > > > I agree that processing events while recording has significant overhead. > > In this case, perf user space need to know details about the the jited BPF > > program. It is impossible to pass all these details to user space through > > the relatively stable ring_buffer API. Therefore, some processing of the > > data is necessary (get bpf prog_id from ring buffer, and then fetch program > > details via BPF_OBJ_GET_INFO_BY_FD. > > > I have some idea on processing important data with relatively low overhead. > > Let me try implement it. > > Well, you could have a separate thread processing just those kinds of > events, associate it with a dummy event where you only ask for > PERF_RECORD_BPF_EVENTs. > > Here is how to setup the PERF_TYPE_SOFTWARE/PERF_COUNT_SW_DUMMY > perf_event_attr: > > [root@seventh ~]# perf record -vv -e dummy sleep 01 > > perf_event_attr: > type 1 > size 112 > config 0x9 > { sample_period, sample_freq } 4000 > sample_type IP|TID|TIME|PERIOD > disabled 1 > inherit 1 These you would have disabled, no need for PERF_RECORD_{MMAP*,COMM,FORK,EXIT} just PERF_RECORD_BPF_EVENT > mmap 1 > comm 1 > task 1 > mmap21 > comm_exec1 > freq 1 > enable_on_exec 1 > sample_id_all1 > exclude_guest1 > > sys_perf_event_open: pid 12046 cpu 0 group_fd -1 flags 0x8 = 4 > sys_perf_event_open: pid 12046 cpu 1 group_fd -1 flags 0x8 = 5 > sys_perf_event_open: pid 12046 cpu 2 group_fd -1 flags 0x8 = 6 > sys_perf_event_open: pid 12046 cpu 3 group_fd -1 flags 0x8 = 8 > mmap size 528384B > perf event ring buffer mmapped per cpu > Synthesizing TSC conversion information > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.014 MB perf.data ] > [root@seventh ~]# > > [root@seventh ~]# perf evlist -v > dummy: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, > sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, > freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, > mmap2: 1, comm_exec: 1 > [root@seventh ~]# > > There is work ongoing in dumping one file per cpu and then, at post > processing time merging all those files to get ordering, so one more > file, for these VIP events, that require per-event processing would be > ordered at that time with all the other per-cpu files. > > - Arnaldo
Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload
Adding Alexey, Jiri and Namhyung as they worked/are working on multithreading 'perf record'. Em Tue, Oct 16, 2018 at 11:43:11PM -0700, Song Liu escreveu: > On Tue, Oct 16, 2018 at 4:43 PM David Ahern wrote: > > On 10/15/18 4:33 PM, Song Liu wrote: > > > I am working with Alexei on the idea of fetching BPF program information > > > via > > > BPF_OBJ_GET_INFO_BY_FD cmd. I added PERF_RECORD_BPF_EVENT > > > to perf_event_type, and dumped these events to perf event ring buffer. > > > I found that perf will not process event until the end of perf-record: > > > root@virt-test:~# ~/perf record -ag -- sleep 10 > > > .. 10 seconds later > > > [ perf record: Woken up 34 times to write data ] > > > machine__process_bpf_event: prog_id 6 loaded > > > machine__process_bpf_event: prog_id 6 unloaded > > > [ perf record: Captured and wrote 9.337 MB perf.data (93178 samples) ] > > > In this example, the bpf program was loaded and then unloaded in > > > another terminal. When machine__process_bpf_event() processes > > > the load event, the bpf program is already unloaded. Therefore, > > > machine__process_bpf_event() will not be able to get information > > > about the program via BPF_OBJ_GET_INFO_BY_FD cmd. > > > To solve this problem, we will need to run BPF_OBJ_GET_INFO_BY_FD > > > as soon as perf get the event from kernel. I looked around the perf > > > code for a while. But I haven't found a good example where some > > > events are processed before the end of perf-record. Could you > > > please help me with this? > > perf record does not process events as they are generated. Its sole job > > is pushing data from the maps to a file as fast as possible meaning in > > bulk based on current read and write locations. > > Adding code to process events will add significant overhead to the > > record command and will not really solve your race problem. > I agree that processing events while recording has significant overhead. > In this case, perf user space need to know details about the the jited BPF > program. It is impossible to pass all these details to user space through > the relatively stable ring_buffer API. Therefore, some processing of the > data is necessary (get bpf prog_id from ring buffer, and then fetch program > details via BPF_OBJ_GET_INFO_BY_FD. > I have some idea on processing important data with relatively low overhead. > Let me try implement it. Well, you could have a separate thread processing just those kinds of events, associate it with a dummy event where you only ask for PERF_RECORD_BPF_EVENTs. Here is how to setup the PERF_TYPE_SOFTWARE/PERF_COUNT_SW_DUMMY perf_event_attr: [root@seventh ~]# perf record -vv -e dummy sleep 01 perf_event_attr: type 1 size 112 config 0x9 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 sample_id_all1 exclude_guest1 mmap21 comm_exec1 sys_perf_event_open: pid 12046 cpu 0 group_fd -1 flags 0x8 = 4 sys_perf_event_open: pid 12046 cpu 1 group_fd -1 flags 0x8 = 5 sys_perf_event_open: pid 12046 cpu 2 group_fd -1 flags 0x8 = 6 sys_perf_event_open: pid 12046 cpu 3 group_fd -1 flags 0x8 = 8 mmap size 528384B perf event ring buffer mmapped per cpu Synthesizing TSC conversion information [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB perf.data ] [root@seventh ~]# [root@seventh ~]# perf evlist -v dummy: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 [root@seventh ~]# There is work ongoing in dumping one file per cpu and then, at post processing time merging all those files to get ordering, so one more file, for these VIP events, that require per-event processing would be ordered at that time with all the other per-cpu files. - Arnaldo
Re: [PATCH bpf-next] bpf: emit audit messages upon successful prog load and unload
Em Fri, Oct 05, 2018 at 11:44:35AM -0700, Alexei Starovoitov escreveu: > On Fri, Oct 05, 2018 at 08:14:09AM +0200, Jiri Olsa wrote: > > On Thu, Oct 04, 2018 at 03:10:15PM -0700, Alexei Starovoitov wrote: > > > On Thu, Oct 04, 2018 at 10:22:31PM +0200, Jesper Dangaard Brouer wrote: > > > > My use-case is to 24/7 collect and keep records in userspace, and have a > > > > timeline of these notifications, for later retrieval. The idea is that > > > > our support engineers can look at these records when troubleshooting > > > > the system. And the plan is also to collect these records as part of > > > > our sosreport tool, which is part of the support case. > > > I don't think you're implying that prog load/unload should be spamming > > > dmesg > > > and auditd not even running... > > I think the problem Jesper implied is that in order to collect > > those logs you'll need perf tool running all the time.. which > > it's not equipped for yet > I'm not proposing to run 'perf' binary all the time. I think Jiri just said that one would have to run something all the time to get all the records, see below > Setting up perf ring buffer just for these new bpf prog load/unload events > and epolling it is simple enough to do from any application including auditd. > selftests/bpf/ do it for bpf output events. I think he is talking about the preexisting loaded BPF programs. We have the same problem with mmaps, where the perf tool will, with races, enumerate the existing mmaps as PERF_RECORD_MMAP synthesized from /proc/PIDS/smaps. There was talk in the past to ask the kernel to emit PERF_RECORD_MMAP into the ring buffer for those pre-existing entries, reducing a bit the races, but as there doesn't seem to have a good way of doing it, we continued with the synthesizing from procfs. Is there a way for us to synthesize those prog load/unload for preexisting loaded bpf objects? - Arnaldo
Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload
Em Thu, Sep 20, 2018 at 08:14:46PM -0700, Alexei Starovoitov escreveu: > On Thu, Sep 20, 2018 at 03:56:51PM +0200, Peter Zijlstra wrote: > > On Thu, Sep 20, 2018 at 10:25:45AM -0300, Arnaldo Carvalho de Melo wrote: > > > PeterZ provided a patch introducing PERF_RECORD_MUNMAP, went nowhere due > > > to having to cope with munmapping parts of existing mmaps, etc. > > > > > > I'm still more in favour of introduce PERF_RECORD_MUNMAP, even if for > > > now it would be used just in this clean case for undoing a > > > PERF_RECORD_MMAP for a BPF program. > > > > > > The ABI is already complicated, starting to use something called > > > PERF_RECORD_MMAP for unmmaping by just using a NULL name... too clever, > > > I think. > > > > Agreed, the PERF_RECORD_MUNMAP patch was fairly trivial, the difficult > > part was getting the perf tool to dtrt for that use-case. But if we need > > unmap events, doing the unmap record now is the right thing. > > Thanks for the pointers! > The answer is a bit long. pls bear with me. Ditto with me :-) > I have considered adding MUNMAP to match existing MMAP, but went > without it because I didn't want to introduce new bit in perf_event_attr > and emit these new events in a misbalanced conditional way for prog > load/unload. > Like old perf is asking kernel for mmap events via mmap bit, so prog load > events By prog load events you mean that old perf, having perf_event_attr.mmap = 1 || perf_event_attr.mmap2 = 1 will cause the new kernel to emit PERF_RECORD_MMAP records for the range of addresses that a BPF program is being loaded on, right? > will be in perf.data, but old perf report won't recognize them anyway. Why not? It should lookup the symbol and find it in the rb_tree of maps, with a DSO name equal to what was in the PERF_RECORD_MMAP emitted by the BPF core, no? It'll be an unresolved symbol, but a resolved map. > Whereas new perf would certainly want to catch bpf events and will set > both mmap and mumap bits. new perf with your code will find a symbol, not a map, because your code catches a special case PERF_RECORD_MMAP and instead of creating a 'struct map' will create a 'struct symbol' and insert it in the kallsyms 'struct map', right? > Then if I add MUNMAP event without new bit and emit MMAP/MUNMAP > conditionally based on single mmap bit they will confuse old perf > and it will print warning about 'unknown events'. That is unfortunate and I'll turn that part into a pr_debug() > Both situations are ugly, hence I went with reuse of MMAP event > for both load/unload. So, its doubly odd, i.e. MMAP used for mmap() and for munmap() and the effects in the tooling is not to create or remove a 'struct map', but to alter an existing symbol table for the kallsyms map. > In such case old perf silently ignores them. Which is what I wanted. In theory the old perf should catch the PERF_RECORD_MMAP with a string in the filename part and insert a new map into the kernel mmap rb_tree, and then samples would be resolved to this map, but since there is no backing DSO with a symtab, it would stop at that, just stating that the map is called NAME-OF-BPF-PROGRAM. This is all from memory, possibly there is something in there that makes it ignore this PERF_RECORD_MMAP emitted by the BPF kernel code when loading a new program. > When we upgrade the kernel we cannot synchronize the kernel upgrade > (or downgrade) with user space perf package upgrade. sure > Hence not confusing old perf is important. Thanks for trying to achieve that, and its a pity that that "unknown record" message is a pr_warning or pr_info and not a pr_debug(). > With new kernel new bpf mmap events get into perf.data and > new perf picks them up. > > Few more considerations: > > I consider synthetic perf events to be non-ABI. Meaning they're > emitted by perf user space into perf.data and there is a convention > on names, but it's not a kernel abi. Like RECORD_MMAP with > event.filename == "[module_name]" is an indication for perf report > to parse elf/build-id of dso==module_name. > There is no such support in the kernel. Kernel doesn't emit > such events for module load/unload. If in the future > we decide to extend kernel with such events they don't have > to match what user space perf does today. Right, that is another unfortunate state of affairs, kernel module load/unload should already be supported, reported by the kernel via a proper PERF_RECORD_MODULE_LOAD/UNLOAD > Why this is important? To get to next step. > As Arnaldo pointed out this patch set is missing support for > JITed prog annotations and displaying asm code. Absolutely correct. > This set only helps perf to reveal the names of bpf progs that _were_ > running at the tim
Re: [PATCH perf 3/3] tools/perf: recognize and process RECORD_MMAP events for bpf progs
Em Thu, Sep 20, 2018 at 10:36:17AM -0300, Arnaldo Carvalho de Melo escreveu: > Em Wed, Sep 19, 2018 at 03:39:35PM -0700, Alexei Starovoitov escreveu: > > Recognize JITed bpf prog load/unload events. > > Add/remove kernel symbols accordingly. > > > > Signed-off-by: Alexei Starovoitov > > --- > > tools/perf/util/machine.c | 27 +++ > > tools/perf/util/symbol.c | 13 + > > tools/perf/util/symbol.h | 1 + > > 3 files changed, 41 insertions(+) > > > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > index c4acd2001db0..ae4f8a0fdc7e 100644 > > --- a/tools/perf/util/machine.c > > +++ b/tools/perf/util/machine.c > > @@ -25,6 +25,7 @@ > > #include "sane_ctype.h" > > #include > > #include > > +#include > > > > static void __machine__remove_thread(struct machine *machine, struct > > thread *th, bool lock); > > > > @@ -1460,6 +1461,32 @@ static int machine__process_kernel_mmap_event(struct > > machine *machine, > > enum dso_kernel_type kernel_type; > > bool is_kernel_mmap; > > > > + /* process JITed bpf programs load/unload events */ > > + if (event->mmap.pid == ~0u && event->mmap.tid == BPF_FS_MAGIC) { > > > So, this would be in machine__process_kernel_munmap-event(machine), etc, > no check for BPF_FS_MAGIC would be needed with a PERF_RECORD_MUNMAP. > > > + struct symbol *sym; > > + u64 ip; > > + > > + map = map_groups__find(>kmaps, event->mmap.start); > > + if (!map) { > > + pr_err("No kernel map for IP %lx\n", event->mmap.start); > > + goto out_problem; > > + } > > + ip = event->mmap.start - map->start + map->pgoff; > > + if (event->mmap.filename[0]) { > > + sym = symbol__new(ip, event->mmap.len, 0, 0, > > + event->mmap.filename); > > Humm, so the bpf program would be just one symbol... bpf-to-bpf calls > will be to a different bpf program, right? > > /me goes to read https://lwn.net/Articles/741773/ > "[PATCH bpf-next 00/13] bpf: introduce function calls" After reading it, yeah, I think we need some way to access a symbol table for a BPF program, and also its binary so that we can do annotation, static (perf annotate) and live (perf top), was this already considered? I think one can get the binary for a program giving sufficient perms somehow, right? One other thing I need to catch up 8-) - Arnaldo > > + dso__insert_symbol(map->dso, sym); > > + } else { > > + if (symbols__erase(>dso->symbols, ip)) { > > + pr_err("No bpf prog at IP %lx/%lx\n", > > + event->mmap.start, ip); > > + goto out_problem; > > + } > > + dso__reset_find_symbol_cache(map->dso); > > + } > > + return 0; > > + } > > + > > /* If we have maps from kcore then we do not need or want any others */ > > if (machine__uses_kcore(machine)) > > return 0; > > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c > > index d188b7588152..0653f313661d 100644 > > --- a/tools/perf/util/symbol.c > > +++ b/tools/perf/util/symbol.c > > @@ -353,6 +353,19 @@ static struct symbol *symbols__find(struct rb_root > > *symbols, u64 ip) > > return NULL; > > } > > > > +int symbols__erase(struct rb_root *symbols, u64 ip) > > +{ > > + struct symbol *s; > > + > > + s = symbols__find(symbols, ip); > > + if (!s) > > + return -ENOENT; > > + > > + rb_erase(>rb_node, symbols); > > + symbol__delete(s); > > + return 0; > > +} > > + > > static struct symbol *symbols__first(struct rb_root *symbols) > > { > > struct rb_node *n = rb_first(symbols); > > diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h > > index f25fae4b5743..92ef31953d9a 100644 > > --- a/tools/perf/util/symbol.h > > +++ b/tools/perf/util/symbol.h > > @@ -310,6 +310,7 @@ char *dso__demangle_sym(struct dso *dso, int kmodule, > > const char *elf_name); > > > > void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool > > kernel); > > void symbols__insert(struct rb_root *symbols, struct symbol *sym); > > +int symbols__erase(struct rb_root *symbols, u64 ip); > > void symbols__fixup_duplicate(struct rb_root *symbols); > > void symbols__fixup_end(struct rb_root *symbols); > > void map_groups__fixup_end(struct map_groups *mg); > > -- > > 2.17.1
Re: [PATCH perf 3/3] tools/perf: recognize and process RECORD_MMAP events for bpf progs
Em Wed, Sep 19, 2018 at 03:39:35PM -0700, Alexei Starovoitov escreveu: > Recognize JITed bpf prog load/unload events. > Add/remove kernel symbols accordingly. > > Signed-off-by: Alexei Starovoitov > --- > tools/perf/util/machine.c | 27 +++ > tools/perf/util/symbol.c | 13 + > tools/perf/util/symbol.h | 1 + > 3 files changed, 41 insertions(+) > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > index c4acd2001db0..ae4f8a0fdc7e 100644 > --- a/tools/perf/util/machine.c > +++ b/tools/perf/util/machine.c > @@ -25,6 +25,7 @@ > #include "sane_ctype.h" > #include > #include > +#include > > static void __machine__remove_thread(struct machine *machine, struct thread > *th, bool lock); > > @@ -1460,6 +1461,32 @@ static int machine__process_kernel_mmap_event(struct > machine *machine, > enum dso_kernel_type kernel_type; > bool is_kernel_mmap; > > + /* process JITed bpf programs load/unload events */ > + if (event->mmap.pid == ~0u && event->mmap.tid == BPF_FS_MAGIC) { So, this would be in machine__process_kernel_munmap-event(machine), etc, no check for BPF_FS_MAGIC would be needed with a PERF_RECORD_MUNMAP. > + struct symbol *sym; > + u64 ip; > + > + map = map_groups__find(>kmaps, event->mmap.start); > + if (!map) { > + pr_err("No kernel map for IP %lx\n", event->mmap.start); > + goto out_problem; > + } > + ip = event->mmap.start - map->start + map->pgoff; > + if (event->mmap.filename[0]) { > + sym = symbol__new(ip, event->mmap.len, 0, 0, > + event->mmap.filename); Humm, so the bpf program would be just one symbol... bpf-to-bpf calls will be to a different bpf program, right? /me goes to read https://lwn.net/Articles/741773/ "[PATCH bpf-next 00/13] bpf: introduce function calls" > + dso__insert_symbol(map->dso, sym); > + } else { > + if (symbols__erase(>dso->symbols, ip)) { > + pr_err("No bpf prog at IP %lx/%lx\n", > +event->mmap.start, ip); > + goto out_problem; > + } > + dso__reset_find_symbol_cache(map->dso); > + } > + return 0; > + } > + > /* If we have maps from kcore then we do not need or want any others */ > if (machine__uses_kcore(machine)) > return 0; > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c > index d188b7588152..0653f313661d 100644 > --- a/tools/perf/util/symbol.c > +++ b/tools/perf/util/symbol.c > @@ -353,6 +353,19 @@ static struct symbol *symbols__find(struct rb_root > *symbols, u64 ip) > return NULL; > } > > +int symbols__erase(struct rb_root *symbols, u64 ip) > +{ > + struct symbol *s; > + > + s = symbols__find(symbols, ip); > + if (!s) > + return -ENOENT; > + > + rb_erase(>rb_node, symbols); > + symbol__delete(s); > + return 0; > +} > + > static struct symbol *symbols__first(struct rb_root *symbols) > { > struct rb_node *n = rb_first(symbols); > diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h > index f25fae4b5743..92ef31953d9a 100644 > --- a/tools/perf/util/symbol.h > +++ b/tools/perf/util/symbol.h > @@ -310,6 +310,7 @@ char *dso__demangle_sym(struct dso *dso, int kmodule, > const char *elf_name); > > void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool > kernel); > void symbols__insert(struct rb_root *symbols, struct symbol *sym); > +int symbols__erase(struct rb_root *symbols, u64 ip); > void symbols__fixup_duplicate(struct rb_root *symbols); > void symbols__fixup_end(struct rb_root *symbols); > void map_groups__fixup_end(struct map_groups *mg); > -- > 2.17.1
Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload
Em Thu, Sep 20, 2018 at 10:44:24AM +0200, Peter Zijlstra escreveu: > On Wed, Sep 19, 2018 at 03:39:34PM -0700, Alexei Starovoitov wrote: > > void bpf_prog_kallsyms_del(struct bpf_prog *fp) > > { > > + unsigned long symbol_start, symbol_end; > > + /* mmap_record.filename cannot be NULL and has to be u64 aligned */ > > + char buf[sizeof(u64)] = {}; > > + > > if (!bpf_prog_kallsyms_candidate(fp)) > > return; > > > > spin_lock_bh(_lock); > > bpf_prog_ksym_node_del(fp->aux); > > spin_unlock_bh(_lock); > > + bpf_get_prog_addr_region(fp, _start, _end); > > + perf_event_mmap_bpf_prog(symbol_start, symbol_end - symbol_start, > > +buf, sizeof(buf)); > > } > > So perf doesn't normally issue unmap events.. We've talked about doing > that, but so far it's never really need needed I think. > I feels a bit weird to start issuing unmap events for this. For reference, this surfaced here: https://lkml.org/lkml/2017/1/27/452 Start of the thread, that involves postgresql, JIT, LLVM, perf is here: https://lkml.org/lkml/2016/12/10/1 PeterZ provided a patch introducing PERF_RECORD_MUNMAP, went nowhere due to having to cope with munmapping parts of existing mmaps, etc. I'm still more in favour of introduce PERF_RECORD_MUNMAP, even if for now it would be used just in this clean case for undoing a PERF_RECORD_MMAP for a BPF program. The ABI is already complicated, starting to use something called PERF_RECORD_MMAP for unmmaping by just using a NULL name... too clever, I think. - Arnaldo
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Thu, Jun 14, 2018 at 09:56:43PM -0700, Yonghong Song escreveu: > I really want to get rid of this option as well. To make pahole work > with the default default format, I need to add bpf support to > libdwfl in elfutils repo. I will work on that. Right, I haven't looked into detail, but perhaps we can do like we do in tools/perf/ where we add a feature test to check if some function is present in a library (elfutils even) and if so, use it, otherwise, use a copy that we carry in pahole.git. For instance: tools/perf/util/symbol-elf.c #ifndef HAVE_ELF_GETPHDRNUM_SUPPORT static int elf_getphdrnum(Elf *elf, size_t *dst) { GElf_Ehdr gehdr; GElf_Ehdr *ehdr; ehdr = gelf_getehdr(elf, ); if (!ehdr) return -1; *dst = ehdr->e_phnum; return 0; } #endif And we have a feature test to check if that is present, simple one, if that builds and links, we have it, then the tools build Makefile magic will end up defining HAVE_ELF_GETPHDRNUM_SUPPORT and our copy doesn't get included, using what is in elfutils: [acme@jouet perf]$ cat tools/build/feature/test-libelf-getphdrnum.c // SPDX-License-Identifier: GPL-2.0 #include int main(void) { size_t dst; return elf_getphdrnum(0, ); } [acme@jouet perf]$ [acme@jouet perf]$ grep elf /tmp/build/perf/FEATURE-DUMP feature-libelf=1 feature-libelf-getphdrnum=1 feature-libelf-gelf_getnote=1 feature-libelf-getshdrstrndx=1 feature-libelf-mmap=1 [acme@jouet perf]$ This way a new pahole version won't get to wait till places where it gets built have these new functions and we stop using it as soon as the library get it. - Arnaldo
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Thu, Jun 14, 2018 at 02:47:59PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Thu, Jun 14, 2018 at 10:21:30AM -0700, Alexei Starovoitov escreveu: > > On 6/14/18 10:18 AM, Arnaldo Carvalho de Melo wrote: > > > Just out of curiosity, is there any plan to have this as a clang option? > > I think > > clang ... -mllvm -mattr=dwarfris > > should work. > The message "(LLVM option parsing)" implies what you suggest, but didn't > worked :-\ > -mllvm Additional arguments to forward to LLVM's option > processing > Almost there tho :-\ So I thought that this -mattr=dwarfris would be available only after I set the target, because I tried 'llc -mattr=help' and dwarfris wasn't there: [acme@jouet perf]$ llc -mattr=help |& grep dwarf [acme@jouet perf]$ Only after I set the arch it appears: [acme@jouet perf]$ llc -march=bpf -mattr=help |& grep dwarf dwarfris - Disable MCAsmInfo DwarfUsesRelocationsAcrossSections. dwarfris - Disable MCAsmInfo DwarfUsesRelocationsAcrossSections. dwarfris - Disable MCAsmInfo DwarfUsesRelocationsAcrossSections. [acme@jouet perf]$ But even after moving the '-mllvm -mattr=dwarfris' to after '-target bpf' it still can't grok it :-\ /usr/local/bin/clang -D__KERNEL__ -D__NR_CPUS__=4 -DLINUX_VERSION_CODE=0x41100 -g -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h -I/home/acme/lib/include/perf/bpf -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/4.17.0-rc5/build -c /home/acme/bpf/hello.c -target bpf -mllvm -mattr=dwarfris -O2 -o hello.o So onlye with 'clang ... -target bpf -emit-llvm -O2 -o - | llc -march=bpf -mattr=dwarfris ...' things work as we expect. - Arnaldo
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Thu, Jun 14, 2018 at 10:21:30AM -0700, Alexei Starovoitov escreveu: > On 6/14/18 10:18 AM, Arnaldo Carvalho de Melo wrote: > > Just out of curiosity, is there any plan to have this as a clang option? > I think > clang ... -mllvm -mattr=dwarfris > should work. [root@jouet bpf]# cat ~/.perfconfig [llvm] dump-obj = true clang-opt = -g -mllvm -mattr=dwarfris [root@jouet bpf]# trace -e openat,hello.c touch /tmp/kafai clang (LLVM option parsing): Unknown command line argument '-mattr=dwarfris'. Try: 'clang (LLVM option parsing) -help' clang (LLVM option parsing): Did you mean '-mxgot=dwarfris'? ERROR: unable to compile hello.c Hint: Check error message shown above. Hint: You can also pre-compile it into .o using: clang -target bpf -O2 -c hello.c with proper -I and -D options. event syntax error: 'hello.c' \___ Failed to load hello.c from source: Error when compiling BPF scriptlet (add -v to see detail) [root@jouet bpf]# [root@jouet bpf]# trace -e openat,hello.c touch /tmp/kafai |& grep clang clang (LLVM option parsing): Unknown command line argument '-mattr=dwarfris'. Try: 'clang (LLVM option parsing) -help' clang (LLVM option parsing): Did you mean '-mxgot=dwarfris'? clang -target bpf -O2 -c hello.c [root@jouet bpf]# trace -v -e openat,hello.c touch /tmp/kafai |& grep clang set env: CLANG_EXEC=/usr/local/bin/clang llvm compiling command : /usr/local/bin/clang -D__KERNEL__ -D__NR_CPUS__=4 -DLINUX_VERSION_CODE=0x41100 -g -mllvm -mattr=dwarfris -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h -I/home/acme/lib/include/perf/bpf -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/4.17.0-rc5/build -c /home/acme/bpf/hello.c -target bpf -O2 -o - clang (LLVM option parsing): Unknown command line argument '-mattr=dwarfris'. Try: 'clang (LLVM option parsing) -help' clang (LLVM option parsing): Did you mean '-mxgot=dwarfris'? clang -target bpf -O2 -c hello.c [root@jouet bpf]# The message "(LLVM option parsing)" implies what you suggest, but didn't worked :-\ -mllvm Additional arguments to forward to LLVM's option processing Almost there tho :-\ - Arnaldo
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Thu, Jun 14, 2018 at 10:21:30AM -0700, Alexei Starovoitov escreveu: > On 6/14/18 10:18 AM, Arnaldo Carvalho de Melo wrote: > > > > Just out of curiosity, is there any plan to have this as a clang option? > > I think > clang ... -mllvm -mattr=dwarfris thanks, trying... - Arnaldo
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Thu, Jun 14, 2018 at 09:22:27AM -0700, Martin KaFai Lau escreveu: > On Thu, Jun 14, 2018 at 12:03:34PM -0300, Arnaldo Carvalho de Melo wrote: > > > > > > > 1. The tools/testing/selftests/bpf/Makefile has the CLANG_FLAGS and > > > > > >LLC_FLAGS needed to compile the bpf prog. It requires a new > > > > > >"-mattr=dwarf" llc option which was added to the future > > > > > >llvm 7.0. > > [ ... ] > > > I tried it, but it didn't work, see: > > > > [root@jouet bpf]# cat hello.c > > #include "stdio.h" > > > > int syscall_enter(openat)(void *ctx) > > { > > puts("Hello, world\n"); > > return 0; > > } > > [root@jouet bpf]# trace -e openat,hello.c touch /tmp/kafai > > clang-6.0: error: unknown argument: '-mattr=dwarf' > "-mattr=dwarf" is currently a llc only option. > > tools/testing/selftests/bpf/Makefile has example on how to pipe clang to llc. > e.g.: > clang -g -O2 -target bpf -emit-llvm -c hello.c -o - | llc -march=bpf > -mcpu=generic -mattr=dwarfris -filetype=obj -o hello.o Ok, so I'll probably add a llvm.opts .perfconfig entry that, if present will tell tools/perf/util/llvm-utils.c that piping the output of clang to llvm, so that we can use llvm specific options, needs to be done. Probably, for the time being I'll check for -g in llvm.clang-opt and if it is there, set up the piping... Just out of curiosity, is there any plan to have this as a clang option? Just to finish this thing here, lemme try a slightly modified version of your command line: [root@jouet bpf]# clang -D__KERNEL__ -D__NR_CPUS__=4 -DLINUX_VERSION_CODE=0x41100 -g -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h -I/home/acme/lib/include/perf/bpf -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/4.17.0-rc5/build -c /home/acme/bpf/hello.c -target bpf -emit-llvm -O2 -o - | llc -march=bpf -mcpu=generic -mattr=dwarfris -filetype=obj -o hello2.o [root@jouet bpf]# [root@jouet bpf]# file hello2.o hello2.o: ELF 64-bit LSB relocatable, *unknown arch 0xf7* version 1 (SYSV), with debug_info, not stripped [root@jouet bpf]# pahole hello2.o struct bpf_map_def { unsigned int type; /* 0 4 */ unsigned int key_size; /* 4 4 */ unsigned int value_size; /* 8 4 */ unsigned int max_entries; /*12 4 */ /* size: 16, cachelines: 1, members: 4 */ /* last cacheline: 16 bytes */ }; [root@jouet bpf]# Finally works, thanks. Thanks, - Arnaldo > > ERROR: unable to compile hello.c > > Hint: Check error message shown above. > > Hint: You can also pre-compile it into .o using: > > clang -target bpf -O2 -c hello.c > > with proper -I and -D options. > > event syntax error: 'hello.c' > > \___ Failed to load hello.c from source: Error when > > compiling BPF scriptlet > > > > (add -v to see detail) > > Run 'perf list' for a list of valid events > > > > Usage: perf trace [] [] > > or: perf trace [] -- [] > > or: perf trace record [] [] > > or: perf trace record [] -- [] > > > > -e, --eventevent/syscall selector. use 'perf list' to list > > available events > > [root@jouet bpf]# > > > > The full command line with that is: > > > > [root@jouet bpf]# trace -v -e openat,hello.c touch /tmp/kafai |& grep mattr > > set env: CLANG_OPTIONS=-g -mattr=dwarf > > llvm compiling command : /usr/local/bin/clang -D__KERNEL__ -D__NR_CPUS__=4 > > -DLINUX_VERSION_CODE=0x41100 -g -mattr=dwarf -nostdinc -isystem > > /usr/lib/gcc/x86_64-redhat-linux/7/include > > -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated > > -I/home/acme/git/linux/include -I./include > > -I/home/acme/git/linux/arch/x86/include/uapi > > -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi > > -I./include/generated/uapi -include > > /home/acme/git/linux/include/linux/kconfig.h > > -I/home/acme/lib/include/perf/bpf -Wno-unused-value -Wno-pointer-sign > > -working-directory /lib/modules/4.17.0-rc5/build -c /home/acme/bpf/hello.c > > -target bpf -O2 -o - > > clan
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Wed, Jun 13, 2018 at 04:26:38PM -0700, Martin KaFai Lau escreveu: > On Tue, Jun 12, 2018 at 05:41:26PM -0300, Arnaldo Carvalho de Melo wrote: > > Em Tue, Jun 12, 2018 at 05:31:24PM -0300, Arnaldo Carvalho de Melo escreveu: > > > Em Thu, Jun 07, 2018 at 01:07:01PM -0700, Martin KaFai Lau escreveu: > > > > On Thu, Jun 07, 2018 at 04:30:29PM -0300, Arnaldo Carvalho de Melo > > > > wrote: > > > > > So this must be available in a newer llvm version? Which one? > > > > I should have put in the details in my last email or > > > > in the commit message, my bad. > > > > 1. The tools/testing/selftests/bpf/Makefile has the CLANG_FLAGS and > > > >LLC_FLAGS needed to compile the bpf prog. It requires a new > > > >"-mattr=dwarf" llc option which was added to the future > > > >llvm 7.0. > > > [root@jouet bpf]# pahole hello.o > > > struct clang version 5.0.1 (tags/RELEASE_501/final) { > > > clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 > > > (tags/RELEASE_501/final); /* 0 4 */ > > > clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 > > > (tags/RELEASE_501/final); /* 4 4 */ > > > clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 > > > (tags/RELEASE_501/final); /* 8 4 */ > > > clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 > > > (tags/RELEASE_501/final); /*12 4 */ > > > /* size: 16, cachelines: 1, members: 4 */ > > > /* last cacheline: 16 bytes */ > > > }; > > > [root@jouet bpf]# > > > > > > Ok, I guess I saw this case in the llvm/clang git logs, so this one was > > > generated with the older clang, will regenerate and add that > > > "-mattr=dwarf" > > > part. > > [root@jouet bpf]# pahole hello.o > > struct clang version 7.0.0 > > /* size: 16, cachelines: 1, members: 4 */ > > /* last cacheline: 16 bytes */ > > }; > That means the "-mattr=dwarf" is not effective. > Can you share your clang and llc command to create hello.o? I tried it, but it didn't work, see: [root@jouet bpf]# cat hello.c #include "stdio.h" int syscall_enter(openat)(void *ctx) { puts("Hello, world\n"); return 0; } [root@jouet bpf]# trace -e openat,hello.c touch /tmp/kafai clang-6.0: error: unknown argument: '-mattr=dwarf' ERROR: unable to compile hello.c Hint: Check error message shown above. Hint: You can also pre-compile it into .o using: clang -target bpf -O2 -c hello.c with proper -I and -D options. event syntax error: 'hello.c' \___ Failed to load hello.c from source: Error when compiling BPF scriptlet (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf trace [] [] or: perf trace [] -- [] or: perf trace record [] [] or: perf trace record [] -- [] -e, --eventevent/syscall selector. use 'perf list' to list available events [root@jouet bpf]# The full command line with that is: [root@jouet bpf]# trace -v -e openat,hello.c touch /tmp/kafai |& grep mattr set env: CLANG_OPTIONS=-g -mattr=dwarf llvm compiling command : /usr/local/bin/clang -D__KERNEL__ -D__NR_CPUS__=4 -DLINUX_VERSION_CODE=0x41100 -g -mattr=dwarf -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h -I/home/acme/lib/include/perf/bpf -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/4.17.0-rc5/build -c /home/acme/bpf/hello.c -target bpf -O2 -o - clang-6.0: error: unknown argument: '-mattr=dwarf' [root@jouet bpf]# This is with these llvm and clang trees: [root@jouet llvm]# git log --oneline -5 98c78e82f54 (HEAD -> master, origin/master, origin/HEAD) [asan] Instrument comdat globals on COFF targets 6ad988b5998 [DAGCombiner] clean up comments; NFC a735ba5b795 [X86][SSE] Support v8i16/v16i16 rotations 1503b9f6fe8 [x86] add tests for node-level FMF; NFC 4a49826736f [x86] regenerate test checks; NFC [root@jouet llvm]# [root@jouet llvm]# cd tools/clang/ [root@jouet clang]# git log --oneline -5 8c873daccc (HEAD -> master, origin/master, origin/HEAD) [X86] Add builtins for vpermq/vpermpd instructions to enable target feature checking. a344be6ba4 [X86] Change immediate type for some builtins from char to int. dcdd53793e [CUDA] Fix emission of constant strings in sections a90c85acaf [X86] Add bu
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Tue, Jun 12, 2018 at 05:31:24PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Thu, Jun 07, 2018 at 01:07:01PM -0700, Martin KaFai Lau escreveu: > > On Thu, Jun 07, 2018 at 04:30:29PM -0300, Arnaldo Carvalho de Melo wrote: > > > So this must be available in a newer llvm version? Which one? > > > I should have put in the details in my last email or > > in the commit message, my bad. > > > 1. The tools/testing/selftests/bpf/Makefile has the CLANG_FLAGS and > >LLC_FLAGS needed to compile the bpf prog. It requires a new > >"-mattr=dwarf" llc option which was added to the future > >llvm 7.0. > [root@jouet bpf]# pahole hello.o > struct clang version 5.0.1 (tags/RELEASE_501/final) { > clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 > (tags/RELEASE_501/final); /* 0 4 */ > clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 > (tags/RELEASE_501/final); /* 4 4 */ > clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 > (tags/RELEASE_501/final); /* 8 4 */ > clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 > (tags/RELEASE_501/final); /*12 4 */ > > /* size: 16, cachelines: 1, members: 4 */ > /* last cacheline: 16 bytes */ > }; > [root@jouet bpf]# > > Ok, I guess I saw this case in the llvm/clang git logs, so this one was > generated with the older clang, will regenerate and add that "-mattr=dwarf" > part. [root@jouet bpf]# pahole hello.o struct clang version 7.0.0 (http://llvm.org/git/clang.git 8c873daccce7ee5339b9fd82c81fe02b73543b65) (http://llvm.org/git/llvm.git 98c78e82f54be8fb0bb5f02e3ca674fbde10ef34) { clang version 7.0.0 (http://llvm.org/git/clang.git 8c873daccce7ee5339b9fd82c81fe02b73543b65) (http://llvm.org/git/llvm.git 98c78 clang version 7.0.0 (http://llvm.org/git/clang.git 8c873daccce7ee5339b9fd82c81fe02b73543b65) (http://llvm.org/git/llvm.git 98c78e82f54be8fb0bb5f02e3ca674fbde10ef34); /* 0 4 */ clang version 7.0.0 (http://llvm.org/git/clang.git 8c873daccce7ee5339b9fd82c81fe02b73543b65) (http://llvm.org/git/llvm.git 98c78 clang version 7.0.0 (http://llvm.org/git/clang.git 8c873daccce7ee5339b9fd82c81fe02b73543b65) (http://llvm.org/git/llvm.git 98c78e82f54be8fb0bb5f02e3ca674fbde10ef34); /* 4 4 */ clang version 7.0.0 (http://llvm.org/git/clang.git 8c873daccce7ee5339b9fd82c81fe02b73543b65) (http://llvm.org/git/llvm.git 98c78 clang version 7.0.0 (http://llvm.org/git/clang.git 8c873daccce7ee5339b9fd82c81fe02b73543b65) (http://llvm.org/git/llvm.git 98c78e82f54be8fb0bb5f02e3ca674fbde10ef34); /* 8 4 */ clang version 7.0.0 (http://llvm.org/git/clang.git 8c873daccce7ee5339b9fd82c81fe02b73543b65) (http://llvm.org/git/llvm.git 98c78 clang version 7.0.0 (http://llvm.org/git/clang.git 8c873daccce7ee5339b9fd82c81fe02b73543b65) (http://llvm.org/git/llvm.git 98c78e82f54be8fb0bb5f02e3ca674fbde10ef34); /*12 4 */ /* size: 16, cachelines: 1, members: 4 */ /* last cacheline: 16 bytes */ }; [root@jouet bpf]# Ideas? [root@jouet bpf]# trace -e open*,hello.c clang-6.0: error: unknown argument: '-mattr=dwarf' ERROR: unable to compile hello.c Hint: Check error message shown above. Hint: You can also pre-compile it into .o using: clang -target bpf -O2 -c hello.c with proper -I and -D options. event syntax error: 'hello.c' \___ Failed to load hello.c from source: Error when compiling BPF scriptlet (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf trace [] [] or: perf trace [] -- [] or: perf trace record [] [] or: perf trace record [] -- [] -e, --eventevent/syscall selector. use 'perf list' to list available events [root@jouet bpf]# [root@jouet bpf]# trace -v -e open*,hello.c bpf: builtin compilation failed: -95, try external compiler Kernel build dir is set to /lib/modules/4.17.0-rc5/build set env: KBUILD_DIR=/lib/modules/4.17.0-rc5/build unset env: KBUILD_OPTS include option is set to -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: NR_CPUS=4 set env: LINUX_VERSION_CODE=0x41100 set env: CLANG_EXEC=/usr/local/bin/clang set env: CLANG_OPTIONS=-g -mattr=dwarf set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Thu, Jun 07, 2018 at 01:07:01PM -0700, Martin KaFai Lau escreveu: > On Thu, Jun 07, 2018 at 04:30:29PM -0300, Arnaldo Carvalho de Melo wrote: > > So this must be available in a newer llvm version? Which one? > I should have put in the details in my last email or > in the commit message, my bad. > 1. The tools/testing/selftests/bpf/Makefile has the CLANG_FLAGS and >LLC_FLAGS needed to compile the bpf prog. It requires a new >"-mattr=dwarf" llc option which was added to the future >llvm 7.0. >Hence, I have been using the llvm's master in github which >also has the llvm-objcopy. > 2. The kernel's btf part only focus on the BPF map. >Hence, the testing bpf program should have the map's key >and map's value. e.g. tools/testing/selftests/bpf/test_btf_haskv.c So, with llvm and clang HEAD I get: [root@jouet bpf]# pahole -J hello.o [root@jouet bpf]# file hello.o hello.o: ELF 64-bit LSB relocatable, *unknown arch 0xf7* version 1 (SYSV), with debug_info, not stripped [root@jouet bpf]# llvm-readelf -s hello.o There are 26 section headers, starting at offset 0xe30: Section Headers: [Nr] Name TypeAddress OffSize ES Flg Lk Inf Al [ 0] NULL 00 00 00 0 0 0 [ 1] .text PROGBITS 40 00 00 AX 0 0 4 [ 2] syscalls:sys_enter_openat PROGBITS 40 88 00 AX 0 0 8 [ 3] license PROGBITS c8 04 00 WA 0 0 1 [ 4] version PROGBITS cc 04 00 WA 0 0 4 [ 5] maps PROGBITS d0 10 00 WA 0 0 4 [ 6] .rodata.str1.1PROGBITS e0 0e 01 AMS 0 0 1 [ 7] .debug_strPROGBITS ee 00010e 01 MS 0 0 1 [ 8] .debug_locPROGBITS 0001fc 23 00 0 0 1 [ 9] .debug_abbrev PROGBITS 00021f e3 00 0 0 1 [10] .debug_info PROGBITS 000302 00015e 00 0 0 1 [11] .debug_ranges PROGBITS 000460 30 00 0 0 1 [12] .debug_macinfoPROGBITS 000490 01 00 0 0 1 [13] .debug_pubnames PROGBITS 000491 6e 00 0 0 1 [14] .debug_pubtypes PROGBITS 0004ff 5a 00 0 0 1 [15] .debug_frame PROGBITS 000560 28 00 0 0 8 [16] .debug_line PROGBITS 000588 6e 00 0 0 1 [17] .symtab SYMTAB 0005f8 000318 18 24 29 8 [18] .relsyscalls:sys_enter_openat REL 000910 10 10 17 2 8 [19] .rel.debug_info REL 000920 0001e0 10 17 10 8 [20] .rel.debug_pubnames REL 000b00 10 10 17 13 8 [21] .rel.debug_pubtypes REL 000b10 10 10 17 14 8 [22] .rel.debug_frame REL 000b20 20 10 17 15 8 [23] .rel.debug_line REL 000b40 10 10 17 16 8 [24] .strtab STRTAB 000b50 00018e 00 0 0 1 [25] .BTF PROGBITS 000cde 00014e 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), l (large) I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific) [root@jouet bpf]# [root@jouet bpf]# pahole hello.o struct clang version 5.0.1 (tags/RELEASE_501/final) { clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 (tags/RELEASE_501/final); /* 0 4 */ clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 (tags/RELEASE_501/final); /* 4 4 */ clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 (tags/RELEASE_501/final); /* 8 4 */ clang version 5.0.1 (tags/RELEASE_501/final) clang version 5.0.1 (tags/RELEASE_501/final); /*12 4 */ /* size: 16, cachelines: 1, members: 4 */ /* last cacheline: 16 bytes */ }; [root@jouet bpf]# Ok, I guess I saw this case in the llvm/clang git logs, so this one was generated with the older clang, will regenerate and add that "-mattr=dwarf" part. - Arnaldo
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Thu, Jun 07, 2018 at 01:07:01PM -0700, Martin KaFai Lau escreveu: > On Thu, Jun 07, 2018 at 04:30:29PM -0300, Arnaldo Carvalho de Melo wrote: > > So this must be available in a newer llvm version? Which one? > I should have put in the details in my last email or > in the commit message, my bad. > 1. The tools/testing/selftests/bpf/Makefile has the CLANG_FLAGS and >LLC_FLAGS needed to compile the bpf prog. It requires a new >"-mattr=dwarf" llc option which was added to the future >llvm 7.0. >Hence, I have been using the llvm's master in github which >also has the llvm-objcopy. > 2. The kernel's btf part only focus on the BPF map. >Hence, the testing bpf program should have the map's key >and map's value. e.g. tools/testing/selftests/bpf/test_btf_haskv.c Thanks for the version required to test this, but where is this test_btf_haskv.c file? Which tree? net-next? Ok, just pulled torvalds/master and there it is. Gotcha. struct bpf_map_def SEC("maps") __bpf_stdout__ = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; This map is in the above hello.c example, but I guess its way too simple :-) Ok, I'll test this at home in another machine where I have the llvm's git repo. - Arnaldo
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Thu, Jun 07, 2018 at 12:05:10PM -0700, Martin KaFai Lau escreveu: > On Thu, Jun 07, 2018 at 11:03:37AM -0300, Arnaldo Carvalho de Melo wrote: > > Em Thu, Jun 07, 2018 at 10:54:01AM -0300, Arnaldo Carvalho de Melo escreveu: > > > Em Tue, Jun 05, 2018 at 02:25:48PM -0700, Martin KaFai Lau escreveu: > > > > [ btw, the latest commit (1 commit) should be 94a11b59e592 ]. > > So, the commit log message for the pahole patch is non-existent: > > https://github.com/iamkafai/pahole/commit/94a11b59e5920908085bfc8d24c92f95c8ffceaf > > we should do better in describing what is done and how, I'm staring > > with a message you sent to the kernel part: > > -- > > This patch introduces BPF Type Format (BTF). > > BTF (BPF Type Format) is the meta data format which describes > > the data types of BPF program/map. Hence, it basically focus > > on the C programming language which the modern BPF is primary > > using. The first use case is to provide a generic pretty print > > capability for a BPF map. > I will add details in the next github respin/push. Ok, but I can do that if there is nothing else to do in the code at this stage :-) > > Now I'm going to do the step-by-step guide on testing the feature just > > introduced, and will try to convert from dwarf to BTF and back, compare > > the pahole output for types encoded in DWARF and BTF, etc. > > If you have something ressembling this already, please share. > The pahole only has the encoder part. I tested with the verbose output > from the "pahole -V -J". Loading the btf to the kernel is also tested. Ok, so here it goes my first stab at testing, using perf's BPF integration: [root@jouet bpf]# cat hello.c #include "stdio.h" int syscall_enter(openat)(void *ctx) { puts("Hello, world\n"); return 0; } [root@jouet bpf]# cat ~/.perfconfig [llvm] dump-obj = true [root@jouet bpf]# perf trace -e open*,hello.c touch /tmp/hello.BTF LLVM: dumping hello.o 0.017 ( ): __bpf_stdout__:Hello, world 0.019 ( 0.011 ms): touch/28147 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC ) = 3 0.053 ( ): __bpf_stdout__:Hello, world 0.055 ( 0.011 ms): touch/28147 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC ) = 3 0.354 ( 0.012 ms): touch/28147 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC ) = 3 0.411 ( ): __bpf_stdout__:Hello, world 0.412 ( 0.198 ms): touch/28147 openat(dfd: CWD, filename: /tmp/hello.BTF, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) = 3 [root@jouet bpf]# [root@jouet bpf]# file hello.o hello.o: ELF 64-bit LSB relocatable, *unknown arch 0xf7* version 1 (SYSV), not stripped [root@jouet bpf]# pahole --btf_encode hello.o pahole: hello.o: No debugging information found [root@jouet bpf]# [root@jouet bpf]# readelf -s hello.o Symbol table '.symtab' contains 5 entries: Num:Value Size TypeBind Vis Ndx Name 0: 0 NOTYPE LOCAL DEFAULT UND 1: 0 NOTYPE GLOBAL DEFAULT7 __bpf_stdout__ 2: 0 NOTYPE GLOBAL DEFAULT5 _license 3: 0 NOTYPE GLOBAL DEFAULT6 _version 4: 0 NOTYPE GLOBAL DEFAULT3 syscall_enter_openat [root@jouet bpf]# [root@jouet bpf]# readelf -SW hello.o There are 10 section headers, starting at offset 0x1f8: Section Headers: [Nr] Name TypeAddress OffSize ES Flg Lk Inf Al [ 0] NULL 00 00 00 0 0 0 [ 1] .strtab STRTAB 000178 7f 00 0 0 1 [ 2] .text PROGBITS 40 00 00 AX 0 0 4 [ 3] syscalls:sys_enter_openat PROGBITS 40 88 00 AX 0 0 8 [ 4] .relsyscalls:sys_enter_openat REL 000168 10 10 9 3 8 [ 5] license PROGBITS c8 04 00 WA 0 0 1 [ 6] version PROGBITS cc 04 00 WA 0 0 4 [ 7] maps PROGBITS d0 10 00 WA 0 0 4 [ 8] .rodata.str1.1PROGBITS e0 0e 01 AMS 0 0 1 [ 9] .symtab SYMTAB f0 78 18 1 1 8 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), p (processor specific) [root@jouet bpf]# Humm, lemme try something, add -g to clang-opt: [root@jouet bpf]# cat ~/.perfconfig [llvm] dump-obj = true clang-o
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Thu, Jun 07, 2018 at 10:54:01AM -0300, Arnaldo Carvalho de Melo escreveu: > Em Tue, Jun 05, 2018 at 02:25:48PM -0700, Martin KaFai Lau escreveu: > > [ btw, the latest commit (1 commit) should be 94a11b59e592 ]. So, the commit log message for the pahole patch is non-existent: https://github.com/iamkafai/pahole/commit/94a11b59e5920908085bfc8d24c92f95c8ffceaf we should do better in describing what is done and how, I'm staring with a message you sent to the kernel part: -- This patch introduces BPF Type Format (BTF). BTF (BPF Type Format) is the meta data format which describes the data types of BPF program/map. Hence, it basically focus on the C programming language which the modern BPF is primary using. The first use case is to provide a generic pretty print capability for a BPF map. -- Now I'm going to do the step-by-step guide on testing the feature just introduced, and will try to convert from dwarf to BTF and back, compare the pahole output for types encoded in DWARF and BTF, etc. If you have something ressembling this already, please share. Thanks, - Arnaldo
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Tue, Jun 05, 2018 at 02:25:48PM -0700, Martin KaFai Lau escreveu: > On Thu, Apr 19, 2018 at 04:40:34PM -0300, Arnaldo Carvalho de Melo wrote: > > Em Wed, Apr 18, 2018 at 03:55:56PM -0700, Martin KaFai Lau escreveu: > > > This patch introduces BPF Type Format (BTF). > > > > > > BTF (BPF Type Format) is the meta data format which describes > > > the data types of BPF program/map. Hence, it basically focus > > > on the C programming language which the modern BPF is primary > > > using. The first use case is to provide a generic pretty print > > > capability for a BPF map. > > > > > > A modified pahole that can convert dwarf to BTF is here: > > > https://github.com/iamkafai/pahole/tree/btf > > > (Arnaldo, there is some BTF_KIND numbering changes on > > > Apr 18th, d61426c1571) > > > > Thanks for letting me know, I'm starting to look at this, > Hi Arnaldo, > > Do you have a chance to take a look and pull it? The kernel > changes will be in 4.18, so it will be handy if it is available in > the pahole repository. > > [ btw, the latest commit (1 commit) should be 94a11b59e592 ]. Yeah, the one I had before had: It also raises the number of types (and functions) limit from 0x7fff to 0x7fff. And on this last one I see that: /* Max # of type identifier */ -#define BTF_MAX_TYPE 0x7fff +#define BTF_MAX_TYPE 0x /* Max offset into the string section */ -#define BTF_MAX_NAME_OFFSET0x7fff +#define BTF_MAX_NAME_OFFSET0x So somehow (still reading) you'll be able to get more space, if we find necessary, to have more types and names, ok. Continuing... - Arnaldo > > > > - Arnaldo > > > > > Please see individual patch for details. > > > > > > v5: > > > - Remove BTF_KIND_FLOAT and BTF_KIND_FUNC which are not > > > currently used. They can be added in the future. > > > Some bpf_df_xxx() are removed together. > > > - Add comment in patch 7 to clarify that the new bpffs_map_fops > > > should not be extended further. > > > > > > v4: > > > - Fix warning (remove unneeded semicolon) > > > - Remove a redundant variable (nr_bytes) from btf_int_check_meta() in > > > patch 1. Caught by W=1. > > > > > > v3: > > > - Rebase to bpf-next > > > - Fix sparse warning (by adding static) > > > - Add BTF header logging: btf_verifier_log_hdr() > > > - Fix the alignment test on btf->type_off > > > - Add tests for the BTF header > > > - Lower the max BTF size to 16MB. It should be enough > > > for some time. We could raise it later if it would > > > be needed. > > > > > > v2: > > > - Use kvfree where needed in patch 1 and 2 > > > - Also consider BTF_INT_OFFSET() in the btf_int_check_meta() > > > in patch 1 > > > - Fix an incorrect goto target in map_create() during > > > the btf-error-path in patch 7 > > > - re-org some local vars to keep the rev xmas tree in btf.c > > > > > > Martin KaFai Lau (10): > > > bpf: btf: Introduce BPF Type Format (BTF) > > > bpf: btf: Validate type reference > > > bpf: btf: Check members of struct/union > > > bpf: btf: Add pretty print capability for data with BTF type info > > > bpf: btf: Add BPF_BTF_LOAD command > > > bpf: btf: Add BPF_OBJ_GET_INFO_BY_FD support to BTF fd > > > bpf: btf: Add pretty print support to the basic arraymap > > > bpf: btf: Sync bpf.h and btf.h to tools/ > > > bpf: btf: Add BTF support to libbpf > > > bpf: btf: Add BTF tests > > > > > > include/linux/bpf.h | 20 +- > > > include/linux/btf.h | 48 + > > > include/uapi/linux/bpf.h | 12 + > > > include/uapi/linux/btf.h | 130 ++ > > > kernel/bpf/Makefile |1 + > > > kernel/bpf/arraymap.c| 50 + > > > kernel/bpf/btf.c | 2064 > > > ++ > > > kernel/bpf/inode.c | 156 +- > > > kernel/bpf/syscall.c | 51 +- > > > tools/include/uapi/linux/bpf.h | 12 + > > > tools/include/uapi/linux/btf.h | 130 ++ > > > tools/lib/bpf/Build |2 +- > > > tools/lib/bpf/bpf.c | 92 +- > > > tools/lib/bpf/bpf.h
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Tue, Jun 05, 2018 at 02:25:48PM -0700, Martin KaFai Lau escreveu: > On Thu, Apr 19, 2018 at 04:40:34PM -0300, Arnaldo Carvalho de Melo wrote: > > Em Wed, Apr 18, 2018 at 03:55:56PM -0700, Martin KaFai Lau escreveu: > > > This patch introduces BPF Type Format (BTF). > > > > > > BTF (BPF Type Format) is the meta data format which describes > > > the data types of BPF program/map. Hence, it basically focus > > > on the C programming language which the modern BPF is primary > > > using. The first use case is to provide a generic pretty print > > > capability for a BPF map. > > > > > > A modified pahole that can convert dwarf to BTF is here: > > > https://github.com/iamkafai/pahole/tree/btf > > > (Arnaldo, there is some BTF_KIND numbering changes on > > > Apr 18th, d61426c1571) > > > > Thanks for letting me know, I'm starting to look at this, > Hi Arnaldo, > > Do you have a chance to take a look and pull it? The kernel > changes will be in 4.18, so it will be handy if it is available in > the pahole repository. > > [ btw, the latest commit (1 commit) should be 94a11b59e592 ]. Got sidetracked, will get back to it later today. - Arnaldo
[GIT PULL 00/11] perf/core improvements and fixes
Hi Ingo, Please consider pulling, more to come as I go thru Adrian's x86 PTI series and the C++ support improvements to 'perf probe', from Holger, Best Regards, - Arnaldo Test results at the end of this message, as usual. The following changes since commit 291c161f6c65530092903fbea58eb07a62b220ba: Merge remote-tracking branch 'tip/perf/urgent' into perf/core (2018-05-15 10:30:17 -0300) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.18-20180516 for you to fetch changes up to 7a36a287de9fbb1ba906e70573d3f2315f7fd609: perf bpf: Fix NULL return handling in bpf__prepare_load() (2018-05-16 10:01:55 -0300) perf/core improvements and fixes: - Add '-e intel_pt//u' test to the 'parse-events' 'perf test' entry, to help avoiding regressions in the events parser such as one that caused a revert in v4.17-rc (Arnaldo Carvalho de Melo) - Fix NULL return handling in bpf__prepare_load() (YueHaibing) - Warn about 'perf buildid-cache --purge-all' failures (Ravi Bangoria) - Add infrastructure to help in writing eBPF C programs to be used with '-e name.c' type events in tools such as 'record' and 'trace', with headers for common constructs and an examples directory that will get populated as we add more such helpers and the 'perf bpf' branch that Jiri Olsa has been working on (Arnaldo Carvalho de Melo) - Handle uncore event aliases in small groups properly (Kan Liang) - Use the "_stest" symbol to identify the kernel map when loading kcore (Adrian Hunter) Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> Adrian Hunter (1): perf tools: Use the "_stest" symbol to identify the kernel map when loading kcore Arnaldo Carvalho de Melo (7): perf tests parse-events: Add intel_pt parse test perf llvm-utils: Add bpf include path to clang command line perf bpf: Add 'examples' directories perf bpf: Add bpf.h to be used in eBPF proggies perf bpf: Add kprobe example to catch 5s naps perf bpf: Add license(NAME) helper perf bpf: Add probe() helper to reduce kprobes boilerplate Kan Liang (1): perf parse-events: Handle uncore event aliases in small groups properly Ravi Bangoria (1): perf buildid-cache: Warn --purge-all failures YueHaibing (1): perf bpf: Fix NULL return handling in bpf__prepare_load() tools/perf/Makefile.config | 14 tools/perf/Makefile.perf | 8 +++ tools/perf/builtin-buildid-cache.c | 8 ++- tools/perf/examples/bpf/5sec.c | 49 ++ tools/perf/examples/bpf/empty.c| 3 + tools/perf/include/bpf/bpf.h | 13 tools/perf/tests/parse-events.c| 13 tools/perf/util/Build | 2 + tools/perf/util/bpf-loader.c | 6 +- tools/perf/util/evsel.h| 1 + tools/perf/util/llvm-utils.c | 19 -- tools/perf/util/parse-events.c | 130 - tools/perf/util/parse-events.h | 7 +- tools/perf/util/parse-events.y | 8 +-- tools/perf/util/symbol.c | 16 ++--- 15 files changed, 270 insertions(+), 27 deletions(-) create mode 100644 tools/perf/examples/bpf/5sec.c create mode 100644 tools/perf/examples/bpf/empty.c create mode 100644 tools/perf/include/bpf/bpf.h Test results: The first ones are container (docker) based builds of tools/perf with and without libelf support. Where clang is available, it is also used to build perf with/without libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang when clang and its devel libraries are installed. The objtool and samples/bpf/ builds are disabled now that I'm switching from using the sources in a local volume to fetching them from a http server to build it inside the container, to make it easier to build in a container cluster. Those will come back later. Several are cross builds, the ones with -x-ARCH and the android one, and those may not have all the features built, due to lack of multi-arch devel packages, available and being used so far on just a few, like debian:experimental-x-{arm64,mipsel}. The 'perf test' one will perform a variety of tests exercising tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands with a variety of command line event specifications to then intercept the sys_perf_event syscall to check that the perf_event_attr fields are set up as expected, among a variety of other unit tests. Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/ with a variety of feature sets, exercising the build with an incomplete set of features as well as with a complete one. It is planned to have it run on each of the containers mentioned above, using some container orchestration infrastructure. Ge
[PATCH 11/11] perf bpf: Fix NULL return handling in bpf__prepare_load()
From: YueHaibing <yuehaib...@huawei.com> bpf_object__open()/bpf_object__open_buffer can return error pointer or NULL, check the return values with IS_ERR_OR_NULL() in bpf__prepare_load and bpf__prepare_load_buffer Signed-off-by: YueHaibing <yuehaib...@huawei.com> Acked-by: Daniel Borkmann <dan...@iogearbox.net> Cc: Alexander Shishkin <alexander.shish...@linux.intel.com> Cc: Namhyung Kim <namhy...@kernel.org> Cc: Peter Zijlstra <pet...@infradead.org> Cc: netdev@vger.kernel.org Link: https://lkml.kernel.org/n/tip-psf4xwc09n62al2cb9s33...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/perf/util/bpf-loader.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c index af7ad814b2c3..cee658733e2c 100644 --- a/tools/perf/util/bpf-loader.c +++ b/tools/perf/util/bpf-loader.c @@ -66,7 +66,7 @@ bpf__prepare_load_buffer(void *obj_buf, size_t obj_buf_sz, const char *name) } obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, name); - if (IS_ERR(obj)) { + if (IS_ERR_OR_NULL(obj)) { pr_debug("bpf: failed to load buffer\n"); return ERR_PTR(-EINVAL); } @@ -102,14 +102,14 @@ struct bpf_object *bpf__prepare_load(const char *filename, bool source) pr_debug("bpf: successfull builtin compilation\n"); obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, filename); - if (!IS_ERR(obj) && llvm_param.dump_obj) + if (!IS_ERR_OR_NULL(obj) && llvm_param.dump_obj) llvm__dump_obj(filename, obj_buf, obj_buf_sz); free(obj_buf); } else obj = bpf_object__open(filename); - if (IS_ERR(obj)) { + if (IS_ERR_OR_NULL(obj)) { pr_debug("bpf: failed to load %s\n", filename); return obj; } -- 2.14.3
Re: [PATCH bpf] tools: bpf: fix NULL return handling in bpf__prepare_load
Em Sun, May 13, 2018 at 01:20:22AM +0200, Daniel Borkmann escreveu: > [ +Arnaldo ] > > On 05/11/2018 01:21 PM, YueHaibing wrote: > > bpf_object__open()/bpf_object__open_buffer can return error pointer or NULL, > > check the return values with IS_ERR_OR_NULL() in bpf__prepare_load and > > bpf__prepare_load_buffer > > > > Signed-off-by: YueHaibing> > --- > > tools/perf/util/bpf-loader.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > This should probably be routed via Arnaldo due to the fix in perf itself. If > there's no particular preference on which tree, we could potentially route it > as well via bpf with Acked-by from Arnaldo, but that is up to him. Arnaldo, > any preference? I'm preparing a pull req right now, and working a bit on perf's BPF support, so why not, I'll merge it, thanks, - Arnaldo > > diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c > > index af7ad81..cee6587 100644 > > --- a/tools/perf/util/bpf-loader.c > > +++ b/tools/perf/util/bpf-loader.c > > @@ -66,7 +66,7 @@ bpf__prepare_load_buffer(void *obj_buf, size_t > > obj_buf_sz, const char *name) > > } > > > > obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, name); > > - if (IS_ERR(obj)) { > > + if (IS_ERR_OR_NULL(obj)) { > > pr_debug("bpf: failed to load buffer\n"); > > return ERR_PTR(-EINVAL); > > } > > @@ -102,14 +102,14 @@ struct bpf_object *bpf__prepare_load(const char > > *filename, bool source) > > pr_debug("bpf: successfull builtin compilation\n"); > > obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, filename); > > > > - if (!IS_ERR(obj) && llvm_param.dump_obj) > > + if (!IS_ERR_OR_NULL(obj) && llvm_param.dump_obj) > > llvm__dump_obj(filename, obj_buf, obj_buf_sz); > > > > free(obj_buf); > > } else > > obj = bpf_object__open(filename); > > > > - if (IS_ERR(obj)) { > > + if (IS_ERR_OR_NULL(obj)) { > > pr_debug("bpf: failed to load %s\n", filename); > > return obj; > > } > >
Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
Em Wed, Apr 18, 2018 at 03:55:56PM -0700, Martin KaFai Lau escreveu: > This patch introduces BPF Type Format (BTF). > > BTF (BPF Type Format) is the meta data format which describes > the data types of BPF program/map. Hence, it basically focus > on the C programming language which the modern BPF is primary > using. The first use case is to provide a generic pretty print > capability for a BPF map. > > A modified pahole that can convert dwarf to BTF is here: > https://github.com/iamkafai/pahole/tree/btf > (Arnaldo, there is some BTF_KIND numbering changes on > Apr 18th, d61426c1571) Thanks for letting me know, I'm starting to look at this, - Arnaldo > Please see individual patch for details. > > v5: > - Remove BTF_KIND_FLOAT and BTF_KIND_FUNC which are not > currently used. They can be added in the future. > Some bpf_df_xxx() are removed together. > - Add comment in patch 7 to clarify that the new bpffs_map_fops > should not be extended further. > > v4: > - Fix warning (remove unneeded semicolon) > - Remove a redundant variable (nr_bytes) from btf_int_check_meta() in > patch 1. Caught by W=1. > > v3: > - Rebase to bpf-next > - Fix sparse warning (by adding static) > - Add BTF header logging: btf_verifier_log_hdr() > - Fix the alignment test on btf->type_off > - Add tests for the BTF header > - Lower the max BTF size to 16MB. It should be enough > for some time. We could raise it later if it would > be needed. > > v2: > - Use kvfree where needed in patch 1 and 2 > - Also consider BTF_INT_OFFSET() in the btf_int_check_meta() > in patch 1 > - Fix an incorrect goto target in map_create() during > the btf-error-path in patch 7 > - re-org some local vars to keep the rev xmas tree in btf.c > > Martin KaFai Lau (10): > bpf: btf: Introduce BPF Type Format (BTF) > bpf: btf: Validate type reference > bpf: btf: Check members of struct/union > bpf: btf: Add pretty print capability for data with BTF type info > bpf: btf: Add BPF_BTF_LOAD command > bpf: btf: Add BPF_OBJ_GET_INFO_BY_FD support to BTF fd > bpf: btf: Add pretty print support to the basic arraymap > bpf: btf: Sync bpf.h and btf.h to tools/ > bpf: btf: Add BTF support to libbpf > bpf: btf: Add BTF tests > > include/linux/bpf.h | 20 +- > include/linux/btf.h | 48 + > include/uapi/linux/bpf.h | 12 + > include/uapi/linux/btf.h | 130 ++ > kernel/bpf/Makefile |1 + > kernel/bpf/arraymap.c| 50 + > kernel/bpf/btf.c | 2064 > ++ > kernel/bpf/inode.c | 156 +- > kernel/bpf/syscall.c | 51 +- > tools/include/uapi/linux/bpf.h | 12 + > tools/include/uapi/linux/btf.h | 130 ++ > tools/lib/bpf/Build |2 +- > tools/lib/bpf/bpf.c | 92 +- > tools/lib/bpf/bpf.h | 16 + > tools/lib/bpf/btf.c | 374 + > tools/lib/bpf/btf.h | 22 + > tools/lib/bpf/libbpf.c | 148 +- > tools/lib/bpf/libbpf.h |3 + > tools/testing/selftests/bpf/Makefile | 26 +- > tools/testing/selftests/bpf/test_btf.c | 1669 + > tools/testing/selftests/bpf/test_btf_haskv.c | 48 + > tools/testing/selftests/bpf/test_btf_nokv.c | 43 + > 22 files changed, 5076 insertions(+), 41 deletions(-) > create mode 100644 include/linux/btf.h > create mode 100644 include/uapi/linux/btf.h > create mode 100644 kernel/bpf/btf.c > create mode 100644 tools/include/uapi/linux/btf.h > create mode 100644 tools/lib/bpf/btf.c > create mode 100644 tools/lib/bpf/btf.h > create mode 100644 tools/testing/selftests/bpf/test_btf.c > create mode 100644 tools/testing/selftests/bpf/test_btf_haskv.c > create mode 100644 tools/testing/selftests/bpf/test_btf_nokv.c > > -- > 2.9.5
Re: [PATCH bpf-next v3 00/10] BTF: BPF Type Format
Em Mon, Apr 16, 2018 at 12:33:17PM -0700, Martin KaFai Lau escreveu: > This patch introduces BPF Type Format (BTF). > > BTF (BPF Type Format) is the meta data format which describes > the data types of BPF program/map. Hence, it basically focus > on the C programming language which the modern BPF is primary > using. The first use case is to provide a generic pretty print > capability for a BPF map. > > A modified pahole (Cc: Arnaldo) that can convert dwarf to BTF is here: > https://github.com/iamkafai/pahole/tree/btf Thanks for CCing me, no changes since when you posted the pahole patches, I gave it a quick look, seems sane, will try to merge and push a new pahole version out so that distros can pick it, at least fedora will 8-) - Arnaldo > Please see individual patch for details. > > v3: > - Rebase to bpf-next > - Fix sparse warning (by adding static) > - Add BTF header logging: btf_verifier_log_hdr() > - Fix the alignment test on btf->type_off > - Add tests for the BTF header > - Lower the max BTF size to 16MB. It should be enough > for some time. We could raise it later if it would > be needed. > > v2: > - Use kvfree where needed in patch 1 and 2 > - Also consider BTF_INT_OFFSET() in the btf_int_check_meta() > in patch 1 > - Fix an incorrect goto target in map_create() during > the btf-error-path in patch 7 > - re-org some local vars to keep the rev xmas tree in btf.c > > Martin KaFai Lau (10): > bpf: btf: Introduce BPF Type Format (BTF) > bpf: btf: Validate type reference > bpf: btf: Check members of struct/union > bpf: btf: Add pretty print capability for data with BTF type info > bpf: btf: Add BPF_BTF_LOAD command > bpf: btf: Add BPF_OBJ_GET_INFO_BY_FD support to BTF fd > bpf: btf: Add pretty print support to the basic arraymap > bpf: btf: Sync bpf.h and btf.h to tools/ > bpf: btf: Add BTF support to libbpf > bpf: btf: Add BTF tests > > include/linux/bpf.h | 20 +- > include/linux/btf.h | 48 + > include/uapi/linux/bpf.h | 12 + > include/uapi/linux/btf.h | 132 ++ > kernel/bpf/Makefile |1 + > kernel/bpf/arraymap.c| 50 + > kernel/bpf/btf.c | 2093 > ++ > kernel/bpf/inode.c | 146 +- > kernel/bpf/syscall.c | 51 +- > tools/include/uapi/linux/bpf.h | 13 + > tools/include/uapi/linux/btf.h | 132 ++ > tools/lib/bpf/Build |2 +- > tools/lib/bpf/bpf.c | 92 +- > tools/lib/bpf/bpf.h | 16 + > tools/lib/bpf/btf.c | 377 + > tools/lib/bpf/btf.h | 22 + > tools/lib/bpf/libbpf.c | 148 +- > tools/lib/bpf/libbpf.h |3 + > tools/testing/selftests/bpf/Makefile | 26 +- > tools/testing/selftests/bpf/test_btf.c | 1669 > tools/testing/selftests/bpf/test_btf_haskv.c | 48 + > tools/testing/selftests/bpf/test_btf_nokv.c | 43 + > 22 files changed, 5103 insertions(+), 41 deletions(-) > create mode 100644 include/linux/btf.h > create mode 100644 include/uapi/linux/btf.h > create mode 100644 kernel/bpf/btf.c > create mode 100644 tools/include/uapi/linux/btf.h > create mode 100644 tools/lib/bpf/btf.c > create mode 100644 tools/lib/bpf/btf.h > create mode 100644 tools/testing/selftests/bpf/test_btf.c > create mode 100644 tools/testing/selftests/bpf/test_btf_haskv.c > create mode 100644 tools/testing/selftests/bpf/test_btf_nokv.c > > -- > 2.9.5
Re: [PATCH bpf-next 00/10] BTF: BPF Type Format
Em Fri, Mar 30, 2018 at 11:26:33AM -0700, Martin KaFai Lau escreveu: > This patch introduces BPF Type Format (BTF). > > BTF (BPF Type Format) is the meta data format which describes > the data types of BPF program/map. Hence, it basically focus > on the C programming language which the modern BPF is primary > using. The first use case is to provide a generic pretty print > capability for a BPF map. > > A modified pahole that can convert dwarf to BTF is here: > https://github.com/iamkafai/pahole/tree/btf Hey, great, I'll try to review this and if all is well, merge this, please consider CCing me in patches to pahole :-) - Arnaldo > Please see individual patch for details. > > Martin KaFai Lau (10): > bpf: btf: Introduce BPF Type Format (BTF) > bpf: btf: Validate type reference > bpf: btf: Check members of struct/union > bpf: btf: Add pretty print capability for data with BTF type info > bpf: btf: Add BPF_BTF_LOAD command > bpf: btf: Add BPF_OBJ_GET_INFO_BY_FD support to BTF fd > bpf: btf: Add pretty print support to the basic arraymap > bpf: btf: Sync bpf.h and btf.h to tools/ > bpf: btf: Add BTF support to libbpf > bpf: btf: Add BTF tests > > include/linux/bpf.h | 20 +- > include/linux/btf.h | 48 + > include/uapi/linux/bpf.h | 12 + > include/uapi/linux/btf.h | 132 ++ > kernel/bpf/Makefile |1 + > kernel/bpf/arraymap.c| 50 + > kernel/bpf/btf.c | 2064 > ++ > kernel/bpf/inode.c | 146 +- > kernel/bpf/syscall.c | 51 +- > tools/include/uapi/linux/bpf.h | 13 + > tools/include/uapi/linux/btf.h | 132 ++ > tools/lib/bpf/Build |2 +- > tools/lib/bpf/bpf.c | 92 +- > tools/lib/bpf/bpf.h | 16 + > tools/lib/bpf/btf.c | 377 + > tools/lib/bpf/btf.h | 22 + > tools/lib/bpf/libbpf.c | 148 +- > tools/lib/bpf/libbpf.h |3 + > tools/testing/selftests/bpf/Makefile | 25 +- > tools/testing/selftests/bpf/test_btf.c | 1539 +++ > tools/testing/selftests/bpf/test_btf_haskv.c | 48 + > tools/testing/selftests/bpf/test_btf_nokv.c | 43 + > 22 files changed, 4943 insertions(+), 41 deletions(-) > create mode 100644 include/linux/btf.h > create mode 100644 include/uapi/linux/btf.h > create mode 100644 kernel/bpf/btf.c > create mode 100644 tools/include/uapi/linux/btf.h > create mode 100644 tools/lib/bpf/btf.c > create mode 100644 tools/lib/bpf/btf.h > create mode 100644 tools/testing/selftests/bpf/test_btf.c > create mode 100644 tools/testing/selftests/bpf/test_btf_haskv.c > create mode 100644 tools/testing/selftests/bpf/test_btf_nokv.c > > -- > 2.9.5
Re: [PATCH bpf] tools: bpftool: fix compilation with older headers
[ +ingo, jolsa, namhyung ] Em Tue, Mar 06, 2018 at 05:28:33PM +0100, Daniel Borkmann escreveu: > [ +acme ] > > On 03/06/2018 05:00 PM, David Miller wrote: > > From: Jiri Benc> > Date: Tue, 6 Mar 2018 16:03:25 +0100 > > > >> On Tue, 6 Mar 2018 15:39:07 +0100, Daniel Borkmann wrote: > >>> Thanks for the fix, Jiri! The standard approach to resolve such header > >>> dependencies under > >>> tools/ would be to add a copy of magic.h uapi header into > >>> tools/include/uapi/linux/magic.h. > >>> > >>> Both bpftool and libbpf have tools/include/uapi/ in their include path > >>> from their > >>> Makefile, so they would pull this in automatically and it would also > >>> allow to get rid > >>> of the extra ifdef in libbpf then. Could you look into that? > >> > >> That's what I tried at first. But honestly, this is a shortcut to hell. > >> Eventually, we'd end up with most of uapi headers duplicated under > >> tools/include/uapi and hopelessly out of sync. > >> > >> The right approach would be to export uapi headers from the currently > >> built kernel somewhere (a temporary directory, perhaps) and use that to > >> build the tools. We should not have duplicated and out of sync headers > >> in the kernel tree. Just look at the git log for tools/include/uapi to > >> see what I mean by "out of sync". > > > > I understand your frustration. > > > > I'm really puzzled why doing "make headers_install" and then building > > these tools does not pick those in-kernel headers up. That's what > > really should happen. > > Arnaldo, given this came out of tools/perf originally and duplicating/syncing > headers is common practice since about 2014 in kernel git tree, do you have > some context on why the above was/is not considered? So, when tools/perf/ started we tried to use kernel code directly, things like rbtree, list.h, etc. It worked, we were sharing stuff with the kernel, all is well. Then someone changes something in one of these files and tools/ compilation broke. Fine for tools/ developers, we knew something like that could happen and would fix things in tools/, life goes on. Then Linus, IIRC, tried building tools/perf/ when something like that had happened and the build broke for him. He didn't liked that and we came up with this copy and check thing: we copy something into tools/include/ and add it to tools/perf/check-headers.sh, when something changes in the kernel, nothing breaks, we get notified, check if the change implies changes in tools/perf/, things like improving 'perf trace' to deal with new ioctl or syscall flags, new syscalls, etc. Sometimes the copy ends up automatically updating the tools, as there are scripts that generate ioctl id->string tables, for instance, automatically from the updated headers, things like what happened in this header update: http://git.kernel.org/acme/c/1350fb7d1b48 tools include powerpc: Grab a copy of arch/powerpc/include/uapi/asm/unistd.h With this in place no kernel developer needs to care about what happens in tools/, tools/ developers don't need to worry about getting in the way of kernel developers day-to-day activities. Then we also have: [acme@jouet perf]$ make help | grep perf perf-tar-src-pkg- Build perf-4.16.0-rc4.tar source tarball perf-targz-src-pkg - Build perf-4.16.0-rc4.tar.gz source tarball perf-tarbz2-src-pkg - Build perf-4.16.0-rc4.tar.bz2 source tarball perf-tarxz-src-pkg - Build perf-4.16.0-rc4.tar.xz source tarball [acme@jouet perf]$ Use that, get the resulting tarball, and all you need to build it anywhere should be self contained there, so the tools may use flags, defines, syscall definitions, etc, without ifdefs, and the resulting source code will build in many places, cross compiling, etc, like is done for every pull request I send to Ingo, see for instance the 53 containers where this is all built in a pull req like the one from yesterday: http://lkml.kernel.org/r/20180305142932.16921-1-a...@kernel.org But now there are more tools in tools/ and of course we can and should improve the whole process in a way that satisfies the various projects. So, with this said, I'll try and read the above thread. Ingo may add some other thoughts here, this is what came to my mind now. - Arnaldo > My current understanding is that the general preference would be on copying > the headers into tools/include/ infrastructure once there are dependencies > identified that would be missing on older/local system headers rather than > ifdef'ery of various bit and pieces in the code that need to make use of them. > Would be good to get some clarification on that in any case. > > But that said, I'd also be fine taking the three-liner as is into bpf as a > fix. > > > The kernel tree internally should be self-consistent. > > > > It's one thing for an external tool like iproute2 to duplicate stuff > > like this, but user programs inside the kernel have no excuse for > > requiring things like that just to
Re: [bpf-next V3 PATCH 1/5] bpf: Sync kernel ABI header with tooling header for bpf_common.h
Em Thu, Feb 08, 2018 at 12:48:12PM +0100, Jesper Dangaard Brouer escreveu: > I recently fixed up a lot of commits that forgot to keep the tooling > headers in sync. And then I forgot to do the same thing in commit > cb5f7334d479 ("bpf: add comments to BPF ld/ldx sizes"). Let correct > that before people notice ;-). > > Lawrence did partly fix/sync this for bpf.h in commit d6d4f60c3a09 > ("bpf: add selftest for tcpbpf"). > > Fixes: cb5f7334d479 ("bpf: add comments to BPF ld/ldx sizes") We don't consider a bug to forget to update the tooling headers copy of the files, i.e. its not a strict requirement on kernel developers to care about tools/ :-) I, for one, like to get the warning, its an opportunity for me to see that something changed and that I should pay attention to see if something needs to be done in the tooling side. - Arnaldo > Signed-off-by: Jesper Dangaard Brouer> --- > tools/include/uapi/linux/bpf_common.h |7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/tools/include/uapi/linux/bpf_common.h > b/tools/include/uapi/linux/bpf_common.h > index 18be90725ab0..ee97668bdadb 100644 > --- a/tools/include/uapi/linux/bpf_common.h > +++ b/tools/include/uapi/linux/bpf_common.h > @@ -15,9 +15,10 @@ > > /* ld/ldx fields */ > #define BPF_SIZE(code) ((code) & 0x18) > -#define BPF_W 0x00 > -#define BPF_H 0x08 > -#define BPF_B 0x10 > +#define BPF_W 0x00 /* 32-bit */ > +#define BPF_H 0x08 /* 16-bit */ > +#define BPF_B 0x10 /* 8-bit */ > +/* eBPF BPF_DW 0x1864-bit */ > #define BPF_MODE(code) ((code) & 0xe0) > #define BPF_IMM 0x00 > #define BPF_ABS 0x20
Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
Em Mon, Jan 22, 2018 at 10:28:11AM -0800, Yonghong Song escreveu: > The compiler did "40: (bf) r1 = r0" and then uses "r1" for branch > comparison, the original "r0" is left with complete unknown integer value > and later used to calculate the buffer size "55: (bf) r5 = r0" > where "r5" could be negative value and the verifier rightfully > complains. > There is no easy way to fix this in verifier unless verifier starts to track > correlations between registers which is a big task. So your below workaround > is okay. The below workaround should also work: > int len = bpf_probe_read_str(filename.path, sizeof(filename.path), > filename.ptr); > if (len > 0 && len < 256) > bpf_perf_event_output(ctx, _map, BPF_F_CURRENT_CPU, > , (len & 0xff) + sizeof(filename.ptr)); > return 0; Ok, thanks for one more time doing the analysis of the optimizations emitted and suggesting something more compact, that I can confirm works: [root@jouet bpf]# perf trace -a -e open,sys_enter_open.c sleep 0.1 LLVM: dumping sys_enter_open.o 1.212 ( ): __bpf_stdout__:/usr/lib/locale/locale-archive..) 1.218 ( 0.021 ms): sleep/9872 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3 2.905 ( ): __bpf_stdout__:..:.F.../usr/lib/locale/locale-archive..) 2.910 ( 0.013 ms): rm/9873 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3 7.562 ( ): __bpf_stdout__:..ul/usr/lib/locale/locale-archive..) 7.564 ( 0.013 ms): mv/9874 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3 11.275 ( ): __bpf_stdout__:...d/usr/lib/locale/locale-archive..) 11.278 ( 0.012 ms): sh/9875 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3 11.945 ( ): __bpf_stdout__:...d/usr/lib64/gconv/gconv-modules.cache) 11.953 ( 0.018 ms): sh/9875 open(filename: /usr/lib64/gconv/gconv-modules.cache) = 3 17.906 ( ): __bpf_stdout__:..T.p.../usr/lib/locale/locale-archive..) 17.913 ( 0.319 ms): gcc/9877 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 4 18.389 ( ): __bpf_stdout__:...l/usr/share/locale/locale.alias..) 18.394 ( 0.266 ms): gcc/9877 open(filename: /usr/share/locale/locale.alias, flags: CLOEXEC) = 4 18.777 ( ): __bpf_stdout__:@.../usr/share/locale/en_US.UTF-8/LC_MESSAGES/gcc.mo) 18.782 ( 0.318 ms): gcc/9877 open(filename: /usr/share/locale/en_US.UTF-8/LC_MESSAGES/gcc.mo, mode: IFBLK|IFIFO|ISGID|ISVTX|IRUSR|IXUSR|0xb5cc) = -1 ENOENT No such file or directory [root@jouet bpf]# cat sys_enter_open.c #include "bpf.h" SEC("syscalls:sys_enter_open") int func(void *ctx) { struct { char *ptr; char path[256]; } filename = { .ptr = *((char **)(ctx + 16)), }; int len = bpf_probe_read_str(filename.path, sizeof(filename.path), filename.ptr); if (len > 0 && len < 256) perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, , (len & 0xff) + sizeof(filename.ptr)); return 0; } [root@jouet bpf]# - Arnaldo
Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
Em Wed, Nov 22, 2017 at 10:42:22AM -0800, Gianluca Borello escreveu: > On Tue, Nov 21, 2017 at 2:31 PM, Alexei Starovoitov >wrote: > > > > yeah sorry about this hack. Gianluca reported this issue as well. > > Yonghong fixed it for bpf_probe_read only. We will extend > > the fix to bpf_probe_read_str() and bpf_perf_event_output() asap. > > The above workaround gets too much into llvm and verifier details > > we should strive to make bpf program writing as easy as possible. > > > > Hi Arnaldo > > With the help of Alexei, Daniel and Yonghong I just submitted a new > series ("bpf: fix semantics issues with helpers receiving NULL > arguments") that includes a fix in bpf_perf_event_output. This should > simplify the way you write your bpf programs, so you shouldn't be > required to write those convoluted branches anymore (there are a few > usage examples in the commit log). > > In my case it made writing the code much easier, after applying it I > haven't been surprised by the compiler output in a while, and I hope > your experience will be improved as well. Trying to work with this again, and I still need to trick clang into not doing some optimizations that end up getting the resulting eBPF object rejected by the kernel verifier: [root@jouet bpf]# uname -a Linux jouet 4.15.0-rc8+ #1 SMP Wed Jan 17 11:01:34 -03 2018 x86_64 x86_64 x86_64 GNU/Linux [root@jouet bpf]# grep -i bpf /lib/modules/`uname -r`/build/.config CONFIG_CGROUP_BPF=y CONFIG_BPF=y CONFIG_BPF_SYSCALL=y CONFIG_BPF_JIT_ALWAYS_ON=y # CONFIG_NETFILTER_XT_MATCH_BPF is not set # CONFIG_NET_CLS_BPF is not set # CONFIG_NET_ACT_BPF is not set CONFIG_BPF_JIT=y CONFIG_BPF_STREAM_PARSER=y CONFIG_LWTUNNEL_BPF=y CONFIG_HAVE_EBPF_JIT=y CONFIG_BPF_EVENTS=y # CONFIG_TEST_BPF is not set [root@jouet bpf]# cat sys_enter_open.c #include "bpf.h" SEC("syscalls:sys_enter_open") int func(void *ctx) { struct { char *ptr; char path[256]; } filename = { /* * /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format: * * ... * field:const char * filename; offset:16; size:8; signed:0; * ... * ctx + 16 selects 'filename' */ .ptr = *((char **)(ctx + 16)), }; int len = bpf_probe_read_str(filename.path, sizeof(filename.path), filename.ptr); if (len > 0 && len < 256) perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, , len + sizeof(filename.ptr)); return 0; } [root@jouet bpf]# [root@jouet bpf]# perf trace -v -e open,sys_enter_open.c bpf: builtin compilation failed: -95, try external compiler Kernel build dir is set to /lib/modules/4.15.0-rc8+/build set env: KBUILD_DIR=/lib/modules/4.15.0-rc8+/build unset env: KBUILD_OPTS include option is set to -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: NR_CPUS=4 set env: LINUX_VERSION_CODE=0x40f00 set env: CLANG_EXEC=/usr/local/bin/clang unset env: CLANG_OPTIONS set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: WORKING_DIR=/lib/modules/4.15.0-rc8+/build set env: CLANG_SOURCE=/home/acme/bpf/sys_enter_open.c llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o - libbpf: loading object 'sys_enter_open.c' from buffer libbpf: section .strtab, size 101, link 0, flags 0, type=3 libbpf: section .text, size 0, link 0, flags 6, type=1 libbpf: section syscalls:sys_enter_open, size 472, link 0, flags 6, type=1 libbpf: found program syscalls:sys_enter_open libbpf: section .relsyscalls:sys_enter_open, size 16, link 8, flags 0, type=9 libbpf: section maps, size 16, link 0, flags 3, type=1 libbpf: section license, size 4, link 0, flags 3, type=1 libbpf: license of sys_enter_open.c is GPL libbpf: section version, size 4, link 0, flags 3, type=1 libbpf: kernel version of sys_enter_open.c is 40f00 libbpf: section .symtab, size 144, link 1, flags 0, type=2 libbpf: maps in sys_enter_open.c: 1 maps in 16 bytes libbpf: map 0 is "__bpf_stdout__" libbpf: collecting relocating info for: 'syscalls:sys_enter_open'
Re: net-next libbpf broken on prev kernel release
Em Thu, Dec 14, 2017 at 10:52:19AM +0100, Daniel Borkmann escreveu: > [ +acme, +ast ] > > On 12/14/2017 10:16 AM, Eric Leblond wrote: > > Hello, > > > > It seems that the following patch did break libbpf (in net-next > > version) which is not able to load anymore a program on a 4.14: > > > > tree 5096ddd73981e33a2164606461a45b56a189889c > > parent ad5b177bd73f5107d97c36f56395c4281fb6f089 > > author Martin KaFai LauWed Sep 27 14:37:54 2017 -0700 > > committer David S. Miller Fri Sep 29 06:17:05 2017 > > +0100 > > > > bpf: libbpf: Provide basic API support to specify BPF obj name > > > > The problem comes from > > > > -int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns, > > -size_t insns_cnt, const char *license, > > -__u32 kern_version, char *log_buf, size_t log_buf_sz) > > +int bpf_load_program_name(enum bpf_prog_type type, const char *name, > > + const struct bpf_insn *insns, > > + size_t insns_cnt, const char *license, > > + __u32 kern_version, char *log_buf, > > + size_t log_buf_sz) > > { > > int fd; > > union bpf_attr attr; > > + __u32 name_len = name ? strlen(name) : 0; > > > > bzero(, sizeof(attr)); > > attr.prog_type = type; > > @@ -130,6 +151,7 @@ int bpf_load_program(enum bpf_prog_type type, const > > struct bpf_insn *insns, > > attr.log_size = 0; > > attr.log_level = 0; > > attr.kern_version = kern_version; > > + memcpy(attr.prog_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1)); > > > > If I comment the memcpy then the eBPF program is loading correctly. > > > > Is this a wanted behavior to have libbpf that needs to be in sync with > > kernel ? or should it be fixed ? > Yeah, this was reported recently here: https://lkml.org/lkml/2017/11/28/1246 > I agree that given the policy of perf tool is to try to use new features > but if they fail on older kernels, then we should try to fallback whenever > that is feasible. I think for this specific case, we should in-fact fallback > and try w/o map/prog name in order to fix this regression for perf (or > other lib users). > Also agree that this cannot be done for every possible case like the mentioned > prog_ifindex field for offloading to NIC in the thread above, but I imho > the prog_ifindex is a slightly different situation given that a user needs > to specifically ask to offload via some provided API. > I think the fix should be: if a user *specifically* calls > bpf_load_program_name() > or bpf_create_map_name() API from the lib, then the intention is very clear > that the bpf object should be created *with* name and otherwise fail. So a > fallback for these APIs to load w/o name would be inappropriate! But for the > existing code that used to load objects before, e.g. bpf_object__create_maps() > or bpf_program__load() it should try to use either mentioned bpf_*_name() APIs > and *iff* they fail, fall-back to the normal ones w/o name attribute. Meaning, > this kind of fall-back should be done, but not on a sys_bpf() layer but from > a higher PoV in the lib instead. I guess it would make sense to probe the > underlying kernel at startup and then based on its capabilities use one out > of the two APIs when we get there, such that we don't need to uselessly retry > APIs for each prog load. tools/perf/ has: static struct { bool sample_id_all; bool exclude_guest; bool mmap2; bool cloexec; bool clockid; bool clockid_wrong; bool lbr_flags; bool write_backward; bool group_read; } perf_missing_features; When the user request something that needs some of these features we try using it, failing it will mark it as missing and then other events will not needlessly try using it, i.e. we don't do it at program start, we leave that to when we actually need it, to avoid uselessly probing at startup. > Arnaldo, will there be a rework of your fix that we could route to bpf tree? I'm resuming work on it after I get my current batch tested and submitted, will reboot with an older kernel and follow your suggestions, that seems to match Alexei's and Martin's, my patch was just a RFC to show that we need a fallback for older kernels. I needed to move on, so I updated my machine to a kernel where interlock of tools/ with the kernel happens and it worked, so I left this to see if someone else complained or if I was being too picky. :-) - Arnaldo
Re: [PATCH net-next 2/2] tools: bpftool: create "uninstall", "doc-uninstall" make targets
Em Thu, Dec 07, 2017 at 03:00:18PM -0800, Jakub Kicinski escreveu: > From: Quentin Monnet <quentin.mon...@netronome.com> > > Create two targets to remove executable and documentation that would > have been previously installed with `make install` and `make > doc-install`. > > Also create a "QUIET_UNINST" helper in tools/scripts/Makefile.include. > > Do not attempt to remove directories /usr/local/sbin and > /usr/share/bash-completions/completions, even if they are empty, as > those specific directories probably already existed on the system before > we installed the program, and we do not wish to break other makefiles > that might assume their existence. Do remvoe /usr/local/share/man/man8 > if empty however, as this directory does not seem to exist by default. > > Signed-off-by: Quentin Monnet <quentin.mon...@netronome.com> Acked-by: Arnaldo Carvalho de Melo <a...@redhat.com> > --- > For addition to tools/scripts/Makefile.include: > > CC: Arnaldo Carvalho de Melo <a...@redhat.com> > CC: Masahiro Yamada <yamada.masah...@socionext.com> > > tools/bpf/bpftool/Documentation/Makefile | 8 +++- > tools/bpf/bpftool/Makefile | 12 ++-- > tools/scripts/Makefile.include | 1 + > 3 files changed, 18 insertions(+), 3 deletions(-) > > diff --git a/tools/bpf/bpftool/Documentation/Makefile > b/tools/bpf/bpftool/Documentation/Makefile > index 71c17fab4f2f..c462a928e03d 100644 > --- a/tools/bpf/bpftool/Documentation/Makefile > +++ b/tools/bpf/bpftool/Documentation/Makefile > @@ -3,6 +3,7 @@ include ../../../scripts/utilities.mak > > INSTALL ?= install > RM ?= rm -f > +RMDIR ?= rmdir --ignore-fail-on-non-empty > > ifeq ($(V),1) >Q = > @@ -34,5 +35,10 @@ install: man > $(Q)$(INSTALL) -d -m 755 $(DESTDIR)$(man8dir) > $(Q)$(INSTALL) -m 644 $(DOC_MAN8) $(DESTDIR)$(man8dir) > > -.PHONY: man man8 clean install > +uninstall: > + $(call QUIET_UNINST, Documentation-man) > + $(Q)$(RM) $(addprefix $(DESTDIR)$(man8dir)/,$(_DOC_MAN8)) > + $(Q)$(RMDIR) $(DESTDIR)$(man8dir) > + > +.PHONY: man man8 clean install uninstall > .DEFAULT_GOAL := man > diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile > index 203ae2e14fbc..3f17ad317512 100644 > --- a/tools/bpf/bpftool/Makefile > +++ b/tools/bpf/bpftool/Makefile > @@ -70,6 +70,11 @@ install: $(OUTPUT)bpftool > $(Q)$(INSTALL) -m 0755 -d $(DESTDIR)$(bash_compdir) > $(Q)$(INSTALL) -m 0644 bash-completion/bpftool $(DESTDIR)$(bash_compdir) > > +uninstall: > + $(call QUIET_UNINST, bpftool) > + $(Q)$(RM) $(DESTDIR)$(prefix)/sbin/bpftool > + $(Q)$(RM) $(DESTDIR)$(bash_compdir)/bpftool > + > doc: > $(call descend,Documentation) > > @@ -79,8 +84,11 @@ install: $(OUTPUT)bpftool > doc-install: > $(call descend,Documentation,install) > > +doc-uninstall: > + $(call descend,Documentation,uninstall) > + > FORCE: > > -.PHONY: all FORCE clean install > -.PHONY: doc doc-clean doc-install > +.PHONY: all FORCE clean install uninstall > +.PHONY: doc doc-clean doc-install doc-uninstall > .DEFAULT_GOAL := all > diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include > index 3fab179b1aba..fcb3ed0be5f8 100644 > --- a/tools/scripts/Makefile.include > +++ b/tools/scripts/Makefile.include > @@ -99,5 +99,6 @@ ifneq ($(silent),1) > > QUIET_CLEAN= @printf ' CLEAN%s\n' $1; > QUIET_INSTALL = @printf ' INSTALL %s\n' $1; > + QUIET_UNINST = @printf ' UNINST %s\n' $1; >endif > endif > -- > 2.15.1
Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"
Em Thu, Nov 30, 2017 at 01:51:15PM -0800, Alexei Starovoitov escreveu: > On 11/30/17 11:00 AM, Arnaldo Carvalho de Melo wrote: > > > Instead of sinking all future bpf_attr's backward compatibility > > > requirements to sys_bpf, I would push it up to its own BPF_* command > > > helper which has a better sense of its bpf_attr, i.e. push it up > > > to bpf_create_map_node() and bpf_load_program_name() in this case. > > Humm, we could try that approach, but the one in this patch seemed good > > enough. > > > > And after all if the first syscall() invokation, with the latest kernel > > and latest tooling will work, right? > > I agree with Martin and I also don't think it will work to push > logic of all bpf commands into single sys_bpf syscall wrapper. Sure, that was just a POC, I'll work on something that takes into account what you guys pointed out. > This logic will become more and more complex over time. > Like this case really belongs in bpf_create_map() which is a wrapper > on top of single BPF_CREATE_MAP command. > Note it's the first time we're facing this 'new libbpf.a running on > top of old kernel' issue and should be very careful adding such > fallback code to the generic bpf library, since all the selftests/bpf/ > are using this lib and relying on excepted behavior. Right, tools/perf/ uses it as well and relies on its continued functioning. > We don't want tests that want to test the latest kernel feature all of > a sudden pass on old kernel that doesn't have it. Sure, neither do I :-) > To some degree perf and selftests/bpf needs are diverging here, > so adding #ifdef to libbpf.a to match testcase expectations may be > necessary. But this is not just testcase expectations, the usecase is someone wanting to use a newer tool, with perhaps some new features of interest that don't depend on changes in the kernel, in an older kernel on a system where updating it is not possible or desirable. - Arnaldo
Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
Em Tue, Nov 14, 2017 at 02:58:24PM -0800, Yonghong Song escreveu: > On 11/14/17 12:25 PM, Daniel Borkmann wrote: > > Yeah, I know, that's what I mentioned earlier in this thread to resolve it, > > but do we really want to add this hack everywhere? :( Potentially any > > function > > having ARG_CONST_SIZE would need to handle size 0 and bail out again in > > their > > helper implementation and it ends up that progs start relying on this > > runtime > > check where we won't be able to get rid of it later on anymore. > The compiler actually does the right thing for the below code: > int ret = bpf_probe_read_str(filename, sizeof(filename), > filename_ptr); > if (ret > 0) >bpf_perf_event_output(ctx, &__bpf_stdout__,BPF_F_CURRENT_CPU, >filename, ret & (sizeof(filename) - 1)); > Just from the above code without consulting bpf_probe_read_str internals, it > is totally possible that ret = 128, then > ret & (sizeof(filename) - 1) = 0. > The issue is that the verifier did not set the "ret" initial range as (-inf, > sizeof(filename) - 1). We could have this information associated with helper > and feed back to verifier. > If we have this range, later for ret & (sizeof(filename) - 1) with ret >= 1, > the verifier should be able to conclude > ret & (sizeof(filename) - 1) >= 1. > To workaround the immediate problem, I tested the following hack > with bcc and it works fine. > BPF_PERF_OUTPUT(events); > int trace(struct pt_regs *ctx) { > char filename[128]; > int ret = bpf_probe_read_str(filename, sizeof(filename), 0); > if (ret > 0) { > if (ret == 1) > events.perf_submit(ctx, filename, ret); > else if (ret < 128) > events.perf_submit(ctx, filename, ret); > } > return 1; > } > The idea is to make control flow more complex to prevent llvm > do certain optimizations. So, the hack makes it work for me, using clang 6.0: set env: NR_CPUS=4 set env: LINUX_VERSION_CODE=0x40e00 set env: CLANG_EXEC=/usr/local/bin/clang unset env: CLANG_OPTIONS set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: WORKING_DIR=/lib/modules/4.14.0+/build set env: CLANG_SOURCE=/home/acme/bpf/open.c llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o - [root@jouet bpf]# perf probe -V do_sys_open Available variables at do_sys_open @char* filename int dfd int flags struct open_flags op umode_t mode [root@jouet bpf]# cat open.c #include "bpf.h" SEC("prog=do_sys_open filename") int prog(void *ctx, int err, char *filename_ptr) { char filename[128]; int len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); if (len > 0) { if (len == 1) perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len); else if (len < 128) perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len); } return 1; } [root@jouet bpf]# [root@jouet bpf]# perf trace -e *open,open.c touch /tmp/Thanks.Yonghong.Song\! LLVM: dumping open.o 0.000 ( 0.009 ms): touch/9034 open(filename: 0x5b678e37, flags: CLOEXEC ) ... 0.009 ( ): __bpf_stdout__:/etc/ld.so.cache) 0.011 ( ): perf_bpf_probe:prog:(8f260da0) filename=0x7f805b678e37) 0.000 ( 0.016 ms): touch/9034 ... [continued]: open()) = 3 0.034 ( 0.002 ms): touch/9034 open(filename: 0x5b87c640, flags: CLOEXEC ) ... 0.036 ( ): __bpf_stdout__:/lib64/libc.so.6) 0.037 ( ): perf_bpf_probe:prog:(8f260da0) filename=0x7f805b87c640) 0.034 ( 0.009 ms): touch/9034 ... [continued]: open()) = 3 0.251 ( 0.002 ms): touch/9034 open(filename: 0x5b422c70, flags: CLOEXEC ) ... 0.253 ( ): __bpf_stdout__:/usr/lib/locale/locale-archive..) 0.254 ( ): perf_bpf_probe:prog:(8f260da0) filename=0x7f805b422c70) 0.251 ( 0.009 ms): touch/9034 ... [continued]: open()) = 3 0.296 ( 0.002 ms): touch/9034 open(filename: 0x1d3a00f1, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) ... 0.298 ( ):
Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
Em Tue, Nov 14, 2017 at 09:25:17PM +0100, Daniel Borkmann escreveu: > On 11/14/2017 07:15 PM, Yonghong Song wrote: > > On 11/14/17 6:19 AM, Daniel Borkmann wrote: > >> On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote: > >>> Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu: > >>>> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote: > >>>>> Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu: > >>>>>> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote: > >>>>>>> libbpf: -- BEGIN DUMP LOG --- > >>>>>>> libbpf: > >>>>>>> 0: (79) r3 = *(u64 *)(r1 +104) > >>>>>>> 1: (b7) r2 = 0 > >>>>>>> 2: (bf) r6 = r1 > >>>>>>> 3: (bf) r1 = r10 > >>>>>>> 4: (07) r1 += -128 > >>>>>>> 5: (b7) r2 = 128 > >>>>>>> 6: (85) call bpf_probe_read_str#45 > >>>>>>> 7: (bf) r1 = r0 > >>>>>>> 8: (07) r1 += -1 > >>>>>>> 9: (67) r1 <<= 32 > >>>>>>> 10: (77) r1 >>= 32 > >>>>>>> 11: (25) if r1 > 0x7f goto pc+11 > >>>>>> > >>>>>> Right, so the compiler is optimizing the two tests into a single one > >>>>>> above, > >>>>>> which means lower bound cannot properly be derived again by the > >>>>>> verifier due > >>>>>> to this and thus you'll get the error. Similar issue was seen recently > >>>>>> [1]. > >>>>>> > >>>>>> Does the below hack work for you? > >>>>>> > >>>>>> int prog([...]) > >>>>>> { > >>>>>> char filename[128]; > >>>>>> int ret = bpf_probe_read_str(filename, sizeof(filename), > >>>>>> filename_ptr); > >>>>>> if (ret > 0) > >>>>>> bpf_perf_event_output(ctx, &__bpf_stdout__, > >>>>>> BPF_F_CURRENT_CPU, filename, > >>>>>> ret & (sizeof(filename) - 1)); > >>>>>> return 1; > >>>>>> } > >>>>>> > >>>>>> r0 should keep on tracking bounds here at least: > >>>>>> > >>>>>> prog: > >>>>>> 0: bf 16 00 00 00 00 00 00 r6 = r1 > >>>>>> 1: bf a1 00 00 00 00 00 00 r1 = r10 > >>>>>> 2: 07 01 00 00 80 ff ff ff r1 += -128 > >>>>>> 3: b7 02 00 00 80 00 00 00 r2 = 128 > >>>>>> 4: 85 00 00 00 2d 00 00 00 call 45 > >>>>>> 5: 67 00 00 00 20 00 00 00 r0 <<= 32 > >>>>>> 6: c7 00 00 00 20 00 00 00 r0 s>>= 32 > >>>>>> 7: b7 01 00 00 01 00 00 00 r1 = 1 > >>>>>> 8: 6d 01 0a 00 00 00 00 00 if r1 s> r0 goto 10 > >>>>>> 9: 57 00 00 00 7f 00 00 00 r0 &= 127 > >>>>>> 10: bf a4 00 00 00 00 00 00 r4 = r10 > >>>>>> 11: 07 04 00 00 80 ff ff ff r4 += -128 > >>>>>> 12: bf 61 00 00 00 00 00 00 r1 = r6 > >>>>>> 13: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = > >>>>>> 0ll > >>>>>> 15: 18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 r3 = > >>>>>> 4294967295ll > >>>>>> 17: bf 05 00 00 00 00 00 00 r5 = r0 > >>>>>> 18: 85 00 00 00 19 00 00 00 call 25 > >>>>>> > >>>>>> [1] > >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_netdev_list_-3Fseries-3D13211=DwIDaQ=5VD0RTtNlTh3ycd41b3MUw=DA8e1B5r073vIqRrFz7MRA=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0=z0d6b_hxStA845Kh7epJ-JiFwkiWqUH_z3fEadwqAQY= > >>>>> > >>>>> Not yet: > >>>>> > >>>>> 6: (85) call bpf_probe_read_str#45 > >>>>> 7: (bf) r1 = r0 > >>>>> 8: (67) r1 <<= 32 > >>>>> 9: (77) r1 >>= 32 > >>>>> 10: (15)
Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
Em Tue, Nov 14, 2017 at 03:19:51PM +0100, Daniel Borkmann escreveu: > On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote: > > Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu: > >> On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote: > >> Currently having a version compiled from the git tree: > >> # llc --version > >> LLVM (http://llvm.org/): > >> LLVM version 6.0.0git-2d810c2 > >> Optimized build. > >> Default target: x86_64-unknown-linux-gnu > >> Host CPU: skylake > > [root@jouet bpf]# llc --version > > LLVM (http://llvm.org/): > > LLVM version 4.0.0svn > > Old stuff! ;-) Will change, but improving these messages should be on > > the radar, I think :-) > Yep, agree, I think we need a generic, better solution for this type of > issue instead of converting individual helpers to handle 0 min bound and > then only bailing out in such case; need to brainstorm a bit on that. > I think for the above in your case ... > [...] > 6: (85) call bpf_probe_read_str#45 > 7: (bf) r1 = r0 > 8: (67) r1 <<= 32 > 9: (77) r1 >>= 32 > 10: (15) if r1 == 0x0 goto pc+10 > R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0x)) > R6=ctx(id=0,off=0,imm=0) R10=fp0 > 11: (57) r0 &= 127 > [...] > ... the shifts on r1 might be due to using 32 bit type, so if you find > a way to avoid these and have the test on r0 directly, we might get there. > Perhaps keep using a 64 bit type to avoid them. It would be useful to > propagate the deduced bound information back to r0 when we know that > neither r0 nor r1 has changed in the meantime. I changed len/ret to u64, didn't help, updating clang and llvm to see if that helps... Will end up working directly with eBPF bytecode, which is what I really need in 'perf trace', but lets get this sorted out first. - Arnaldo
Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu: > On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote: > > Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu: > >> On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote: > >>> libbpf: -- BEGIN DUMP LOG --- > >>> libbpf: > >>> 0: (79) r3 = *(u64 *)(r1 +104) > >>> 1: (b7) r2 = 0 > >>> 2: (bf) r6 = r1 > >>> 3: (bf) r1 = r10 > >>> 4: (07) r1 += -128 > >>> 5: (b7) r2 = 128 > >>> 6: (85) call bpf_probe_read_str#45 > >>> 7: (bf) r1 = r0 > >>> 8: (07) r1 += -1 > >>> 9: (67) r1 <<= 32 > >>> 10: (77) r1 >>= 32 > >>> 11: (25) if r1 > 0x7f goto pc+11 > >> > >> Right, so the compiler is optimizing the two tests into a single one above, > >> which means lower bound cannot properly be derived again by the verifier > >> due > >> to this and thus you'll get the error. Similar issue was seen recently [1]. > >> > >> Does the below hack work for you? > >> > >> int prog([...]) > >> { > >> char filename[128]; > >> int ret = bpf_probe_read_str(filename, sizeof(filename), > >> filename_ptr); > >> if (ret > 0) > >> bpf_perf_event_output(ctx, &__bpf_stdout__, > >> BPF_F_CURRENT_CPU, filename, > >> ret & (sizeof(filename) - 1)); > >> return 1; > >> } > >> > >> r0 should keep on tracking bounds here at least: > >> > >> prog: > >>0: bf 16 00 00 00 00 00 00 r6 = r1 > >>1: bf a1 00 00 00 00 00 00 r1 = r10 > >>2: 07 01 00 00 80 ff ff ff r1 += -128 > >>3: b7 02 00 00 80 00 00 00 r2 = 128 > >>4: 85 00 00 00 2d 00 00 00 call 45 > >>5: 67 00 00 00 20 00 00 00 r0 <<= 32 > >>6: c7 00 00 00 20 00 00 00 r0 s>>= 32 > >>7: b7 01 00 00 01 00 00 00 r1 = 1 > >>8: 6d 01 0a 00 00 00 00 00 if r1 s> r0 goto 10 > >>9: 57 00 00 00 7f 00 00 00 r0 &= 127 > >> 10: bf a4 00 00 00 00 00 00 r4 = r10 > >> 11: 07 04 00 00 80 ff ff ff r4 += -128 > >> 12: bf 61 00 00 00 00 00 00 r1 = r6 > >> 13: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0ll > >> 15: 18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 r3 = > >> 4294967295ll > >> 17: bf 05 00 00 00 00 00 00 r5 = r0 > >> 18: 85 00 00 00 19 00 00 00 call 25 > >> > >> [1] http://patchwork.ozlabs.org/project/netdev/list/?series=13211 > > > > Not yet: > > > > 6: (85) call bpf_probe_read_str#45 > > 7: (bf) r1 = r0 > > 8: (67) r1 <<= 32 > > 9: (77) r1 >>= 32 > > 10: (15) if r1 == 0x0 goto pc+10 > > R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0x)) > > R6=ctx(id=0,off=0,imm=0) R10=fp0 > > 11: (57) r0 &= 127 > > 12: (bf) r4 = r10 > > 13: (07) r4 += -128 > > 14: (bf) r1 = r6 > > 15: (18) r2 = 0x92bfc2aba840 > > 17: (18) r3 = 0x > > 19: (bf) r5 = r0 > > 20: (85) call bpf_perf_event_output#25 > > invalid stack type R4 off=-128 access_size=0 > > > > I'll try updating clang/llvm... > > > > Full details: > > > > [root@jouet bpf]# cat open.c > > #include "bpf.h" > > > > SEC("prog=do_sys_open filename") > > int prog(void *ctx, int err, const char __user *filename_ptr) > > { > > char filename[128]; > > const unsigned len = bpf_probe_read_str(filename, sizeof(filename), > > filename_ptr); > > Btw, I was using 'int' here above instead of 'unsigned' as > strncpy_from_unsafe() > could potentially return errors like -EFAULT. I changed to int, didn't help > Currently having a version compiled from the git tree: > > # llc --version > LLVM (http://llvm.org/): > LLVM version 6.0.0git-2d810c2 > Optimized build. > Default target: x86_64-unknown-linux-gnu > Host CPU: skylake [root@jouet bpf]# llc --version LLVM (http://llvm.org/): LLVM version 4.0.0svn Old stuff! ;-) Will change, but improving these messages should be on the radar, I think :-) - Arnaldo > Registered Targets: > bpf- BPF (host endian) > bpfeb - BPF (big endian) > bpfel - BPF (little endian) > x86- 32-bit X86: Pentium-Pro and above > x86-64 - 64-bit X86: EM64T and AMD64 > > > if (len > 0) > > perf_event_output(ctx, &__bpf_stdout__, > > BPF_F_CURRENT_CPU, filename, > > len & (sizeof(filename) - 1)); > > return 1; > > }
Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu: > On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote: > > libbpf: -- BEGIN DUMP LOG --- > > libbpf: > > 0: (79) r3 = *(u64 *)(r1 +104) > > 1: (b7) r2 = 0 > > 2: (bf) r6 = r1 > > 3: (bf) r1 = r10 > > 4: (07) r1 += -128 > > 5: (b7) r2 = 128 > > 6: (85) call bpf_probe_read_str#45 > > 7: (bf) r1 = r0 > > 8: (07) r1 += -1 > > 9: (67) r1 <<= 32 > > 10: (77) r1 >>= 32 > > 11: (25) if r1 > 0x7f goto pc+11 > > Right, so the compiler is optimizing the two tests into a single one above, > which means lower bound cannot properly be derived again by the verifier due > to this and thus you'll get the error. Similar issue was seen recently [1]. > > Does the below hack work for you? > > int prog([...]) > { > char filename[128]; > int ret = bpf_probe_read_str(filename, sizeof(filename), > filename_ptr); > if (ret > 0) > bpf_perf_event_output(ctx, &__bpf_stdout__, > BPF_F_CURRENT_CPU, filename, > ret & (sizeof(filename) - 1)); > return 1; > } > > r0 should keep on tracking bounds here at least: > > prog: >0: bf 16 00 00 00 00 00 00 r6 = r1 >1: bf a1 00 00 00 00 00 00 r1 = r10 >2: 07 01 00 00 80 ff ff ff r1 += -128 >3: b7 02 00 00 80 00 00 00 r2 = 128 >4: 85 00 00 00 2d 00 00 00 call 45 >5: 67 00 00 00 20 00 00 00 r0 <<= 32 >6: c7 00 00 00 20 00 00 00 r0 s>>= 32 >7: b7 01 00 00 01 00 00 00 r1 = 1 >8: 6d 01 0a 00 00 00 00 00 if r1 s> r0 goto 10 >9: 57 00 00 00 7f 00 00 00 r0 &= 127 > 10: bf a4 00 00 00 00 00 00 r4 = r10 > 11: 07 04 00 00 80 ff ff ff r4 += -128 > 12: bf 61 00 00 00 00 00 00 r1 = r6 > 13: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0ll > 15: 18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 r3 = > 4294967295ll > 17: bf 05 00 00 00 00 00 00 r5 = r0 > 18: 85 00 00 00 19 00 00 00 call 25 > > [1] http://patchwork.ozlabs.org/project/netdev/list/?series=13211 Not yet: 6: (85) call bpf_probe_read_str#45 7: (bf) r1 = r0 8: (67) r1 <<= 32 9: (77) r1 >>= 32 10: (15) if r1 == 0x0 goto pc+10 R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0x)) R6=ctx(id=0,off=0,imm=0) R10=fp0 11: (57) r0 &= 127 12: (bf) r4 = r10 13: (07) r4 += -128 14: (bf) r1 = r6 15: (18) r2 = 0x92bfc2aba840 17: (18) r3 = 0x 19: (bf) r5 = r0 20: (85) call bpf_perf_event_output#25 invalid stack type R4 off=-128 access_size=0 I'll try updating clang/llvm... Full details: [root@jouet bpf]# cat open.c #include "bpf.h" SEC("prog=do_sys_open filename") int prog(void *ctx, int err, const char __user *filename_ptr) { char filename[128]; const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); if (len > 0) perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len & (sizeof(filename) - 1)); return 1; } [root@jouet bpf]# perf trace -v -e *open,open.c usleep 2 bpf: builtin compilation failed: -95, try external compiler Kernel build dir is set to /lib/modules/4.14.0+/build set env: KBUILD_DIR=/lib/modules/4.14.0+/build unset env: KBUILD_OPTS include option is set to -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: NR_CPUS=4 set env: LINUX_VERSION_CODE=0x40e00 set env: CLANG_EXEC=/usr/local/bin/clang unset env: CLANG_OPTIONS set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: WORKING_DIR=/lib/modules/4.14.0+/build set env: CLANG_SOURCE=/home/acme/bpf/open.c llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OP
Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
Em Mon, Nov 13, 2017 at 03:56:14PM +0100, Daniel Borkmann escreveu: > On 11/13/2017 03:30 PM, Arnaldo Carvalho de Melo wrote: > > Hi, > > > > In a5e8c07059d0 ("bpf: add bpf_probe_read_str helper") you > > state: > > > >"This is suboptimal because the size of the string needs to be estimated > > at compile time, causing more memory to be copied than often necessary, > > and can become more problematic if further processing on buf is done, > > for example by pushing it to userspace via bpf_perf_event_output(), > > since the real length of the string is unknown and the entire buffer > > must be copied (and defining an unrolled strnlen() inside the bpf > > program is a very inefficient and unfeasible approach)." > > > > So I went on to try this with 'perf trace' but it isn't working if I use > > the return from bpf_probe_read_str(), I must be missing something > > here... > > > > I.e. this works: > > > > [root@jouet bpf]# cat open.c > > #include "bpf.h" > > > > SEC("prog=do_sys_open filename") > > int prog(void *ctx, int err, const char __user *filename_ptr) > > { > > char filename[128]; > > const unsigned len = bpf_probe_read_str(filename, sizeof(filename), > > filename_ptr); > > perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), > > filename, 32); > > By the way, you can just use BPF_F_CURRENT_CPU flag instead of the helper > call get_smp_processor_id() to get current CPU. Thanks, switched to it. > > But then if I use the return value to push just the string lenght, it > > doesn't work: > > > > [root@jouet bpf]# cat open.c > > #include "bpf.h" > > > > SEC("prog=do_sys_open filename") > > int prog(void *ctx, int err, const char __user *filename_ptr) > > { > > char filename[128]; > > const unsigned len = bpf_probe_read_str(filename, sizeof(filename), > > filename_ptr); > > perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), > > filename, len); > > The below issue 'invalid stack type R4 off=-128 access_size=0' is basically > that > unsigned len is unknown at verification time, thus unbounded. Can you try the > following to see if that passes? > > if (len > 0 && len <= sizeof(filename)) > perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, > len); I had it like: if (len > 0 && len < 32) And it didn't helped, now I did exactly as you suggested: [root@jouet bpf]# cat open.c #include "bpf.h" SEC("prog=do_sys_open filename") int prog(void *ctx, int err, const char __user *filename_ptr) { char filename[128]; const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); if (len > 0 && len <= sizeof(filename)) perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len); return 1; } [root@jouet bpf]# trace -e open,open.c touch /etc/passwd bpf: builtin compilation failed: -95, try external compiler event syntax error: 'open.c' \___ Kernel verifier blocks program loading [root@jouet bpf]# The -v output looks the same: [root@jouet bpf]# trace -v -e open,open.c touch /etc/passwd bpf: builtin compilation failed: -95, try external compiler Kernel build dir is set to /lib/modules/4.14.0-rc6+/build set env: KBUILD_DIR=/lib/modules/4.14.0-rc6+/build unset env: KBUILD_OPTS include option is set to -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: NR_CPUS=4 set env: LINUX_VERSION_CODE=0x40e00 set env: CLANG_EXEC=/usr/local/bin/clang unset env: CLANG_OPTIONS set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: WORKING_DIR=/lib/modules/4.14.0-rc6+/build set env: CLANG_SOURCE=/home/acme/bpf/open.c llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -W
len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
Hi, In a5e8c07059d0 ("bpf: add bpf_probe_read_str helper") you state: "This is suboptimal because the size of the string needs to be estimated at compile time, causing more memory to be copied than often necessary, and can become more problematic if further processing on buf is done, for example by pushing it to userspace via bpf_perf_event_output(), since the real length of the string is unknown and the entire buffer must be copied (and defining an unrolled strnlen() inside the bpf program is a very inefficient and unfeasible approach)." So I went on to try this with 'perf trace' but it isn't working if I use the return from bpf_probe_read_str(), I must be missing something here... I.e. this works: [root@jouet bpf]# cat open.c #include "bpf.h" SEC("prog=do_sys_open filename") int prog(void *ctx, int err, const char __user *filename_ptr) { char filename[128]; const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, 32); return 1; } [root@jouet bpf]# perf trace -e open,open.c touch /etc/passwd bpf: builtin compilation failed: -95, try external compiler 0.000 ( 0.013 ms): touch/14403 open(filename: 0x2ff7ce37, flags: CLOEXEC ) ... 0.013 ( ): __bpf_stdout__:/etc/ld.so.cache..B.) 0.015 ( ): perf_bpf_probe:prog:(b4260ae0) filename=0x7f7a2ff7ce37) 0.000 ( 0.021 ms): touch/14403 ... [continued]: open()) = 3 0.042 ( 0.002 ms): touch/14403 open(filename: 0x30180640, flags: CLOEXEC ) ... 0.044 ( ): __bpf_stdout__:/lib64/libc.so.6.. ...G.) 0.045 ( ): perf_bpf_probe:prog:(b4260ae0) filename=0x7f7a30180640) 0.042 ( 0.010 ms): touch/14403 ... [continued]: open()) = 3 0.301 ( 0.003 ms): touch/14403 open(filename: 0x2fd26c70, flags: CLOEXEC ) ... 0.305 ( ): __bpf_stdout__:/usr/lib/locale/locale-archive..) 0.306 ( ): perf_bpf_probe:prog:(b4260ae0) filename=0x7f7a2fd26c70) 0.301 ( 0.011 ms): touch/14403 ... [continued]: open()) = 3 0.360 ( 0.002 ms): touch/14403 open(filename: 0x681f20f3, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) ... 0.362 ( ): __bpf_stdout__:/etc/passwd... ...D.) 0.363 ( ): perf_bpf_probe:prog:(b4260ae0) filename=0x7ffe681f20f3) 0.360 ( 0.010 ms): touch/14403 ... [continued]: open()) = 3 [root@jouet bpf]# That bpf.h will set up the maps, etc, its attached if that may be needed to help figure this out. But then if I use the return value to push just the string lenght, it doesn't work: [root@jouet bpf]# cat open.c #include "bpf.h" SEC("prog=do_sys_open filename") int prog(void *ctx, int err, const char __user *filename_ptr) { char filename[128]; const unsigned len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), filename, len); return 1; } [root@jouet bpf]# perf trace -e open,open.c touch /etc/passwd bpf: builtin compilation failed: -95, try external compiler event syntax error: 'open.c' \___ Kernel verifier blocks program loading (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf trace [] [] or: perf trace [] -- [] or: perf trace record [] [] or: perf trace record [] -- [] -e, --eventevent/syscall selector. use 'perf list' to list available events [root@jouet bpf]# When running this with -v we get the tools/lib/libbpf.c debug that may help here: Opening /sys/kernel/debug/tracing//kprobe_events write=1 Writing event: p:perf_bpf_probe/prog _text+2493152 filename=%si:x64 In map_prologue, ntevs=1 mapping[0]=0 libbpf: create map __bpf_stdout__: fd=3 prologue: pass validation prologue: fast path libbpf: load bpf program failed: Permission denied libbpf: -- BEGIN DUMP LOG --- libbpf: 0: (79) r3 = *(u64 *)(r1 +104) 1: (b7) r2 = 0 2: (bf) r6 = r1 3: (bf) r7 = r10 4: (07) r7 += -128 5: (bf) r1 = r7 6: (b7) r2 = 128 7: (85) call bpf_probe_read_str#45 8: (bf) r8 = r0 9: (67) r8 <<= 32 10: (77) r8 >>= 32 11: (85) call bpf_get_smp_processor_id#8 12: (bf) r1 = r6 13: (18) r2 = 0xa0b5958e16c0 15: (bf) r3 = r0 16: (bf) r4 = r7 17: (bf) r5 = r8 18: (85) call bpf_perf_event_output#25 invalid stack type R4 off=-128 access_size=0 libbpf: -- END LOG -- libbpf: Loading the 0th instance of program 'prog=do_sys_open filename' failed libbpf: failed to load program 'prog=do_sys_open filename' libbpf: failed to load object 'open.c' bpf: load objects failed event syntax error: 'open.c' \___ Kernel verifier blocks program loading I tried adding checks for len to try to somehow make sure its all bounds checked, but
bpf.h drift due to bpf_sk_redirect_map()
Hi John, Recently the tools/perf/ build system noticed drift in tools/include/uapi/linux/bpf.h from its master copy include/uapi/linux/bpf.h, which comes from changes from you, can you please check this? [acme@jouet linux]$ diff -u tools/include/uapi/linux/bpf.h include/uapi/linux/bpf.h --- tools/include/uapi/linux/bpf.h 2017-10-26 08:10:15.980323396 -0300 +++ include/uapi/linux/bpf.h2017-10-19 14:26:13.859622885 -0300 @@ -569,10 +569,9 @@ * @flags: reserved for future use * Return: 0 on success or negative error code * - * int bpf_sk_redirect_map(skb, map, key, flags) + * int bpf_sk_redirect_map(map, key, flags) * Redirect skb to a sock in map using key as a lookup key for the * sock in map. - * @skb: pointer to skb * @map: pointer to sockmap * @key: key to lookup sock in map * @flags: reserved for future use [acme@jouet linux]$ - Arnaldo
Re: [PATCH net-next v2 0/3] tools: add bpftool
Em Tue, Oct 03, 2017 at 05:48:22PM -0700, Jakub Kicinski escreveu: > On Tue, 3 Oct 2017 17:19:42 -0300, Arnaldo Carvalho de Melo wrote: > > Why not call it just 'bpf'? > bpftool was suggested as a better name, I don't really mind either way. I just thought that 'bpf' isn't used as a command, shorter, less typing, but yeah, if people think having 'tool' in the tool name helps somewhat, so be it. - Arnaldo
Re: [PATCH net-next v2 0/3] tools: add bpftool
Em Mon, Oct 02, 2017 at 04:11:27PM -0700, Jakub Kicinski escreveu: > Hi! > > This set adds bpftool to the tools/ directory. The first > patch renames tools/net to tools/bpf, the second one adds > the new code, while the third adds simple documentation. > > v2: > - report names, map ids, load time, uid; > - add docs/man pages; > - general cleanups & fixes. > > Thanks to David Beckett for help with docs and testing. Why not call it just 'bpf'? - Arnaldo > Jakub Kicinski (3): > tools: rename tools/net directory to tools/bpf > tools: bpf: add bpftool > tools: bpftool: add documentation > > MAINTAINERS | 3 +- > tools/Makefile | 14 +- > tools/{net => bpf}/Makefile | 18 +- > tools/{net => bpf}/bpf_asm.c | 0 > tools/{net => bpf}/bpf_dbg.c | 0 > tools/{net => bpf}/bpf_exp.l | 0 > tools/{net => bpf}/bpf_exp.y | 0 > tools/{net => bpf}/bpf_jit_disasm.c | 0 > tools/bpf/bpftool/Documentation/Makefile | 34 ++ > tools/bpf/bpftool/Documentation/bpftool-map.txt | 110 > tools/bpf/bpftool/Documentation/bpftool-prog.txt | 81 +++ > tools/bpf/bpftool/Documentation/bpftool.txt | 34 ++ > tools/bpf/bpftool/Makefile | 86 +++ > tools/bpf/bpftool/common.c | 215 +++ > tools/bpf/bpftool/jit_disasm.c | 87 +++ > tools/bpf/bpftool/main.c | 212 +++ > tools/bpf/bpftool/main.h | 99 +++ > tools/bpf/bpftool/map.c | 744 > +++ > tools/bpf/bpftool/prog.c | 427 + > 19 files changed, 2152 insertions(+), 12 deletions(-) > rename tools/{net => bpf}/Makefile (74%) > rename tools/{net => bpf}/bpf_asm.c (100%) > rename tools/{net => bpf}/bpf_dbg.c (100%) > rename tools/{net => bpf}/bpf_exp.l (100%) > rename tools/{net => bpf}/bpf_exp.y (100%) > rename tools/{net => bpf}/bpf_jit_disasm.c (100%) > create mode 100644 tools/bpf/bpftool/Documentation/Makefile > create mode 100644 tools/bpf/bpftool/Documentation/bpftool-map.txt > create mode 100644 tools/bpf/bpftool/Documentation/bpftool-prog.txt > create mode 100644 tools/bpf/bpftool/Documentation/bpftool.txt > create mode 100644 tools/bpf/bpftool/Makefile > create mode 100644 tools/bpf/bpftool/common.c > create mode 100644 tools/bpf/bpftool/jit_disasm.c > create mode 100644 tools/bpf/bpftool/main.c > create mode 100644 tools/bpf/bpftool/main.h > create mode 100644 tools/bpf/bpftool/map.c > create mode 100644 tools/bpf/bpftool/prog.c > > -- > 2.14.1
Re: [PATCH net-next] bridge: add tracepoint in br_fdb_update
Em Thu, Aug 31, 2017 at 06:20:12PM +0200, Jesper Dangaard Brouer escreveu: > On Thu, 31 Aug 2017 09:30:05 -0600 David Ahernwrote: > > > On Thu, Aug 31, 2017 at 5:38 AM, Jesper Dangaard Brouer > > > wrote: > > > These bridge tracepoints in context are primarily for debugging fdb > > > updates only, not for every packet and hence not in the performance > > > path. > > > In large scale deployments with thousands of bridge ports and fdb > > > entries, dev->name will definately make it easier to trouble-shoot. > > > So, I did like to leave these with dev->name unless there are strong > > > objections. > > +1 for user friendliness for debugging tracepoints. The device name is > > also more user friendly when adding filters to the data collection. > > Being able to add bpf everywhere certainly changes the game a bit, but > > we should not relinquish ease of use and understanding for the potential > > that someone might want to put a bpf program on the tracepoint and want > > to maintain high performance. > (Cc. Acme and Peterz) > I wonder if we can create a special perf-tracepoint type for ifindex'es > and the tool reading (e.g. perf-script) can perform the name lookup in > userspace (calling if_indextoname(3)) ? > I don't know the perf tools well enough to know if this is possible? Yeah, there are libtraceevent plugins, and that gets used by trace-cmd and perf script, perf trace. [root@jouet ~]# ls -la ~/.traceevent/plugins/ total 192 drwxr-xr-x. 2 acme acme 4096 Aug 31 15:29 . drwxr-xr-x. 3 acme acme 4096 Jan 27 2017 .. -rwxr-xr-x. 1 acme acme 13744 Aug 31 15:29 plugin_cfg80211.so -rwxr-xr-x. 1 acme acme 20192 Aug 31 15:29 plugin_function.so -rwxr-xr-x. 1 acme acme 13680 Aug 31 15:29 plugin_hrtimer.so -rwxr-xr-x. 1 acme acme 13760 Aug 31 15:29 plugin_jbd2.so -rwxr-xr-x. 1 acme acme 13704 Aug 31 15:29 plugin_kmem.so -rwxr-xr-x. 1 acme acme 28568 Aug 31 15:29 plugin_kvm.so -rwxr-xr-x. 1 acme acme 14184 Aug 31 15:29 plugin_mac80211.so -rwxr-xr-x. 1 acme acme 14424 Aug 31 15:29 plugin_sched_switch.so -rwxr-xr-x. 1 acme acme 20136 Aug 31 15:29 plugin_scsi.so -rwxr-xr-x. 1 acme acme 14504 Aug 31 15:29 plugin_xen.so [root@jouet ~]# But... that index is something that is mutable, i.e. you'd have to somehow record all the assignments of an index to an interface and then, when processing the events, get from that state the mapping you want. So you don't store the device name by doing lookups at each of those high volume tracepoints, store just the index, but then, when establishing the mapping, collect that as well and we come up with some infrastructure to get that mapping in a place where a plugin can do the lookups at post processing time. For instance, the hrtimer plugin will get an address from the kernel, and, from a state recorded at the same time as the trace file, will lookup elf symbol tables for the kernel or modules and resolve that symbol, etc. - Arnaldo
[PATCH 1/2] dccp: Unlock sock before calling sk_free()
From: Arnaldo Carvalho de Melo <a...@redhat.com> The code where sk_clone() came from created a new socket and locked it, but then, on the error path didn't unlock it. This problem stayed there for a long while, till b0691c8ee7c2 ("net: Unlock sock before calling sk_free()") fixed it, but unfortunately the callers of sk_clone() (now sk_clone_locked()) were not audited and the one in dccp_create_openreq_child() remained. Now in the age of the syskaller fuzzer, this was finally uncovered, as reported by Dmitry: 8< I've got the following report while running syzkaller fuzzer on 86292b33d4b7 ("Merge branch 'akpm' (patches from Andrew)") [ BUG: held lock freed! ] 4.10.0+ #234 Not tainted - syz-executor6/6898 is freeing memory 88006286cac0-88006286d3b7, with a lock still held there! (slock-AF_INET6){+.-...}, at: [] spin_lock include/linux/spinlock.h:299 [inline] (slock-AF_INET6){+.-...}, at: [] sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504 5 locks held by syz-executor6/6898: #0: (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock include/net/sock.h:1460 [inline] #0: (sk_lock-AF_INET6){+.+.+.}, at: [] inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681 #1: (rcu_read_lock){..}, at: [] inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126 #2: (rcu_read_lock){..}, at: [] __skb_unlink include/linux/skbuff.h:1767 [inline] #2: (rcu_read_lock){..}, at: [] __skb_dequeue include/linux/skbuff.h:1783 [inline] #2: (rcu_read_lock){..}, at: [] process_backlog+0x264/0x730 net/core/dev.c:4835 #3: (rcu_read_lock){..}, at: [] ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59 #4: (slock-AF_INET6){+.-...}, at: [] spin_lock include/linux/spinlock.h:299 [inline] #4: (slock-AF_INET6){+.-...}, at: [] sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504 Fix it just like was done by b0691c8ee7c2 ("net: Unlock sock before calling sk_free()"). Reported-by: Dmitry Vyukov <dvyu...@google.com> Cc: Cong Wang <xiyou.wangc...@gmail.com> Cc: Eric Dumazet <eduma...@google.com> Cc: Gerrit Renker <ger...@erg.abdn.ac.uk> Cc: Thomas Gleixner <t...@linutronix.de> Link: http://lkml.kernel.org/r/20170301153510.ge15...@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- net/dccp/minisocks.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c index 53eddf99e4f6..d20d948a98ed 100644 --- a/net/dccp/minisocks.c +++ b/net/dccp/minisocks.c @@ -122,6 +122,7 @@ struct sock *dccp_create_openreq_child(const struct sock *sk, /* It is still raw copy of parent, so invalidate * destructor and make plain sk_free() */ newsk->sk_destruct = NULL; + bh_unlock_sock(newsk); sk_free(newsk); return NULL; } -- 2.9.3
[PATCH 2/2] net: Introduce sk_clone_lock() error path routine
From: Arnaldo Carvalho de Melo <a...@redhat.com> When handling problems in cloning a socket with the sk_clone_locked() function we need to perform several steps that were open coded in it and its callers, so introduce a routine to avoid this duplication: sk_free_unlock_clone(). Cc: Cong Wang <xiyou.wangc...@gmail.com> Cc: Dmitry Vyukov <dvyu...@google.com> Cc: Eric Dumazet <eduma...@google.com> Cc: Gerrit Renker <ger...@erg.abdn.ac.uk> Cc: Thomas Gleixner <t...@linutronix.de> Link: http://lkml.kernel.org/n/net-ui6laqkotycunhtmqryl9...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- include/net/sock.h | 1 + net/core/sock.c | 16 +++- net/dccp/minisocks.c | 6 +- 3 files changed, 13 insertions(+), 10 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index c4f5e6fca17c..93d1160bcd32 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1520,6 +1520,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t priority, void sk_free(struct sock *sk); void sk_destruct(struct sock *sk); struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority); +void sk_free_unlock_clone(struct sock *sk); struct sk_buff *sock_wmalloc(struct sock *sk, unsigned long size, int force, gfp_t priority); diff --git a/net/core/sock.c b/net/core/sock.c index 4eca27dc5c94..a3d9bb20f65d 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1540,11 +1540,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority) is_charged = sk_filter_charge(newsk, filter); if (unlikely(!is_charged || xfrm_sk_clone_policy(newsk, sk))) { - /* It is still raw copy of parent, so invalidate -* destructor and make plain sk_free() */ - newsk->sk_destruct = NULL; - bh_unlock_sock(newsk); - sk_free(newsk); + sk_free_unlock_clone(newsk); newsk = NULL; goto out; } @@ -1593,6 +1589,16 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority) } EXPORT_SYMBOL_GPL(sk_clone_lock); +void sk_free_unlock_clone(struct sock *sk) +{ + /* It is still raw copy of parent, so invalidate +* destructor and make plain sk_free() */ + sk->sk_destruct = NULL; + bh_unlock_sock(sk); + sk_free(sk); +} +EXPORT_SYMBOL_GPL(sk_free_unlock_clone); + void sk_setup_caps(struct sock *sk, struct dst_entry *dst) { u32 max_segs = 1; diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c index d20d948a98ed..e267e6f4c9a5 100644 --- a/net/dccp/minisocks.c +++ b/net/dccp/minisocks.c @@ -119,11 +119,7 @@ struct sock *dccp_create_openreq_child(const struct sock *sk, * Activate features: initialise CCIDs, sequence windows etc. */ if (dccp_feat_activate_values(newsk, >dreq_featneg)) { - /* It is still raw copy of parent, so invalidate -* destructor and make plain sk_free() */ - newsk->sk_destruct = NULL; - bh_unlock_sock(newsk); - sk_free(newsk); + sk_free_unlock_clone(newsk); return NULL; } dccp_init_xmit_timers(newsk); -- 2.9.3
Re: net/dccp: dccp_create_openreq_child freed held lock
Em Wed, Mar 01, 2017 at 10:38:54AM +0100, Dmitry Vyukov escreveu: > Hello, > > I've got the following report while running syzkaller fuzzer on > 86292b33d4b79ee03e2f43ea0381ef85f077c760: > > > It seems that dccp_create_openreq_child needs to unlock the sock if > dccp_feat_activate_values fails. Yeah, can you please use the patch below, that mimics the error paths in sk_clone_new(), from where I think even the comment about it being a raw copy came, but the bh_unlock_sock() didn't? - Arnaldo diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c index 53eddf99e4f6..d20d948a98ed 100644 --- a/net/dccp/minisocks.c +++ b/net/dccp/minisocks.c @@ -122,6 +122,7 @@ struct sock *dccp_create_openreq_child(const struct sock *sk, /* It is still raw copy of parent, so invalidate * destructor and make plain sk_free() */ newsk->sk_destruct = NULL; + bh_unlock_sock(newsk); sk_free(newsk); return NULL; }
Re: net/dccp: dccp_create_openreq_child freed held lock
Em Wed, Mar 01, 2017 at 12:35:10PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Wed, Mar 01, 2017 at 10:38:54AM +0100, Dmitry Vyukov escreveu: > > Hello, > > > > I've got the following report while running syzkaller fuzzer on > > 86292b33d4b79ee03e2f43ea0381ef85f077c760: > > > > > > It seems that dccp_create_openreq_child needs to unlock the sock if > > dccp_feat_activate_values fails. > > Yeah, can you please use the patch below, that mimics the error paths in > sk_clone_new(), from where I think even the comment about it being a raw Argh, s/sk_clone_new()/sk_clone_lock()/g - Arnaldo > copy came, but the bh_unlock_sock() didn't? > > - Arnaldo > > diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c > index 53eddf99e4f6..d20d948a98ed 100644 > --- a/net/dccp/minisocks.c > +++ b/net/dccp/minisocks.c > @@ -122,6 +122,7 @@ struct sock *dccp_create_openreq_child(const struct sock > *sk, > /* It is still raw copy of parent, so invalidate >* destructor and make plain sk_free() */ > newsk->sk_destruct = NULL; > + bh_unlock_sock(newsk); > sk_free(newsk); > return NULL; > }
Re: [PATCH] net/dccp: fix use after free in tw_timer_handler()
Em Tue, Feb 21, 2017 at 02:27:40PM +0300, Andrey Ryabinin escreveu: > DCCP doesn't purge timewait sockets on network namespace shutdown. > So, after net namespace destroyed we could still have an active timer > which will trigger use after free in tw_timer_handler(): > > > Add .exit_batch hook to dccp_v4_ops()/dccp_v6_ops() which will purge > timewait sockets on net namespace destruction and prevent above issue. Please add this, to help stable kernels to pick this up Fixes: b099ce2602d8 ("net: Batch inet_twsk_purge") Cc: Eric W. Biederman <ebied...@xmission.com> [acme@jouet linux]$ git describe b099ce2602d8 v2.6.32-rc8-1977-gb099ce2602d8 This one added the pernet operations related to network namespaces, but then the one above got missed. commit 72a2d6138224298a576bcdc33d7d0004de604856 Author: Pavel Emelyanov <xe...@openvz.org> Date: Sun Apr 13 22:29:13 2008 -0700 [NETNS][DCCPV4]: Add dummy per-net operations. -- It looks ok, so please consider adding my: Acked-by: Arnaldo Carvalho de Melo <a...@redhat.com> - Arnaldo > Reported-by: Dmitry Vyukov <dvyu...@google.com> > Signed-off-by: Andrey Ryabinin <aryabi...@virtuozzo.com> > --- > net/dccp/ipv4.c | 6 ++ > net/dccp/ipv6.c | 6 ++ > 2 files changed, 12 insertions(+) > > diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c > index d859a5c..da7cb16 100644 > --- a/net/dccp/ipv4.c > +++ b/net/dccp/ipv4.c > @@ -1018,9 +1018,15 @@ static void __net_exit dccp_v4_exit_net(struct net > *net) > inet_ctl_sock_destroy(net->dccp.v4_ctl_sk); > } > > +static void __net_exit dccp_v4_exit_batch(struct list_head *net_exit_list) > +{ > + inet_twsk_purge(_hashinfo, _death_row, AF_INET); > +} > + > static struct pernet_operations dccp_v4_ops = { > .init = dccp_v4_init_net, > .exit = dccp_v4_exit_net, > + .exit_batch = dccp_v4_exit_batch, > }; > > static int __init dccp_v4_init(void) > diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c > index c4e879c..f3d8f92 100644 > --- a/net/dccp/ipv6.c > +++ b/net/dccp/ipv6.c > @@ -1077,9 +1077,15 @@ static void __net_exit dccp_v6_exit_net(struct net > *net) > inet_ctl_sock_destroy(net->dccp.v6_ctl_sk); > } > > +static void __net_exit dccp_v6_exit_batch(struct list_head *net_exit_list) > +{ > + inet_twsk_purge(_hashinfo, _death_row, AF_INET6); > +} > + > static struct pernet_operations dccp_v6_ops = { > .init = dccp_v6_init_net, > .exit = dccp_v6_exit_net, > + .exit_batch = dccp_v6_exit_batch, > }; > > static int __init dccp_v6_init(void) > -- > 2.10.2 > > -- > To unsubscribe from this list: send the line "unsubscribe dccp" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: build failure after merge of the net tree
Em Tue, Feb 14, 2017 at 02:23:26PM +0100, Jiri Olsa escreveu: > On Tue, Feb 14, 2017 at 09:50:20AM -0300, Arnaldo Carvalho de Melo wrote: > > SNIP > > > > > What I think Ingo meant with dependency at the build system level is to > > somehow state that if file A gets changed, then tool B must be rebuilt. > > > > Now that samples/bpf and tools/perf/ depend on tools/lib/bpf/ I _always_ > > build both, ditto for tools/objtool, that shares a different library > > with tools/perf/, tools/lib/subcmd/: > > > > ENTRYPOINT make -C /git/linux/tools/perf O=/tmp/build/perf && \ > >rm -rf /tmp/build/perf/{.[^.]*,*} && \ > >make NO_LIBELF=1 -C /git/linux/tools/perf O=/tmp/build/perf && \ > >make -C /git/linux/tools/objtool O=/tmp/build/objtool && \ > >make -C /git/linux O=/tmp/build/linux allmodconfig && \ > >make -C /git/linux O=/tmp/build/linux headers_install && \ > >make -C /git/linux O=/tmp/build/linux samples/bpf/ > > > > This is the default action for my > > docker.io/acmel/linux-perf-tools-build-fedora:rawhide container. > > > > It is published, so a: > > > >docker pull docker.io/acmel/linux-perf-tools-build-fedora:rawhide > > > > And then run it before pushing things upstream would catch these kinds > > of errors. > > > > But that would possibly disrupt too much people's workflow, that is why > > using the Kbuild originated tools/build/ we have to somehow express that > > when a change is made in a file then a tool that uses that file needs to > > be rebuilt. > > we already have the check in the check-headers.sh script, > an AFAICS there's no 'rebuild' option here.. just warn or fail > because the headers update needs to be done manualy ... when needed. And that will only be detected if you try to build tools using what is in tools/include/linux/bpf.h Tools using tools/lib/bpf/ _must_ use what is in tools/include/. So lemme see if my reasoning is right: tools/lib/bpf/bpf.c has: #include Now, samples/bpf/ will build tools/lib/bpf/bpf.o: # Libbpf dependencies LIBBPF := ../../tools/lib/bpf/bpf.o HOSTCFLAGS += -I$(objtree)/usr/include HOSTCFLAGS += -I$(srctree)/tools/lib/ HOSTCFLAGS += -I$(srctree)/tools/testing/selftests/bpf/ HOSTCFLAGS += -I$(srctree)/tools/lib/ -I$(srctree)/tools/include HOSTCFLAGS += -I$(srctree)/tools/perf HOSTCFLAGS_bpf_load.o += -I$(objtree)/usr/include -Wno-unused-variable So it will never include tools/include/uapi/linux/bpf.h, which it should. Because the workflow people working on sample/bpf/ is to first install the new headers using a variation of: make headers_install So they will get the new bpf.h, not use tools/include/uapi/linux/bpf.h, b00m. They should use tools/include/uapi/linux/bpf.h, which is the one we know builds well with tools/lib/bpf/bpf.c, since we tested it last time we made the copy. > > Makefile rules probably would be enough, but then it would have to be > > done at the tools/build/ level and all tools using shared components > > would have to use it to trigger the rebuild. > we can move/invoke the check-headers.sh script in some upper dir Most of the time I just ignore that warning, only when I find spare time I go look if the changes in the kernel copy, i.e. upstream, should trigger changes in the tools using its copy in tools/include/. - Arnaldo
Re: linux-next: build failure after merge of the net tree
Em Tue, Feb 14, 2017 at 10:19:37AM +0100, Jiri Olsa escreveu: > On Tue, Feb 14, 2017 at 07:42:21AM +0100, Ingo Molnar wrote: > > * Stephen Rothwellwrote: > > > Unfortunately, the perf header files are kept separate from the kernel > > > header files proper and are not automatically copied over :-( > > No, that's wrong, the problem is not that headers were not shared, the > > problem is > > that a tooling interdependency was not properly tested *and* that the > > dependency > > was not properly implemented in the build system either. > > Note that we had similar build breakages when include headers _were_ shared > > as > > well, so sharing the headers would only have worked around this particular > > bug and > > would have introduced fragility in other places... > > The best, most robust solution in this particular case would be to fix the > > (tooling) build system to express the dependency, that would have shown the > > build > > failure right when the modification was done. > so we have the warning now: > Warning: tools/include/uapi/linux/bpf.h differs from kernel > do you want to change it into the build failure? No. Differences in the copy are not always problematic, the problem here lies elsewhere. Please run: make -C tools all To build all tools when you touch something in tools/include and/or tools/lib/ - Arnaldo Bored? Here is what I first wrote ;-) Simply using the kernel original would require kernel hackers to build all tools using that file, something we long decided not to do. What I think Ingo meant with dependency at the build system level is to somehow state that if file A gets changed, then tool B must be rebuilt. Now that samples/bpf and tools/perf/ depend on tools/lib/bpf/ I _always_ build both, ditto for tools/objtool, that shares a different library with tools/perf/, tools/lib/subcmd/: ENTRYPOINT make -C /git/linux/tools/perf O=/tmp/build/perf && \ rm -rf /tmp/build/perf/{.[^.]*,*} && \ make NO_LIBELF=1 -C /git/linux/tools/perf O=/tmp/build/perf && \ make -C /git/linux/tools/objtool O=/tmp/build/objtool && \ make -C /git/linux O=/tmp/build/linux allmodconfig && \ make -C /git/linux O=/tmp/build/linux headers_install && \ make -C /git/linux O=/tmp/build/linux samples/bpf/ This is the default action for my docker.io/acmel/linux-perf-tools-build-fedora:rawhide container. It is published, so a: docker pull docker.io/acmel/linux-perf-tools-build-fedora:rawhide And then run it before pushing things upstream would catch these kinds of errors. But that would possibly disrupt too much people's workflow, that is why using the Kbuild originated tools/build/ we have to somehow express that when a change is made in a file then a tool that uses that file needs to be rebuilt. Makefile rules probably would be enough, but then it would have to be done at the tools/build/ level and all tools using shared components would have to use it to trigger the rebuild. - Arnaldo
[GIT PULL 00/15] perf/core improvements and fixes
Hi Ingo, Please consider pulling, - Arnaldo Test results at the end of this message, as usual. The following changes since commit f2029b1e47b607619d1dd2cb0bbb77f64ec6b7c2: perf/x86/intel: Add Kaby Lake support (2017-02-11 21:28:23 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170213 for you to fetch changes up to a734fb5d60067a73dd7099a58756847c07f9cd68: samples/bpf: Reset global variables (2017-02-13 17:22:53 -0300) perf/core improvements and fixes: New feature: - Introduce the 'delta-abs' 'perf diff' compute method, that orders the histogram entries by the absolute value of the percentage delta for a function in two perf.data files, i.e. the functions that changed the most (increase or decrease in samples) comes first (Namhyung Kim) User visible: - Improve message about tweaking the kernel.perf_event_paranoid setting, telling how to make the change permanent by editing /etc/sysctl.conf (Ingo Molnar) Infrastructure: - Introduce linux/compiler-gcc.h as a counterpart to the kernel's, initially containing the definition of __fallthrough, more to come (__maybe_unused, etc) (Arnaldo Carvalho de Melo) - Fixes for problems uncovered by building tools/perf with clang, such as always true tests of arrays against NULL and variables that sometimes were used without being initialized (Arnaldo Carvalho de Melo, Steven Rostedt) - Before loading a new ELF, clear global variables set by the samples/bpf loader (Mickaël Salaün) - Ignore already processed ELF sections in the samples/bpf loader (Mickaël Salaün) - Fix compile error in the scripting code with some perl5 versions (Wang YanQing) Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> ---- Arnaldo Carvalho de Melo (6): tools include: Introduce linux/compiler-gcc.h tools lib traceevent plugin function: Initialize 'index' variable perf evsel: Inform how to make a sysctl setting permanent perf symbols: No need to check if sym->name is NULL perf tests record: No need to test an array against NULL perf symbols: dso->name is an array, no need to check it against NULL Mickaël Salaün (3): samples/bpf: Add missing header samples/bpf: Ignore already processed ELF sections samples/bpf: Reset global variables Namhyung Kim (4): perf diff: Add 'delta-abs' compute method perf diff: Add diff.order config option perf diff: Add diff.compute config option perf diff: Change default setting to "delta-abs" Steven Rostedt (VMware) (1): tools lib traceevent: Initialize lenght on OLD_RING_BUFFER_TYPE_TIME_STAMP Wang YanQing (1): perf scripting perl: Fix compile error with some perl5 versions samples/bpf/bpf_load.c | 7 ++ samples/bpf/tracex5_kern.c | 1 + tools/include/linux/compiler-gcc.h | 14 tools/include/linux/compiler.h | 10 +-- tools/lib/traceevent/kbuffer-parse.c | 1 + tools/lib/traceevent/plugin_function.c | 2 +- tools/perf/Documentation/perf-config.txt | 12 tools/perf/Documentation/perf-diff.txt | 15 - tools/perf/MANIFEST| 1 + tools/perf/builtin-diff.c | 78 -- tools/perf/builtin-kmem.c | 4 +- tools/perf/builtin-record.c| 2 +- tools/perf/builtin-sched.c | 2 +- tools/perf/builtin-stat.c | 2 +- tools/perf/builtin-top.c | 2 +- tools/perf/tests/perf-record.c | 2 +- tools/perf/util/evsel.c| 4 +- tools/perf/util/evsel_fprintf.c| 1 - tools/perf/util/machine.c | 2 +- tools/perf/util/map.c | 4 +- tools/perf/util/scripting-engines/Build| 2 +- .../perf/util/scripting-engines/trace-event-perl.c | 4 +- tools/perf/util/symbol_fprintf.c | 2 +- 23 files changed, 145 insertions(+), 29 deletions(-) create mode 100644 tools/include/linux/compiler-gcc.h Test results: The first ones are container (docker) based builds of tools/perf with and without libelf support, objtool where it is supported and samples/bpf/, ditto. Several are cross builds, the ones with -x-ARCH, and the android one, and those may not have all the features built, due to lack of multi-arch devel packages, available and being used so far on just a few, like debian:experimental-x-{arm64,mipsel}. The 'perf test' one will perform a variety of tests exercising tools/perf/util/, tools/lib/{bpf,traceevent,etc
[PATCH 13/15] samples/bpf: Add missing header
From: Mickaël Salaün <m...@digikod.net> Include unistd.h to define __NR_getuid and __NR_getsid. Signed-off-by: Mickaël Salaün <m...@digikod.net> Acked-by: Joe Stringer <j...@ovn.org> Acked-by: Wang Nan <wangn...@huawei.com> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: David S. Miller <da...@davemloft.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170208202744.16274-4-...@digikod.net Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- samples/bpf/tracex5_kern.c | 1 + 1 file changed, 1 insertion(+) diff --git a/samples/bpf/tracex5_kern.c b/samples/bpf/tracex5_kern.c index fd12d7154d42..7e4cf74553ff 100644 --- a/samples/bpf/tracex5_kern.c +++ b/samples/bpf/tracex5_kern.c @@ -8,6 +8,7 @@ #include #include #include +#include #include "bpf_helpers.h" #define PROG(F) SEC("kprobe/"__stringify(F)) int bpf_func_##F -- 2.9.3
[PATCH 14/15] samples/bpf: Ignore already processed ELF sections
From: Mickaël Salaün <m...@digikod.net> Add a missing check for the map fixup loop. Signed-off-by: Mickaël Salaün <m...@digikod.net> Acked-by: Joe Stringer <j...@ovn.org> Acked-by: Wang Nan <wangn...@huawei.com> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: David S. Miller <da...@davemloft.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170208202744.16274-2-...@digikod.net Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- samples/bpf/bpf_load.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c index 396e204888b3..e04fe09d7c2e 100644 --- a/samples/bpf/bpf_load.c +++ b/samples/bpf/bpf_load.c @@ -328,6 +328,8 @@ int load_bpf_file(char *path) /* load programs that need map fixup (relocations) */ for (i = 1; i < ehdr.e_shnum; i++) { + if (processed_sec[i]) + continue; if (get_sec(elf, i, , , , )) continue; -- 2.9.3
[PATCH 15/15] samples/bpf: Reset global variables
From: Mickaël Salaün <m...@digikod.net> Before loading a new ELF, clean previous kernel version, license and processed sections. Signed-off-by: Mickaël Salaün <m...@digikod.net> Acked-by: Joe Stringer <j...@ovn.org> Acked-by: Wang Nan <wangn...@huawei.com> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: David S. Miller <da...@davemloft.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170208202744.16274-3-...@digikod.net Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- samples/bpf/bpf_load.c | 5 + 1 file changed, 5 insertions(+) diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c index e04fe09d7c2e..b86ee54da2d1 100644 --- a/samples/bpf/bpf_load.c +++ b/samples/bpf/bpf_load.c @@ -277,6 +277,11 @@ int load_bpf_file(char *path) Elf_Data *data, *data_prog, *symbols = NULL; char *shname, *shname_prog; + /* reset global variables */ + kern_version = 0; + memset(license, 0, sizeof(license)); + memset(processed_sec, 0, sizeof(processed_sec)); + if (elf_version(EV_CURRENT) == EV_NONE) return 1; -- 2.9.3
Re: [PATCH v4 0/3] Miscellaneous fixes for BPF (perf tree)
Em Mon, Feb 13, 2017 at 09:42:31AM +0800, Wangnan (F) escreveu: > On 2017/2/9 4:27, Mickaël Salaün wrote: > >Mickaël Salaün (3): > > samples/bpf: Ignore already processed ELF sections > > samples/bpf: Reset global variables > > samples/bpf: Add missing header > > > > samples/bpf/bpf_load.c | 7 +++ > > samples/bpf/tracex5_kern.c | 1 + > > 2 files changed, 8 insertions(+) > > > Looks good to me. > > Thank you. Thanks, applied, added Acked-by tags for you and Joe. - Arnaldo
[PATCH 1/1] MAINTAINERS: Remove old e-mail address
The ghostprotocols.net domain is not working, remove it from CREDITS and MAINTAINERS, and change the status to "Odd fixes", and since I haven't been maintaining those, remove my address from there. Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- CREDITS | 5 ++--- MAINTAINERS | 15 ++- 2 files changed, 8 insertions(+), 12 deletions(-) diff --git a/CREDITS b/CREDITS index c58560701d13..c5626bf06264 100644 --- a/CREDITS +++ b/CREDITS @@ -2478,12 +2478,11 @@ S: D-90453 Nuernberg S: Germany N: Arnaldo Carvalho de Melo -E: a...@ghostprotocols.net +E: a...@kernel.org E: arnaldo.m...@gmail.com E: a...@redhat.com -W: http://oops.ghostprotocols.net:81/blog/ P: 1024D/9224DF01 D5DF E3BB E3C8 BCBB F8AD 841A B6AB 4681 9224 DF01 -D: IPX, LLC, DCCP, cyc2x, wl3501_cs, net/ hacks +D: tools/, IPX, LLC, DCCP, cyc2x, wl3501_cs, net/ hacks S: Brazil N: Karsten Merker diff --git a/MAINTAINERS b/MAINTAINERS index 3960e7faaa99..b781db49d363 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -877,8 +877,8 @@ S: Odd fixes F: drivers/hwmon/applesmc.c APPLETALK NETWORK LAYER -M: Arnaldo Carvalho de Melo <a...@ghostprotocols.net> -S: Maintained +L: netdev@vger.kernel.org +S: Odd fixes F: drivers/net/appletalk/ F: net/appletalk/ @@ -6727,9 +6727,8 @@ S:Odd Fixes F: drivers/tty/ipwireless/ IPX NETWORK LAYER -M: Arnaldo Carvalho de Melo <a...@ghostprotocols.net> L: netdev@vger.kernel.org -S: Maintained +S: Odd fixes F: include/net/ipx.h F: include/uapi/linux/ipx.h F: net/ipx/ @@ -7501,8 +7500,8 @@ S:Maintained F: drivers/misc/lkdtm* LLC (802.2) -M: Arnaldo Carvalho de Melo <a...@ghostprotocols.net> -S: Maintained +L: netdev@vger.kernel.org +S: Odd fixes F: include/linux/llc.h F: include/uapi/linux/llc.h F: include/net/llc* @@ -13373,10 +13372,8 @@ S: Maintained F: drivers/input/misc/wistron_btns.c WL3501 WIRELESS PCMCIA CARD DRIVER -M: Arnaldo Carvalho de Melo <a...@ghostprotocols.net> L: linux-wirel...@vger.kernel.org -W: http://oops.ghostprotocols.net:81/blog -S: Maintained +S: Odd fixes F: drivers/net/wireless/wl3501* WOLFSON MICROELECTRONICS DRIVERS -- 2.9.3
Re: [PATCH v4 0/3] Miscellaneous fixes for BPF (perf tree)
Em Wed, Feb 08, 2017 at 09:27:41PM +0100, Mickaël Salaün escreveu: > This series brings some fixes and small improvements to the BPF samples. > > This is intended for the perf tree and apply on 7a5980f9c006 ("tools lib bpf: > Add missing header to the library"). Wang, are you ok with this series? Joe? - Arnaldo > Changes since v3: > * remove applied patch 1/5 > * remove patch 2/5 on bpf_load_program() as requested by Wang Nan > > Changes since v2: > * add this cover letter > > Changes since v1: > * exclude patches not intended for the perf tree > > Regards, > > Mickaël Salaün (3): > samples/bpf: Ignore already processed ELF sections > samples/bpf: Reset global variables > samples/bpf: Add missing header > > samples/bpf/bpf_load.c | 7 +++ > samples/bpf/tracex5_kern.c | 1 + > 2 files changed, 8 insertions(+) > > -- > 2.11.0
Re: [PATCH net-next v3 04/11] bpf: Use bpf_load_program() from the library
Em Tue, Feb 07, 2017 at 03:17:43PM -0800, Alexei Starovoitov escreveu: > On 2/7/17 1:44 PM, Mickaël Salaün wrote: > >-union bpf_attr attr; > >+union bpf_attr attr = {}; > > > >-bzero(, sizeof(attr)); > > I think somebody mentioned that there are compilers out there > that don't do it correctly, hence it was done with explicit bzero. > Arnaldo, Wang, do you remember the details? https://www.spinics.net/lists/netdev/msg411144.html But this was when some named initializers are used in a union with unnamed members like 'union bpf_attr', unsure if this would break with the above case where no named initializers are being used. Having that said, the above is gratuitous, the code that is being replaced is not related to the patch at hand, and conceptually the end result should be the same. So, please, just leave it as is, i.e. using bzero() and make your patch a bit smaller, remember, small is good, smaller is even better ;-) - Arnaldo
Re: [PATCH net-next v3 04/11] bpf: Use bpf_load_program() from the library
Em Tue, Feb 07, 2017 at 03:17:43PM -0800, Alexei Starovoitov escreveu: > On 2/7/17 1:44 PM, Mickaël Salaün wrote: > >-union bpf_attr attr; > >+union bpf_attr attr = {}; > > > >-bzero(, sizeof(attr)); > I think somebody mentioned that there are compilers out there > that don't do it correctly, hence it was done with explicit bzero. > Arnaldo, Wang, do you remember the details? Yeah, lemme dig it... - Arnaldo
Re: [PATCH v3 1/5] bpf: Add missing header to the library
Em Wed, Feb 08, 2017 at 10:47:18AM +0800, Wangnan (F) escreveu: > >+++ b/tools/lib/bpf/bpf.h > >@@ -22,6 +22,7 @@ > > #define __BPF_BPF_H > > #include > >+#include > > int bpf_create_map(enum bpf_map_type map_type, int key_size, int > > value_size, > >int max_entries, __u32 map_flags); > Looks good to me. > > Thank you. Applied, took the "Thank you" as an "Acked-by: Wang", Regards, - Arnaldo
[PATCH 09/14] tools lib bpf: Add bpf_object__pin()
From: Joe Stringer <j...@ovn.org> Add a new API to pin a BPF object to the filesystem. The user can specify the path within a BPF filesystem to pin the object. Programs will be pinned under a subdirectory named the same as the program, with each instance appearing as a numbered file under that directory, and maps will be pinned under the path using the name of the map as the file basename. For example, with the directory '/sys/fs/bpf/foo' and a BPF object which contains two instances of a program named 'bar', and a map named 'baz': /sys/fs/bpf/foo/bar/0 /sys/fs/bpf/foo/bar/1 /sys/fs/bpf/foo/baz Signed-off-by: Joe Stringer <j...@ovn.org> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: Wang Nan <wangn...@huawei.com> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170126212001.14103-4-...@ovn.org [ Check snprintf >= for truncation, as snprintf(bf, size, ...) == size also means truncation ] Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/lib/bpf/libbpf.c | 53 ++ tools/lib/bpf/libbpf.h | 1 + 2 files changed, 54 insertions(+) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 6a8c8beeb291..ac6eb863b2a4 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1379,6 +1379,59 @@ int bpf_map__pin(struct bpf_map *map, const char *path) return 0; } +int bpf_object__pin(struct bpf_object *obj, const char *path) +{ + struct bpf_program *prog; + struct bpf_map *map; + int err; + + if (!obj) + return -ENOENT; + + if (!obj->loaded) { + pr_warning("object not yet loaded; load it first\n"); + return -ENOENT; + } + + err = make_dir(path); + if (err) + return err; + + bpf_map__for_each(map, obj) { + char buf[PATH_MAX]; + int len; + + len = snprintf(buf, PATH_MAX, "%s/%s", path, + bpf_map__name(map)); + if (len < 0) + return -EINVAL; + else if (len >= PATH_MAX) + return -ENAMETOOLONG; + + err = bpf_map__pin(map, buf); + if (err) + return err; + } + + bpf_object__for_each_program(prog, obj) { + char buf[PATH_MAX]; + int len; + + len = snprintf(buf, PATH_MAX, "%s/%s", path, + prog->section_name); + if (len < 0) + return -EINVAL; + else if (len >= PATH_MAX) + return -ENAMETOOLONG; + + err = bpf_program__pin(prog, buf); + if (err) + return err; + } + + return 0; +} + void bpf_object__close(struct bpf_object *obj) { size_t i; diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 2addf9d5b13c..b30394f9947a 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -65,6 +65,7 @@ struct bpf_object *bpf_object__open(const char *path); struct bpf_object *bpf_object__open_buffer(void *obj_buf, size_t obj_buf_sz, const char *name); +int bpf_object__pin(struct bpf_object *object, const char *path); void bpf_object__close(struct bpf_object *object); /* Load/unload object into/from kernel */ -- 2.9.3
[PATCH 10/14] tools perf util: Make rm_rf(path) argument const
From: Joe Stringer <j...@ovn.org> rm_rf() doesn't modify its path argument, and a future caller will pass a string constant into it to delete. Signed-off-by: Joe Stringer <j...@ovn.org> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: Wang Nan <wangn...@huawei.com> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170126212001.14103-5-...@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/perf/util/util.c | 2 +- tools/perf/util/util.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c index bf29aed16bd6..d8b45cea54d0 100644 --- a/tools/perf/util/util.c +++ b/tools/perf/util/util.c @@ -85,7 +85,7 @@ int mkdir_p(char *path, mode_t mode) return (stat(path, ) && mkdir(path, mode)) ? -1 : 0; } -int rm_rf(char *path) +int rm_rf(const char *path) { DIR *dir; int ret = 0; diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h index 6e8be174ec0b..c74708da8571 100644 --- a/tools/perf/util/util.h +++ b/tools/perf/util/util.h @@ -209,7 +209,7 @@ static inline int sane_case(int x, int high) } int mkdir_p(char *path, mode_t mode); -int rm_rf(char *path); +int rm_rf(const char *path); struct strlist *lsdir(const char *name, bool (*filter)(const char *, struct dirent *)); bool lsdir_no_dot_filter(const char *name, struct dirent *d); int copyfile(const char *from, const char *to); -- 2.9.3
[PATCH 11/14] tools lib api fs: Add bpf_fs filesystem detector
From: Joe Stringer <j...@ovn.org> Allow mounting of the BPF filesystem at /sys/fs/bpf. Signed-off-by: Joe Stringer <j...@ovn.org> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: Wang Nan <wangn...@huawei.com> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170126212001.14103-6-...@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/lib/api/fs/fs.c | 16 tools/lib/api/fs/fs.h | 1 + 2 files changed, 17 insertions(+) diff --git a/tools/lib/api/fs/fs.c b/tools/lib/api/fs/fs.c index f99f49e4a31e..4b6bfc43cccf 100644 --- a/tools/lib/api/fs/fs.c +++ b/tools/lib/api/fs/fs.c @@ -38,6 +38,10 @@ #define HUGETLBFS_MAGIC0x958458f6 #endif +#ifndef BPF_FS_MAGIC +#define BPF_FS_MAGIC 0xcafe4a11 +#endif + static const char * const sysfs__fs_known_mountpoints[] = { "/sys", 0, @@ -75,6 +79,11 @@ static const char * const hugetlbfs__known_mountpoints[] = { 0, }; +static const char * const bpf_fs__known_mountpoints[] = { + "/sys/fs/bpf", + 0, +}; + struct fs { const char *name; const char * const *mounts; @@ -89,6 +98,7 @@ enum { FS__DEBUGFS = 2, FS__TRACEFS = 3, FS__HUGETLBFS = 4, + FS__BPF_FS = 5, }; #ifndef TRACEFS_MAGIC @@ -121,6 +131,11 @@ static struct fs fs__entries[] = { .mounts = hugetlbfs__known_mountpoints, .magic = HUGETLBFS_MAGIC, }, + [FS__BPF_FS] = { + .name = "bpf", + .mounts = bpf_fs__known_mountpoints, + .magic = BPF_FS_MAGIC, + }, }; static bool fs__read_mounts(struct fs *fs) @@ -280,6 +295,7 @@ FS(procfs, FS__PROCFS); FS(debugfs, FS__DEBUGFS); FS(tracefs, FS__TRACEFS); FS(hugetlbfs, FS__HUGETLBFS); +FS(bpf_fs, FS__BPF_FS); int filename__read_int(const char *filename, int *value) { diff --git a/tools/lib/api/fs/fs.h b/tools/lib/api/fs/fs.h index a63269f5d20c..6b332dc74498 100644 --- a/tools/lib/api/fs/fs.h +++ b/tools/lib/api/fs/fs.h @@ -22,6 +22,7 @@ FS(procfs) FS(debugfs) FS(tracefs) FS(hugetlbfs) +FS(bpf_fs) #undef FS -- 2.9.3
[PATCH 08/14] tools lib bpf: Add bpf_map__pin()
From: Joe Stringer <j...@ovn.org> Add a new API to pin a BPF map to the filesystem. The user can specify the path full path within a BPF filesystem to pin the map. Signed-off-by: Joe Stringer <j...@ovn.org> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: Wang Nan <wangn...@huawei.com> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170126212001.14103-3-...@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/lib/bpf/libbpf.c | 22 ++ tools/lib/bpf/libbpf.h | 1 + 2 files changed, 23 insertions(+) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index c4465b2fddf6..6a8c8beeb291 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1357,6 +1357,28 @@ int bpf_program__pin(struct bpf_program *prog, const char *path) return 0; } +int bpf_map__pin(struct bpf_map *map, const char *path) +{ + int err; + + err = check_path(path); + if (err) + return err; + + if (map == NULL) { + pr_warning("invalid map pointer\n"); + return -EINVAL; + } + + if (bpf_obj_pin(map->fd, path)) { + pr_warning("failed to pin map: %s\n", strerror(errno)); + return -errno; + } + + pr_debug("pinned map '%s'\n", path); + return 0; +} + void bpf_object__close(struct bpf_object *obj) { size_t i; diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 9f8aa63b95f4..2addf9d5b13c 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -236,6 +236,7 @@ typedef void (*bpf_map_clear_priv_t)(struct bpf_map *, void *); int bpf_map__set_priv(struct bpf_map *map, void *priv, bpf_map_clear_priv_t clear_priv); void *bpf_map__priv(struct bpf_map *map); +int bpf_map__pin(struct bpf_map *map, const char *path); long libbpf_get_error(const void *ptr); -- 2.9.3
[PATCH 12/14] perf test: Add libbpf pinning test
From: Joe Stringer <j...@ovn.org> Add a test for the newly added BPF object pinning functionality. For example: # tools/perf/perf test 37 37: BPF filter : 37.1: Basic BPF filtering : Ok 37.2: BPF pinning : Ok 37.3: BPF prologue generation : Ok 37.4: BPF relocation checker : Ok # tools/perf/perf test 37 -v 2>&1 | grep pinned libbpf: pinned map '/sys/fs/bpf/perf_test/flip_table' libbpf: pinned program '/sys/fs/bpf/perf_test/func=SyS_epoll_wait/0' Signed-off-by: Joe Stringer <j...@ovn.org> Requested-and-Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: Wang Nan <wangn...@huawei.com> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170126212001.14103-7-...@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/perf/tests/bpf.c | 42 +- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c index 92343f43e44a..1a04fe77487d 100644 --- a/tools/perf/tests/bpf.c +++ b/tools/perf/tests/bpf.c @@ -5,11 +5,13 @@ #include #include #include +#include #include #include "tests.h" #include "llvm.h" #include "debug.h" #define NR_ITERS 111 +#define PERF_TEST_BPF_PATH "/sys/fs/bpf/perf_test" #ifdef HAVE_LIBBPF_SUPPORT @@ -54,6 +56,7 @@ static struct { const char *msg_load_fail; int (*target_func)(void); int expect_result; + boolpin; } bpf_testcase_table[] = { { LLVM_TESTCASE_BASE, @@ -63,6 +66,17 @@ static struct { "load bpf object failed", _wait_loop, (NR_ITERS + 1) / 2, + false, + }, + { + LLVM_TESTCASE_BASE, + "BPF pinning", + "[bpf_pinning]", + "fix kbuild first", + "check your vmlinux setting?", + _wait_loop, + (NR_ITERS + 1) / 2, + true, }, #ifdef HAVE_BPF_PROLOGUE { @@ -73,6 +87,7 @@ static struct { "check your vmlinux setting?", _loop, (NR_ITERS + 1) / 4, + false, }, #endif { @@ -83,6 +98,7 @@ static struct { "libbpf error when dealing with relocation", NULL, 0, + false, }, }; @@ -226,10 +242,34 @@ static int __test__bpf(int idx) goto out; } - if (obj) + if (obj) { ret = do_test(obj, bpf_testcase_table[idx].target_func, bpf_testcase_table[idx].expect_result); + if (ret != TEST_OK) + goto out; + if (bpf_testcase_table[idx].pin) { + int err; + + if (!bpf_fs__mount()) { + pr_debug("BPF filesystem not mounted\n"); + ret = TEST_FAIL; + goto out; + } + err = mkdir(PERF_TEST_BPF_PATH, 0777); + if (err && errno != EEXIST) { + pr_debug("Failed to make perf_test dir: %s\n", +strerror(errno)); + ret = TEST_FAIL; + goto out; + } + if (bpf_object__pin(obj, PERF_TEST_BPF_PATH)) + ret = TEST_FAIL; + if (rm_rf(PERF_TEST_BPF_PATH)) + ret = TEST_FAIL; + } + } + out: bpf__clear(); return ret; -- 2.9.3
[PATCH 07/14] tools lib bpf: Add BPF program pinning APIs
From: Joe Stringer <j...@ovn.org> Add new APIs to pin a BPF program (or specific instances) to the filesystem. The user can specify the path full path within a BPF filesystem to pin the program. bpf_program__pin_instance(prog, path, n) will pin the nth instance of 'prog' to the specified path. bpf_program__pin(prog, path) will create the directory 'path' (if it does not exist) and pin each instance within that directory. For instance, path/0, path/1, path/2. Committer notes: - Add missing headers for mkdir() - Check strdup() for failure - Check snprintf >= size, not >, as == also means truncated, see 'man snprintf', return value. - Conditionally define BPF_FS_MAGIC, as it isn't in magic.h in older systems and we're not yet having a tools/include/uapi/linux/magic.h copy. - Do not include linux/magic.h, not present in older distros. Signed-off-by: Joe Stringer <j...@ovn.org> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: Wang Nan <wangn...@huawei.com> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170126212001.14103-2-...@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/lib/bpf/libbpf.c | 120 + tools/lib/bpf/libbpf.h | 3 ++ 2 files changed, 123 insertions(+) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index e6cd62b1264b..c4465b2fddf6 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -4,6 +4,7 @@ * Copyright (C) 2013-2015 Alexei Starovoitov <a...@kernel.org> * Copyright (C) 2015 Wang Nan <wangn...@huawei.com> * Copyright (C) 2015 Huawei Inc. + * Copyright (C) 2017 Nicira, Inc. * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -22,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -32,6 +34,10 @@ #include #include #include +#include +#include +#include +#include #include #include @@ -42,6 +48,10 @@ #define EM_BPF 247 #endif +#ifndef BPF_FS_MAGIC +#define BPF_FS_MAGIC 0xcafe4a11 +#endif + #define __printf(a, b) __attribute__((format(printf, a, b))) __printf(1, 2) @@ -1237,6 +1247,116 @@ int bpf_object__load(struct bpf_object *obj) return err; } +static int check_path(const char *path) +{ + struct statfs st_fs; + char *dname, *dir; + int err = 0; + + if (path == NULL) + return -EINVAL; + + dname = strdup(path); + if (dname == NULL) + return -ENOMEM; + + dir = dirname(dname); + if (statfs(dir, _fs)) { + pr_warning("failed to statfs %s: %s\n", dir, strerror(errno)); + err = -errno; + } + free(dname); + + if (!err && st_fs.f_type != BPF_FS_MAGIC) { + pr_warning("specified path %s is not on BPF FS\n", path); + err = -EINVAL; + } + + return err; +} + +int bpf_program__pin_instance(struct bpf_program *prog, const char *path, + int instance) +{ + int err; + + err = check_path(path); + if (err) + return err; + + if (prog == NULL) { + pr_warning("invalid program pointer\n"); + return -EINVAL; + } + + if (instance < 0 || instance >= prog->instances.nr) { + pr_warning("invalid prog instance %d of prog %s (max %d)\n", + instance, prog->section_name, prog->instances.nr); + return -EINVAL; + } + + if (bpf_obj_pin(prog->instances.fds[instance], path)) { + pr_warning("failed to pin program: %s\n", strerror(errno)); + return -errno; + } + pr_debug("pinned program '%s'\n", path); + + return 0; +} + +static int make_dir(const char *path) +{ + int err = 0; + + if (mkdir(path, 0700) && errno != EEXIST) + err = -errno; + + if (err) + pr_warning("failed to mkdir %s: %s\n", path, strerror(-err)); + return err; +} + +int bpf_program__pin(struct bpf_program *prog, const char *path) +{ + int i, err; + + err = check_path(path); + if (err) + return err; + + if (prog == NULL) { + pr_warning("invalid program pointer\n"); + return -EINVAL; + } + + if (prog->instances.nr <= 0) { + pr_warning("no instances of prog %s to pin\n", + prog->section_name); + return -EINVAL; + } + + err = make_dir(path); + if (err) + return err; + + for (i = 0; i < prog->instances.nr; i++) { + char buf[PATH_MAX]; +
Re: [PATCHv3 perf/core 1/6] tools lib bpf: Add BPF program pinning APIs.
Em Tue, Jan 31, 2017 at 01:13:20PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Tue, Jan 31, 2017 at 01:08:27PM -0300, Arnaldo Carvalho de Melo escreveu: > > Em Mon, Jan 30, 2017 at 09:58:05PM -0300, Arnaldo Carvalho de Melo escreveu: > > > Em Mon, Jan 30, 2017 at 01:16:18PM -0800, Joe Stringer escreveu: > > > > On 30 January 2017 at 12:28, Arnaldo Carvalho de Melo <a...@kernel.org> > > > > wrote: > > > > > --- > > > > > Thus, a return value of size or more means that the output was > > > > > truncated. > > > > > --- > > > > > > > Good spotting, I looked over the committed versions and tested them, > > > > they seem good to me. Thanks! > > > > > > Thanks for checking, will push Ingo's way after a battery of extra > > > tests, tomorrow, > > > > Which failed for centos:5, centos:6, centos:7, debian:7, debian:8, > > debian:experimental and others, I stopped the test at this point, > > working on fixing it. > > > > All seems related to: > > > > libbpf.c:1267: error: 'BPF_FS_MAGIC' undeclared (first use in this function) > > libbpf.c:1267: error: (Each undeclared identifier is reported only once > > libbpf.c:1267: error: for each function it appears in.) > > We need to carry a tools/include/uapi/linux/magic.c copy, check if it > drifts, remove the ifdefs for _FS_MAGIC defines from tools/ and use that > instead, etc, till then I'll just add the ifdef to libbpf.c. After also removing that #include line, that is not used anywhere else in tools/{perf,include,lib}/ it is going further: [root@jouet ~]# time dm 1 83.120412349 alpine:3.4: Ok 2 35.486456929 android-ndk:r12b-arm: Ok 3 85.384259996 archlinux:latest: Ok 4 49.518031326 centos:5: Ok 5 70.417375831 centos:6: Ok 6 87.033156092 centos:7: Ok 31 more to go :-) - Arnaldo
Re: [PATCHv3 perf/core 1/6] tools lib bpf: Add BPF program pinning APIs.
Em Tue, Jan 31, 2017 at 01:08:27PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Mon, Jan 30, 2017 at 09:58:05PM -0300, Arnaldo Carvalho de Melo escreveu: > > Em Mon, Jan 30, 2017 at 01:16:18PM -0800, Joe Stringer escreveu: > > > On 30 January 2017 at 12:28, Arnaldo Carvalho de Melo <a...@kernel.org> > > > wrote: > > > > --- > > > > Thus, a return value of size or more means that the output was > > > > truncated. > > > > --- > > > > > Good spotting, I looked over the committed versions and tested them, > > > they seem good to me. Thanks! > > > > Thanks for checking, will push Ingo's way after a battery of extra > > tests, tomorrow, > > Which failed for centos:5, centos:6, centos:7, debian:7, debian:8, > debian:experimental and others, I stopped the test at this point, > working on fixing it. > > All seems related to: > > libbpf.c:1267: error: 'BPF_FS_MAGIC' undeclared (first use in this function) > libbpf.c:1267: error: (Each undeclared identifier is reported only once > libbpf.c:1267: error: for each function it appears in.) We need to carry a tools/include/uapi/linux/magic.c copy, check if it drifts, remove the ifdefs for _FS_MAGIC defines from tools/ and use that instead, etc, till then I'll just add the ifdef to libbpf.c. [acme@jouet linux]$ grep BPF_FS_MAGIC /usr/include/*/*.h /usr/include/linux/magic.h:#define BPF_FS_MAGIC 0xcafe4a11 [acme@jouet linux]$ rpm -qf /usr/include/linux/magic.h kernel-headers-4.9.6-200.fc25.x86_64 [acme@jouet linux]$ cat /etc/fedora-release Fedora release 25 (Twenty Five) [acme@jouet linux]$ But those other distros don't have it. - Arnaldo
Re: [PATCHv3 perf/core 1/6] tools lib bpf: Add BPF program pinning APIs.
Em Mon, Jan 30, 2017 at 09:58:05PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Mon, Jan 30, 2017 at 01:16:18PM -0800, Joe Stringer escreveu: > > On 30 January 2017 at 12:28, Arnaldo Carvalho de Melo <a...@kernel.org> > > wrote: > > > --- > > > Thus, a return value of size or more means that the output was > > > truncated. > > > --- > > > Good spotting, I looked over the committed versions and tested them, > > they seem good to me. Thanks! > > Thanks for checking, will push Ingo's way after a battery of extra > tests, tomorrow, Which failed for centos:5, centos:6, centos:7, debian:7, debian:8, debian:experimental and others, I stopped the test at this point, working on fixing it. All seems related to: libbpf.c:1267: error: 'BPF_FS_MAGIC' undeclared (first use in this function) libbpf.c:1267: error: (Each undeclared identifier is reported only once libbpf.c:1267: error: for each function it appears in.)
Re: [PATCHv3 perf/core 1/6] tools lib bpf: Add BPF program pinning APIs.
Em Mon, Jan 30, 2017 at 01:16:18PM -0800, Joe Stringer escreveu: > On 30 January 2017 at 12:28, Arnaldo Carvalho de Melo <a...@kernel.org> wrote: > > --- > > Thus, a return value of size or more means that the output was > > truncated. > > --- > Good spotting, I looked over the committed versions and tested them, > they seem good to me. Thanks! Thanks for checking, will push Ingo's way after a battery of extra tests, tomorrow, - Arnaldo
Re: [PATCHv3 perf/core 0/6] Libbpf object pinning
Em Thu, Jan 26, 2017 at 01:19:55PM -0800, Joe Stringer escreveu: > This series adds pinning functionality for maps, programs, and objects. > Library users may call bpf_map__pin(map, path) or bpf_program__pin(prog, path) > to pin maps and programs separately, or use bpf_object__pin(obj, path) to > pin all maps and programs from the BPF object to the path. The map and program > variations require a path where it will be pinned in the filesystem, > and the object variation will create named directories for each program with > instances within, and mount the maps by name under the path. > > For example, with the directory '/sys/fs/bpf/foo' and a BPF object which > contains two instances of a program named 'bar', and a map named 'baz': > /sys/fs/bpf/foo/bar/0 > /sys/fs/bpf/foo/bar/1 > /sys/fs/bpf/foo/baz Thanks, applied, after some minor fixes. - Arnaldo > --- > v3: Split out bpf_program__pin_instance(). > Change the paths from PATH/{maps,progs}/foo to the above. > Drop the patches that were applied. > Add a perf test to check that pinning works. > v2: Wang Nan provided improvements to patch 1. > Dropped patch 2 from v1. > Added acks for acked patches. > Split the bpf_obj__pin() to also provide map / program pinning APIs. > Allow users to provide full filesystem path (don't autodetect/mount > BPFFS). > v1: Initial post. > > Joe Stringer (6): > tools lib bpf: Add BPF program pinning APIs. > tools lib bpf: Add bpf_map__pin() > tools lib bpf: Add bpf_object__pin() > tools perf util: Make rm_rf(path) argument const > tools lib api fs: Add bpf_fs filesystem detector > perf test: Add libbpf pinning test > > tools/lib/api/fs/fs.c | 16 + > tools/lib/api/fs/fs.h | 1 + > tools/lib/bpf/libbpf.c | 188 > + > tools/lib/bpf/libbpf.h | 5 ++ > tools/perf/tests/bpf.c | 42 ++- > tools/perf/util/util.c | 2 +- > tools/perf/util/util.h | 2 +- > 7 files changed, 253 insertions(+), 3 deletions(-) > > -- > 2.11.0
Re: [PATCHv3 perf/core 1/6] tools lib bpf: Add BPF program pinning APIs.
Em Mon, Jan 30, 2017 at 05:25:06PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Thu, Jan 26, 2017 at 01:19:56PM -0800, Joe Stringer escreveu: > > Add new APIs to pin a BPF program (or specific instances) to the filesystem. > > The user can specify the path full path within a BPF filesystem to pin the > > program. > > > > bpf_program__pin_instance(prog, path, n) will pin the nth instance of > > 'prog' to the specified path. > > bpf_program__pin(prog, path) will create the directory 'path' (if it > > does not exist) and pin each instance within that directory. For > > instance, path/0, path/1, path/2. > > > > Signed-off-by: Joe Stringer <j...@ovn.org> > > make: Entering directory '/home/acme/git/linux/tools/perf' > BUILD: Doing 'make -j4' parallel build > CC /tmp/build/perf/builtin-record.o > CC /tmp/build/perf/libbpf.o > CC /tmp/build/perf/util/parse-events.o > INSTALL trace_plugins > libbpf.c: In function ‘make_dir’: > libbpf.c:1303:6: error: implicit declaration of function ‘mkdir’ > [-Werror=implicit-function-declaration] > if (mkdir(path, 0700) && errno != EEXIST) > ^ > libbpf.c:1303:2: error: nested extern declaration of ‘mkdir’ > [-Werror=nested-externs] > if (mkdir(path, 0700) && errno != EEXIST) > ^~ > cc1: all warnings being treated as errors > mv: cannot stat '/tmp/build/perf/.libbpf.o.tmp': No such file or directory > /home/acme/git/linux/tools/build/Makefile.build:101: recipe for target > '/tmp/build/perf/libbpf.o' failed > > > And strdup() is not checked for failure, I'm fixing those, > > +++ b/tools/lib/bpf/libbpf.c > @@ -36,6 +36,8 @@ > #include > #include > #include > +#include > +#include > #include This as well: @@ -1338,7 +1343,7 @@ int bpf_program__pin(struct bpf_program *prog, const char *path) len = snprintf(buf, PATH_MAX, "%s/%d", path, i); if (len < 0) return -EINVAL; - else if (len > PATH_MAX) + else if (len >= PATH_MAX) return -ENAMETOOLONG; See 'man snprintf', return value: --- Thus, a return value of size or more means that the output was truncated. ---
Re: [PATCHv3 perf/core 1/6] tools lib bpf: Add BPF program pinning APIs.
Em Thu, Jan 26, 2017 at 01:19:56PM -0800, Joe Stringer escreveu: > Add new APIs to pin a BPF program (or specific instances) to the filesystem. > The user can specify the path full path within a BPF filesystem to pin the > program. > > bpf_program__pin_instance(prog, path, n) will pin the nth instance of > 'prog' to the specified path. > bpf_program__pin(prog, path) will create the directory 'path' (if it > does not exist) and pin each instance within that directory. For > instance, path/0, path/1, path/2. > > Signed-off-by: Joe Stringermake: Entering directory '/home/acme/git/linux/tools/perf' BUILD: Doing 'make -j4' parallel build CC /tmp/build/perf/builtin-record.o CC /tmp/build/perf/libbpf.o CC /tmp/build/perf/util/parse-events.o INSTALL trace_plugins libbpf.c: In function ‘make_dir’: libbpf.c:1303:6: error: implicit declaration of function ‘mkdir’ [-Werror=implicit-function-declaration] if (mkdir(path, 0700) && errno != EEXIST) ^ libbpf.c:1303:2: error: nested extern declaration of ‘mkdir’ [-Werror=nested-externs] if (mkdir(path, 0700) && errno != EEXIST) ^~ cc1: all warnings being treated as errors mv: cannot stat '/tmp/build/perf/.libbpf.o.tmp': No such file or directory /home/acme/git/linux/tools/build/Makefile.build:101: recipe for target '/tmp/build/perf/libbpf.o' failed And strdup() is not checked for failure, I'm fixing those, +++ b/tools/lib/bpf/libbpf.c @@ -36,6 +36,8 @@ #include #include #include +#include +#include #include - Arnaldo > --- > v3: Add per-instance pinning. > Use path for bpf_program__pin() as directory. > v2: Don't automount BPF filesystem > Split program, map, object pinning into separate APIs and separate > patches. > --- > tools/lib/bpf/libbpf.c | 112 > + > tools/lib/bpf/libbpf.h | 3 ++ > 2 files changed, 115 insertions(+) > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index e6cd62b1264b..d1d7638b7c21 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -4,6 +4,7 @@ > * Copyright (C) 2013-2015 Alexei Starovoitov > * Copyright (C) 2015 Wang Nan > * Copyright (C) 2015 Huawei Inc. > + * Copyright (C) 2017 Nicira, Inc. > * > * This program is free software; you can redistribute it and/or > * modify it under the terms of the GNU Lesser General Public > @@ -22,6 +23,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -31,7 +33,10 @@ > #include > #include > #include > +#include > #include > +#include > +#include > #include > #include > > @@ -1237,6 +1242,113 @@ int bpf_object__load(struct bpf_object *obj) > return err; > } > > +static int check_path(const char *path) > +{ > + struct statfs st_fs; > + char *dname, *dir; > + int err = 0; > + > + if (path == NULL) > + return -EINVAL; > + > + dname = strdup(path); > + dir = dirname(dname); > + if (statfs(dir, _fs)) { > + pr_warning("failed to statfs %s: %s\n", dir, strerror(errno)); > + err = -errno; > + } > + free(dname); > + > + if (!err && st_fs.f_type != BPF_FS_MAGIC) { > + pr_warning("specified path %s is not on BPF FS\n", path); > + err = -EINVAL; > + } > + > + return err; > +} > + > +int bpf_program__pin_instance(struct bpf_program *prog, const char *path, > + int instance) > +{ > + int err; > + > + err = check_path(path); > + if (err) > + return err; > + > + if (prog == NULL) { > + pr_warning("invalid program pointer\n"); > + return -EINVAL; > + } > + > + if (instance < 0 || instance >= prog->instances.nr) { > + pr_warning("invalid prog instance %d of prog %s (max %d)\n", > +instance, prog->section_name, prog->instances.nr); > + return -EINVAL; > + } > + > + if (bpf_obj_pin(prog->instances.fds[instance], path)) { > + pr_warning("failed to pin program: %s\n", strerror(errno)); > + return -errno; > + } > + pr_debug("pinned program '%s'\n", path); > + > + return 0; > +} > + > +static int make_dir(const char *path) > +{ > + int err = 0; > + > + if (mkdir(path, 0700) && errno != EEXIST) > + err = -errno; > + > + if (err) > + pr_warning("failed to mkdir %s: %s\n", path, strerror(-err)); > + return err; > +} > + > +int bpf_program__pin(struct bpf_program *prog, const char *path) > +{ > + int i, err; > + > + err = check_path(path); > + if (err) > + return err; > + > + if (prog == NULL) { > + pr_warning("invalid program pointer\n"); > + return -EINVAL; > + } > + > + if (prog->instances.nr <= 0) { > + pr_warning("no instances of
Re: [PATCH net-next 1/3] trace: add variant without spacing in trace_print_hex_seq
Em Wed, Jan 25, 2017 at 02:28:16AM +0100, Daniel Borkmann escreveu: > For upcoming tracepoint support for BPF, we want to dump the program's > tag. Format should be similar to __print_hex(), but without spacing. > Add a __print_hex_str() variant for exactly that purpose that reuses > trace_print_hex_seq(). Steven should be back to his side of the wall soon, will wait for his Ack, ok? - Arnaldo > Signed-off-by: Daniel Borkmann <dan...@iogearbox.net> > Cc: Steven Rostedt <rost...@goodmis.org> > Cc: Arnaldo Carvalho de Melo <a...@redhat.com> > --- > include/linux/trace_events.h | 3 ++- > include/trace/trace_events.h | 8 +++- > kernel/trace/trace_output.c | 7 --- > 3 files changed, 13 insertions(+), 5 deletions(-) > > diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h > index be00761..cfa475a 100644 > --- a/include/linux/trace_events.h > +++ b/include/linux/trace_events.h > @@ -33,7 +33,8 @@ const char *trace_print_bitmask_seq(struct trace_seq *p, > void *bitmask_ptr, > unsigned int bitmask_size); > > const char *trace_print_hex_seq(struct trace_seq *p, > - const unsigned char *buf, int len); > + const unsigned char *buf, int len, > + bool spacing); > > const char *trace_print_array_seq(struct trace_seq *p, > const void *buf, int count, > diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h > index 467e12f..9f68462 100644 > --- a/include/trace/trace_events.h > +++ b/include/trace/trace_events.h > @@ -297,7 +297,12 @@ > #endif > > #undef __print_hex > -#define __print_hex(buf, buf_len) trace_print_hex_seq(p, buf, buf_len) > +#define __print_hex(buf, buf_len)\ > + trace_print_hex_seq(p, buf, buf_len, true) > + > +#undef __print_hex_str > +#define __print_hex_str(buf, buf_len) > \ > + trace_print_hex_seq(p, buf, buf_len, false) > > #undef __print_array > #define __print_array(array, count, el_size) \ > @@ -711,6 +716,7 @@ > #undef __print_flags > #undef __print_symbolic > #undef __print_hex > +#undef __print_hex_str > #undef __get_dynamic_array > #undef __get_dynamic_array_len > #undef __get_str > diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c > index 5d33a73..30a144b1 100644 > --- a/kernel/trace/trace_output.c > +++ b/kernel/trace/trace_output.c > @@ -163,14 +163,15 @@ enum print_line_t trace_print_printk_msg_only(struct > trace_iterator *iter) > EXPORT_SYMBOL_GPL(trace_print_bitmask_seq); > > const char * > -trace_print_hex_seq(struct trace_seq *p, const unsigned char *buf, int > buf_len) > +trace_print_hex_seq(struct trace_seq *p, const unsigned char *buf, int > buf_len, > + bool spacing) > { > int i; > const char *ret = trace_seq_buffer_ptr(p); > > for (i = 0; i < buf_len; i++) > - trace_seq_printf(p, "%s%2.2x", i == 0 ? "" : " ", buf[i]); > - > + trace_seq_printf(p, "%s%2.2x", !spacing || i == 0 ? "" : " ", > + buf[i]); > trace_seq_putc(p, 0); > > return ret; > -- > 1.9.3
Re: [PATCHv2 perf/core 5/7] tools lib bpf: Add bpf_program__pin()
Em Wed, Jan 25, 2017 at 10:18:22AM +0800, Wangnan (F) escreveu: > On 2017/1/25 9:16, Joe Stringer wrote: > > On 24 January 2017 at 17:06, Wangnan (F)wrote: > > > On 2017/1/25 9:04, Wangnan (F) wrote: > > > Is it possible to use directory tree instead? > > > %s/object/mapname > > > %s/object/prog/instance > > I don't think objects have names, so let's assume an object with two > > program instances named foo, and one map named bar. > > A call of bpf_object__pin(obj, "/sys/fs/bpf/myobj") would mount with > > the following files and directories: > > /sys/fs/bpf/myobj/foo/1 > > /sys/fs/bpf/myobj/foo/2 > > /sys/fs/bpf/myobj/bar > > Alternatively, if you want to control exactly where you want the > > progs/maps to be pinned, you can call eg > > bpf_program__pin_instance(prog, "/sys/fs/bpf/wherever", 0) and that > > instance will be mounted to /sys/fs/bpf/wherever, or alternatively > > bpf_program__pin(prog, "/sys/fs/bpf/foo"), and you will end up with > > /sys/fs/bpf/foo/{0,1}. > > This looks pretty reasonable to me. > It looks good to me. Ok, please continue from perf/core, Ingo merged the first patch of this patchset today, - Arnaldo
[PATCH 16/23] tools lib bpf: Add set/is helpers for all prog types
From: Joe Stringer <j...@ovn.org> These bpf_prog_types were exposed in the uapi but there were no corresponding functions to set these types for programs in libbpf. Signed-off-by: Joe Stringer <j...@ovn.org> Acked-by: Wang Nan <wangn...@huawei.com> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170123011128.26534-4-...@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/lib/bpf/libbpf.c | 5 + tools/lib/bpf/libbpf.h | 10 ++ 2 files changed, 15 insertions(+) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 371cb40a2304..406838fa9c4f 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1448,8 +1448,13 @@ bool bpf_program__is_##NAME(struct bpf_program *prog) \ return bpf_program__is_type(prog, TYPE);\ } \ +BPF_PROG_TYPE_FNS(socket_filter, BPF_PROG_TYPE_SOCKET_FILTER); BPF_PROG_TYPE_FNS(kprobe, BPF_PROG_TYPE_KPROBE); +BPF_PROG_TYPE_FNS(sched_cls, BPF_PROG_TYPE_SCHED_CLS); +BPF_PROG_TYPE_FNS(sched_act, BPF_PROG_TYPE_SCHED_ACT); BPF_PROG_TYPE_FNS(tracepoint, BPF_PROG_TYPE_TRACEPOINT); +BPF_PROG_TYPE_FNS(xdp, BPF_PROG_TYPE_XDP); +BPF_PROG_TYPE_FNS(perf_event, BPF_PROG_TYPE_PERF_EVENT); int bpf_map__fd(struct bpf_map *map) { diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index a5a8b86a06fe..2188ccdc0e2d 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -174,11 +174,21 @@ int bpf_program__nth_fd(struct bpf_program *prog, int n); /* * Adjust type of bpf program. Default is kprobe. */ +int bpf_program__set_socket_filter(struct bpf_program *prog); int bpf_program__set_tracepoint(struct bpf_program *prog); int bpf_program__set_kprobe(struct bpf_program *prog); +int bpf_program__set_sched_cls(struct bpf_program *prog); +int bpf_program__set_sched_act(struct bpf_program *prog); +int bpf_program__set_xdp(struct bpf_program *prog); +int bpf_program__set_perf_event(struct bpf_program *prog); +bool bpf_program__is_socket_filter(struct bpf_program *prog); bool bpf_program__is_tracepoint(struct bpf_program *prog); bool bpf_program__is_kprobe(struct bpf_program *prog); +bool bpf_program__is_sched_cls(struct bpf_program *prog); +bool bpf_program__is_sched_act(struct bpf_program *prog); +bool bpf_program__is_xdp(struct bpf_program *prog); +bool bpf_program__is_perf_event(struct bpf_program *prog); /* * We don't need __attribute__((packed)) now since it is -- 2.9.3
[PATCH 15/23] tools lib bpf: Define prog_type fns with macro
From: Joe Stringer <j...@ovn.org> Turning this into a macro allows future prog types to be added with a single line per type. Signed-off-by: Joe Stringer <j...@ovn.org> Acked-by: Wang Nan <wangn...@huawei.com> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170123011128.26534-3-...@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/lib/bpf/libbpf.c | 41 - 1 file changed, 16 insertions(+), 25 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 671d5ad07cf1..371cb40a2304 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1428,37 +1428,28 @@ static void bpf_program__set_type(struct bpf_program *prog, prog->type = type; } -int bpf_program__set_tracepoint(struct bpf_program *prog) -{ - if (!prog) - return -EINVAL; - bpf_program__set_type(prog, BPF_PROG_TYPE_TRACEPOINT); - return 0; -} - -int bpf_program__set_kprobe(struct bpf_program *prog) -{ - if (!prog) - return -EINVAL; - bpf_program__set_type(prog, BPF_PROG_TYPE_KPROBE); - return 0; -} - static bool bpf_program__is_type(struct bpf_program *prog, enum bpf_prog_type type) { return prog ? (prog->type == type) : false; } -bool bpf_program__is_tracepoint(struct bpf_program *prog) -{ - return bpf_program__is_type(prog, BPF_PROG_TYPE_TRACEPOINT); -} - -bool bpf_program__is_kprobe(struct bpf_program *prog) -{ - return bpf_program__is_type(prog, BPF_PROG_TYPE_KPROBE); -} +#define BPF_PROG_TYPE_FNS(NAME, TYPE) \ +int bpf_program__set_##NAME(struct bpf_program *prog) \ +{ \ + if (!prog) \ + return -EINVAL; \ + bpf_program__set_type(prog, TYPE); \ + return 0; \ +} \ + \ +bool bpf_program__is_##NAME(struct bpf_program *prog) \ +{ \ + return bpf_program__is_type(prog, TYPE);\ +} \ + +BPF_PROG_TYPE_FNS(kprobe, BPF_PROG_TYPE_KPROBE); +BPF_PROG_TYPE_FNS(tracepoint, BPF_PROG_TYPE_TRACEPOINT); int bpf_map__fd(struct bpf_map *map) { -- 2.9.3
[PATCH 17/23] tools lib bpf: Add libbpf_get_error()
From: Joe Stringer <j...@ovn.org> This function will turn a libbpf pointer into a standard error code (or 0 if the pointer is valid). This also allows removal of the dependency on linux/err.h in the public header file, which causes problems in userspace programs built against libbpf. Signed-off-by: Joe Stringer <j...@ovn.org> Acked-by: Wang Nan <wangn...@huawei.com> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170123011128.26534-5-...@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/lib/bpf/libbpf.c | 8 tools/lib/bpf/libbpf.h | 4 +++- tools/perf/tests/llvm.c | 2 +- 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 406838fa9c4f..e6cd62b1264b 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include #include @@ -1542,3 +1543,10 @@ bpf_object__find_map_by_offset(struct bpf_object *obj, size_t offset) } return ERR_PTR(-ENOENT); } + +long libbpf_get_error(const void *ptr) +{ + if (IS_ERR(ptr)) + return PTR_ERR(ptr); + return 0; +} diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 2188ccdc0e2d..4014d1ba5e3d 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -22,8 +22,8 @@ #define __BPF_LIBBPF_H #include +#include #include -#include #include // for size_t enum libbpf_errno { @@ -234,4 +234,6 @@ int bpf_map__set_priv(struct bpf_map *map, void *priv, bpf_map_clear_priv_t clear_priv); void *bpf_map__priv(struct bpf_map *map); +long libbpf_get_error(const void *ptr); + #endif diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c index 02a33ebcd992..d357dab72e68 100644 --- a/tools/perf/tests/llvm.c +++ b/tools/perf/tests/llvm.c @@ -13,7 +13,7 @@ static int test__bpf_parsing(void *obj_buf, size_t obj_buf_sz) struct bpf_object *obj; obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, NULL); - if (IS_ERR(obj)) + if (libbpf_get_error(obj)) return TEST_FAIL; bpf_object__close(obj); return TEST_OK; -- 2.9.3
[PATCH 14/23] tools lib bpf: Fix map offsets in relocation
From: Joe Stringer <j...@ovn.org> Commit 4708bbda5cb2 ("tools lib bpf: Fix maps resolution") attempted to fix map resolution by identifying the number of symbols that point to maps, and using this number to resolve each of the maps. However, during relocation the original definition of the map size was still in use. For up to two maps, the calculation was correct if there was a small difference in size between the map definition in libbpf and the one that the client library uses. However if the difference was large, particularly if more than two maps were used in the BPF program, the relocation would fail. For example, when using a map definition with size 28, with three maps, map relocation would count: (sym_offset / sizeof(struct bpf_map_def) => map_idx) (0 / 16 => 0), ie map_idx = 0 (28 / 16 => 1), ie map_idx = 1 (56 / 16 => 3), ie map_idx = 3 So, libbpf reports: libbpf: bpf relocation: map_idx 3 large than 2 Fix map relocation by checking the exact offset of maps when doing relocation. Signed-off-by: Joe Stringer <j...@ovn.org> [Allow different map size in an object] Signed-off-by: Wang Nan <wangn...@huawei.com> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: netdev@vger.kernel.org Fixes: 4708bbda5cb2 ("tools lib bpf: Fix maps resolution") Link: http://lkml.kernel.org/r/20170123011128.26534-2-...@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- tools/lib/bpf/libbpf.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 84e6b35da4bd..671d5ad07cf1 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -779,7 +779,7 @@ static int bpf_program__collect_reloc(struct bpf_program *prog, size_t nr_maps, GElf_Shdr *shdr, Elf_Data *data, Elf_Data *symbols, - int maps_shndx) + int maps_shndx, struct bpf_map *maps) { int i, nrels; @@ -829,7 +829,15 @@ bpf_program__collect_reloc(struct bpf_program *prog, return -LIBBPF_ERRNO__RELOC; } - map_idx = sym.st_value / sizeof(struct bpf_map_def); + /* TODO: 'maps' is sorted. We can use bsearch to make it faster. */ + for (map_idx = 0; map_idx < nr_maps; map_idx++) { + if (maps[map_idx].offset == sym.st_value) { + pr_debug("relocation: find map %zd (%s) for insn %u\n", +map_idx, maps[map_idx].name, insn_idx); + break; + } + } + if (map_idx >= nr_maps) { pr_warning("bpf relocation: map_idx %d large than %d\n", (int)map_idx, (int)nr_maps - 1); @@ -953,7 +961,8 @@ static int bpf_object__collect_reloc(struct bpf_object *obj) err = bpf_program__collect_reloc(prog, nr_maps, shdr, data, obj->efile.symbols, -obj->efile.maps_shndx); +obj->efile.maps_shndx, +obj->maps); if (err) return err; } -- 2.9.3
Re: [PATCHv2 perf/core 0/7] Libbpf improvements
Em Sun, Jan 22, 2017 at 05:11:21PM -0800, Joe Stringer escreveu: > Patch 1 fixes an issue when using drastically different BPF map definitions > inside ELFs from a client using libbpf, vs the map definition libbpf uses. > > Patches 2-4 add some simple, useful helper functions for setting prog type > and retrieving libbpf errors without depending on kernel headers from > userspace programs. > > Patches 5-7 add a new pinning functionality for maps, programs, and objects. > Library users may call bpf_map__pin(map, path) or bpf_program__pin(prog, path) > to pin maps and programs separately, or use bpf_object__pin(obj, path) to > pin all maps and programs from the BPF object to the path. The map and program > variations require a full path where it will be pinned in the filesystem, > and the object variation will create directories "maps/" and "progs/" under > the specified path, then mount each map and program under those > subdirectories. Merged the ones either acked by Wang or adjusted by you to address Wang's remarks, the last ones introducing those __pin() methods, please provide users together with those APIs, preferably entries for 'perf test', - Arnaldo > --- > v1: Initial post. > v2: Wang Nan provided improvements to patch 1. > Dropped patch 2 from v1. > Added acks for acked patches. > Split the bpf_obj__pin() to also provide map / program pinning APIs. > Allow users to provide full filesystem path (don't autodetect/mount > BPFFS). > > Joe Stringer (7): > tools lib bpf: Fix map offsets in relocation > tools lib bpf: Define prog_type fns with macro > tools lib bpf: Add set/is helpers for all prog types > tools lib bpf: Add libbpf_get_error() > tools lib bpf: Add bpf_program__pin() > tools lib bpf: Add bpf_map__pin() > tools lib bpf: Add bpf_object__pin() > > tools/lib/bpf/libbpf.c | 240 > ++-- > tools/lib/bpf/libbpf.h | 17 +++- > tools/perf/tests/llvm.c | 2 +- > 3 files changed, 229 insertions(+), 30 deletions(-) > > -- > 2.11.0
Re: [patch] samples/bpf: silence shift wrapping warning
Em Mon, Jan 23, 2017 at 10:44:34PM -0800, Alexei Starovoitov escreveu: > On Mon, Jan 23, 2017 at 5:27 AM, Arnaldo Carvalho de Melo > <arnaldo.m...@gmail.com> wrote: > > Em Sun, Jan 22, 2017 at 02:51:25PM -0800, Alexei Starovoitov escreveu: > >> On Sat, Jan 21, 2017 at 07:51:43AM +0300, Dan Carpenter wrote: > >> > max_key is a value in the 0-63 range, so on 32 bit systems the shift > >> > could wrap. > >> > > >> > Signed-off-by: Dan Carpenter <dan.carpen...@oracle.com> > >> > >> Looks fine. I think 'net-next' is ok. > > > > I could process these patches, if that would help, > > Thanks for the offer. > I don't think there will be conflicts with all the work happening in net-next, > but it's best to avoid even possibility of it when we can. Okay sir, I'll let you know when/if the tests I perform building samples/bpf/ in my containers catch something, - Arnaldo > Dan, > can you please resend the patch cc-ing Dave and netdev ? > please mention [PATCH net-next] in the subject. > > > - Arnaldo > > > >> Acked-by: Alexei Starovoitov <a...@kernel.org> > > > >> > diff --git a/samples/bpf/lwt_len_hist_user.c > >> > b/samples/bpf/lwt_len_hist_user.c > >> > index ec8f3bb..bd06eef 100644 > >> > --- a/samples/bpf/lwt_len_hist_user.c > >> > +++ b/samples/bpf/lwt_len_hist_user.c > >> > @@ -68,7 +68,7 @@ int main(int argc, char **argv) > >> > for (i = 1; i <= max_key + 1; i++) { > >> > stars(starstr, data[i - 1], max_value, MAX_STARS); > >> > printf("%8ld -> %-8ld : %-8ld |%-*s|\n", > >> > - (1l << i) >> 1, (1l << i) - 1, data[i - 1], > >> > + (1ULL << i) >> 1, (1ULL << i) - 1, data[i - 1], > >> >MAX_STARS, starstr); > >> > } > >> >
Re: [patch] samples/bpf: silence shift wrapping warning
Em Sun, Jan 22, 2017 at 02:51:25PM -0800, Alexei Starovoitov escreveu: > On Sat, Jan 21, 2017 at 07:51:43AM +0300, Dan Carpenter wrote: > > max_key is a value in the 0-63 range, so on 32 bit systems the shift > > could wrap. > > > > Signed-off-by: Dan Carpenter> > Looks fine. I think 'net-next' is ok. I could process these patches, if that would help, - Arnaldo > Acked-by: Alexei Starovoitov > > diff --git a/samples/bpf/lwt_len_hist_user.c > > b/samples/bpf/lwt_len_hist_user.c > > index ec8f3bb..bd06eef 100644 > > --- a/samples/bpf/lwt_len_hist_user.c > > +++ b/samples/bpf/lwt_len_hist_user.c > > @@ -68,7 +68,7 @@ int main(int argc, char **argv) > > for (i = 1; i <= max_key + 1; i++) { > > stars(starstr, data[i - 1], max_value, MAX_STARS); > > printf("%8ld -> %-8ld : %-8ld |%-*s|\n", > > - (1l << i) >> 1, (1l << i) - 1, data[i - 1], > > + (1ULL << i) >> 1, (1ULL << i) - 1, data[i - 1], > >MAX_STARS, starstr); > > } > >
Re: [PATCH perf/core REBASE 3/5] tools lib bpf: Add bpf_prog_{attach,detach}
Em Tue, Dec 20, 2016 at 10:50:22AM -0800, Joe Stringer escreveu: > On 20 December 2016 at 06:32, Arnaldo Carvalho de Melo <a...@kernel.org> > wrote: > > Em Tue, Dec 20, 2016 at 11:18:51AM -0300, Arnaldo Carvalho de Melo escreveu: > >> This one makes it fail for CentOS 5 and 6, others may fail as well, > >> still building, investigating... > > > > Ok, fixed it by making it follow the model of the other sys_bpf wrappers > > setting up that bpf_attr union wrt initializing unamed struct members: > > - union bpf_attr attr = { > > - .target_fd = target_fd, > > - }; > > + union bpf_attr attr; > > + > > + bzero(, sizeof(attr)); > > + attr.target_fd = target_fd; > Ah, I just shifted these across originally so the delta would be > minimal but now I know why this code is like this. Thanks. np, making sure this code works in all those environments requires automation, I'd say its impossible otherwise, too many details :-\ Fixed, pushed, merged, should hit 4.10 soon :-) - Arnaldo
[PATCH 19/29] samples/bpf: Make samples more libbpf-centric
From: Joe Stringer <j...@ovn.org> Switch all of the sample code to use the function names from tools/lib/bpf so that they're consistent with that, and to declare their own log buffers. This allow the next commit to be purely devoted to getting rid of the duplicate library in samples/bpf. Committer notes: Testing it: On a fedora rawhide container, with clang/llvm 3.9, sharing the host linux kernel git tree: # make O=/tmp/build/linux/ headers_install # make O=/tmp/build/linux -C samples/bpf/ Since I forgot to make it privileged, just tested it outside the container, using what it generated: # uname -a Linux jouet 4.9.0-rc8+ #1 SMP Mon Dec 12 11:20:49 BRT 2016 x86_64 x86_64 x86_64 GNU/Linux # cd /var/lib/docker/devicemapper/mnt/c43e09a53ff56c86a07baf79847f00e2cc2a17a1e2220e1adbf8cbc62734feda/rootfs/tmp/build/linux/samples/bpf/ # ls -la offwaketime -rwxr-xr-x. 1 root root 24200 Dec 15 12:19 offwaketime # file offwaketime offwaketime: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=c940d3f127d5e66cdd680e42d885cb0b64f8a0e4, not stripped # readelf -SW offwaketime_kern.o | grep PROGBITS [ 2] .text PROGBITS 40 00 00 AX 0 0 4 [ 3] kprobe/try_to_wake_up PROGBITS 40 d8 00 AX 0 0 8 [ 5] tracepoint/sched/sched_switch PROGBITS 000118 000318 00 AX 0 0 8 [ 7] maps PROGBITS 000430 50 00 WA 0 0 4 [ 8] license PROGBITS 000480 04 00 WA 0 0 1 [ 9] version PROGBITS 000484 04 00 WA 0 0 4 # ./offwaketime | head -5 swapper/1;start_secondary;cpu_startup_entry;schedule_preempt_disabled;schedule;__schedule;-;---;; 106 CPU 0/KVM;entry_SYSCALL_64_fastpath;sys_ioctl;do_vfs_ioctl;kvm_vcpu_ioctl;kvm_arch_vcpu_ioctl_run;kvm_vcpu_block;schedule;__schedule;-;try_to_wake_up;swake_up_locked;swake_up;apic_timer_expired;apic_timer_fn;__hrtimer_run_queues;hrtimer_interrupt;local_apic_timer_interrupt;smp_apic_timer_interrupt;__irqentry_text_start;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary;;swapper/3 2 Compositor;entry_SYSCALL_64_fastpath;sys_futex;do_futex;futex_wait;futex_wait_queue_me;schedule;__schedule;-;try_to_wake_up;futex_requeue;do_futex;sys_futex;entry_SYSCALL_64_fastpath;;SoftwareVsyncTh 5 firefox;entry_SYSCALL_64_fastpath;sys_poll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;pollwake;__wake_up_common;__wake_up_sync_key;pipe_write;__vfs_write;vfs_write;sys_write;entry_SYSCALL_64_fastpath;;Timer 13 JS Helper;entry_SYSCALL_64_fastpath;sys_futex;do_futex;futex_wait;futex_wait_queue_me;schedule;__schedule;-;try_to_wake_up;do_futex;sys_futex;entry_SYSCALL_64_fastpath;;firefox 2 # Signed-off-by: Joe Stringer <j...@ovn.org> Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: Wang Nan <wangn...@huawei.com> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20161214224342.12858-2-...@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- samples/bpf/bpf_load.c| 17 +--- samples/bpf/bpf_load.h| 3 +++ samples/bpf/fds_example.c | 9 --- samples/bpf/lathist_user.c| 2 +- samples/bpf/libbpf.c | 23 samples/bpf/libbpf.h | 18 ++--- samples/bpf/lwt_len_hist_user.c | 6 +++-- samples/bpf/offwaketime_user.c| 8 +++--- samples/bpf/sampleip_user.c | 4 +-- samples/bpf/sock_example.c| 12 + samples/bpf/sockex1_user.c| 6 ++--- samples/bpf/sockex2_user.c| 4 +-- samples/bpf/sockex3_user.c| 4 +-- samples/bpf/spintest_user.c | 8 +++--- samples/bpf/tc_l2_redirect_user.c | 4 +-- samples/bpf/test_cgrp2_array_pin.c| 4 +-- samples/bpf/test_cgrp2_attach.c | 11 +--- samples/bpf/test_cgrp2_attach2.c | 7 +++-- samples/bpf/test_cgrp2_sock.c | 6 +++-- samples/bpf/test_current_task_under_cgroup_user.c | 8 +++--- samples/bpf/test_lru_dist.c | 32 +++ samples/bpf/test_probe_write_user_user.c | 2 +- samples/bpf/trace_event_user.c| 14 +- samples/bpf/trace_output_user.c | 2 +- samples/bpf/tracex2_user.c
[PATCH 25/29] samples/bpf: Switch over to libbpf
PROGBITS 000484 04 00 WA 0 0 4 [10] .symtab SYMTAB 000488 000120 18 1 4 8 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific) [root@jouet bpf]# ./offwaketime | head -3 qemu-system-x86;entry_SYSCALL_64_fastpath;sys_ppoll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;hrtimer_wakeup;__hrtimer_run_queues;hrtimer_interrupt;local_apic_timer_interrupt;smp_apic_timer_interrupt;__irqentry_text_start;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel;start_cpu;;swapper/0 4 firefox;entry_SYSCALL_64_fastpath;sys_poll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;pollwake;__wake_up_common;__wake_up_sync_key;pipe_write;__vfs_write;vfs_write;sys_write;entry_SYSCALL_64_fastpath;;Timer 1 swapper/2;start_cpu;start_secondary;cpu_startup_entry;schedule_preempt_disabled;schedule;__schedule;-;---;; 61 [root@jouet bpf]# Signed-off-by: Joe Stringer <j...@ovn.org> Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com> Cc: Alexei Starovoitov <a...@fb.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: Wang Nan <wangn...@huawei.com> Cc: netdev@vger.kernel.org Link: https://github.com/joestringer/linux/commit/5c40f54a52b1f437123c81e21873f4b4b1f9bd55.patch Link: http://lkml.kernel.org/n/tip-xr8twtx7sjh5821g8qw47...@git.kernel.org [ Use -I$(srctree)/tools/lib/ to support out of source code tree builds, as noticed by Wang Nan ] Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> --- samples/bpf/Makefile | 68 +--- samples/bpf/README.rst | 4 +- samples/bpf/bpf_load.c | 3 +- samples/bpf/fds_example.c| 3 +- samples/bpf/libbpf.c | 111 --- samples/bpf/libbpf.h | 19 +-- samples/bpf/sock_example.c | 3 +- samples/bpf/test_cgrp2_attach.c | 3 +- samples/bpf/test_cgrp2_attach2.c | 3 +- samples/bpf/test_cgrp2_sock.c| 3 +- 10 files changed, 52 insertions(+), 168 deletions(-) diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index f2219c1489e5..81b0ef2f7994 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -35,40 +35,43 @@ hostprogs-y += tc_l2_redirect hostprogs-y += lwt_len_hist hostprogs-y += xdp_tx_iptunnel -test_lru_dist-objs := test_lru_dist.o libbpf.o -sock_example-objs := sock_example.o libbpf.o -fds_example-objs := bpf_load.o libbpf.o fds_example.o -sockex1-objs := bpf_load.o libbpf.o sockex1_user.o -sockex2-objs := bpf_load.o libbpf.o sockex2_user.o -sockex3-objs := bpf_load.o libbpf.o sockex3_user.o -tracex1-objs := bpf_load.o libbpf.o tracex1_user.o -tracex2-objs := bpf_load.o libbpf.o tracex2_user.o -tracex3-objs := bpf_load.o libbpf.o tracex3_user.o -tracex4-objs := bpf_load.o libbpf.o tracex4_user.o -tracex5-objs := bpf_load.o libbpf.o tracex5_user.o -tracex6-objs := bpf_load.o libbpf.o tracex6_user.o -test_probe_write_user-objs := bpf_load.o libbpf.o test_probe_write_user_user.o -trace_output-objs := bpf_load.o libbpf.o trace_output_user.o -lathist-objs := bpf_load.o libbpf.o lathist_user.o -offwaketime-objs := bpf_load.o libbpf.o offwaketime_user.o -spintest-objs := bpf_load.o libbpf.o spintest_user.o -map_perf_test-objs := bpf_load.o libbpf.o map_perf_test_user.o -test_overhead-objs := bpf_load.o libbpf.o test_overhead_user.o -test_cgrp2_array_pin-objs := libbpf.o test_cgrp2_array_pin.o -test_cgrp2_attach-objs := libbpf.o test_cgrp2_attach.o -test_cgrp2_attach2-objs := libbpf.o test_cgrp2_attach2.o cgroup_helpers.o -test_cgrp2_sock-objs := libbpf.o test_cgrp2_sock.o -test_cgrp2_sock2-objs := bpf_load.o libbpf.o test_cgrp2_sock2.o -xdp1-objs := bpf_load.o libbpf.o xdp1_user.o +# Libbpf dependencies +LIBBPF := libbpf.o ../../tools/lib/bpf/bpf.o + +test_lru_dist-objs := test_lru_dist.o $(LIBBPF) +sock_example-objs := sock_example.o $(LIBBPF) +fds_example-objs := bpf_load.o $(LIBBPF) fds_example.o +sockex1-objs := bpf_load.o $(LIBBPF) sockex1_user.o +sockex2-objs := bpf_load.o $(LIBBPF) sockex2_user.o +sockex3-objs := bpf_load.o $(LIBBPF) sockex3_user.o +tracex1-objs := bpf_load.o $(LIBBPF) tracex1_user.o +tracex2-objs := bpf_load.o $(LIBBPF) tracex2_user.o +tracex3-objs := bpf_load.o $(LIBBPF) tracex3_user.o +tracex4-objs := bpf_load.o $(LIBBPF) tracex4_user.o +tracex5-objs := bpf_load.o $(LIBBPF) tracex5_user.o +tracex6-objs := bpf_load.o $(LIBBPF) tracex6_user.o +test_probe_write_user-objs := bpf_load.o $(LIBBPF) test_probe_wr
[GIT PULL 00/29] perf/core improvements and fixes
Hi Ingo, Please consider pulling, I had most of this queued before your first pull req to Linus for 4.10, most are fixes, with 'perf sched timehist --idle' as a followup new feature to the 'perf sched timehist' command introduced in this window. One other thing that delayed this was the samples/bpf/ switch to tools/lib/bpf/ that involved fixing up merge clashes with net.git and also to properly test it, after more rounds than antecipated, but all seems ok now and would be good to get this merge issues past us ASAP. - Arnaldo Test results at the end of this message, as usual. The following changes since commit e7aa8c2eb11ba69b1b69099c3c7bd6be3087b0ba: Merge tag 'docs-4.10' of git://git.lwn.net/linux (2016-12-12 21:58:13 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161220 for you to fetch changes up to 9899694a7f67714216665b87318eb367e2c5c901: samples/bpf: Move open_raw_sock to separate header (2016-12-20 12:00:40 -0300) perf/core improvements and fixes: New features: - Introduce 'perf sched timehist --idle', to analyse processes going to/from idle state (Namhyung Kim) Fixes: - Allow 'perf record -u user' to continue when facing races with threads going away after having scanned them via /proc (Jiri Olsa) - Fix 'perf mem' --all-user/--all-kernel options (Jiri Olsa) - Support jumps with multiple arguments (Ravi Bangoria) - Fix jumps to before the function where they are located (Ravi Bangoria) - Fix lock-pi help string (Davidlohr Bueso) - Fix build of 'perf trace' in odd systems such as a RHEL PPC one (Jiri Olsa) - Do not overwrite valid build id in 'perf diff' (Kan Liang) - Don't throw error for zero length symbols, allowing the use of the TUI in PowerPC, where such symbols became more common recently (Ravi Bangoria) Infrastructure: - Switch of samples/bpf/ to use tools/lib/bpf, removing libbpf duplication (Joe Stringer) - Move headers check into bash script (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> ---- Arnaldo Carvalho de Melo (3): perf tools: Remove some needless __maybe_unused samples/bpf: Make perf_event_read() static samples/bpf: Be consistent with bpf_load_program bpf_insn parameter Davidlohr Bueso (1): perf bench futex: Fix lock-pi help string Jiri Olsa (7): perf tools: Move headers check into bash script perf mem: Fix --all-user/--all-kernel options perf evsel: Use variable instead of repeating lengthy FD macro perf thread_map: Add thread_map__remove function perf evsel: Allow to ignore missing pid perf record: Force ignore_missing_thread for uid option perf trace: Check if MAP_32BIT is defined (again) Joe Stringer (8): tools lib bpf: Sync {tools,}/include/uapi/linux/bpf.h tools lib bpf: use __u32 from linux/types.h tools lib bpf: Add flags to bpf_create_map() samples/bpf: Make samples more libbpf-centric samples/bpf: Switch over to libbpf tools lib bpf: Add bpf_prog_{attach,detach} samples/bpf: Remove perf_event_open() declaration samples/bpf: Move open_raw_sock to separate header Kan Liang (1): perf diff: Do not overwrite valid build id Namhyung Kim (6): perf sched timehist: Split is_idle_sample() perf sched timehist: Introduce struct idle_time_data perf sched timehist: Save callchain when entering idle perf sched timehist: Skip non-idle events when necessary perf sched timehist: Add -I/--idle-hist option perf sched timehist: Show callchains for idle stat Ravi Bangoria (3): perf annotate: Support jump instruction with target as second operand perf annotate: Fix jump target outside of function address range perf annotate: Don't throw error for zero length symbols samples/bpf/Makefile | 70 +-- samples/bpf/README.rst| 4 +- samples/bpf/bpf_load.c| 21 +- samples/bpf/bpf_load.h| 3 + samples/bpf/fds_example.c | 13 +- samples/bpf/lathist_user.c| 2 +- samples/bpf/libbpf.c | 176 --- samples/bpf/libbpf.h | 28 +- samples/bpf/lwt_len_hist_user.c | 6 +- samples/bpf/offwaketime_user.c| 8 +- samples/bpf/sampleip_user.c | 7 +- samples/bpf/sock_example.c| 14 +- samples/bpf/sock_example.h| 35 ++ samples/bpf/sockex1_user.c| 7 +- samples/bpf/sockex2_user.c| 5 +- samples/bpf/sockex3_user.c