[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts
yonghong-song added a comment. @rsmith @eli.friedman Thanks for your comments. I fully agree that it seems awkward that we have both GEP and intrinsic generation. I will try to do some experiments here to only have intrinsic generation. My only concern is possible performance degradation. I will post here once I got concrete implementation and numbers. @rsmith regarding to your concerns about struct size change. Yes, the structure size may indeed change. Currently, we plan only to support field fields (non-struct/union type with 1/2/4/8 bytes, and strings). These are most common use cases. Let me explain a little bit more. Here, we are targeting the bpf_probe_read (https://github.com/torvalds/linux/blob/master/include/uapi/linux/bpf.h#L532-L537) int bpf_probe_read(void *dst, u32 size, const void *src) which tries to copy some data from kernel (src pointer) to the stack of the bpf program. Here, if "src" points to a structure, and user intends to copy the whole structure out of kernel, then "size" may need to be different for different kernels if different kernels indeed have different structure size. But first, in all our use cases, users typically do not read the whole structure. Second, if "size" does change, we request users to use multiple versioning (using patchable global variables) with possibly different "size" values for different kernel versions. Note that "size" must be constant for bpf verifier to succeed. (https://github.com/torvalds/linux/blob/master/kernel/trace/bpf_trace.c#L138-L156) Just gave another two examples here about reading kernel memories in iovisor/bcc. Note that bcc will transform certain pointer access "b->c" to "bpf_probe_read(, size, >c)" internally. The bcc tool biosnoop.py contain several bpf probe read: https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L110 data.len = req->__data_len; https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L117 data.len = req->__data_len; https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L118 data.sector = req->__sector; https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L120 struct gendisk *rq_disk = req->rq_disk; https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L121-L122 bpf_probe_read(_name, sizeof(data.disk_name), rq_disk->disk_name); They are all primitive types and strings. Another example for bcc tool tcpconnect.py https://github.com/iovisor/bcc/blob/master/tools/tcpconnect.py#L128 u16 dport = skp->__sk_common.skc_dport; https://github.com/iovisor/bcc/blob/master/tools/tcpconnect.py#L136-L137 data4.saddr = skp->__sk_common.skc_rcv_saddr; data4.daddr = skp->__sk_common.skc_daddr; . In summary, right now, the to-be-read kernel memory size (through a specific bpf_probe_read() call) won't change between different kernel versions. So we do not need to handle structure allocation size change here. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61809/new/ https://reviews.llvm.org/D61809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts
efriedma added a comment. > The size you allocate here will presumably need to vary as the struct layout > changes, and you have no way of knowing which allocas will need their sizes > to be changed. Your example is just a pointer; the size of a pointer won't change. That said, yes, address computation isn't the only place C cares about the layout of a struct. In particular variables (local or global) of struct type, sizeof(), and offsetof() need different handling. But I'm not sure how much any of those matter for BPF. > If this call is supposed to somehow represent that the %6 GEP is computed > from %5 by way of a struct field access, this will not be robust against any > kind of optimization: optimizations will happily rewrite uses of %6 to use %5 > or some other form directly, and you will have lost track of the pointers you > need to update. I think actually, the idea is that the intrinsic represents the computation of some additional adjustment, and so the following instructions use %7, the result of the call, not %6. So I think the existing formulation is sound. That said, it's awkward to have both an instruction and an intrinsic representing parts of the same offset computation; it probably makes sense to emit just the intrinsic, instead of both the intrinsic and the GEP, like you suggest. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61809/new/ https://reviews.llvm.org/D61809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts
rsmith added a comment. If I understand correctly, you want to be able to compile a program against some `struct` and `union` layouts, and then at load time "update" the program to cope with the actual layouts for those types being something else, but containing (at least) the set of members you know about. And your proposed approach is to add intrinsics into the IR that identify the GEPs that we emitted to model struct field accesses, but to otherwise not change the emitted IR. If so, I don't see how this approach to that problem can work. There seem to be a couple of problems: > %2 = alloca %struct.sk_buff*, align 8 The size you allocate here will presumably need to vary as the struct layout changes, and you have no way of knowing which `alloca`s will need their sizes to be changed. > %6 = getelementptr inbounds %struct.sk_buff, %struct.sk_buff* %5, i32 0, > i32 2, !dbg !54 > %7 = call [10 x %union.anon]* > > @llvm.preserve.di.access.index.p0a10s_union.anons.p0a10s_union.anons.p0s_struct.sk_buffs( > [10 x %union.anon]* %6, %struct.sk_buff* %5, > i8* getelementptr inbounds ([8 x i8], [8 x i8]* @0, i32 0, i32 0), i32 > 3), !dbg !54 If this call is supposed to somehow represent that the `%6` GEP is computed from `%5` by way of a struct field access, this will not be robust against any kind of optimization: optimizations will happily rewrite uses of `%6` to use `%5` or some other form directly, and you will have lost track of the pointers you need to update. It would seem better to me to represent a relocateable GEP directly: that is, instead of emitting a regular GEP and some fixup intrinsic, you could emit an intrinsic call //instead of// the GEP, and have the intrinsic compute the field offset. (And you'll need to do something about your `alloca`s, such as using an intrinsic to get the struct size and emitting a dynamic alloca of that size.) For example, instead of the above, you could emit > %6 = call i8* @llvm.preserve.di.access.gep.2(i8* %5, i32 0, i32 2, > !typeinfo) (where `!typeinfo` is whatever information you need to identify `struct sk_buff` later on, when generating your relocations, and the indices you pass in are whichever ones you need to identify the subobject within that type information). Does that make sense? Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61809/new/ https://reviews.llvm.org/D61809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts
yonghong-song added a comment. @lebedev.ri Thanks for the comment. This patch is not ready to land yet. Yes, tests are missing and I am going to add tests later. More importantly, I want to get a sense whether what I am implementing here is the right direction or not. The following two other patches are also related to describe our end goal: https://reviews.llvm.org/D61810 https://reviews.llvm.org/D61524 Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61809/new/ https://reviews.llvm.org/D61809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts
lebedev.ri added a comment. Tests seems to be missing? Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61809/new/ https://reviews.llvm.org/D61809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts
yonghong-song created this revision. yonghong-song added reviewers: eli.friedman, ast. Herald added subscribers: cfe-commits, arphaman, kosarev. Herald added a project: clang. For background of BPF CO-RE project, please refer to http://vger.kernel.org/bpfconf2019.html In summary, BPF CO-RE intends to compile bpf programs adjustable on struct/union layout change so the same program can run on multiple kernels with adjustment before loading based on native kernel structures. In order to do this, we need keep track of GEP(getelementptr) instruction base and result debuginfo types, so we can adjust on the host based on kernel BTF info. Capturing such information as an IR optimization is hard as various optimization may have tweaked GEP and also union is replaced by structure it is impossible to track fieldindex for union member accesses. An intrinsic function, preserve_di_access_index, is introducted. naddr = preserve_di_access_index(addr, base, type_name, access_index) here, addr: the previous getelementptr result, used as the result value do program semantics are kept intact. base: the base of previous getelementptr, used for later code generation with new relocatable access offset. type_name: the struct/union type name if available, can be used to match corresponding types in debuginfo. access_index: the access index based on user/debuginfo types. naddr: the result, having the same type as "addr". For example, for the following example, $ cat test.c struct sk_buff { int i; int b1:1; int b2:2; union { struct { int o1; int o2; } o; struct { char flags; char dev_id; } dev; int netid; } u[10]; }; static int (*bpf_probe_read)(void *dst, int size, void *unsafe_ptr) = (void *) 4; int bpf_prog(struct sk_buff *ctx) { char dev_id; bpf_probe_read(_id, sizeof(char), >u[5].dev.dev_id); return dev_id; } $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all test.c >& log The generated IR looks like below: ... define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 { %2 = alloca %struct.sk_buff*, align 8 %3 = alloca i8, align 1 store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45 call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49 call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50 call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51 %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45 %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45 %6 = getelementptr inbounds %struct.sk_buff, %struct.sk_buff* %5, i32 0, i32 2, !dbg !54 %7 = call [10 x %union.anon]* @llvm.preserve.di.access.index.p0a10s_union.anons.p0a10s_union.anons.p0s_struct.sk_buffs( [10 x %union.anon]* %6, %struct.sk_buff* %5, i8* getelementptr inbounds ([8 x i8], [8 x i8]* @0, i32 0, i32 0), i32 3), !dbg !54 %8 = getelementptr inbounds [10 x %union.anon], [10 x %union.anon]* %7, i64 0, i64 5, !dbg !53 %9 = call %union.anon* @llvm.preserve.di.access.index.p0s_union.anons.p0s_union.anons.p0a10s_union.anons( %union.anon* %8, [10 x %union.anon]* %7, i8* getelementptr inbounds ([1 x i8], [1 x i8]* @1, i32 0, i32 0), i32 5), !dbg !53 %10 = call %union.anon* @llvm.preserve.di.access.index.p0s_union.anons.p0s_union.anons.p0s_union.anons( %union.anon* %9, %union.anon* %9, i8* getelementptr inbounds ([1 x i8], [1 x i8]* @2, i32 0, i32 0), i32 1), !dbg !55 %11 = bitcast %union.anon* %10 to %struct.anon.0*, !dbg !55 %12 = getelementptr inbounds %struct.anon.0, %struct.anon.0* %11, i32 0, i32 1, !dbg !56 %13 = call i8* @llvm.preserve.di.access.index.p0i8.p0i8.p0s_struct.anon.0s(i8* %12, %struct.anon.0* %11, i8* getelementptr inbounds ([1 x i8], [1 x i8]* @3, i32 0, i32 0), i32 1), !dbg !56 %14 = call i32 %4(i8* %3, i32 1, i8* %13), !dbg !52 %15 = load i8, i8* %3, align 1, !dbg !57, !tbaa !58 %16 = sext i8 %15 to i32, !dbg !57 call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !59 ret i32 %16, !dbg !60 } For >u[5].dev.dev_id, . The first getelementptr (%6 = ...) has index 2 based on IR layout, and subsequent preserve_di_access_index (%7 = ...) has index 3 which reflects the debuginfo type layout. . The second getelementptr (%8 = ...) has index 5 which is the same as preserve_di_access_index (%9 = ...) for array subscript. . The instruction "%10 ..." is a call to preserve_di_access_index, which encodes the union member access index "1". Such information is lost in the original IR. . The third getelementptr (%12 = ...) has index 1 anonymous struct member "dev_id". The subsequent preserve_di_access_index also has the index "1". Basically,