[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts

2019-05-13 Thread Yonghong Song via Phabricator via cfe-commits
yonghong-song added a comment.

@rsmith @eli.friedman Thanks for your comments. I fully agree that it seems 
awkward that we have both GEP and intrinsic generation. I will try to do some 
experiments here to only have intrinsic generation. My only concern is possible 
performance degradation. I will post here once I got concrete implementation 
and numbers.

@rsmith regarding to your concerns about struct size change. Yes, the structure 
size may indeed change. Currently, we plan only to support field fields 
(non-struct/union type with 1/2/4/8 bytes, and strings). These are most common 
use cases. Let me explain a little bit more.

Here, we are targeting the bpf_probe_read 
(https://github.com/torvalds/linux/blob/master/include/uapi/linux/bpf.h#L532-L537)

  int bpf_probe_read(void *dst, u32 size, const void *src)

which tries to copy some data from kernel (src pointer) to the stack of the bpf 
program.

Here, if "src" points to a structure, and user intends to copy the whole 
structure out of kernel, then "size" may need to be different for different 
kernels if different kernels indeed have different structure size. 
But first, in all our use cases, users typically do not read the whole 
structure. Second, if "size" does change, we request users to use multiple 
versioning (using patchable global variables) with possibly different "size" 
values for different kernel versions. 
Note that "size" must be constant for bpf verifier to succeed.
(https://github.com/torvalds/linux/blob/master/kernel/trace/bpf_trace.c#L138-L156)

Just gave another two examples here about reading kernel memories in 
iovisor/bcc. Note that
bcc will transform certain pointer access "b->c" to "bpf_probe_read(, 
size, >c)" internally.

The bcc tool biosnoop.py contain several bpf probe read:

  https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L110
data.len = req->__data_len;
  https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L117
data.len = req->__data_len;
  https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L118
data.sector = req->__sector;
  https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L120
struct gendisk *rq_disk = req->rq_disk;
  https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L121-L122
   bpf_probe_read(_name, sizeof(data.disk_name),
   rq_disk->disk_name);

They are all primitive types and strings.

Another example for bcc tool tcpconnect.py

  https://github.com/iovisor/bcc/blob/master/tools/tcpconnect.py#L128
u16 dport = skp->__sk_common.skc_dport;
  https://github.com/iovisor/bcc/blob/master/tools/tcpconnect.py#L136-L137
data4.saddr = skp->__sk_common.skc_rcv_saddr;
data4.daddr = skp->__sk_common.skc_daddr;

.

In summary, right now, the to-be-read kernel memory size (through a specific 
bpf_probe_read() call)
won't change between different kernel versions. So we do not need to handle 
structure allocation size change here.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61809/new/

https://reviews.llvm.org/D61809



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts

2019-05-10 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment.

> The size you allocate here will presumably need to vary as the struct layout 
> changes, and you have no way of knowing which allocas will need their sizes 
> to be changed.

Your example is just a pointer; the size of a pointer won't change.

That said, yes, address computation isn't the only place C cares about the 
layout of a struct.  In particular variables (local or global) of struct type, 
sizeof(), and offsetof() need different handling.  But I'm not sure how much 
any of those matter for BPF.

> If this call is supposed to somehow represent that the %6 GEP is computed 
> from %5 by way of a struct field access, this will not be robust against any 
> kind of optimization: optimizations will happily rewrite uses of %6 to use %5 
> or some other form directly, and you will have lost track of the pointers you 
> need to update.

I think actually, the idea is that the intrinsic represents the computation of 
some additional adjustment, and so the following instructions use %7, the 
result of the call, not %6.  So I think the existing formulation is sound.

That said, it's awkward to have both an instruction and an intrinsic 
representing parts of the same offset computation; it probably makes sense to 
emit just the intrinsic, instead of both the intrinsic and the GEP, like you 
suggest.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61809/new/

https://reviews.llvm.org/D61809



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts

2019-05-10 Thread Richard Smith - zygoloid via Phabricator via cfe-commits
rsmith added a comment.

If I understand correctly, you want to be able to compile a program against 
some `struct` and `union` layouts, and then at load time "update" the program 
to cope with the actual layouts for those types being something else, but 
containing (at least) the set of members you know about. And your proposed 
approach is to add intrinsics into the IR that identify the GEPs that we 
emitted to model struct field accesses, but to otherwise not change the emitted 
IR. If so, I don't see how this approach to that problem can work. There seem 
to be a couple of problems:

>   %2 = alloca %struct.sk_buff*, align 8

The size you allocate here will presumably need to vary as the struct layout 
changes, and you have no way of knowing which `alloca`s will need their sizes 
to be changed.

>   %6 = getelementptr inbounds %struct.sk_buff, %struct.sk_buff* %5, i32 0, 
> i32 2, !dbg !54
>   %7 = call [10 x %union.anon]*
>   
> @llvm.preserve.di.access.index.p0a10s_union.anons.p0a10s_union.anons.p0s_struct.sk_buffs(
>   [10 x %union.anon]* %6, %struct.sk_buff* %5,
>   i8* getelementptr inbounds ([8 x i8], [8 x i8]* @0, i32 0, i32 0), i32 
> 3), !dbg !54

If this call is supposed to somehow represent that the `%6` GEP is computed 
from `%5` by way of a struct field access, this will not be robust against any 
kind of optimization: optimizations will happily rewrite uses of `%6` to use 
`%5` or some other form directly, and you will have lost track of the pointers 
you need to update.

It would seem better to me to represent a relocateable GEP directly: that is, 
instead of emitting a regular GEP and some fixup intrinsic, you could emit an 
intrinsic call //instead of// the GEP, and have the intrinsic compute the field 
offset. (And you'll need to do something about your `alloca`s, such as using an 
intrinsic to get the struct size and emitting a dynamic alloca of that size.) 
For example, instead of the above, you could emit

>   %6 = call i8* @llvm.preserve.di.access.gep.2(i8* %5, i32 0, i32 2, 
> !typeinfo)

(where `!typeinfo` is whatever information you need to identify `struct 
sk_buff` later on, when generating your relocations, and the indices you pass 
in are whichever ones you need to identify the subobject within that type 
information). Does that make sense?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61809/new/

https://reviews.llvm.org/D61809



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts

2019-05-10 Thread Yonghong Song via Phabricator via cfe-commits
yonghong-song added a comment.

@lebedev.ri Thanks for the comment. This patch is not ready to land yet. Yes, 
tests are missing and I am going to add tests later.
More importantly, I want to get a sense whether what I am implementing here is 
the right direction or not.
The following two other patches are also related to describe our end goal:

  https://reviews.llvm.org/D61810
  https://reviews.llvm.org/D61524  


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61809/new/

https://reviews.llvm.org/D61809



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts

2019-05-10 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Tests seems to be missing?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61809/new/

https://reviews.llvm.org/D61809



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D61809: [BPF] Preserve original struct/union type name/access index and array subscripts

2019-05-10 Thread Yonghong Song via Phabricator via cfe-commits
yonghong-song created this revision.
yonghong-song added reviewers: eli.friedman, ast.
Herald added subscribers: cfe-commits, arphaman, kosarev.
Herald added a project: clang.

For background of BPF CO-RE project, please refer to

  http://vger.kernel.org/bpfconf2019.html

In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.

In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.

An intrinsic function, preserve_di_access_index, is introducted.

  naddr = preserve_di_access_index(addr, base, type_name, access_index)

here,

  addr: the previous getelementptr result, used as the result value do
program semantics are kept intact.
  base: the base of previous getelementptr, used for later
code generation with new relocatable access offset.
  type_name: the struct/union type name if available, can be used to
match corresponding types in debuginfo.
  access_index: the access index based on user/debuginfo types.
  naddr: the result, having the same type as "addr".

For example, for the following example,

  $ cat test.c
  struct sk_buff {
int i;
int b1:1;
int b2:2;
union {
  struct {
int o1;
int o2;
  } o;
  struct {
char flags;
char dev_id;
  } dev;
  int netid;
} u[10];
  };
  
  static int (*bpf_probe_read)(void *dst, int size, void *unsafe_ptr)
  = (void *) 4;
  
  int bpf_prog(struct sk_buff *ctx) {
char dev_id;
bpf_probe_read(_id, sizeof(char), >u[5].dev.dev_id);
return dev_id;
  }
  
  $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all test.c >& 
log

The generated IR looks like below:

  ...

define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {

  %2 = alloca %struct.sk_buff*, align 8
  %3 = alloca i8, align 1
  store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
  call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, 
metadata !DIExpression()), !dbg !49
  call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
  call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata 
!DIExpression()), !dbg !51
  %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 
8, !dbg !52, !tbaa !45
  %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
  %6 = getelementptr inbounds %struct.sk_buff, %struct.sk_buff* %5, i32 0, i32 
2, !dbg !54
  %7 = call [10 x %union.anon]*
   
@llvm.preserve.di.access.index.p0a10s_union.anons.p0a10s_union.anons.p0s_struct.sk_buffs(
   [10 x %union.anon]* %6, %struct.sk_buff* %5,
   i8* getelementptr inbounds ([8 x i8], [8 x i8]* @0, i32 0, i32 0), i32 
3), !dbg !54
  %8 = getelementptr inbounds [10 x %union.anon], [10 x %union.anon]* %7, i64 
0, i64 5, !dbg !53
  %9 = call %union.anon* 
@llvm.preserve.di.access.index.p0s_union.anons.p0s_union.anons.p0a10s_union.anons(
   %union.anon* %8, [10 x %union.anon]* %7,
   i8* getelementptr inbounds ([1 x i8], [1 x i8]* @1, i32 0, i32 0), i32 
5), !dbg !53
  %10 = call %union.anon* 
@llvm.preserve.di.access.index.p0s_union.anons.p0s_union.anons.p0s_union.anons(
%union.anon* %9, %union.anon* %9, i8* getelementptr inbounds ([1 x i8], 
[1 x i8]* @2, i32 0, i32 0), i32 1), !dbg !55
  %11 = bitcast %union.anon* %10 to %struct.anon.0*, !dbg !55
  %12 = getelementptr inbounds %struct.anon.0, %struct.anon.0* %11, i32 0, i32 
1, !dbg !56
  %13 = call i8* 
@llvm.preserve.di.access.index.p0i8.p0i8.p0s_struct.anon.0s(i8* %12, 
%struct.anon.0* %11,
i8* getelementptr inbounds ([1 x i8], [1 x i8]* @3, i32 0, i32 0), i32 
1), !dbg !56
  %14 = call i32 %4(i8* %3, i32 1, i8* %13), !dbg !52
  %15 = load i8, i8* %3, align 1, !dbg !57, !tbaa !58
  %16 = sext i8 %15 to i32, !dbg !57
  call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !59
  ret i32 %16, !dbg !60

}

For >u[5].dev.dev_id,

  . The first getelementptr (%6 = ...) has index 2 based on IR layout, and 
subsequent
preserve_di_access_index (%7 = ...) has index 3 which reflects the 
debuginfo type layout.
  . The second getelementptr (%8 = ...) has index 5 which is the same as 
preserve_di_access_index
(%9 = ...) for array subscript.
  . The instruction "%10 ..." is a call to preserve_di_access_index, which 
encodes the union member
access index "1". Such information is lost in the original IR.
  . The third getelementptr (%12 = ...) has index 1 anonymous struct member 
"dev_id". The
subsequent preserve_di_access_index also has the index "1".

Basically,