"Tristan Mayfield" <[email protected]> writes:

> Toke, thanks for the quick response!
>
> Yes, I was checking the bpf_probe_read return values, and was reading
> the number of bytes expected, so nothing wrong there!

Right, in that case that's probably just because the struct in question
is next to some other valid memory (not sure where tracepoints keep
their data, but if it's on the stack, for instance, you'll have no
problem reading past it).

> Now that you mention CO-RE, it does actually make sense that these
> sorts of errors could be shifted to load time rather than attach time
> (that the right phrase?). I've fiddled with CO-RE a bit but I haven't
> adopted it for a few reasons (which I could certainly be mistaken
> about).

I'm by no means the leading authority on CO-RE, but I can give answering
a shot; hopefully someone will chime in to correct me if I'm wrong :)

> I don't have control over kernel versions or compilation flags for the
> kernel on the systems I'm targeting and I've had significant
> difficulty trying to compile CO-RE programs (e.g. from the BCC repo's
> libbpf-tools) on Linux <5.4 because I've had a hard time getting the
> vmlinux. I can't remember if I used bpftool though (this was about a
> year ago that I last played with CO-RE), so perhaps I'll give it
> another shot.

Yeah, getting all your ducks in a row when compiling can be a bit of an
issue. However, I don't think you need anything special from the kernel
at compile-time if you just compile your own programs with a vmlinux.h
file you generated on a kernel that has been compiled with BTF.

> I've also been very unclear, and have gotten many different answers
> regarding the target systems and whether they need to be custom
> compiled with BTF enabled for CO-RE programs to run on them, or if you
> can put a CO-RE program onto a generic kernel build and it "just
> works?" From your answer, the answer seems to be that
> /sys/kernel/btf/vmlinux needs to be on the target system, so it must
> have that BTF_ENABLE flag set?

Well, you'll need the BTF information of the running kernel. It doesn't
*have* to come from /sys/kernel/btf/vmlinux, libbpf will look for it in
a few other locations as well:

https://github.com/libbpf/libbpf/blob/master/src/btf.c#L4583

Distros have gotten pretty good about enabling BTF in their kernel
builds, though, so it's getting increasingly feasible to rely on it. It
should certainly be available on RHEL8 (and thus CentOS 8).

> If that's set, do you also need a vmlinux.h file as well? A coworker
> was recently messing with CO-RE and seemed to think that deploying a
> CO-RE program required shipping the vmlinux.h file and I think he
> mentioned that file was about 1Gb big, which is certainly a no-go for
> our position.

No, you don't need to ship the vmlinux.h file. That's just a regular
header file with an unusual amount of definitions in it, that will be
used at compile time. It can be useful to include a copy of it in your
source code repository, though, as mentioned above. That's what BCC
does, for instance:
https://github.com/iovisor/bcc/tree/master/libbpf-tools/x86

An no, it's not 1GB in size. Maybe that size was from before BTF
de-duplication got implemented? The one linked above is 2.7M.

> In addition to that, I've been unclear in the role of BTF in BPF
> generally. When I began tinkering with BPF I was under the impression
> that BTF was *only* something used for CO-RE programs (something I
> actually might've gotten from the article referenced and written by
> Andrii), but I've periodically seen errors arise that cite BTF reasons
> for erroring.

One common cause for this has been when loading 'tc' programs with
iproute2, because the iproute2 loader doesn't understand BTF and will
complain about it. That is usually harmless, though, but I agree it's
quite annoying. Fortunately, iproute2 has recently gained support for
using libbpf for its BPF loading, so hopefully that particular error
should go away before too long.

> Unfortunately I haven't saved any of these errors and
> can't remember the causes specifically, but something like the
> "updated" maps declarations, i.e.
>
> struct {
> __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
> __uint(key_size, sizeof(u32));
> __uint(value_size, sizeof(u32));
> } events SEC(".maps");
>
> I've learned does use BTF?

Yes, the new-style map definitions use BTF. While BTF is ostensibly a
type format (i.e., something that describes C data types), Andrii
figured out that it is also possible to use it as a general purpose
key/value store. You do this by being a bit clever about how you
represent your data, which is what the __uint() macro in the above is
doing (it's encoding the integer value as the size of an array, which
becomes part of the type and thus embedded in the BTF). When loading,
libbpf will parse this data back out of the BTF data and use it when
creating the map. So you'll need BTF support in your compiler and in
libbpf to use this style of map definitions.

> Am I misunderstanding what BTF is and the role it plays in BPF? Or
> maybe has libbpf development moved so far toward CO-RE that non-CO-RE
> development gets similar or the same error messages that just aren't
> as clear for it?

Hmm, no, CO-RE is the specific feature that does relocations of struct
fields based on member names. This relies on BTF, but it's not the only
thing BTF is used for. The map definition is another, as you discovered,
and there are some program types that cannot work without BTF
information at all. Also, things like bpftool being able to print out
the struct layout of map values is using BTF. So you're certainly right
that the BPF ecosystem in general is moving towards using BTF in more
and more places. And I guess you're also right that this leads to some
cryptic error messages sometimes... :)

-Toke



-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1956): https://lists.iovisor.org/g/iovisor-dev/message/1956
Mute This Topic: https://lists.iovisor.org/mt/80853471/21656
Group Owner: [email protected]
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to