2018-11-21 09:28 UTC-0800 ~ Stanislav Fomichev <s...@fomichev.me>
On 11/21, Quentin Monnet wrote:
2018-11-20 15:26 UTC-0800 ~ Stanislav Fomichev <s...@fomichev.me>
On 11/20, Alexei Starovoitov wrote:
On Wed, Nov 21, 2018 at 12:18:57AM +0100, Daniel Borkmann wrote:
On 11/21/2018 12:04 AM, Alexei Starovoitov wrote:
On Tue, Nov 20, 2018 at 01:19:05PM -0800, Stanislav Fomichev wrote:
On 11/20, Alexei Starovoitov wrote:
On Mon, Nov 19, 2018 at 04:46:25PM -0800, Stanislav Fomichev wrote:
[Recent commit 23499442c319 ("bpf: libbpf: retry map creation without
the name") fixed this issue for maps, let's do the same for programs.]

Since commit 88cda1c9da02 ("bpf: libbpf: Provide basic API support
to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name
for programs. Pre v4.14 kernels don't know about programs names and
return an error about unexpected non-zero data. Retry sys_bpf without
a program name to cover older kernels.

Signed-off-by: Stanislav Fomichev <s...@google.com>
---
  tools/lib/bpf/bpf.c | 10 ++++++++++
  1 file changed, 10 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 961e1b9fc592..cbe9d757c646 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -212,6 +212,16 @@ int bpf_load_program_xattr(const struct 
bpf_load_program_attr *load_attr,
        if (fd >= 0 || !log_buf || !log_buf_sz)
                return fd;
+ if (fd < 0 && errno == E2BIG && load_attr->name) {
+               /* Retry the same syscall, but without the name.
+                * Pre v4.14 kernels don't support prog names.
+                */

I'm afraid that will put unnecessary stress on the kernel.
This check needs to be tighter.
Like E2BIG and anything in the log_buf probably means that
E2BIG came from the verifier and nothing to do with prog_name.
Asking kernel to repeat is an unnecessary work.

In general we need to think beyond this single prog_name field.
There are bunch of other fields in bpf_load_program_xattr() and older kernels
won't support them. Are we going to zero them out one by one
and retry? I don't think that would be practical.
I general, we don't want to zero anything out. However,
for this particular problem the rationale is the following:
In commit 88cda1c9da02 we started unconditionally setting {prog,map}->name
from the 'higher' libbpfc layer which breaks users on the older kernels.

Also libbpf silently ignoring prog_name is not great for debugging.
A warning is needed.
But it cannot be done out of lib/bpf/bpf.c, since it's a set of syscall
wrappers.
Imo such "old kernel -> lets retry" feature should probably be done
at lib/bpf/libbpf.c level. inside load_program().
For maps bpftools calls bpf_create_map_xattr directly, that's why
for maps I did the retry on the lower level (and why for programs I initially
thought about doing the same). However, in this case maybe asking
user to omit 'name' argument might be a better option.

For program names, I agree, we might think about doing it on the higher
level (although I'm not sure whether we want to have different API
expectations, i.e. bpf_create_map_xattr ignoring the name and
bpf_load_program_xattr not ignoring the name).

So given that rationale above, what do you think is the best way to
move forward?
1. Same patch, but tighten the retry check inside bpf_load_program_xattr ?
2. Move this retry logic into load_program and have different handling
    for bpf_create_map_xattr vs bpf_load_program_xattr ?
3. Do 2 and move the retry check for maps from bpf_create_map_xattr
    into bpf_object__create_maps ?

(I'm slightly leaning towards #3)

me too. I think it's cleaner for maps to do it in
bpf_object__create_maps().
Originally bpf.c was envisioned to be a thin layer on top of bpf syscall.
Whereas 'smart bits' would go into libbpf.c

Can't we create in bpf_object__load() a small helper bpf_object__probe_caps()
which would figure this out _once_ upon start with a few things to probe for
availability in the underlying kernel for maps and programs? E.g. programs
it could try to inject a tiny 'r0 = 0; exit' snippet where we figure out
things like prog name support etc. Given underlying kernel doesn't change, we
would only try this once and it doesn't require fallback every time.

+1. great idea!
Sounds good, let me try to do it.

It sounds more like a recent LPC proposal/idea to have some sys_bpf option
to query BPF features. This new bpf_object__probe_caps can probably query
that in the future if we eventually add support for it.


Hi,

LPC proposal indeed. I've been working on implementing this kind of
probes in bpftool. I don't probe name support for now (but I can
certainly add it), but I detect supported program types, map types,
header functions, and a couple of other parameters. The idea (initially
from Daniel) was to dump "#define" declarations that could later be
included in a header file and used for a BPF project (or alternatively,
JSON output).
Oh, nice, I didn't know someone was already working on it!

I felt like bpftool was maybe a better place to do it, as the set of
probes may grow large (all types, helpers, etc). It might have
consequences on the running system: for example, if I don't raise the
rlimit in bpftool before starting the probes, a good half of them fail
because of consecutive program creation and reclaim delay for locked
memory usage.
Should we aim for something like build system feature checks? Where we
preserve these probe results between bpftool/libbpf invocations so we
don't re-run them again?

So should we start adding probes to libbpf and should I move mine to the
lib as well, or should the one in the v3 of this series be directed to
something like bpftool instead?
We need them (well, at least the name checks) for libbpf because that's
what we link against and what we use to load the programs, bpftool is
less of an issue right now. But my patch was mostly a hackish solution
until we get the real feature checks :-)

Hi Stanislav,
Apologies for the delayed answer, I have been travelling.

I don't know if the probes should be "shared" with libbpf. What I had in mind was rather to run them with bpftool and to produce something like a header containing the results of the probes. Then this header could be included in whatever software wants to manage BPF objects, and that software can then decide to call (or not to call) libbpf functions in one way or another, depending on the features supported by the system. It means you have to recompile your program for the target system, or at least to load probe results as a configuration file somehow.

I see it as the best approach for supported program or map types for example, because I do not believe we should prevent a user to attempt to load a given program type with libbpf if they want to, even if probes told us this program type is not supported. But on the other hand, fixing up the calls as you did for program names is probably easier if there are probes in libbpf, so it can just be adjusted at runtime... I just feel concerned that they might grow too big in time, your patch means that we now have two more programs to load each time we call bpf_object__load().

I'll try to post a bpftool patch with my probes next week (and CC you), so that we can discuss further about this.

Quentin

Reply via email to