Hi Daniel, Alexei, and many thanks for your answers, 2016-04-15 (11:44 UTC-0700) ~ Alexei Starovoitov: > On Fri, Apr 15, 2016 at 12:41:05PM +0200, Daniel Borkmann wrote: >> Hi Quentin, >> >> On 04/15/2016 12:07 PM, Quentin Monnet wrote: >>> When a new BPF traffic control filter or action is set up with tc, the >>> bytecode is sent back to userspace through a netlink socket for cBPF, but >>> not for eBPF (the file descriptor pointing to the object file containing >>> the bytecode is sent instead). >>> >>> This patch makes cls_bpf and act_bpf modules send the bytecode for eBPF as >>> well (in addition to the file descriptor). >>> […] >> >> Thanks for working on this, but it's unfortunately not that easy. Let >> me ask, what would be the intended use-case to dump the insns? > > +1 > >> I'm asking because if you dump them as-is, then a reinject at a later >> time of that bytecode back into the kernel will most likely be rejected >> by the verifier. >> >> This is because on load time, verifier does rewrites/expansion on some >> of the insns (f.e. map pointers, helper functions, ctx access etc, see >> also appendix in [1]), so the code as seen in the kernel would need to >> be sanitized first. > > +1 > we had similar discussion about this in seccomp context and decided that > the only sensible way is to keep original instructions, but it's wasteful > to do unconditionally and snapshotting of maps is not possible, > so there was no use for such dumping facility other than debugging. > Is it what the patch after? > We need to discuss it in the proper context.
I am experimenting with BPF, and so far I was just trying to dump the bytecode sent from tc to the kernel. I had not realized that the verifier would bring some changes to the instructions. And I agree that a more comprehensive debugging solution could be obtained if I can find some way to get a snapshot of the maps. >> Also, how would you make sense/transform maps into a meaningful >> representation (probably possible to find a scheme when they are pinned)? >> >> Another possibility is that such programs need to be pinned (can be done >> easily by tc in the background) and then implement a CRIU facility into >> the bpf(2) syscall to retrieve them. tc could make use of this w/o too >> much effort, and at the same time it would help CRIU folks, too. It >> also seems cleaner to have only one central api (bpf(2)) to dump them, >> but needs a bit of thought. > > +1 > any debugging or criu needs to be done in a centralized way via syscall > and/or bpffs. Maintaining a central API around bpf() makes sense to me. I have been looking at the BPF filesystem to see what information I can obtain from it, but I did not understand it well. I read the logs of Daniel's commit b2197755b263 (“bpf: add support for persistent maps/progs”), but I am unsure how I could use it in order to gather data about the maps and programs (if this is possible at all). I tried to set up some BPF filters working with maps, but I could not find any file under /sys/fs/bpf/tc. Would you have a pointer to some documentation about this filesystem? Or is there only the kernel code?
