Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
On Mon, Dec 5, 2016 at 7:20 PM, Florian Westphalwrote: > Willem de Bruijn wrote: >> While we're discussing the patch, another question, about revisions: I >> tested both modified and original iptables binaries on both standard >> and modified kernels. It all works as expected, except for the case >> where both binaries are used on a single kernel. For instance: >> >> iptables -A OUTPUT -m bpf --bytecode "`./nfbpf_compile RAW 'udp port >> 8000'`" -j LOG >> ./iptables.new -L >> >> Here the new binary will interpret the object as xt_bpf_match_v1, but >> iptables has inserted xt_bpf_match. The same problem happens the other >> way around. A new binary can be made robust to detect old structs, but >> not the other way around. Specific to bpf, the existing xt_bpf code >> has an unfortunate bug that it always prints at least one line of >> code, even if ->bpf_program_num_elems == 0. >> >> I notice that other extensions also do not necessarily only extend >> struct vN in vN+1. Is the above a known issue? > > Yes, I guess noone ever bothered to fix this. > > The kernel blob should contain the match/target revision number, > so userspace can in fact see that 'this is bpf v42', but iirc > the netfilter userspace just loads the highest userspace revision > supported by the kernel (which is then different for the 2 iptables > binaries). We can fall back on not parsing contents on mismatch: diff --git a/iptables/iptables.c b/iptables/iptables.c index 540d111..ada7c94 100644 --- a/iptables/iptables.c +++ b/iptables/iptables.c @@ -504,7 +504,8 @@ print_match(const struct xt_entry_match *m, xtables_find_match(m->u.user.name, XTF_TRY_LOAD, NULL); if (match) { - if (match->print) + if (match->print && + m->u.user.revision == match->revision) match->print(ip, m, numeric); else printf("%s ", match->name); > But we *could* display message like 'kernel uses revision 2 but I can > only find 0 and 1' or fall back to the lower supported revision without > guess-the-struct-by-size games. That's a good idea. A special case printf() with a notice, then. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
Willem de Bruijnwrote: > While we're discussing the patch, another question, about revisions: I > tested both modified and original iptables binaries on both standard > and modified kernels. It all works as expected, except for the case > where both binaries are used on a single kernel. For instance: > > iptables -A OUTPUT -m bpf --bytecode "`./nfbpf_compile RAW 'udp port > 8000'`" -j LOG > ./iptables.new -L > > Here the new binary will interpret the object as xt_bpf_match_v1, but > iptables has inserted xt_bpf_match. The same problem happens the other > way around. A new binary can be made robust to detect old structs, but > not the other way around. Specific to bpf, the existing xt_bpf code > has an unfortunate bug that it always prints at least one line of > code, even if ->bpf_program_num_elems == 0. > > I notice that other extensions also do not necessarily only extend > struct vN in vN+1. Is the above a known issue? Yes, I guess noone ever bothered to fix this. The kernel blob should contain the match/target revision number, so userspace can in fact see that 'this is bpf v42', but iirc the netfilter userspace just loads the highest userspace revision supported by the kernel (which is then different for the 2 iptables binaries). But we *could* display message like 'kernel uses revision 2 but I can only find 0 and 1' or fall back to the lower supported revision without guess-the-struct-by-size games. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
On Mon, Dec 5, 2016 at 6:29 PM, Willem de Bruijnwrote: > On Mon, Dec 5, 2016 at 6:22 PM, Pablo Neira Ayuso wrote: >> On Mon, Dec 05, 2016 at 06:06:05PM -0500, Willem de Bruijn wrote: >> [...] >>> Eric also suggests a private variable to avoid being subject to >>> changes to PATH_MAX. Then we can indeed also choose an arbitrary lower >>> length than current PATH_MAX. >> >> Good. >> >>> FWIW, there is a workaround for users with deeply nested paths: the >>> path passed does not have to be absolute. It is literally what is >>> passed on the command line to iptables right now, including relative >>> addresses. >> >> If iptables userspace always expects to have the bpf file repository >> in some given location (suggesting to have a directory that we specify >> at ./configure time, similar to what we do with connlabel.conf), then >> I think we can rely on relative paths. Would this be flexible enough >> for your usecase? > > As long as it accepts relative paths, I think it will always work. > Worst case, a user has to cd. No need for hardcoding the bpf mount > point at compile time. > > I have the matching iptables patch for pinned objects, btw. Not for > elf objects, which requires linking to libelf and parsing the object, > which is more work (and perhaps best punted on by expanding libbpf in > bcc to include this functionality. it already exists under samples/bpf > and iproute2). While we're discussing the patch, another question, about revisions: I tested both modified and original iptables binaries on both standard and modified kernels. It all works as expected, except for the case where both binaries are used on a single kernel. For instance: iptables -A OUTPUT -m bpf --bytecode "`./nfbpf_compile RAW 'udp port 8000'`" -j LOG ./iptables.new -L Here the new binary will interpret the object as xt_bpf_match_v1, but iptables has inserted xt_bpf_match. The same problem happens the other way around. A new binary can be made robust to detect old structs, but not the other way around. Specific to bpf, the existing xt_bpf code has an unfortunate bug that it always prints at least one line of code, even if ->bpf_program_num_elems == 0. I notice that other extensions also do not necessarily only extend struct vN in vN+1. Is the above a known issue? -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
On Mon, Dec 5, 2016 at 6:22 PM, Pablo Neira Ayusowrote: > On Mon, Dec 05, 2016 at 06:06:05PM -0500, Willem de Bruijn wrote: > [...] >> Eric also suggests a private variable to avoid being subject to >> changes to PATH_MAX. Then we can indeed also choose an arbitrary lower >> length than current PATH_MAX. > > Good. > >> FWIW, there is a workaround for users with deeply nested paths: the >> path passed does not have to be absolute. It is literally what is >> passed on the command line to iptables right now, including relative >> addresses. > > If iptables userspace always expects to have the bpf file repository > in some given location (suggesting to have a directory that we specify > at ./configure time, similar to what we do with connlabel.conf), then > I think we can rely on relative paths. Would this be flexible enough > for your usecase? As long as it accepts relative paths, I think it will always work. Worst case, a user has to cd. No need for hardcoding the bpf mount point at compile time. I have the matching iptables patch for pinned objects, btw. Not for elf objects, which requires linking to libelf and parsing the object, which is more work (and perhaps best punted on by expanding libbpf in bcc to include this functionality. it already exists under samples/bpf and iproute2). -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
On Mon, Dec 05, 2016 at 06:06:05PM -0500, Willem de Bruijn wrote: [...] > Eric also suggests a private variable to avoid being subject to > changes to PATH_MAX. Then we can indeed also choose an arbitrary lower > length than current PATH_MAX. Good. > FWIW, there is a workaround for users with deeply nested paths: the > path passed does not have to be absolute. It is literally what is > passed on the command line to iptables right now, including relative > addresses. If iptables userspace always expects to have the bpf file repository in some given location (suggesting to have a directory that we specify at ./configure time, similar to what we do with connlabel.conf), then I think we can rely on relative paths. Would this be flexible enough for your usecase? -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
On Mon, Dec 5, 2016 at 6:00 PM, Pablo Neira Ayusowrote: > On Mon, Dec 05, 2016 at 11:34:15PM +0100, Pablo Neira Ayuso wrote: >> On Mon, Dec 05, 2016 at 10:30:01PM +0100, Florian Westphal wrote: >> > Eric Dumazet wrote: >> > > On Mon, 2016-12-05 at 15:28 -0500, Willem de Bruijn wrote: >> > > > From: Willem de Bruijn >> > > > >> > > > Add support for attaching an eBPF object by file descriptor. >> > > > >> > > > The iptables binary can be called with a path to an elf object or a >> > > > pinned bpf object. Also pass the mode and path to the kernel to be >> > > > able to return it later for iptables dump and save. >> > > > >> > > > Signed-off-by: Willem de Bruijn >> > > > --- >> > > >> > > Assuming there is no simple way to get variable matchsize in iptables, >> > > this looks good to me, thanks. >> > >> > It should be possible by setting kernel .matchsize to ~0 which >> > suppresses strict size enforcement. >> > >> > Its currently only used by ebt_among, but this should work for any xtables >> > module. >> >> This is likely going to trigger a large rewrite of the core userspace >> iptables codebase, and likely going to pull part of the mess we have >> in ebtables into iptables. So I'd prefer not to follow this path. > > So this variable path is there to annotate what userspace claims that > is the file that contains the bpf blob that was loaded, actually this > is irrelevant to the kernel, so this is just there to dump it back > when iptables-save it is called. Just a side note, one could set > anything there from userspace, point somewhere else actually... > > Well anyway, going back to the path problem to keep it simple: Why > don't just trim this down to something smaller, are you really > expecting to reach PATH_MAX in your usecase? Not often. Module-specific limitations that differ from global definitions are just a pain when they bite. This module also has an arbitrary low limit on the length of the cBPF program passed, for instance. Eric also suggests a private variable to avoid being subject to changes to PATH_MAX. Then we can indeed also choose an arbitrary lower length than current PATH_MAX. FWIW, there is a workaround for users with deeply nested paths: the path passed does not have to be absolute. It is literally what is passed on the command line to iptables right now, including relative addresses. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
On Mon, Dec 05, 2016 at 02:59:09PM -0800, Eric Dumazet wrote: > On Mon, 2016-12-05 at 23:40 +0100, Florian Westphal wrote: > > > Fair enough, I have no objections to the patch. > > An additional question is about PATH_MAX : > > Is it guaranteed to stay at 4096 forever ? > > To be safe, maybe we should use a constant of our own. Right, this reminds me we have to fix something else. So constant of our own plus something smaller, if possible, would be good to go. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
On Mon, Dec 05, 2016 at 11:34:15PM +0100, Pablo Neira Ayuso wrote: > On Mon, Dec 05, 2016 at 10:30:01PM +0100, Florian Westphal wrote: > > Eric Dumazetwrote: > > > On Mon, 2016-12-05 at 15:28 -0500, Willem de Bruijn wrote: > > > > From: Willem de Bruijn > > > > > > > > Add support for attaching an eBPF object by file descriptor. > > > > > > > > The iptables binary can be called with a path to an elf object or a > > > > pinned bpf object. Also pass the mode and path to the kernel to be > > > > able to return it later for iptables dump and save. > > > > > > > > Signed-off-by: Willem de Bruijn > > > > --- > > > > > > Assuming there is no simple way to get variable matchsize in iptables, > > > this looks good to me, thanks. > > > > It should be possible by setting kernel .matchsize to ~0 which > > suppresses strict size enforcement. > > > > Its currently only used by ebt_among, but this should work for any xtables > > module. > > This is likely going to trigger a large rewrite of the core userspace > iptables codebase, and likely going to pull part of the mess we have > in ebtables into iptables. So I'd prefer not to follow this path. So this variable path is there to annotate what userspace claims that is the file that contains the bpf blob that was loaded, actually this is irrelevant to the kernel, so this is just there to dump it back when iptables-save it is called. Just a side note, one could set anything there from userspace, point somewhere else actually... Well anyway, going back to the path problem to keep it simple: Why don't just trim this down to something smaller, are you really expecting to reach PATH_MAX in your usecase? -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
Hi Willem, On 12/05/2016 09:28 PM, Willem de Bruijn wrote: From: Willem de BruijnAdd support for attaching an eBPF object by file descriptor. The iptables binary can be called with a path to an elf object or a pinned bpf object. Also pass the mode and path to the kernel to be able to return it later for iptables dump and save. Signed-off-by: Willem de Bruijn just out of pure curiosity, use case is for android guys wrt accounting, or anything specific that cls_bpf on tc ingress + egress cannot do already? Thanks, Daniel -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
Pablo Neira Ayusowrote: > On Mon, Dec 05, 2016 at 10:30:01PM +0100, Florian Westphal wrote: > > Eric Dumazet wrote: > > > On Mon, 2016-12-05 at 15:28 -0500, Willem de Bruijn wrote: > > > > From: Willem de Bruijn > > > > > > > > Add support for attaching an eBPF object by file descriptor. > > > > > > > > The iptables binary can be called with a path to an elf object or a > > > > pinned bpf object. Also pass the mode and path to the kernel to be > > > > able to return it later for iptables dump and save. > > > > > > > > Signed-off-by: Willem de Bruijn > > > > --- > > > > > > Assuming there is no simple way to get variable matchsize in iptables, > > > this looks good to me, thanks. > > > > It should be possible by setting kernel .matchsize to ~0 which > > suppresses strict size enforcement. > > > > Its currently only used by ebt_among, but this should work for any xtables > > module. > > This is likely going to trigger a large rewrite of the core userspace > iptables codebase, and likely going to pull part of the mess we have > in ebtables into iptables. So I'd prefer not to follow this path. Fair enough, I have no objections to the patch. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
Eric Dumazetwrote: > On Mon, 2016-12-05 at 15:28 -0500, Willem de Bruijn wrote: > > From: Willem de Bruijn > > > > Add support for attaching an eBPF object by file descriptor. > > > > The iptables binary can be called with a path to an elf object or a > > pinned bpf object. Also pass the mode and path to the kernel to be > > able to return it later for iptables dump and save. > > > > Signed-off-by: Willem de Bruijn > > --- > > Assuming there is no simple way to get variable matchsize in iptables, > this looks good to me, thanks. It should be possible by setting kernel .matchsize to ~0 which suppresses strict size enforcement. Its currently only used by ebt_among, but this should work for any xtables module. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH nf-next] netfilter: xt_bpf: support ebpf
On Mon, 2016-12-05 at 15:28 -0500, Willem de Bruijn wrote: > From: Willem de Bruijn> > Add support for attaching an eBPF object by file descriptor. > > The iptables binary can be called with a path to an elf object or a > pinned bpf object. Also pass the mode and path to the kernel to be > able to return it later for iptables dump and save. > > Signed-off-by: Willem de Bruijn > --- Assuming there is no simple way to get variable matchsize in iptables, this looks good to me, thanks. Reviewed-by: Eric Dumazet -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html