Re: [PATCH RFC PoC 0/3] nftables meets bpf
On Wed, 21 Feb 2018 16:30:07 -0800, Florian Fainelli wrote: > On 02/21/2018 03:46 PM, Jakub Kicinski wrote: > > On Tue, 20 Feb 2018 11:58:22 +0100, Pablo Neira Ayuso wrote: > >> We also have a large range of TCAM based hardware offload outthere > >> that will _not_ work with your BPF HW offload infrastructure. What > >> this bpf infrastructure pushes into the kernel is just a blob > >> expressing things in a very low-level instruction-set: trying to find > >> a mapping of that to typical HW intermediate representations in the > >> TCAM based HW offload world will be simply crazy. > > > > I'm not sure where the TCAM talk is coming from. Think much smaller - > > cellular modems/phone SoCs, 32bit ARM/MIPS router box CPUs. The > > information the verifier is gathering will be crucial for optimizing > > those. Please don't discount the value of being able to use > > heterogeneous processing units by the networking stack. > > The only use case that we have a good answer for is when there is no HW > offload capability available, because there, we know that eBPF is our > best possible solution for a software fast path, in large part because > of all the efforts that went into making it both safe and fast. I was trying to point out that JITing eBPF for the host on 32 bit systems is already a pain, Jiong Wang is leading an effort to improve this both from LLVM and verifier angles, IOW running through the verifier may become useful even for host JITs :) > When there is offloading HW available, there does not appear to be a > perfect answer to this problem of, given a standard Linux utility that > can express any sort of match + action, be it ethtool::rxnfc, > tc/cls_{u32,flower}, nftables, how do I transform that into what makes > most sense to my HW? You could: > > - have hardware that understands BPF bytecode directly, great, then you > don't have to do anything, just pass it up the driver baby, oh wait, > it's not that simple, the NFP driver is not small True, it's not the largest but fair point, IMHO we should be trying to push for sharing as much code between drivers as possible, and on all fronts, but that's a topic for another time... > - transform BPF back into something that your hardware understand, does > that belong in the kernel? Maybe, maybe not Personally, I think there is non-zero probability of AMP CPUs/systems becoming more common. NFP is very powerful and fast, but less advanced solution may just use an off-the-shelf MIPS/ARM/Andes core. Taking it slightly further from home to the cellular/WiFi wake up problem which was mentioned by Android folks at one of netdevs - if we have MIPS/ARM/Andes *host* JIT in the kernel, and the NIC processor is built on one of those all the driver needs to provide is some glue and we can offload filtering to the MCU on the NIC/modem! > - use a completely different intermediate representation like P4, > brainfuck, I don't know > > Maybe first things first, we have at least 3 different programming > interfaces, if not more: ethtool::rxnfc, tc/cls_{u32,flower}, nftables > that are all capable of programming TCAMs and hardware capable of match > + action, how about we start with having some sort of common library > code that: > > - validates input parameters against HW capabilities This one may be quite hard. > - does the adequate transformation from any of these interfaces into a > generic set of input parameters > - define what the appropriate behavior is when programming through all > of these 3 interfaces that ultimately access the same shared piece of > HW, and therefore need to manage resources allocation? That would be great! :) Flower stands out today as the most feature rich and a go-to for TCAM offloads. > -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC PoC 0/3] nftables meets bpf
On 02/21/2018 03:46 PM, Jakub Kicinski wrote: > On Tue, 20 Feb 2018 11:58:22 +0100, Pablo Neira Ayuso wrote: >> We also have a large range of TCAM based hardware offload outthere >> that will _not_ work with your BPF HW offload infrastructure. What >> this bpf infrastructure pushes into the kernel is just a blob >> expressing things in a very low-level instruction-set: trying to find >> a mapping of that to typical HW intermediate representations in the >> TCAM based HW offload world will be simply crazy. > > I'm not sure where the TCAM talk is coming from. Think much smaller - > cellular modems/phone SoCs, 32bit ARM/MIPS router box CPUs. The > information the verifier is gathering will be crucial for optimizing > those. Please don't discount the value of being able to use > heterogeneous processing units by the networking stack. > The only use case that we have a good answer for is when there is no HW offload capability available, because there, we know that eBPF is our best possible solution for a software fast path, in large part because of all the efforts that went into making it both safe and fast. When there is offloading HW available, there does not appear to be a perfect answer to this problem of, given a standard Linux utility that can express any sort of match + action, be it ethtool::rxnfc, tc/cls_{u32,flower}, nftables, how do I transform that into what makes most sense to my HW? You could: - have hardware that understands BPF bytecode directly, great, then you don't have to do anything, just pass it up the driver baby, oh wait, it's not that simple, the NFP driver is not small - transform BPF back into something that your hardware understand, does that belong in the kernel? Maybe, maybe not - use a completely different intermediate representation like P4, brainfuck, I don't know Maybe first things first, we have at least 3 different programming interfaces, if not more: ethtool::rxnfc, tc/cls_{u32,flower}, nftables that are all capable of programming TCAMs and hardware capable of match + action, how about we start with having some sort of common library code that: - validates input parameters against HW capabilities - does the adequate transformation from any of these interfaces into a generic set of input parameters - define what the appropriate behavior is when programming through all of these 3 interfaces that ultimately access the same shared piece of HW, and therefore need to manage resources allocation? -- Florian -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC PoC 0/3] nftables meets bpf
On Tue, 20 Feb 2018 11:58:22 +0100, Pablo Neira Ayuso wrote: > We also have a large range of TCAM based hardware offload outthere > that will _not_ work with your BPF HW offload infrastructure. What > this bpf infrastructure pushes into the kernel is just a blob > expressing things in a very low-level instruction-set: trying to find > a mapping of that to typical HW intermediate representations in the > TCAM based HW offload world will be simply crazy. I'm not sure where the TCAM talk is coming from. Think much smaller - cellular modems/phone SoCs, 32bit ARM/MIPS router box CPUs. The information the verifier is gathering will be crucial for optimizing those. Please don't discount the value of being able to use heterogeneous processing units by the networking stack. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC PoC 0/3] nftables meets bpf
Hi Pablo, On 02/20/2018 11:58 AM, Pablo Neira Ayuso wrote: > On Mon, Feb 19, 2018 at 08:57:39PM +0100, Daniel Borkmann wrote: >> On 02/19/2018 05:37 PM, Pablo Neira Ayuso wrote: >> [...] >>> * Simplified infrastructure: We don't need the ebpf verifier complexity >>> either given we trust the code we generate from the kernel. We don't >>> need any complex userspace tooling either, just libnftnl and nft >>> userspace binaries. >>> >>> * Hardware offload: We can use this to offload rulesets to the only >>> smartnic driver that we have in the tree that already implements bpf >>> offload, hence, we can reuse this work already in place. >> >> In addition Dave's points, regarding the above two, this will also only >> work behind the verifier since NIC offloading piggy-backs on the verifier's >> program analysis to prepare and generate a dev specific JITed BPF >> prog, so it's not the same as normal host JITs (and there, the cBPF -> >> eBPF in kernel migration adds a lot of headaches already due to >> different underlying assumptions coming from the two flavors, even >> if both are eBPF insns in the end), and given this, offloading will >> also only work for eBPF and not cBPF. > > We also have a large range of TCAM based hardware offload outthere > that will _not_ work with your BPF HW offload infrastructure. What > this bpf infrastructure pushes into the kernel is just a blob > expressing things in a very low-level instruction-set: trying to find > a mapping of that to typical HW intermediate representations in the > TCAM based HW offload world will be simply crazy. Sure, and I think that's fine; there have been possible ways proposed in last netdev conference how this can be addressed by adding hints [0] in a programmable way as meta data in front of the packet as one option to accelerate. Other than that for fully pushing into hardware people will get a SmartNIC and there are multiple big vendors in that area working on them. Potentially in few years from now they're more and more becoming a commodity in DCs, lets see. Maybe we'll be programming them similarly as the case with graphics cards today. :-) [0] https://www.netdevconf.org/2.2/session.html?waskiewicz-xdpacceleration-talk >> There's a lot more the verifier is doing internally, like performing >> various different program rewrites from the context, for helpers >> (e.g. inlining), and for internal insn mappings that are not exposed >> (e.g. in calls), so we definitely need to go through it. > > If we need to call the verifier from the kernel for the code that we > generate there for this initial stage, that should be not an issue. > > The BPF interface is lacking many of the features and flexibility we > have in netlink these days, and it is only allowing for monolitic > ruleset replacement. This approach also loses internal rule stateful That only depends how you partition your program, a partial reconfiguration is definitely possible and done so today, for example as talked about in LB use case where the packet processing is staged e.g. into sampling, DDoS mitigation, and encap + redirect phase, where each of the components can be replaced atomically during runtime. So there is definitely flexibility available. Thanks, Daniel > information that we're doing in the packet path when updating the > ruleset. So it's taking us back to exactly the same mistakes we made > in iptables back in the 90s as it's been mentioned already. > > So I just wish I can count with your help in this process, we can get > the best of the two worlds by providing a subsystem that allows users > to configure packet classification through one single interface, no > matter if the policy representation ends up being in software or HW > offloads, either TCAM or smartnic. > > Thanks. > -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC PoC 0/3] nftables meets bpf
Hi Daniel, On Mon, Feb 19, 2018 at 08:57:39PM +0100, Daniel Borkmann wrote: > On 02/19/2018 05:37 PM, Pablo Neira Ayuso wrote: > [...] > > * Simplified infrastructure: We don't need the ebpf verifier complexity > > either given we trust the code we generate from the kernel. We don't > > need any complex userspace tooling either, just libnftnl and nft > > userspace binaries. > > > > * Hardware offload: We can use this to offload rulesets to the only > > smartnic driver that we have in the tree that already implements bpf > > offload, hence, we can reuse this work already in place. > > In addition Dave's points, regarding the above two, this will also only > work behind the verifier since NIC offloading piggy-backs on the verifier's > program analysis to prepare and generate a dev specific JITed BPF > prog, so it's not the same as normal host JITs (and there, the cBPF -> > eBPF in kernel migration adds a lot of headaches already due to > different underlying assumptions coming from the two flavors, even > if both are eBPF insns in the end), and given this, offloading will > also only work for eBPF and not cBPF. We also have a large range of TCAM based hardware offload outthere that will _not_ work with your BPF HW offload infrastructure. What this bpf infrastructure pushes into the kernel is just a blob expressing things in a very low-level instruction-set: trying to find a mapping of that to typical HW intermediate representations in the TCAM based HW offload world will be simply crazy. > There's a lot more the verifier is doing internally, like performing > various different program rewrites from the context, for helpers > (e.g. inlining), and for internal insn mappings that are not exposed > (e.g. in calls), so we definitely need to go through it. If we need to call the verifier from the kernel for the code that we generate there for this initial stage, that should be not an issue. The BPF interface is lacking many of the features and flexibility we have in netlink these days, and it is only allowing for monolitic ruleset replacement. This approach also loses internal rule stateful information that we're doing in the packet path when updating the ruleset. So it's taking us back to exactly the same mistakes we made in iptables back in the 90s as it's been mentioned already. So I just wish I can count with your help in this process, we can get the best of the two worlds by providing a subsystem that allows users to configure packet classification through one single interface, no matter if the policy representation ends up being in software or HW offloads, either TCAM or smartnic. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC PoC 0/3] nftables meets bpf
On 02/19/2018 05:37 PM, Pablo Neira Ayuso wrote: [...] > * Simplified infrastructure: We don't need the ebpf verifier complexity > either given we trust the code we generate from the kernel. We don't > need any complex userspace tooling either, just libnftnl and nft > userspace binaries. > > * Hardware offload: We can use this to offload rulesets to the only > smartnic driver that we have in the tree that already implements bpf > offload, hence, we can reuse this work already in place. In addition Dave's points, regarding the above two, this will also only work behind the verifier since NIC offloading piggy-backs on the verifier's program analysis to prepare and generate a dev specific JITed BPF prog, so it's not the same as normal host JITs (and there, the cBPF -> eBPF in kernel migration adds a lot of headaches already due to different underlying assumptions coming from the two flavors, even if both are eBPF insns in the end), and given this, offloading will also only work for eBPF and not cBPF. There's a lot more the verifier is doing internally, like performing various different program rewrites from the context, for helpers (e.g. inlining), and for internal insn mappings that are not exposed (e.g. in calls), so we definitely need to go through it. Thanks, Daniel -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html