Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver
On 5/4/23 6:47 AM, Daniel P. Berrangé wrote: On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote: This patch series enables libvirt to use nftables rules rather than iptables *when setting up virtual networks* (it does *not* add nftables support to the nwfilter driver). It accomplishes this by abstracting several iptables functions (from viriptables.[ch] called by the virtual network driver into a rudimentary "virNetfilter API" (in virnetfilter.[ch], having the virtual network driver call the virNetFilter API rather than calling the existing iptables functions directly, and then finally adding an equivalent virNftables backend that can be used instead of iptables (selected manually via a network.conf setting, or automatically if iptables isn't found on the host). A first look at the result may have you thinking that it's filled with a lot of bad decisions. While I would agree with that in many cases, I think that overall they are the "least bad" decisions, or at least "bad within acceptable limits / no worse than something else", and point out that it's been done in a way that minimizes (actually eliminates) the need for immediate changes to nwfilter (the other consumer of iptables, which *also* needs to be updated to use native nftables), and makes it much easier to change our mind about the details in the future. When I first started on this (long, protracted, repeatedly interrupted for extended periods - many of these patches are > a year old) task, I considered doing an all-at-once complete replacement of iptables with nftables, since all the Linux distros we support have had nftables for several years, and I'm pretty sure nobody has it disabled (not even sure if it's possible to disable nftables while still enabling iptables, since they both use xtables in the kernel). But due to libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit fd5b15ff all the way back in July 2010 for details) which has no equivalent in nftables rules (and we don't *want* it to!!), and the desire to be able to easily switch back to iptables in case of an unforeseen regression, we decided that both iptables and nftables need to be supported (for now), with the default (for now) remaining as iptables. Just allowing for dual backends complicated matters, since it means that we have to have a config file, a setting, detection of which backends are available, and of course some sort of concept of an abstracted frontend that can use either backend based on the config setting (and/or auto-detection). Combining that with the fact that it would just be "too big" of a project to switch over nwfilter's iptables usage at the same time means that we have to keep around a lot of existing code for compatibility's sake rather than just wiping it all away and starting over. So, what I've ended up with is: 1) a network.conf file (didn't exist before) with a single setting "firewall_backend". If unset, the network driver tries to use iptables on the backend, and if that's missing, then tries to use nftables. When testing your git branch active-nft-10 leavnig it unset didn't work: Running './src/libvirtd'... 2023-05-04 10:16:11.447+: 115377: info : libvirt version: 9.3.0 2023-05-04 10:16:11.447+: 115377: info : hostname: localhost.localdomain 2023-05-04 10:16:11.447+: 115377: error : virFirewallNew:118 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected 2023-05-04 10:16:11.447+: 115377: error : virNetFilterBackendUnsetError:51 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected 2023-05-04 10:16:11.447+: 115377: error : virNetFilterBackendUnsetError:51 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected 2023-05-04 10:16:11.473+: 115377: error : virFirewallNew:118 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected Ugh :-( I guess I did brak something with all the pre-posting cleanups after my last round of testing, as that was working just fine as of the end of last week.. I'll see what I broke, and if it's simple I'll update the branch on gitlab so that you can try it out if you want (even though it's obvious I need to change some things)
Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver
On 5/3/23 11:40 AM, Daniel P. Berrangé wrote: On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote: When I first started on this (long, protracted, repeatedly interrupted for extended periods - many of these patches are > a year old) task, I considered doing an all-at-once complete replacement of iptables with nftables, since all the Linux distros we support have had nftables for several years, and I'm pretty sure nobody has it disabled (not even sure if it's possible to disable nftables while still enabling iptables, since they both use xtables in the kernel). But due to libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit fd5b15ff all the way back in July 2010 for details) which has no equivalent in nftables rules (and we don't *want* it to!!), and the desire to be able to easily switch back to iptables in case of an unforeseen regression, we decided that both iptables and nftables need to be supported (for now), with the default (for now) remaining as iptables. What happens if you use iptables -j CHECKSUM --checksum-fill, and this is just the iptables compat shim that is secretly talking to NFT in the kernel ? Is the checksum stuff just ignored entirely then ? If so, then the key decision factor for us, is whether any of the supported distros still officially support use of the iptables KMOD instead of nft KMOD. So, what I've ended up with is: 1) a network.conf file (didn't exist before) with a single setting "firewall_backend". If unset, the network driver tries to use iptables on the backend, and if that's missing, then tries to use nftables. For most distros I feel like nftables should be the default and not require a user opt in. That raises the question of how do we deal with the upgrade path. Historically when starting libvirt we'll blow away our existing iptables rules and create them fresh. For an upgrade path we'll need a variant which blows away iptables rules and instead creates nftables rules. If that works, then we could default to nftables without user config. Since my patches do that, I would be totally fine with preferring nftables over iptables when no config is present. It would definitely give a better feeling of "modernizing" if we could do that. (Actually, I was considering that when I tweak Eric Garver's patches for native firewalld networks, I would prefer firewalld over nftables if firewalld.service is running. In that case, the order of preference (still overrideable, of course) would be firewalld->nftables->iptables, which from a functionality standpoint makes the most sense.) We had spent some time in IRC discussing different ways of using new functionality available in nftables to make a more efficient/performant implemention of the desired filtering, and there are some really great possibilities that need to be explored, but in the end there were too many details up in the air, and I decided that it would be more "accomplishable" (coined a new word there!) to first replicate existing behavior with nftables, but do it inside a framework that makes it easy to modify the details in the future (in particular making it painless to switch back and forth between builds with differing filter models at runtime) - this way we'll be able to separate the infrastructure work from the details of the rules (which we can then more easily work on and experiment with). (This implies that the main objective right now is "get rid of iptables dependencies", not "make the filtering faster and more efficient"). I feel like filtering wrt virtual networks isn't our main performance bottleneck. Most machines only have a couple of virtual networks, so most host traffic only has to match a few rules. With nwfilter though, if we have 1000 VMs we'll have 1000 top level rules that need evaluating for every packet. IOW, the performance optimizations allowed by nftables are much more relevant to nwfilter. That's a good point, and makes me feel better about not trying to immediately optimize the rules created here. (although I do recall someone once a long time ago who filed a bug about poor performance when they had > 500 virtual networks :-P) (OTOH, I've been reading through your response to 18/28, and what you're suggesting there, in the name of simpler removal rather than runtime performance, is changing what rules are generated in the initial version. I'm still wrapping my head around that - I had confused "chain" with "hook" in a strange way, and thought that the behavior was different than it apparently is). * allows switching between iptables/nftables backends without rebooting or restarting networks/guests. Because the commands required to remove a network's filter rules are now saved in the network status XML, each time libvirtd (or virtnetworkd) is restarted, it will execute exactly the commands needed to remove the filter rules that had been added by the previous libvirtd/virtnetworkd (rather than just
Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver
On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote: > This patch series enables libvirt to use nftables rules rather than > iptables *when setting up virtual networks* (it does *not* add > nftables support to the nwfilter driver). It accomplishes this by > abstracting several iptables functions (from viriptables.[ch] called > by the virtual network driver into a rudimentary "virNetfilter API" > (in virnetfilter.[ch], having the virtual network driver call the > virNetFilter API rather than calling the existing iptables functions > directly, and then finally adding an equivalent virNftables backend > that can be used instead of iptables (selected manually via a > network.conf setting, or automatically if iptables isn't found on the > host). > > A first look at the result may have you thinking that it's filled with > a lot of bad decisions. While I would agree with that in many cases, I > think that overall they are the "least bad" decisions, or at least > "bad within acceptable limits / no worse than something else", and > point out that it's been done in a way that minimizes (actually > eliminates) the need for immediate changes to nwfilter (the other > consumer of iptables, which *also* needs to be updated to use native > nftables), and makes it much easier to change our mind about the > details in the future. > > When I first started on this (long, protracted, repeatedly interrupted > for extended periods - many of these patches are > a year old) task, I > considered doing an all-at-once complete replacement of iptables with > nftables, since all the Linux distros we support have had nftables for > several years, and I'm pretty sure nobody has it disabled (not even > sure if it's possible to disable nftables while still enabling > iptables, since they both use xtables in the kernel). But due to > libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit > fd5b15ff all the way back in July 2010 for details) which has no > equivalent in nftables rules (and we don't *want* it to!!), and the > desire to be able to easily switch back to iptables in case of an > unforeseen regression, we decided that both iptables and nftables need > to be supported (for now), with the default (for now) remaining as > iptables. > > Just allowing for dual backends complicated matters, since it means > that we have to have a config file, a setting, detection of which > backends are available, and of course some sort of concept of an > abstracted frontend that can use either backend based on the config > setting (and/or auto-detection). Combining that with the fact that it > would just be "too big" of a project to switch over nwfilter's > iptables usage at the same time means that we have to keep around a > lot of existing code for compatibility's sake rather than just wiping > it all away and starting over. > > So, what I've ended up with is: > > 1) a network.conf file (didn't exist before) with a single setting > "firewall_backend". If unset, the network driver tries to use iptables > on the backend, and if that's missing, then tries to use nftables. When testing your git branch active-nft-10 leavnig it unset didn't work: Running './src/libvirtd'... 2023-05-04 10:16:11.447+: 115377: info : libvirt version: 9.3.0 2023-05-04 10:16:11.447+: 115377: info : hostname: localhost.localdomain 2023-05-04 10:16:11.447+: 115377: error : virFirewallNew:118 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected 2023-05-04 10:16:11.447+: 115377: error : virNetFilterBackendUnsetError:51 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected 2023-05-04 10:16:11.447+: 115377: error : virNetFilterBackendUnsetError:51 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected 2023-05-04 10:16:11.473+: 115377: error : virFirewallNew:118 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver
On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote: > When I first started on this (long, protracted, repeatedly interrupted > for extended periods - many of these patches are > a year old) task, I > considered doing an all-at-once complete replacement of iptables with > nftables, since all the Linux distros we support have had nftables for > several years, and I'm pretty sure nobody has it disabled (not even > sure if it's possible to disable nftables while still enabling > iptables, since they both use xtables in the kernel). But due to > libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit > fd5b15ff all the way back in July 2010 for details) which has no > equivalent in nftables rules (and we don't *want* it to!!), and the > desire to be able to easily switch back to iptables in case of an > unforeseen regression, we decided that both iptables and nftables need > to be supported (for now), with the default (for now) remaining as > iptables. What happens if you use iptables -j CHECKSUM --checksum-fill, and this is just the iptables compat shim that is secretly talking to NFT in the kernel ? Is the checksum stuff just ignored entirely then ? If so, then the key decision factor for us, is whether any of the supported distros still officially support use of the iptables KMOD instead of nft KMOD. > So, what I've ended up with is: > > 1) a network.conf file (didn't exist before) with a single setting > "firewall_backend". If unset, the network driver tries to use iptables > on the backend, and if that's missing, then tries to use nftables. For most distros I feel like nftables should be the default and not require a user opt in. That raises the question of how do we deal with the upgrade path. Historically when starting libvirt we'll blow away our existing iptables rules and create them fresh. For an upgrade path we'll need a variant which blows away iptables rules and instead creates nftables rules. If that works, then we could default to nftables without user config. > We had spent some time in IRC discussing different ways of using new > functionality available in nftables to make a more > efficient/performant implemention of the desired filtering, and there > are some really great possibilities that need to be explored, but in > the end there were too many details up in the air, and I decided that > it would be more "accomplishable" (coined a new word there!) to first > replicate existing behavior with nftables, but do it inside a > framework that makes it easy to modify the details in the future (in > particular making it painless to switch back and forth between builds > with differing filter models at runtime) - this way we'll be able to > separate the infrastructure work from the details of the rules (which > we can then more easily work on and experiment with). (This implies > that the main objective right now is "get rid of iptables > dependencies", not "make the filtering faster and more efficient"). I feel like filtering wrt virtual networks isn't our main performance bottleneck. Most machines only have a couple of virtual networks, so most host traffic only has to match a few rules. With nwfilter though, if we have 1000 VMs we'll have 1000 top level rules that need evaluating for every packet. IOW, the performance optimizations allowed by nftables are much more relevant to nwfilter. > * allows switching between iptables/nftables backends without > rebooting or restarting networks/guests. > > Because the commands required to remove a network's filter rules are > now saved in the network status XML, each time libvirtd (or > virtnetworkd) is restarted, it will execute exactly the commands > needed to remove the filter rules that had been added by the > previous libvirtd/virtnetworkd (rather than just making a guess, as > we've always done up until now), and then add new rules using the > current backend+binary's set of rules (while also saving the info > needed for future removal of these new rules back into the network's > status XML). Ok that's answering my question about upgrades then. It looks like we should be able to default to nftables out of the box, if we assume that absence of info in the network status XML implies iptables. > * virFirewall does *not* provide a backend-agnostic interface [this is fine] > > * We need to maintain a backward-compatible API for virFirewall so > that we don't have to touch nwfilter code. Trying to make its API > backend-agnostic would require individually considering/changing > every nwfilter use of virFirewall. > > * instead virFirewall objects are just a way to build a collection > of commands to execute to build a firewall, then execute them > while collecting info for and building a collection of commands > that will tear down that firewall in the future. > > Do I want to "fix" this in the future by making virFirewall a higher > level interface that accepts
Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver
On a Sunday in 2023, Laine Stump wrote: This patch series enables libvirt to use nftables rules rather than iptables *when setting up virtual networks* (it does *not* add nftables support to the nwfilter driver). It accomplishes this by getting these patches in. [... 150 lines delted ...] Laine Stump (28): util: add -w/--concurrent when applying the rule rather than when building it util: new virFirewallRuleGet*() APIs util: determine ignoreErrors value when creating rule, not when applying util: rename iptables helpers that will become the frontend for ip util: move backend-agnostic virNetfilter*() functions to their own file util: make netfilter action a proper typedefed (virFirewall) enum util: #define the names used for private packet filter chains util: move/rename virFirewallApplyRuleDirect to virIptablesApplyFirewallRule util/network: reintroduce virFirewallBackend, but different network: add (empty) network.conf file to distribution files network: allow setting firewallBackend from network.conf network: do not add DHCP checksum mangle rule unless using iptables network: call backend agnostic function to init private filter chains util: setup functions in virnetfilter which will call appropriate backend build: add nft to the list of binaries we attempt to locate util: add nftables backend to virnetfilter API used by network driver tests: test cases for nftables backend util: new functions to support adding individual rollback rules util: check for 0 args when applying iptables rule util: implement rollback rule autosave for iptables backend util: implement rollback rule autosave for nftables backend network: turn on auto-rollback for the rules added for virtual networks util: new function virFirewallNewFromRollback() util: new functions virFirewallParseXML() and virFirewallFormat() conf: add a virFirewall object to virNetworkObj network: use previously saved list of firewall rules when removing network: save network status when firewall rules are reloaded network: improve log message when reloading virtual network firewall rules libvirt.spec.in | 5 + meson.build | 1 + po/POTFILES | 2 + src/conf/virnetworkobj.c | 40 + src/conf/virnetworkobj.h | 11 + src/libvirt_private.syms | 68 +- src/network/bridge_driver.c | 40 +- src/network/bridge_driver_conf.c | 44 + src/network/bridge_driver_conf.h | 3 + src/network/bridge_driver_linux.c | 241 +++-- src/network/bridge_driver_nop.c | 6 +- src/network/bridge_driver_platform.h | 6 +- src/network/libvirtd_network.aug | 39 + src/network/meson.build | 11 + src/network/network.conf | 24 + src/network/test_libvirtd_network.aug.in | 5 + src/nwfilter/nwfilter_ebiptables_driver.c | 16 +- src/util/meson.build | 2 + src/util/virebtables.c| 4 +- src/util/virfirewall.c| 490 -- src/util/virfirewall.h| 51 +- src/util/viriptables.c| 762 --- src/util/viriptables.h| 222 ++--- src/util/virnetfilter.c | 892 ++ src/util/virnetfilter.h | 159 src/util/virnftables.c| 698 ++ src/util/virnftables.h| 118 +++ .../{base.args => base.iptables} | 0 tests/networkxml2firewalldata/base.nftables | 256 + ...-linux.args => nat-default-linux.iptables} | 0 .../nat-default-linux.nftables| 248 + ...pv6-linux.args => nat-ipv6-linux.iptables} | 0 .../nat-ipv6-linux.nftables | 384 ...rgs => nat-ipv6-masquerade-linux.iptables} | 0 .../nat-ipv6-masquerade-linux.nftables| 456 + ...linux.args => nat-many-ips-linux.iptables} | 0 .../nat-many-ips-linux.nftables | 472 + ...-linux.args => nat-no-dhcp-linux.iptables} | 0 .../nat-no-dhcp-linux.nftables| 384 ...ftp-linux.args => nat-tftp-linux.iptables} | 0 .../nat-tftp-linux.nftables | 274 ++ ...inux.args => route-default-linux.iptables} | 0 .../route-default-linux.nftables | 162 tests/networkxml2firewalltest.c | 56 +- tests/virfirewalltest.c | 20 +- 45 files changed, 5718 insertions(+), 954 deletions(-) Reviewed-by: Ján Tomko Jano signature.asc Description: PGP signature
Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver
On 5/1/23 05:19, Laine Stump wrote: > > 45 files changed, 5718 insertions(+), 954 deletions(-) Reviewed-by: Michal Privoznik Michal