Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver

2023-05-05 Thread Laine Stump

On 5/4/23 6:47 AM, Daniel P. Berrangé wrote:

On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote:

This patch series enables libvirt to use nftables rules rather than
iptables *when setting up virtual networks* (it does *not* add
nftables support to the nwfilter driver). It accomplishes this by
abstracting several iptables functions (from viriptables.[ch] called
by the virtual network driver into a rudimentary "virNetfilter API"
(in virnetfilter.[ch], having the virtual network driver call the
virNetFilter API rather than calling the existing iptables functions
directly, and then finally adding an equivalent virNftables backend
that can be used instead of iptables (selected manually via a
network.conf setting, or automatically if iptables isn't found on the
host).

A first look at the result may have you thinking that it's filled with
a lot of bad decisions. While I would agree with that in many cases, I
think that overall they are the "least bad" decisions, or at least
"bad within acceptable limits / no worse than something else", and
point out that it's been done in a way that minimizes (actually
eliminates) the need for immediate changes to nwfilter (the other
consumer of iptables, which *also* needs to be updated to use native
nftables), and makes it much easier to change our mind about the
details in the future.

When I first started on this (long, protracted, repeatedly interrupted
for extended periods - many of these patches are > a year old) task, I
considered doing an all-at-once complete replacement of iptables with
nftables, since all the Linux distros we support have had nftables for
several years, and I'm pretty sure nobody has it disabled (not even
sure if it's possible to disable nftables while still enabling
iptables, since they both use xtables in the kernel). But due to
libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit
fd5b15ff all the way back in July 2010 for details) which has no
equivalent in nftables rules (and we don't *want* it to!!), and the
desire to be able to easily switch back to iptables in case of an
unforeseen regression, we decided that both iptables and nftables need
to be supported (for now), with the default (for now) remaining as
iptables.

Just allowing for dual backends complicated matters, since it means
that we have to have a config file, a setting, detection of which
backends are available, and of course some sort of concept of an
abstracted frontend that can use either backend based on the config
setting (and/or auto-detection). Combining that with the fact that it
would just be "too big" of a project to switch over nwfilter's
iptables usage at the same time means that we have to keep around a
lot of existing code for compatibility's sake rather than just wiping
it all away and starting over.

So, what I've ended up with is:

1) a network.conf file (didn't exist before) with a single setting
"firewall_backend". If unset, the network driver tries to use iptables
on the backend, and if that's missing, then tries to use nftables.


When testing your git branch active-nft-10 leavnig it unset didn't
work:

Running './src/libvirtd'...
2023-05-04 10:16:11.447+: 115377: info : libvirt version: 9.3.0
2023-05-04 10:16:11.447+: 115377: info : hostname: localhost.localdomain
2023-05-04 10:16:11.447+: 115377: error : virFirewallNew:118 : internal 
error: firewall_backend wasn't set, and no usable setting could be auto-detected
2023-05-04 10:16:11.447+: 115377: error : virNetFilterBackendUnsetError:51 
: internal error: firewall_backend wasn't set, and no usable setting could be 
auto-detected
2023-05-04 10:16:11.447+: 115377: error : virNetFilterBackendUnsetError:51 
: internal error: firewall_backend wasn't set, and no usable setting could be 
auto-detected
2023-05-04 10:16:11.473+: 115377: error : virFirewallNew:118 : internal 
error: firewall_backend wasn't set, and no usable setting could be auto-detected


Ugh :-(

I guess I did brak something with all the pre-posting cleanups after my 
last round of testing, as that was working just fine as of the end of 
last week.. I'll see what I broke, and if it's simple I'll update the 
branch on gitlab so that you can try it out if you want (even though 
it's obvious I need to change some things)




Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver

2023-05-05 Thread Laine Stump

On 5/3/23 11:40 AM, Daniel P. Berrangé wrote:

On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote:

When I first started on this (long, protracted, repeatedly interrupted
for extended periods - many of these patches are > a year old) task, I
considered doing an all-at-once complete replacement of iptables with
nftables, since all the Linux distros we support have had nftables for
several years, and I'm pretty sure nobody has it disabled (not even
sure if it's possible to disable nftables while still enabling
iptables, since they both use xtables in the kernel). But due to
libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit
fd5b15ff all the way back in July 2010 for details) which has no
equivalent in nftables rules (and we don't *want* it to!!), and the
desire to be able to easily switch back to iptables in case of an
unforeseen regression, we decided that both iptables and nftables need
to be supported (for now), with the default (for now) remaining as
iptables.


What happens if you use iptables -j CHECKSUM --checksum-fill, and
this is just the iptables compat shim that is secretly talking to
NFT in the kernel ?  Is the checksum stuff just ignored entirely
then ?

If so, then the key decision factor for us, is whether any of the
supported distros still officially support use of the iptables KMOD
instead of nft KMOD.


So, what I've ended up with is:

1) a network.conf file (didn't exist before) with a single setting
"firewall_backend". If unset, the network driver tries to use iptables
on the backend, and if that's missing, then tries to use nftables.


For most distros I feel like nftables should be the default
and not require a user opt in.  That raises the question of
how do we deal with the upgrade path.

Historically when starting libvirt we'll blow away our existing
iptables rules and create them fresh. For an upgrade path we'll
need a variant which blows away iptables rules and instead creates
nftables rules. If that works, then we could default to nftables
without user config.


Since my patches do that, I would be totally fine with preferring 
nftables over iptables when no config is present. It would definitely 
give a better feeling of "modernizing" if we could do that.


(Actually, I was considering that when I tweak Eric Garver's patches for 
native firewalld networks, I would prefer firewalld over nftables if 
firewalld.service is running. In that case, the order of preference 
(still overrideable, of course) would be firewalld->nftables->iptables, 
which from a functionality standpoint makes the most sense.)







We had spent some time in IRC discussing different ways of using new
functionality available in nftables to make a more
efficient/performant implemention of the desired filtering, and there
are some really great possibilities that need to be explored, but in
the end there were too many details up in the air, and I decided that
it would be more "accomplishable" (coined a new word there!) to first
replicate existing behavior with nftables, but do it inside a
framework that makes it easy to modify the details in the future (in
particular making it painless to switch back and forth between builds
with differing filter models at runtime) - this way we'll be able to
separate the infrastructure work from the details of the rules (which
we can then more easily work on and experiment with). (This implies
that the main objective right now is "get rid of iptables
dependencies", not "make the filtering faster and more efficient").


I feel like filtering wrt virtual networks isn't our main
performance bottleneck. Most machines only have a couple
of virtual networks, so most host traffic only has to
match a few rules.

With nwfilter though, if we have 1000 VMs we'll have
1000 top level rules that need evaluating for every
packet. IOW, the performance optimizations allowed
by nftables are much more relevant to nwfilter.


That's a good point, and makes me feel better about not trying to 
immediately optimize the rules created here. (although I do recall 
someone once a long time ago who filed a bug about poor performance when 
they had > 500 virtual networks :-P)


(OTOH, I've been reading through your response to 18/28, and what you're 
suggesting there, in the name of simpler removal rather than runtime 
performance, is changing what rules are generated in the initial 
version. I'm still wrapping my head around that - I had confused "chain" 
with "hook" in a strange way, and thought that the behavior was 
different than it apparently is).





* allows switching between iptables/nftables backends without
   rebooting or restarting networks/guests.

   Because the commands required to remove a network's filter rules are
   now saved in the network status XML, each time libvirtd (or
   virtnetworkd) is restarted, it will execute exactly the commands
   needed to remove the filter rules that had been added by the
   previous libvirtd/virtnetworkd (rather than just 

Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver

2023-05-04 Thread Daniel P . Berrangé
On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote:
> This patch series enables libvirt to use nftables rules rather than
> iptables *when setting up virtual networks* (it does *not* add
> nftables support to the nwfilter driver). It accomplishes this by
> abstracting several iptables functions (from viriptables.[ch] called
> by the virtual network driver into a rudimentary "virNetfilter API"
> (in virnetfilter.[ch], having the virtual network driver call the
> virNetFilter API rather than calling the existing iptables functions
> directly, and then finally adding an equivalent virNftables backend
> that can be used instead of iptables (selected manually via a
> network.conf setting, or automatically if iptables isn't found on the
> host).
> 
> A first look at the result may have you thinking that it's filled with
> a lot of bad decisions. While I would agree with that in many cases, I
> think that overall they are the "least bad" decisions, or at least
> "bad within acceptable limits / no worse than something else", and
> point out that it's been done in a way that minimizes (actually
> eliminates) the need for immediate changes to nwfilter (the other
> consumer of iptables, which *also* needs to be updated to use native
> nftables), and makes it much easier to change our mind about the
> details in the future.
> 
> When I first started on this (long, protracted, repeatedly interrupted
> for extended periods - many of these patches are > a year old) task, I
> considered doing an all-at-once complete replacement of iptables with
> nftables, since all the Linux distros we support have had nftables for
> several years, and I'm pretty sure nobody has it disabled (not even
> sure if it's possible to disable nftables while still enabling
> iptables, since they both use xtables in the kernel). But due to
> libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit
> fd5b15ff all the way back in July 2010 for details) which has no
> equivalent in nftables rules (and we don't *want* it to!!), and the
> desire to be able to easily switch back to iptables in case of an
> unforeseen regression, we decided that both iptables and nftables need
> to be supported (for now), with the default (for now) remaining as
> iptables.
> 
> Just allowing for dual backends complicated matters, since it means
> that we have to have a config file, a setting, detection of which
> backends are available, and of course some sort of concept of an
> abstracted frontend that can use either backend based on the config
> setting (and/or auto-detection). Combining that with the fact that it
> would just be "too big" of a project to switch over nwfilter's
> iptables usage at the same time means that we have to keep around a
> lot of existing code for compatibility's sake rather than just wiping
> it all away and starting over.
> 
> So, what I've ended up with is:
> 
> 1) a network.conf file (didn't exist before) with a single setting
> "firewall_backend". If unset, the network driver tries to use iptables
> on the backend, and if that's missing, then tries to use nftables.

When testing your git branch active-nft-10 leavnig it unset didn't
work:

Running './src/libvirtd'...
2023-05-04 10:16:11.447+: 115377: info : libvirt version: 9.3.0
2023-05-04 10:16:11.447+: 115377: info : hostname: localhost.localdomain
2023-05-04 10:16:11.447+: 115377: error : virFirewallNew:118 : internal 
error: firewall_backend wasn't set, and no usable setting could be auto-detected
2023-05-04 10:16:11.447+: 115377: error : virNetFilterBackendUnsetError:51 
: internal error: firewall_backend wasn't set, and no usable setting could be 
auto-detected
2023-05-04 10:16:11.447+: 115377: error : virNetFilterBackendUnsetError:51 
: internal error: firewall_backend wasn't set, and no usable setting could be 
auto-detected
2023-05-04 10:16:11.473+: 115377: error : virFirewallNew:118 : internal 
error: firewall_backend wasn't set, and no usable setting could be auto-detected


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver

2023-05-03 Thread Daniel P . Berrangé
On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote:
> When I first started on this (long, protracted, repeatedly interrupted
> for extended periods - many of these patches are > a year old) task, I
> considered doing an all-at-once complete replacement of iptables with
> nftables, since all the Linux distros we support have had nftables for
> several years, and I'm pretty sure nobody has it disabled (not even
> sure if it's possible to disable nftables while still enabling
> iptables, since they both use xtables in the kernel). But due to
> libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit
> fd5b15ff all the way back in July 2010 for details) which has no
> equivalent in nftables rules (and we don't *want* it to!!), and the
> desire to be able to easily switch back to iptables in case of an
> unforeseen regression, we decided that both iptables and nftables need
> to be supported (for now), with the default (for now) remaining as
> iptables.

What happens if you use iptables -j CHECKSUM --checksum-fill, and
this is just the iptables compat shim that is secretly talking to
NFT in the kernel ?  Is the checksum stuff just ignored entirely
then ?

If so, then the key decision factor for us, is whether any of the
supported distros still officially support use of the iptables KMOD
instead of nft KMOD.

> So, what I've ended up with is:
> 
> 1) a network.conf file (didn't exist before) with a single setting
> "firewall_backend". If unset, the network driver tries to use iptables
> on the backend, and if that's missing, then tries to use nftables.

For most distros I feel like nftables should be the default
and not require a user opt in.  That raises the question of
how do we deal with the upgrade path.

Historically when starting libvirt we'll blow away our existing
iptables rules and create them fresh. For an upgrade path we'll
need a variant which blows away iptables rules and instead creates
nftables rules. If that works, then we could default to nftables
without user config.


> We had spent some time in IRC discussing different ways of using new
> functionality available in nftables to make a more
> efficient/performant implemention of the desired filtering, and there
> are some really great possibilities that need to be explored, but in
> the end there were too many details up in the air, and I decided that
> it would be more "accomplishable" (coined a new word there!) to first
> replicate existing behavior with nftables, but do it inside a
> framework that makes it easy to modify the details in the future (in
> particular making it painless to switch back and forth between builds
> with differing filter models at runtime) - this way we'll be able to
> separate the infrastructure work from the details of the rules (which
> we can then more easily work on and experiment with). (This implies
> that the main objective right now is "get rid of iptables
> dependencies", not "make the filtering faster and more efficient").

I feel like filtering wrt virtual networks isn't our main
performance bottleneck. Most machines only have a couple
of virtual networks, so most host traffic only has to
match a few rules.

With nwfilter though, if we have 1000 VMs we'll have
1000 top level rules that need evaluating for every
packet. IOW, the performance optimizations allowed
by nftables are much more relevant to nwfilter.

> * allows switching between iptables/nftables backends without
>   rebooting or restarting networks/guests.
> 
>   Because the commands required to remove a network's filter rules are
>   now saved in the network status XML, each time libvirtd (or
>   virtnetworkd) is restarted, it will execute exactly the commands
>   needed to remove the filter rules that had been added by the
>   previous libvirtd/virtnetworkd (rather than just making a guess, as
>   we've always done up until now), and then add new rules using the
>   current backend+binary's set of rules (while also saving the info
>   needed for future removal of these new rules back into the network's
>   status XML).

Ok that's answering my question about upgrades then. It looks like we
should be able to default to nftables out of the box, if we assume
that absence of info in the network status XML implies iptables.

> * virFirewall does *not* provide a backend-agnostic interface [this is fine]
> 
>   * We need to maintain a backward-compatible API for virFirewall so
> that we don't have to touch nwfilter code. Trying to make its API
> backend-agnostic would require individually considering/changing
> every nwfilter use of virFirewall.
> 
>   * instead virFirewall objects are just a way to build a collection
> of commands to execute to build a firewall, then execute them
> while collecting info for and building a collection of commands
> that will tear down that firewall in the future.
> 
>   Do I want to "fix" this in the future by making virFirewall a higher
>   level interface that accepts 

Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver

2023-05-03 Thread Ján Tomko

On a Sunday in 2023, Laine Stump wrote:

This patch series enables libvirt to use nftables rules rather than
iptables *when setting up virtual networks* (it does *not* add
nftables support to the nwfilter driver). It accomplishes this by
 getting these patches in.


[... 150 lines delted ...]


Laine Stump (28):
 util: add -w/--concurrent when applying the rule rather than when
   building it
 util: new virFirewallRuleGet*() APIs
 util: determine ignoreErrors value when creating rule, not when
   applying
 util: rename iptables helpers that will become the frontend for
   ip
 util: move backend-agnostic virNetfilter*() functions to their own
   file
 util: make netfilter action a proper typedefed (virFirewall) enum
 util: #define the names used for private packet filter chains
 util: move/rename virFirewallApplyRuleDirect to
   virIptablesApplyFirewallRule
 util/network: reintroduce virFirewallBackend, but different
 network: add (empty) network.conf file to distribution files
 network: allow setting firewallBackend from network.conf
 network: do not add DHCP checksum mangle rule unless using iptables
 network: call backend agnostic function to init private filter chains
 util: setup functions in virnetfilter which will call appropriate
   backend
 build: add nft to the list of binaries we attempt to locate
 util: add nftables backend to virnetfilter API used by network driver
 tests: test cases for nftables backend
 util: new functions to support adding individual rollback rules
 util: check for 0 args when applying iptables rule
 util: implement rollback rule autosave for iptables backend
 util: implement rollback rule autosave for nftables backend
 network: turn on auto-rollback for the rules added for virtual
   networks
 util: new function virFirewallNewFromRollback()
 util: new functions virFirewallParseXML() and virFirewallFormat()
 conf: add a virFirewall object to virNetworkObj
 network: use previously saved list of firewall rules when removing
 network: save network status when firewall rules are reloaded
 network: improve log message when reloading virtual network firewall
   rules

libvirt.spec.in   |   5 +
meson.build   |   1 +
po/POTFILES   |   2 +
src/conf/virnetworkobj.c  |  40 +
src/conf/virnetworkobj.h  |  11 +
src/libvirt_private.syms  |  68 +-
src/network/bridge_driver.c   |  40 +-
src/network/bridge_driver_conf.c  |  44 +
src/network/bridge_driver_conf.h  |   3 +
src/network/bridge_driver_linux.c | 241 +++--
src/network/bridge_driver_nop.c   |   6 +-
src/network/bridge_driver_platform.h  |   6 +-
src/network/libvirtd_network.aug  |  39 +
src/network/meson.build   |  11 +
src/network/network.conf  |  24 +
src/network/test_libvirtd_network.aug.in  |   5 +
src/nwfilter/nwfilter_ebiptables_driver.c |  16 +-
src/util/meson.build  |   2 +
src/util/virebtables.c|   4 +-
src/util/virfirewall.c| 490 --
src/util/virfirewall.h|  51 +-
src/util/viriptables.c| 762 ---
src/util/viriptables.h| 222 ++---
src/util/virnetfilter.c   | 892 ++
src/util/virnetfilter.h   | 159 
src/util/virnftables.c| 698 ++
src/util/virnftables.h| 118 +++
.../{base.args => base.iptables}  |   0
tests/networkxml2firewalldata/base.nftables   | 256 +
...-linux.args => nat-default-linux.iptables} |   0
.../nat-default-linux.nftables| 248 +
...pv6-linux.args => nat-ipv6-linux.iptables} |   0
.../nat-ipv6-linux.nftables   | 384 
...rgs => nat-ipv6-masquerade-linux.iptables} |   0
.../nat-ipv6-masquerade-linux.nftables| 456 +
...linux.args => nat-many-ips-linux.iptables} |   0
.../nat-many-ips-linux.nftables   | 472 +
...-linux.args => nat-no-dhcp-linux.iptables} |   0
.../nat-no-dhcp-linux.nftables| 384 
...ftp-linux.args => nat-tftp-linux.iptables} |   0
.../nat-tftp-linux.nftables   | 274 ++
...inux.args => route-default-linux.iptables} |   0
.../route-default-linux.nftables  | 162 
tests/networkxml2firewalltest.c   |  56 +-
tests/virfirewalltest.c   |  20 +-
45 files changed, 5718 insertions(+), 954 deletions(-)


Reviewed-by: Ján Tomko 

Jano


signature.asc
Description: PGP signature


Re: [libvirt PATCH 00/28] native support for nftables in virtual network driver

2023-05-03 Thread Michal Prívozník
On 5/1/23 05:19, Laine Stump wrote:
>
>  45 files changed, 5718 insertions(+), 954 deletions(-)
Reviewed-by: Michal Privoznik 

Michal