Re: [ovs-dev] [PATCH net-next v2] net: openvswitch: fix uAPI incompatibility with existing user space
Le 09/03/2022 à 23:20, Ilya Maximets a écrit : > Few years ago OVS user space made a strange choice in the commit [1] > to define types only valid for the user space inside the copy of a > kernel uAPI header. '#ifndef __KERNEL__' and another attribute was > added later. > > This leads to the inevitable clash between user space and kernel types > when the kernel uAPI is extended. The issue was unveiled with the > addition of a new type for IPv6 extension header in kernel uAPI. > > When kernel provides the OVS_KEY_ATTR_IPV6_EXTHDRS attribute to the > older user space application, application tries to parse it as > OVS_KEY_ATTR_PACKET_TYPE and discards the whole netlink message as > malformed. Since OVS_KEY_ATTR_IPV6_EXTHDRS is supplied along with > every IPv6 packet that goes to the user space, IPv6 support is fully > broken. > > Fixing that by bringing these user space attributes to the kernel > uAPI to avoid the clash. Strictly speaking this is not the problem > of the kernel uAPI, but changing it is the only way to avoid breakage > of the older user space applications at this point. > > These 2 types are explicitly rejected now since they should not be > passed to the kernel. Additionally, OVS_KEY_ATTR_TUNNEL_INFO moved > out from the '#ifdef __KERNEL__' as there is no good reason to hide > it from the userspace. And it's also explicitly rejected now, because > it's for in-kernel use only. > > Comments with warnings were added to avoid the problem coming back. > > (1 << type) converted to (1ULL << type) to avoid integer overflow on > OVS_KEY_ATTR_IPV6_EXTHDRS, since it equals 32 now. > > [1] beb75a40fdc2 ("userspace: Switching of L3 packets in L2 pipeline") > > Fixes: 28a3f0601727 ("net: openvswitch: IPv6: Add IPv6 extension header > support") > Link: > https://lore.kernel.org/netdev/3adf00c7-fe65-3ef4-b6d7-6d8a0cad8...@nvidia.com > Link: > https://github.com/openvswitch/ovs/commit/beb75a40fdc295bfd6521b0068b4cd12f6de507c > Reported-by: Roi Dayan > Signed-off-by: Ilya Maximets Thanks for finally doing this. I also suggest it months ago: https://lore.kernel.org/lkml/a4894aef-b82a-8224-611d-07be229f5...@6wind.com/ Acked-by: Nicolas Dichtel ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next v7] net: openvswitch: IPv6: Add IPv6 extension header support
Le 15/10/2021 à 15:56, Ilya Maximets a écrit : [snip] > Not a full review, but, I think, that we should not add paddings, and > define OVS_KEY_ATTR_IPV6_EXTHDRS before the OVS_KEY_ATTR_TUNNEL_INFO > instead. See my comments for v6: +1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next v6] net: openvswitch: IPv6: Add IPv6 extension header support
Le 01/10/2021 à 22:42, Cpp Code a écrit : > On Fri, Oct 1, 2021 at 12:21 AM Nicolas Dichtel > wrote: >> >> Le 30/09/2021 à 18:11, Cpp Code a écrit : >>> On Wed, Sep 29, 2021 at 6:19 AM Jakub Kicinski wrote: >>>> >>>> On Wed, 29 Sep 2021 08:19:05 +0200 Nicolas Dichtel wrote: >>>>>> /* Insert a kernel only KEY_ATTR */ >>>>>> #define OVS_KEY_ATTR_TUNNEL_INFO__OVS_KEY_ATTR_MAX >>>>>> #undef OVS_KEY_ATTR_MAX >>>>>> #define OVS_KEY_ATTR_MAX__OVS_KEY_ATTR_MAX >>>>> Following the other thread [1], this will break if a new app runs over an >>>>> old >>>>> kernel. >>>> >>>> Good point. >>>> >>>>> Why not simply expose this attribute to userspace and throw an error if a >>>>> userspace app uses it? >>>> >>>> Does it matter if it's exposed or not? Either way the parsing policy >>>> for attrs coming from user space should have a reject for the value. >>>> (I say that not having looked at the code, so maybe I shouldn't...) >>> >>> To remove some confusion, there are some architectural nuances if we >>> want to extend code without large refactor. >>> The ovs_key_attr is defined only in kernel side. Userspace side is >>> generated from this file. As well the code can be built without kernel >>> modules. >>> The code inside OVS repository and net-next is not identical, but I >>> try to keep some consistency. >> I didn't get why OVS_KEY_ATTR_TUNNEL_INFO cannot be exposed to userspace. > > OVS_KEY_ATTR_TUNNEL_INFO is compressed version of OVS_KEY_ATTR_TUNNEL > and for clarity purposes its not exposed to userspace as it will never > use it. > I would say it's a coding style as it would not brake anything if exposed. In fact, it's the best way to keep the compatibility in the long term. You can define it like this: OVS_KEY_ATTR_TUNNEL_INFO, /* struct ip_tunnel_info, reserved for kernel use */ > >> >>> >>> JFYI This is the file responsible for generating userspace part: >>> https://github.com/openvswitch/ovs/blob/master/build-aux/extract-odp-netlink-h >>> This is the how corresponding file for ovs_key_attr looks inside OVS: >>> https://github.com/openvswitch/ovs/blob/master/datapath/linux/compat/include/linux/openvswitch.h >>> one can see there are more values than in net-next version. >> There are still some '#ifdef __KERNEL__'. The standard 'make headers_install' >> filters them. Why not using this standard mechanism? > > Could you elaborate on this, I don't quite understand the idea!? Which > ifdef you are referring, the one along OVS_KEY_ATTR_TUNNEL_INFO or > some other? My understanding is that this file is used for the userland third party, thus, theoretically, there should be no '#ifdef __KERNEL__'. uapi headers generated with 'make headers_install' are filtered to remove them. > >> >> In this file, there are two attributes (OVS_KEY_ATTR_PACKET_TYPE and >> OVS_KEY_ATTR_ND_EXTENSIONS) that doesn't exist in the kernel. >> This will also breaks if an old app runs over a new kernel. I don't see how >> it >> is possible to keep the compat between {old|new} {kernel|app}. > > Looks like this most likely is a bug while working on multiple > versions of code. Need to do add more padding. As said above, just define the same uapi for everybody and the problem is gone forever. Regards, Nicolas ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next v6] net: openvswitch: IPv6: Add IPv6 extension header support
Le 30/09/2021 à 18:11, Cpp Code a écrit : > On Wed, Sep 29, 2021 at 6:19 AM Jakub Kicinski wrote: >> >> On Wed, 29 Sep 2021 08:19:05 +0200 Nicolas Dichtel wrote: >>>> /* Insert a kernel only KEY_ATTR */ >>>> #define OVS_KEY_ATTR_TUNNEL_INFO__OVS_KEY_ATTR_MAX >>>> #undef OVS_KEY_ATTR_MAX >>>> #define OVS_KEY_ATTR_MAX__OVS_KEY_ATTR_MAX >>> Following the other thread [1], this will break if a new app runs over an >>> old >>> kernel. >> >> Good point. >> >>> Why not simply expose this attribute to userspace and throw an error if a >>> userspace app uses it? >> >> Does it matter if it's exposed or not? Either way the parsing policy >> for attrs coming from user space should have a reject for the value. >> (I say that not having looked at the code, so maybe I shouldn't...) > > To remove some confusion, there are some architectural nuances if we > want to extend code without large refactor. > The ovs_key_attr is defined only in kernel side. Userspace side is > generated from this file. As well the code can be built without kernel > modules. > The code inside OVS repository and net-next is not identical, but I > try to keep some consistency. I didn't get why OVS_KEY_ATTR_TUNNEL_INFO cannot be exposed to userspace. > > JFYI This is the file responsible for generating userspace part: > https://github.com/openvswitch/ovs/blob/master/build-aux/extract-odp-netlink-h > This is the how corresponding file for ovs_key_attr looks inside OVS: > https://github.com/openvswitch/ovs/blob/master/datapath/linux/compat/include/linux/openvswitch.h > one can see there are more values than in net-next version. There are still some '#ifdef __KERNEL__'. The standard 'make headers_install' filters them. Why not using this standard mechanism? In this file, there are two attributes (OVS_KEY_ATTR_PACKET_TYPE and OVS_KEY_ATTR_ND_EXTENSIONS) that doesn't exist in the kernel. This will also breaks if an old app runs over a new kernel. I don't see how it is possible to keep the compat between {old|new} {kernel|app}. Regards, Nicolas ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next v6] net: openvswitch: IPv6: Add IPv6 extension header support
Le 29/09/2021 à 02:48, Jakub Kicinski a écrit : > On Tue, 28 Sep 2021 12:47:27 -0700 Toms Atteka wrote: >> diff --git a/include/uapi/linux/openvswitch.h >> b/include/uapi/linux/openvswitch.h >> index a87b44cd5590..dc6eb5f6399f 100644 >> --- a/include/uapi/linux/openvswitch.h >> +++ b/include/uapi/linux/openvswitch.h >> @@ -346,6 +346,13 @@ enum ovs_key_attr { >> #ifdef __KERNEL__ >> OVS_KEY_ATTR_TUNNEL_INFO, /* struct ip_tunnel_info */ >> #endif >> + >> +#ifndef __KERNEL__ > > #else > >> +PADDING, /* Padding so kernel and non kernel field count would match */ > > The name PADDING seems rather risky, collisions will be likely. > OVS_KEY_ATTR_PADDING maybe? > > But maybe we don't need to define this special value and bake it into > the uAPI, why can't we add something like this to the kernel header > (i.e. include/linux/openvswitch.h): > > /* Insert a kernel only KEY_ATTR */ > #define OVS_KEY_ATTR_TUNNEL_INFO __OVS_KEY_ATTR_MAX > #undef OVS_KEY_ATTR_MAX > #define OVS_KEY_ATTR_MAX __OVS_KEY_ATTR_MAX Following the other thread [1], this will break if a new app runs over an old kernel. Why not simply expose this attribute to userspace and throw an error if a userspace app uses it? [1] https://lore.kernel.org/lkml/CAASuNyUWoZ1wToEUYbdehux=yVnWQ=sukdyrkqfrd-72dol...@mail.gmail.com/ > >> +#endif ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next v5] net: openvswitch: IPv6: Add IPv6 extension header support
Le 27/09/2021 à 21:12, Cpp Code a écrit : > To use this code there is a part of code in the userspace. We want to > keep compatibility when we only update userspace part code or only > kernel part code. This means we should have same values for constants > and we can only add new ones at the end of list. All attributes after OVS_KEY_ATTR_CT_STATE (ie 7 attributes) were added before OVS_KEY_ATTR_TUNNEL_INFO. Why is it not possible anymore? Regards, Nicolas > > Best, > Tom > > On Wed, Sep 22, 2021 at 11:02 PM Nicolas Dichtel > wrote: >> >> Le 20/09/2021 à 20:20, Toms Atteka a écrit : >>> This change adds a new OpenFlow field OFPXMT_OFB_IPV6_EXTHDR and >>> packets can be filtered using ipv6_ext flag. >>> >>> Signed-off-by: Toms Atteka >>> --- >>> include/uapi/linux/openvswitch.h | 12 +++ >>> net/openvswitch/flow.c | 140 +++ >>> net/openvswitch/flow.h | 14 >>> net/openvswitch/flow_netlink.c | 24 +- >>> 4 files changed, 189 insertions(+), 1 deletion(-) >>> >>> diff --git a/include/uapi/linux/openvswitch.h >>> b/include/uapi/linux/openvswitch.h >>> index a87b44cd5590..dc6eb5f6399f 100644 >>> --- a/include/uapi/linux/openvswitch.h >>> +++ b/include/uapi/linux/openvswitch.h >>> @@ -346,6 +346,13 @@ enum ovs_key_attr { >>> #ifdef __KERNEL__ >>> OVS_KEY_ATTR_TUNNEL_INFO, /* struct ip_tunnel_info */ >>> #endif >>> + >>> +#ifndef __KERNEL__ >>> + PADDING, /* Padding so kernel and non kernel field count would match >>> */ >>> +#endif >>> + >>> + OVS_KEY_ATTR_IPV6_EXTHDRS, /* struct ovs_key_ipv6_exthdr */ >> Naive question, why not moving OVS_KEY_ATTR_IPV6_EXTHDRS above >> OVS_KEY_ATTR_TUNNEL_INFO? >> >> >> >> Regards, >> Nicolas ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next v5] net: openvswitch: IPv6: Add IPv6 extension header support
Le 20/09/2021 à 20:20, Toms Atteka a écrit : > This change adds a new OpenFlow field OFPXMT_OFB_IPV6_EXTHDR and > packets can be filtered using ipv6_ext flag. > > Signed-off-by: Toms Atteka > --- > include/uapi/linux/openvswitch.h | 12 +++ > net/openvswitch/flow.c | 140 +++ > net/openvswitch/flow.h | 14 > net/openvswitch/flow_netlink.c | 24 +- > 4 files changed, 189 insertions(+), 1 deletion(-) > > diff --git a/include/uapi/linux/openvswitch.h > b/include/uapi/linux/openvswitch.h > index a87b44cd5590..dc6eb5f6399f 100644 > --- a/include/uapi/linux/openvswitch.h > +++ b/include/uapi/linux/openvswitch.h > @@ -346,6 +346,13 @@ enum ovs_key_attr { > #ifdef __KERNEL__ > OVS_KEY_ATTR_TUNNEL_INFO, /* struct ip_tunnel_info */ > #endif > + > +#ifndef __KERNEL__ > + PADDING, /* Padding so kernel and non kernel field count would match */ > +#endif > + > + OVS_KEY_ATTR_IPV6_EXTHDRS, /* struct ovs_key_ipv6_exthdr */ Naive question, why not moving OVS_KEY_ATTR_IPV6_EXTHDRS above OVS_KEY_ATTR_TUNNEL_INFO? Regards, Nicolas ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net-next v3] openvswitch: add TTL decrement action
Le 16/01/2020 à 08:21, Pravin Shelar a écrit : > On Wed, Jan 15, 2020 at 8:40 AM Matteo Croce wrote: [snip] >> @@ -1050,4 +1051,5 @@ struct ovs_zone_limit { >> __u32 count; >> }; >> >> +#define OVS_DEC_TTL_ATTR_EXEC 0 > > I am not sure if we need this, But if you want the nested attribute > then lets define enum with this single attribute and have actions as > part of its data. This would be optional argument, so userspace can > skip it, and in that case datapath can drop the packet. And note that 0 is a reserved value and should not be used. Look at other attributes. Regards, Nicolas ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net 1/2] openvswitch: support asymmetric conntrack
Le 18/11/2019 à 22:19, Aaron Conole a écrit : > Nicolas Dichtel writes: > >> Le 08/11/2019 à 22:07, Aaron Conole a écrit : >>> The openvswitch module shares a common conntrack and NAT infrastructure >>> exposed via netfilter. It's possible that a packet needs both SNAT and >>> DNAT manipulation, due to e.g. tuple collision. Netfilter can support >>> this because it runs through the NAT table twice - once on ingress and >>> again after egress. The openvswitch module doesn't have such capability. >>> >>> Like netfilter hook infrastructure, we should run through NAT twice to >>> keep the symmetry. >>> >>> Fixes: 05752523e565 ("openvswitch: Interface with NAT.") >>> Signed-off-by: Aaron Conole >> In this case, ovs_ct_find_existing() won't be able to find the >> conntrack, right? > > vswitchd normally won't allow both actions to get programmed. Even the > kernel module won't allow it, so this really will only happen when the > connection gets established via the nf_hook path, and then needs to be > processed via openvswitch. In those cases, the tuple lookup should be > correct, because the nf_nat table should contain the correct tuple data, > and the skbuff should have the correct tuples in the packet data to > begin with. > >> Inverting the tuple to find the conntrack doesn't work anymore with double >> NAT. >> Am I wrong? > > I think since the packet was double-NAT on the way out (via nf_hook > path), then the incoming reply will have the correct NAT tuples and the > lookup will happen just fine. Just that during processing, both > transformations aren't applied. Ok, I didn't look deeply, thank you for the explanation. Regards, Nicolas ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net 1/2] openvswitch: support asymmetric conntrack
Le 08/11/2019 à 22:07, Aaron Conole a écrit : > The openvswitch module shares a common conntrack and NAT infrastructure > exposed via netfilter. It's possible that a packet needs both SNAT and > DNAT manipulation, due to e.g. tuple collision. Netfilter can support > this because it runs through the NAT table twice - once on ingress and > again after egress. The openvswitch module doesn't have such capability. > > Like netfilter hook infrastructure, we should run through NAT twice to > keep the symmetry. > > Fixes: 05752523e565 ("openvswitch: Interface with NAT.") > Signed-off-by: Aaron Conole In this case, ovs_ct_find_existing() won't be able to find the conntrack, right? Inverting the tuple to find the conntrack doesn't work anymore with double NAT. Am I wrong? Regards, Nicolas ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH net] vxlan: fix ovs support
The required changes in the function vxlan_dev_create() were missing in commit 8bcdc4f3a20b. The vxlan device is not registered anymore after this patch and the error path causes an stack dump: WARNING: CPU: 3 PID: 1498 at net/core/dev.c:6713 rollback_registered_many+0x9d/0x3f0 Fixes: 8bcdc4f3a20b ("vxlan: add changelink support") CC: Roopa Prabhu <ro...@cumulusnetworks.com> Signed-off-by: Nicolas Dichtel <nicolas.dich...@6wind.com> --- drivers/net/vxlan.c | 73 + 1 file changed, 40 insertions(+), 33 deletions(-) diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index e375560cc74e..bdb6ae16d4a8 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -2976,6 +2976,44 @@ static int vxlan_dev_configure(struct net *src_net, struct net_device *dev, return 0; } +static int __vxlan_dev_create(struct net *net, struct net_device *dev, + struct vxlan_config *conf) +{ + struct vxlan_net *vn = net_generic(net, vxlan_net_id); + struct vxlan_dev *vxlan = netdev_priv(dev); + int err; + + err = vxlan_dev_configure(net, dev, conf, false); + if (err) + return err; + + dev->ethtool_ops = _ethtool_ops; + + /* create an fdb entry for a valid default destination */ + if (!vxlan_addr_any(>default_dst.remote_ip)) { + err = vxlan_fdb_create(vxlan, all_zeros_mac, + >default_dst.remote_ip, + NUD_REACHABLE | NUD_PERMANENT, + NLM_F_EXCL | NLM_F_CREATE, + vxlan->cfg.dst_port, + vxlan->default_dst.remote_vni, + vxlan->default_dst.remote_vni, + vxlan->default_dst.remote_ifindex, + NTF_SELF); + if (err) + return err; + } + + err = register_netdevice(dev); + if (err) { + vxlan_fdb_delete_default(vxlan, vxlan->default_dst.remote_vni); + return err; + } + + list_add(>next, >vxlan_list); + return 0; +} + static int vxlan_nl2conf(struct nlattr *tb[], struct nlattr *data[], struct net_device *dev, struct vxlan_config *conf, bool changelink) @@ -3172,8 +3210,6 @@ static int vxlan_nl2conf(struct nlattr *tb[], struct nlattr *data[], static int vxlan_newlink(struct net *src_net, struct net_device *dev, struct nlattr *tb[], struct nlattr *data[]) { - struct vxlan_net *vn = net_generic(src_net, vxlan_net_id); - struct vxlan_dev *vxlan = netdev_priv(dev); struct vxlan_config conf; int err; @@ -3181,36 +3217,7 @@ static int vxlan_newlink(struct net *src_net, struct net_device *dev, if (err) return err; - err = vxlan_dev_configure(src_net, dev, , false); - if (err) - return err; - - dev->ethtool_ops = _ethtool_ops; - - /* create an fdb entry for a valid default destination */ - if (!vxlan_addr_any(>default_dst.remote_ip)) { - err = vxlan_fdb_create(vxlan, all_zeros_mac, - >default_dst.remote_ip, - NUD_REACHABLE | NUD_PERMANENT, - NLM_F_EXCL | NLM_F_CREATE, - vxlan->cfg.dst_port, - vxlan->default_dst.remote_vni, - vxlan->default_dst.remote_vni, - vxlan->default_dst.remote_ifindex, - NTF_SELF); - if (err) - return err; - } - - err = register_netdevice(dev); - if (err) { - vxlan_fdb_delete_default(vxlan, vxlan->default_dst.remote_vni); - return err; - } - - list_add(>next, >vxlan_list); - - return 0; + return __vxlan_dev_create(src_net, dev, ); } static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], @@ -3440,7 +3447,7 @@ struct net_device *vxlan_dev_create(struct net *net, const char *name, if (IS_ERR(dev)) return dev; - err = vxlan_dev_configure(net, dev, conf, false); + err = __vxlan_dev_create(net, dev, conf); if (err < 0) { free_netdev(dev); return ERR_PTR(err); -- 2.8.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev