Re: [ovs-dev] [PATCH net-next v2] net: openvswitch: fix uAPI incompatibility with existing user space

2022-03-10 Thread Nicolas Dichtel

Le 09/03/2022 à 23:20, Ilya Maximets a écrit :
> Few years ago OVS user space made a strange choice in the commit [1]
> to define types only valid for the user space inside the copy of a
> kernel uAPI header.  '#ifndef __KERNEL__' and another attribute was
> added later.
> 
> This leads to the inevitable clash between user space and kernel types
> when the kernel uAPI is extended.  The issue was unveiled with the
> addition of a new type for IPv6 extension header in kernel uAPI.
> 
> When kernel provides the OVS_KEY_ATTR_IPV6_EXTHDRS attribute to the
> older user space application, application tries to parse it as
> OVS_KEY_ATTR_PACKET_TYPE and discards the whole netlink message as
> malformed.  Since OVS_KEY_ATTR_IPV6_EXTHDRS is supplied along with
> every IPv6 packet that goes to the user space, IPv6 support is fully
> broken.
> 
> Fixing that by bringing these user space attributes to the kernel
> uAPI to avoid the clash.  Strictly speaking this is not the problem
> of the kernel uAPI, but changing it is the only way to avoid breakage
> of the older user space applications at this point.
> 
> These 2 types are explicitly rejected now since they should not be
> passed to the kernel.  Additionally, OVS_KEY_ATTR_TUNNEL_INFO moved
> out from the '#ifdef __KERNEL__' as there is no good reason to hide
> it from the userspace.  And it's also explicitly rejected now, because
> it's for in-kernel use only.
> 
> Comments with warnings were added to avoid the problem coming back.
> 
> (1 << type) converted to (1ULL << type) to avoid integer overflow on
> OVS_KEY_ATTR_IPV6_EXTHDRS, since it equals 32 now.
> 
>  [1] beb75a40fdc2 ("userspace: Switching of L3 packets in L2 pipeline")
> 
> Fixes: 28a3f0601727 ("net: openvswitch: IPv6: Add IPv6 extension header 
> support")
> Link: 
> https://lore.kernel.org/netdev/3adf00c7-fe65-3ef4-b6d7-6d8a0cad8...@nvidia.com
> Link: 
> https://github.com/openvswitch/ovs/commit/beb75a40fdc295bfd6521b0068b4cd12f6de507c
> Reported-by: Roi Dayan 
> Signed-off-by: Ilya Maximets 
Thanks for finally doing this. I also suggest it months ago:
https://lore.kernel.org/lkml/a4894aef-b82a-8224-611d-07be229f5...@6wind.com/

Acked-by: Nicolas Dichtel 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v7] net: openvswitch: IPv6: Add IPv6 extension header support

2021-10-15 Thread Nicolas Dichtel
Le 15/10/2021 à 15:56, Ilya Maximets a écrit :
[snip]
> Not a full review, but, I think, that we should not add paddings, and
> define OVS_KEY_ATTR_IPV6_EXTHDRS before the OVS_KEY_ATTR_TUNNEL_INFO
> instead.  See my comments for v6:
+1
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v6] net: openvswitch: IPv6: Add IPv6 extension header support

2021-10-05 Thread Nicolas Dichtel
Le 01/10/2021 à 22:42, Cpp Code a écrit :
> On Fri, Oct 1, 2021 at 12:21 AM Nicolas Dichtel
>  wrote:
>>
>> Le 30/09/2021 à 18:11, Cpp Code a écrit :
>>> On Wed, Sep 29, 2021 at 6:19 AM Jakub Kicinski  wrote:
>>>>
>>>> On Wed, 29 Sep 2021 08:19:05 +0200 Nicolas Dichtel wrote:
>>>>>> /* Insert a kernel only KEY_ATTR */
>>>>>> #define OVS_KEY_ATTR_TUNNEL_INFO__OVS_KEY_ATTR_MAX
>>>>>> #undef OVS_KEY_ATTR_MAX
>>>>>> #define OVS_KEY_ATTR_MAX__OVS_KEY_ATTR_MAX
>>>>> Following the other thread [1], this will break if a new app runs over an 
>>>>> old
>>>>> kernel.
>>>>
>>>> Good point.
>>>>
>>>>> Why not simply expose this attribute to userspace and throw an error if a
>>>>> userspace app uses it?
>>>>
>>>> Does it matter if it's exposed or not? Either way the parsing policy
>>>> for attrs coming from user space should have a reject for the value.
>>>> (I say that not having looked at the code, so maybe I shouldn't...)
>>>
>>> To remove some confusion, there are some architectural nuances if we
>>> want to extend code without large refactor.
>>> The ovs_key_attr is defined only in kernel side. Userspace side is
>>> generated from this file. As well the code can be built without kernel
>>> modules.
>>> The code inside OVS repository and net-next is not identical, but I
>>> try to keep some consistency.
>> I didn't get why OVS_KEY_ATTR_TUNNEL_INFO cannot be exposed to userspace.
> 
> OVS_KEY_ATTR_TUNNEL_INFO is compressed version of OVS_KEY_ATTR_TUNNEL
> and for clarity purposes its not exposed to userspace as it will never
> use it.
> I would say it's a coding style as it would not brake anything if exposed.
In fact, it's the best way to keep the compatibility in the long term.
You can define it like this:
OVS_KEY_ATTR_TUNNEL_INFO,  /* struct ip_tunnel_info, reserved for kernel use */

> 
>>
>>>
>>> JFYI This is the file responsible for generating userspace part:
>>> https://github.com/openvswitch/ovs/blob/master/build-aux/extract-odp-netlink-h
>>> This is the how corresponding file for ovs_key_attr looks inside OVS:
>>> https://github.com/openvswitch/ovs/blob/master/datapath/linux/compat/include/linux/openvswitch.h
>>> one can see there are more values than in net-next version.
>> There are still some '#ifdef __KERNEL__'. The standard 'make headers_install'
>> filters them. Why not using this standard mechanism?
> 
> Could you elaborate on this, I don't quite understand the idea!? Which
> ifdef you are referring, the one along OVS_KEY_ATTR_TUNNEL_INFO or
> some other?
My understanding is that this file is used for the userland third party, thus,
theoretically, there should be no '#ifdef __KERNEL__'. uapi headers generated
with 'make headers_install' are filtered to remove them.

> 
>>
>> In this file, there are two attributes (OVS_KEY_ATTR_PACKET_TYPE and
>> OVS_KEY_ATTR_ND_EXTENSIONS) that doesn't exist in the kernel.
>> This will also breaks if an old app runs over a new kernel. I don't see how 
>> it
>> is possible to keep the compat between {old|new} {kernel|app}.
> 
> Looks like this most likely is a bug while working on multiple
> versions of code.  Need to do add more padding.
As said above, just define the same uapi for everybody and the problem is gone
forever.


Regards,
Nicolas
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v6] net: openvswitch: IPv6: Add IPv6 extension header support

2021-10-01 Thread Nicolas Dichtel
Le 30/09/2021 à 18:11, Cpp Code a écrit :
> On Wed, Sep 29, 2021 at 6:19 AM Jakub Kicinski  wrote:
>>
>> On Wed, 29 Sep 2021 08:19:05 +0200 Nicolas Dichtel wrote:
>>>> /* Insert a kernel only KEY_ATTR */
>>>> #define OVS_KEY_ATTR_TUNNEL_INFO__OVS_KEY_ATTR_MAX
>>>> #undef OVS_KEY_ATTR_MAX
>>>> #define OVS_KEY_ATTR_MAX__OVS_KEY_ATTR_MAX
>>> Following the other thread [1], this will break if a new app runs over an 
>>> old
>>> kernel.
>>
>> Good point.
>>
>>> Why not simply expose this attribute to userspace and throw an error if a
>>> userspace app uses it?
>>
>> Does it matter if it's exposed or not? Either way the parsing policy
>> for attrs coming from user space should have a reject for the value.
>> (I say that not having looked at the code, so maybe I shouldn't...)
> 
> To remove some confusion, there are some architectural nuances if we
> want to extend code without large refactor.
> The ovs_key_attr is defined only in kernel side. Userspace side is
> generated from this file. As well the code can be built without kernel
> modules.
> The code inside OVS repository and net-next is not identical, but I
> try to keep some consistency.
I didn't get why OVS_KEY_ATTR_TUNNEL_INFO cannot be exposed to userspace.

> 
> JFYI This is the file responsible for generating userspace part:
> https://github.com/openvswitch/ovs/blob/master/build-aux/extract-odp-netlink-h
> This is the how corresponding file for ovs_key_attr looks inside OVS:
> https://github.com/openvswitch/ovs/blob/master/datapath/linux/compat/include/linux/openvswitch.h
> one can see there are more values than in net-next version.
There are still some '#ifdef __KERNEL__'. The standard 'make headers_install'
filters them. Why not using this standard mechanism?

In this file, there are two attributes (OVS_KEY_ATTR_PACKET_TYPE and
OVS_KEY_ATTR_ND_EXTENSIONS) that doesn't exist in the kernel.
This will also breaks if an old app runs over a new kernel. I don't see how it
is possible to keep the compat between {old|new} {kernel|app}.


Regards,
Nicolas
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v6] net: openvswitch: IPv6: Add IPv6 extension header support

2021-09-29 Thread Nicolas Dichtel
Le 29/09/2021 à 02:48, Jakub Kicinski a écrit :
> On Tue, 28 Sep 2021 12:47:27 -0700 Toms Atteka wrote:
>> diff --git a/include/uapi/linux/openvswitch.h 
>> b/include/uapi/linux/openvswitch.h
>> index a87b44cd5590..dc6eb5f6399f 100644
>> --- a/include/uapi/linux/openvswitch.h
>> +++ b/include/uapi/linux/openvswitch.h
>> @@ -346,6 +346,13 @@ enum ovs_key_attr {
>>  #ifdef __KERNEL__
>>  OVS_KEY_ATTR_TUNNEL_INFO,  /* struct ip_tunnel_info */
>>  #endif
>> +
>> +#ifndef __KERNEL__
> 
> #else
> 
>> +PADDING,  /* Padding so kernel and non kernel field count would match */
> 
> The name PADDING seems rather risky, collisions will be likely.
> OVS_KEY_ATTR_PADDING maybe?
> 
> But maybe we don't need to define this special value and bake it into
> the uAPI, why can't we add something like this to the kernel header
> (i.e. include/linux/openvswitch.h):
> 
> /* Insert a kernel only KEY_ATTR */
> #define OVS_KEY_ATTR_TUNNEL_INFO  __OVS_KEY_ATTR_MAX
> #undef OVS_KEY_ATTR_MAX
> #define OVS_KEY_ATTR_MAX  __OVS_KEY_ATTR_MAX
Following the other thread [1], this will break if a new app runs over an old
kernel.
Why not simply expose this attribute to userspace and throw an error if a
userspace app uses it?

[1]
https://lore.kernel.org/lkml/CAASuNyUWoZ1wToEUYbdehux=yVnWQ=sukdyrkqfrd-72dol...@mail.gmail.com/

> 
>> +#endif
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v5] net: openvswitch: IPv6: Add IPv6 extension header support

2021-09-28 Thread Nicolas Dichtel
Le 27/09/2021 à 21:12, Cpp Code a écrit :
> To use this code there is a part of code in the userspace. We want to
> keep compatibility when we only update userspace part code or only
> kernel part code. This means we should have same values for constants
> and we can only add new ones at the end of list.
All attributes after OVS_KEY_ATTR_CT_STATE (ie 7 attributes) were added before
OVS_KEY_ATTR_TUNNEL_INFO.
Why is it not possible anymore?


Regards,
Nicolas

> 
> Best,
> Tom
> 
> On Wed, Sep 22, 2021 at 11:02 PM Nicolas Dichtel
>  wrote:
>>
>> Le 20/09/2021 à 20:20, Toms Atteka a écrit :
>>> This change adds a new OpenFlow field OFPXMT_OFB_IPV6_EXTHDR and
>>> packets can be filtered using ipv6_ext flag.
>>>
>>> Signed-off-by: Toms Atteka 
>>> ---
>>>  include/uapi/linux/openvswitch.h |  12 +++
>>>  net/openvswitch/flow.c   | 140 +++
>>>  net/openvswitch/flow.h   |  14 
>>>  net/openvswitch/flow_netlink.c   |  24 +-
>>>  4 files changed, 189 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/uapi/linux/openvswitch.h 
>>> b/include/uapi/linux/openvswitch.h
>>> index a87b44cd5590..dc6eb5f6399f 100644
>>> --- a/include/uapi/linux/openvswitch.h
>>> +++ b/include/uapi/linux/openvswitch.h
>>> @@ -346,6 +346,13 @@ enum ovs_key_attr {
>>>  #ifdef __KERNEL__
>>>   OVS_KEY_ATTR_TUNNEL_INFO,  /* struct ip_tunnel_info */
>>>  #endif
>>> +
>>> +#ifndef __KERNEL__
>>> + PADDING,  /* Padding so kernel and non kernel field count would match 
>>> */
>>> +#endif
>>> +
>>> + OVS_KEY_ATTR_IPV6_EXTHDRS,  /* struct ovs_key_ipv6_exthdr */
>> Naive question, why not moving OVS_KEY_ATTR_IPV6_EXTHDRS above
>> OVS_KEY_ATTR_TUNNEL_INFO?
>>
>>
>>
>> Regards,
>> Nicolas
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v5] net: openvswitch: IPv6: Add IPv6 extension header support

2021-09-23 Thread Nicolas Dichtel
Le 20/09/2021 à 20:20, Toms Atteka a écrit :
> This change adds a new OpenFlow field OFPXMT_OFB_IPV6_EXTHDR and
> packets can be filtered using ipv6_ext flag.
> 
> Signed-off-by: Toms Atteka 
> ---
>  include/uapi/linux/openvswitch.h |  12 +++
>  net/openvswitch/flow.c   | 140 +++
>  net/openvswitch/flow.h   |  14 
>  net/openvswitch/flow_netlink.c   |  24 +-
>  4 files changed, 189 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/openvswitch.h 
> b/include/uapi/linux/openvswitch.h
> index a87b44cd5590..dc6eb5f6399f 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -346,6 +346,13 @@ enum ovs_key_attr {
>  #ifdef __KERNEL__
>   OVS_KEY_ATTR_TUNNEL_INFO,  /* struct ip_tunnel_info */
>  #endif
> +
> +#ifndef __KERNEL__
> + PADDING,  /* Padding so kernel and non kernel field count would match */
> +#endif
> +
> + OVS_KEY_ATTR_IPV6_EXTHDRS,  /* struct ovs_key_ipv6_exthdr */
Naive question, why not moving OVS_KEY_ATTR_IPV6_EXTHDRS above
OVS_KEY_ATTR_TUNNEL_INFO?



Regards,
Nicolas
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v3] openvswitch: add TTL decrement action

2020-01-16 Thread Nicolas Dichtel
Le 16/01/2020 à 08:21, Pravin Shelar a écrit :
> On Wed, Jan 15, 2020 at 8:40 AM Matteo Croce  wrote:
[snip]
>> @@ -1050,4 +1051,5 @@ struct ovs_zone_limit {
>> __u32 count;
>>  };
>>
>> +#define OVS_DEC_TTL_ATTR_EXEC  0
> 
> I am not sure if we need this, But if you want the nested attribute
> then lets define enum with this single attribute and have actions as
> part of its data. This would be optional argument, so userspace can
> skip it, and in that case datapath can drop the packet.
And note that 0 is a reserved value and should not be used. Look at other
attributes.

Regards,
Nicolas
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net 1/2] openvswitch: support asymmetric conntrack

2019-11-28 Thread Nicolas Dichtel
Le 18/11/2019 à 22:19, Aaron Conole a écrit :
> Nicolas Dichtel  writes:
> 
>> Le 08/11/2019 à 22:07, Aaron Conole a écrit :
>>> The openvswitch module shares a common conntrack and NAT infrastructure
>>> exposed via netfilter.  It's possible that a packet needs both SNAT and
>>> DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
>>> this because it runs through the NAT table twice - once on ingress and
>>> again after egress.  The openvswitch module doesn't have such capability.
>>>
>>> Like netfilter hook infrastructure, we should run through NAT twice to
>>> keep the symmetry.
>>>
>>> Fixes: 05752523e565 ("openvswitch: Interface with NAT.")
>>> Signed-off-by: Aaron Conole 
>> In this case, ovs_ct_find_existing() won't be able to find the
>> conntrack, right?
> 
> vswitchd normally won't allow both actions to get programmed.  Even the
> kernel module won't allow it, so this really will only happen when the
> connection gets established via the nf_hook path, and then needs to be
> processed via openvswitch.  In those cases, the tuple lookup should be
> correct, because the nf_nat table should contain the correct tuple data,
> and the skbuff should have the correct tuples in the packet data to
> begin with.
> 
>> Inverting the tuple to find the conntrack doesn't work anymore with double 
>> NAT.
>> Am I wrong?
> 
> I think since the packet was double-NAT on the way out (via nf_hook
> path), then the incoming reply will have the correct NAT tuples and the
> lookup will happen just fine.  Just that during processing, both
> transformations aren't applied.
Ok, I didn't look deeply, thank you for the explanation.

Regards,
Nicolas
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net 1/2] openvswitch: support asymmetric conntrack

2019-11-12 Thread Nicolas Dichtel
Le 08/11/2019 à 22:07, Aaron Conole a écrit :
> The openvswitch module shares a common conntrack and NAT infrastructure
> exposed via netfilter.  It's possible that a packet needs both SNAT and
> DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
> this because it runs through the NAT table twice - once on ingress and
> again after egress.  The openvswitch module doesn't have such capability.
> 
> Like netfilter hook infrastructure, we should run through NAT twice to
> keep the symmetry.
> 
> Fixes: 05752523e565 ("openvswitch: Interface with NAT.")
> Signed-off-by: Aaron Conole 
In this case, ovs_ct_find_existing() won't be able to find the conntrack, right?
Inverting the tuple to find the conntrack doesn't work anymore with double NAT.
Am I wrong?


Regards,
Nicolas
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net] vxlan: fix ovs support

2017-03-13 Thread Nicolas Dichtel
The required changes in the function vxlan_dev_create() were missing
in commit 8bcdc4f3a20b.
The vxlan device is not registered anymore after this patch and the error
path causes an stack dump:
 WARNING: CPU: 3 PID: 1498 at net/core/dev.c:6713 
rollback_registered_many+0x9d/0x3f0

Fixes: 8bcdc4f3a20b ("vxlan: add changelink support")
CC: Roopa Prabhu <ro...@cumulusnetworks.com>
Signed-off-by: Nicolas Dichtel <nicolas.dich...@6wind.com>
---
 drivers/net/vxlan.c | 73 +
 1 file changed, 40 insertions(+), 33 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index e375560cc74e..bdb6ae16d4a8 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2976,6 +2976,44 @@ static int vxlan_dev_configure(struct net *src_net, 
struct net_device *dev,
return 0;
 }
 
+static int __vxlan_dev_create(struct net *net, struct net_device *dev,
+ struct vxlan_config *conf)
+{
+   struct vxlan_net *vn = net_generic(net, vxlan_net_id);
+   struct vxlan_dev *vxlan = netdev_priv(dev);
+   int err;
+
+   err = vxlan_dev_configure(net, dev, conf, false);
+   if (err)
+   return err;
+
+   dev->ethtool_ops = _ethtool_ops;
+
+   /* create an fdb entry for a valid default destination */
+   if (!vxlan_addr_any(>default_dst.remote_ip)) {
+   err = vxlan_fdb_create(vxlan, all_zeros_mac,
+  >default_dst.remote_ip,
+  NUD_REACHABLE | NUD_PERMANENT,
+  NLM_F_EXCL | NLM_F_CREATE,
+  vxlan->cfg.dst_port,
+  vxlan->default_dst.remote_vni,
+  vxlan->default_dst.remote_vni,
+  vxlan->default_dst.remote_ifindex,
+  NTF_SELF);
+   if (err)
+   return err;
+   }
+
+   err = register_netdevice(dev);
+   if (err) {
+   vxlan_fdb_delete_default(vxlan, vxlan->default_dst.remote_vni);
+   return err;
+   }
+
+   list_add(>next, >vxlan_list);
+   return 0;
+}
+
 static int vxlan_nl2conf(struct nlattr *tb[], struct nlattr *data[],
 struct net_device *dev, struct vxlan_config *conf,
 bool changelink)
@@ -3172,8 +3210,6 @@ static int vxlan_nl2conf(struct nlattr *tb[], struct 
nlattr *data[],
 static int vxlan_newlink(struct net *src_net, struct net_device *dev,
 struct nlattr *tb[], struct nlattr *data[])
 {
-   struct vxlan_net *vn = net_generic(src_net, vxlan_net_id);
-   struct vxlan_dev *vxlan = netdev_priv(dev);
struct vxlan_config conf;
int err;
 
@@ -3181,36 +3217,7 @@ static int vxlan_newlink(struct net *src_net, struct 
net_device *dev,
if (err)
return err;
 
-   err = vxlan_dev_configure(src_net, dev, , false);
-   if (err)
-   return err;
-
-   dev->ethtool_ops = _ethtool_ops;
-
-   /* create an fdb entry for a valid default destination */
-   if (!vxlan_addr_any(>default_dst.remote_ip)) {
-   err = vxlan_fdb_create(vxlan, all_zeros_mac,
-  >default_dst.remote_ip,
-  NUD_REACHABLE | NUD_PERMANENT,
-  NLM_F_EXCL | NLM_F_CREATE,
-  vxlan->cfg.dst_port,
-  vxlan->default_dst.remote_vni,
-  vxlan->default_dst.remote_vni,
-  vxlan->default_dst.remote_ifindex,
-  NTF_SELF);
-   if (err)
-   return err;
-   }
-
-   err = register_netdevice(dev);
-   if (err) {
-   vxlan_fdb_delete_default(vxlan, vxlan->default_dst.remote_vni);
-   return err;
-   }
-
-   list_add(>next, >vxlan_list);
-
-   return 0;
+   return __vxlan_dev_create(src_net, dev, );
 }
 
 static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
@@ -3440,7 +3447,7 @@ struct net_device *vxlan_dev_create(struct net *net, 
const char *name,
if (IS_ERR(dev))
return dev;
 
-   err = vxlan_dev_configure(net, dev, conf, false);
+   err = __vxlan_dev_create(net, dev, conf);
if (err < 0) {
free_netdev(dev);
return ERR_PTR(err);
-- 
2.8.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev