Re: [PATCH net-next 3/3] vxlan: allow multiple VXLANs with same VNI for IPv6 link-local addresses
On 03/15/2017 04:22 PM, Jiri Benc wrote: > On Wed, 15 Mar 2017 15:29:29 +0100, Matthias Schiffer wrote: >> While ensuring that the destination address is link-local iff the source >> address is would also be an option, it didn't seem too useful as the >> destination address will be a multicast address anyways in "normal" VXLAN >> configurations. If we really want to check this, I guess the valid >> combinations are: >> >> source link-local - destination link-local UC >> source link-local - destination link-local MC >> source global/... - destination global/... UC >> source global/... - destination any MC >> >> Does this make sense? > > It does. > > Thanks! > > Jiri > While trying to integrate the additional checks, I noticed that the vxlan_dev_configure() function has some serious issues in the changelink case. An (probably incomplete) list of things that can go wrong: - vxlan_dev_configure may return with an error as late the "if (conf->mtu)" branch, after some settings have already been applied to the vxlan_dev or net_device, leading to partial application in some error cases - at the moment, changelink is allowed to change the address family of the source/destionation addresses, but VXLAN_F_IPV6 is never removed, and sockets are not recreated; I think changing the AF should just be disallowed - conf->mtu will be re-applied in changelink even when the lowerdev has not changed, possibly overwriting other MTU changes that have been made after device creation (having conf->mtu in addition to dev->mtu might be a bad idea in general?) Each of the issues could be fixed separately, but at the moment, I'm rather considering cleaning up the code by factoring out a vxlan_cfg_validate() from vxlan_dev_configure(), to clearly separate validation and application of the configuration. Is this the way to go, or do you have other suggestions? -- Matthias signature.asc Description: OpenPGP digital signature
Re: [PATCH net-next 3/3] vxlan: allow multiple VXLANs with same VNI for IPv6 link-local addresses
On Wed, 15 Mar 2017 15:29:29 +0100, Matthias Schiffer wrote: > While ensuring that the destination address is link-local iff the source > address is would also be an option, it didn't seem too useful as the > destination address will be a multicast address anyways in "normal" VXLAN > configurations. If we really want to check this, I guess the valid > combinations are: > > source link-local - destination link-local UC > source link-local - destination link-local MC > source global/... - destination global/... UC > source global/... - destination any MC > > Does this make sense? It does. Thanks! Jiri
Re: [PATCH net-next 3/3] vxlan: allow multiple VXLANs with same VNI for IPv6 link-local addresses
On 03/14/2017 04:28 PM, Jiri Benc wrote: > On Fri, 10 Mar 2017 23:39:44 +0100, Matthias Schiffer wrote: >> @@ -233,17 +234,30 @@ static struct vxlan_dev *vxlan_vs_find_vni(struct >> vxlan_sock *vs, __be32 vni) >> vni = 0; >> >> hlist_for_each_entry_rcu(vxlan, vni_head(vs, vni), hlist) { >> -if (vxlan->default_dst.remote_vni == vni) >> -return vxlan; >> +if (vxlan->default_dst.remote_vni != vni) >> +continue; >> + >> +if (IS_ENABLED(CONFIG_IPV6)) { >> +const struct vxlan_config *cfg = >cfg; >> + >> +if (cfg->remote_ifindex != 0 && >> +cfg->remote_ifindex != ifindex && >> +cfg->saddr.sa.sa_family == AF_INET6 && >> +(ipv6_addr_type(>saddr.sin6.sin6_addr) & >> + IPV6_ADDR_LINKLOCAL)) > > Calculating this (especially ipv6_addr_type) on every received packet > looks unnecessarily expensive. Just store the fact the the local > address is link-local in a flag during config. And compare the flag > first before considering remote_ifindex. > > This is especially important for lwtunnels which can have anything in > the saddr and remote_ifindex, yet those fields are ignored and the > vni 0 entry has to be returned. It also means that the link-local flag > must not be set for lwtunnels. > >> +#if IS_ENABLED(CONFIG_IPV6) >> +if (conf->remote_ifindex != tmp->cfg.remote_ifindex && >> +conf->saddr.sa.sa_family == AF_INET6 && >> +tmp->cfg.saddr.sa.sa_family == AF_INET6 && >> +(ipv6_addr_type(>saddr.sin6.sin6_addr) & >> + IPV6_ADDR_LINKLOCAL) && >> +(ipv6_addr_type(>cfg.saddr.sin6.sin6_addr) & >> + IPV6_ADDR_LINKLOCAL)) >> +continue; >> +#endif > > In patch 1, you're checking for either source and destination address > being link-local, while here you're consider only those that have both > addresses link-local. > > Wouldn't it be better to plainly reject configuration that has one > address link-local but not the other one? While ensuring that the destination address is link-local iff the source address is would also be an option, it didn't seem too useful as the destination address will be a multicast address anyways in "normal" VXLAN configurations. If we really want to check this, I guess the valid combinations are: source link-local - destination link-local UC source link-local - destination link-local MC source global/... - destination global/... UC source global/... - destination any MC Does this make sense? > > Jiri Thanks for your comments, I'll send a v2 soon. Matthias signature.asc Description: OpenPGP digital signature
Re: [PATCH net-next 3/3] vxlan: allow multiple VXLANs with same VNI for IPv6 link-local addresses
On Fri, 10 Mar 2017 23:39:44 +0100, Matthias Schiffer wrote: > @@ -233,17 +234,30 @@ static struct vxlan_dev *vxlan_vs_find_vni(struct > vxlan_sock *vs, __be32 vni) > vni = 0; > > hlist_for_each_entry_rcu(vxlan, vni_head(vs, vni), hlist) { > - if (vxlan->default_dst.remote_vni == vni) > - return vxlan; > + if (vxlan->default_dst.remote_vni != vni) > + continue; > + > + if (IS_ENABLED(CONFIG_IPV6)) { > + const struct vxlan_config *cfg = >cfg; > + > + if (cfg->remote_ifindex != 0 && > + cfg->remote_ifindex != ifindex && > + cfg->saddr.sa.sa_family == AF_INET6 && > + (ipv6_addr_type(>saddr.sin6.sin6_addr) & > + IPV6_ADDR_LINKLOCAL)) Calculating this (especially ipv6_addr_type) on every received packet looks unnecessarily expensive. Just store the fact the the local address is link-local in a flag during config. And compare the flag first before considering remote_ifindex. This is especially important for lwtunnels which can have anything in the saddr and remote_ifindex, yet those fields are ignored and the vni 0 entry has to be returned. It also means that the link-local flag must not be set for lwtunnels. > +#if IS_ENABLED(CONFIG_IPV6) > + if (conf->remote_ifindex != tmp->cfg.remote_ifindex && > + conf->saddr.sa.sa_family == AF_INET6 && > + tmp->cfg.saddr.sa.sa_family == AF_INET6 && > + (ipv6_addr_type(>saddr.sin6.sin6_addr) & > + IPV6_ADDR_LINKLOCAL) && > + (ipv6_addr_type(>cfg.saddr.sin6.sin6_addr) & > + IPV6_ADDR_LINKLOCAL)) > + continue; > +#endif In patch 1, you're checking for either source and destination address being link-local, while here you're consider only those that have both addresses link-local. Wouldn't it be better to plainly reject configuration that has one address link-local but not the other one? Jiri
[PATCH net-next 3/3] vxlan: allow multiple VXLANs with same VNI for IPv6 link-local addresses
As link-local addresses are only valid for a single interface, we can allow to use the same VNI for multiple independent VXLANs, as long as the used interfaces are distinct. This way, VXLANs can always be used as a drop-in replacement for VLANs with greater ID space. This also extends VNI lookup to respect the ifindex when link-local IPv6 addresses are used, so using the same VNI on multiple interfaces can actually work. Signed-off-by: Matthias Schiffer--- drivers/net/vxlan.c | 88 - 1 file changed, 60 insertions(+), 28 deletions(-) diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 4c0ef8bbad71..9cd6f6b92cf4 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -224,7 +224,8 @@ static struct vxlan_sock *vxlan_find_sock(struct net *net, sa_family_t family, return NULL; } -static struct vxlan_dev *vxlan_vs_find_vni(struct vxlan_sock *vs, __be32 vni) +static struct vxlan_dev *vxlan_vs_find_vni(struct vxlan_sock *vs, int ifindex, + __be32 vni) { struct vxlan_dev *vxlan; @@ -233,17 +234,30 @@ static struct vxlan_dev *vxlan_vs_find_vni(struct vxlan_sock *vs, __be32 vni) vni = 0; hlist_for_each_entry_rcu(vxlan, vni_head(vs, vni), hlist) { - if (vxlan->default_dst.remote_vni == vni) - return vxlan; + if (vxlan->default_dst.remote_vni != vni) + continue; + + if (IS_ENABLED(CONFIG_IPV6)) { + const struct vxlan_config *cfg = >cfg; + + if (cfg->remote_ifindex != 0 && + cfg->remote_ifindex != ifindex && + cfg->saddr.sa.sa_family == AF_INET6 && + (ipv6_addr_type(>saddr.sin6.sin6_addr) & +IPV6_ADDR_LINKLOCAL)) + continue; + } + + return vxlan; } return NULL; } /* Look up VNI in a per net namespace table */ -static struct vxlan_dev *vxlan_find_vni(struct net *net, __be32 vni, - sa_family_t family, __be16 port, - u32 flags) +static struct vxlan_dev *vxlan_find_vni(struct net *net, int ifindex, + __be32 vni, sa_family_t family, + __be16 port, u32 flags) { struct vxlan_sock *vs; @@ -251,7 +265,7 @@ static struct vxlan_dev *vxlan_find_vni(struct net *net, __be32 vni, if (!vs) return NULL; - return vxlan_vs_find_vni(vs, vni); + return vxlan_vs_find_vni(vs, ifindex, vni); } /* Fill in neighbour message in skbuff. */ @@ -1342,7 +1356,7 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb) vni = vxlan_vni(vxlan_hdr(skb)->vx_vni); - vxlan = vxlan_vs_find_vni(vs, vni); + vxlan = vxlan_vs_find_vni(vs, skb->dev->ifindex, vni); if (!vxlan) goto drop; @@ -2002,8 +2016,10 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan, } static int encap_bypass_if_local(struct sk_buff *skb, struct net_device *dev, -struct vxlan_dev *vxlan, union vxlan_addr *daddr, -__be16 dst_port, __be32 vni, struct dst_entry *dst, +struct vxlan_dev *vxlan, +union vxlan_addr *daddr, +__be16 dst_port, int dst_ifindex, __be32 vni, +struct dst_entry *dst, u32 rt_flags) { #if IS_ENABLED(CONFIG_IPV6) @@ -2019,7 +2035,7 @@ static int encap_bypass_if_local(struct sk_buff *skb, struct net_device *dev, struct vxlan_dev *dst_vxlan; dst_release(dst); - dst_vxlan = vxlan_find_vni(vxlan->net, vni, + dst_vxlan = vxlan_find_vni(vxlan->net, dst_ifindex, vni, daddr->sa.sa_family, dst_port, vxlan->flags); if (!dst_vxlan) { @@ -2051,6 +2067,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, struct dst_entry *ndst = NULL; __be32 vni, label; __u8 tos, ttl; + int ifindex; int err; u32 flags = vxlan->flags; bool udp_sum = false; @@ -2071,6 +2088,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, dst_port = rdst->remote_port ? rdst->remote_port : vxlan->cfg.dst_port; vni = (rdst->remote_vni) ? : default_vni; + ifindex = rdst->remote_ifindex; local_ip = vxlan->cfg.saddr; dst_cache = >dst_cache;