On Wed, Dec 05, 2018 at 03:12:08PM -0200, Flavio Leitner wrote: > > Hi Ben, > > This patch introduced a regression in OSP environments using internal > ports in other netns. Their networking configuration is lost when > the service is restarted because the ports are recreated now. > > Before the patch it checked using netlink if the port with a specific > "name" was already there. I believe that's the check you referred as > expensive below. Anyways, the check is a lookup in all ports attached > to the DP regardless of the port's netns. > > After the patch it relies on the kernel to identify that situation. > Unfortunately the only protection there is register_netdevice() which > fails only if the port with that name exists in the current netns. > > If the port is in another netns, it will get a new dp_port and because > of that userspace will delete the old port. At this point the original > port is gone from the other netns and there a fresh port in the current > netns. > > I think the optimization is a good idea, so I came up with this kernel > patch to make sure we are not adding another vport with the same name. > It resolved the issue in my small env (want to do more tests though). > > diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c > index 252adfb6fc0b..291b4a71a910 100644 > --- a/net/openvswitch/datapath.c > +++ b/net/openvswitch/datapath.c > @@ -2022,6 +2022,11 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, > struct genl_info *info) > return -ENOMEM; > > ovs_lock(); > + vport = lookup_vport(sock_net(skb->sk), ovs_header, a); > + err = -EEXIST; > + if (!IS_ERR(vport) && vport) > + goto exit_unlock_free; > + > restart: > dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex); > err = -ENODEV; > > > However, OSP users using unpatched kernel with OVS 2.10 might trigger > the bug, so I wonder if we should revert the patch in 2.10 and work > on an improved fix for 2.11. Perhaps we can detect if the kernel fix > is in there (or not) by trying to add the same port twice once and > use that as a hint. Perhaps there is something cheaper in dpif to > verify if the vport is there that is not vulnerable to races.
This seems reasonable to me. Would you mind coming up with a series that reverts the OVS 2.10 patch and separately adds the improved fix? _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
