On Thu, Mar 21, 2019 at 03:04:37PM +0200, Liran Alon wrote: > > > > On 21 Mar 2019, at 14:57, Michael S. Tsirkin <m...@redhat.com> wrote: > > > > On Thu, Mar 21, 2019 at 02:47:50PM +0200, Liran Alon wrote: > >> > >> > >>> On 21 Mar 2019, at 14:37, Michael S. Tsirkin <m...@redhat.com> wrote: > >>> > >>> On Thu, Mar 21, 2019 at 12:07:57PM +0200, Liran Alon wrote: > >>>>>>>> 2) It brings non-intuitive customer experience. For example, a > >>>>>>>> customer may attempt to analyse connectivity issue by checking the > >>>>>>>> connectivity > >>>>>>>> on a net-failover slave (e.g. the VF) but will see no connectivity > >>>>>>>> when in-fact checking the connectivity on the net-failover master > >>>>>>>> netdev shows correct connectivity. > >>>>>>>> > >>>>>>>> The set of changes I vision to fix our issues are: > >>>>>>>> 1) Hide net-failover slaves in a different netns created and managed > >>>>>>>> by the kernel. But that user can enter to it and manage the netdevs > >>>>>>>> there if wishes to do so explicitly. > >>>>>>>> (E.g. Configure the net-failover VF slave in some special way). > >>>>>>>> 2) Match the virtio-net and the VF based on a PV attribute instead > >>>>>>>> of MAC. (Similar to as done in NetVSC). E.g. Provide a virtio-net > >>>>>>>> interface to get PCI slot where the matching VF will be hot-plugged > >>>>>>>> by hypervisor. > >>>>>>>> 3) Have an explicit virtio-net control message to command hypervisor > >>>>>>>> to switch data-path from virtio-net to VF and vice-versa. Instead of > >>>>>>>> relying on intercepting the PCI master enable-bit > >>>>>>>> as an indicator on when VF is about to be set up. (Similar to as > >>>>>>>> done in NetVSC). > >>>>>>>> > >>>>>>>> Is there any clear issue we see regarding the above suggestion? > >>>>>>>> > >>>>>>>> -Liran > >>>>>>> > >>>>>>> The issue would be this: how do we avoid conflicting with namespaces > >>>>>>> created by users? > >>>>>> > >>>>>> This is kinda controversial, but maybe separate netns names into 2 > >>>>>> groups: hidden and normal. > >>>>>> To reference a hidden netns, you need to do it explicitly. > >>>>>> Hidden and normal netns names can collide as they will be maintained > >>>>>> in different namespaces (Yes I’m overloading the term namespace here…). > >>>>> > >>>>> Maybe it's an unnamed namespace. Hidden until userspace gives it a name? > >>>> > >>>> This is also a good idea that will solve the issue. Yes. > >>>> > >>>>> > >>>>>> Does this seems reasonable? > >>>>>> > >>>>>> -Liran > >>>>> > >>>>> Reasonable I'd say yes, easy to implement probably no. But maybe I > >>>>> missed a trick or two. > >>>> > >>>> BTW, from a practical point of view, I think that even until we figure > >>>> out a solution on how to implement this, > >>>> it was better to create an kernel auto-generated name (e.g. > >>>> “kernel_net_failover_slaves") > >>>> that will break only userspace workloads that by a very rare-chance have > >>>> a netns that collides with this then > >>>> the breakage we have today for the various userspace components. > >>>> > >>>> -Liran > >>> > >>> It seems quite easy to supply that as a module parameter. Do we need two > >>> namespaces though? Won't some userspace still be confused by the two > >>> slaves sharing the MAC address? > >> > >> That’s one reasonable option. > >> Another one is that we will indeed change the mechanism by which we > >> determine a VF should be bonded with a virtio-net device. > >> i.e. Expose a new virtio-net property that specify the PCI slot of the VF > >> to be bonded with. > >> > >> The second seems cleaner but I don’t have a strong opinion on this. Both > >> seem reasonable to me and your suggestion is faster to implement from > >> current state of things. > >> > >> -Liran > > > > OK. Now what happens if master is moved to another namespace? Do we need > > to move the slaves too? > > No. Why would we move the slaves?
The reason we have 3 device model at all is so users can fine tune the slaves. I don't see why this applies to the root namespace but not a container. If it has access to failover it should have access to slaves. > The whole point is to make most customer ignore the net-failover slaves and > remain them “hidden” in their dedicated netns. So that makes the common case easy. That is good. My worry is it might make some uncommon cases impossible. > We won’t prevent customer from explicitly moving the net-failover slaves out > of this netns, but we will not move them out of there automatically. > > > > > Also siwei's patch is then kind of extraneous right? > > Attempts to rename a slave will now fail as it's in a namespace… > > I’m not sure actually. Isn't udev/systemd netns-aware? > I would expect it to be able to provide names also to netdevs in netns > different than default netns. I think most people move devices after they are renamed. > If that’s the case, Si-Wei patch to be able to rename a net-failover slave > when it is already open is still required. As the race-condition still exists. > > -Liran > > > > >>> > >>> -- > >>> MST _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization