On Wed, 7 Mar 2018 09:50:50 -0800
Alexander Duyck <alexander.du...@gmail.com> wrote:
> On Tue, Mar 6, 2018 at 6:38 PM, Michael S. Tsirkin <m...@redhat.com> wrote:
> > On Tue, Mar 06, 2018 at 03:27:46PM -0800, Alexander Duyck wrote:
> >> > I definitelly vote for a separate common shared code for both netvsc and
> >> > virtio_net - even if you use 2 and 3 netdev model, you could share the
> >> > common code. Strict checks and limitation should be in place.
> >> Noted. But as I also mentioned there isn't that much "common" code
> >> between the two models. I think if anything we could probably look at
> >> peeling out a few bits such as "get_<iface>_bymac" which really would
> >> become dev_get_by_mac_and_ops in order to find the device for the
> >> notifiers. I probably wouldn't even put that in our driver and would
> >> instead put it in the core code since it almost makes more sense
> >> there. Beyond that sharing becomes much more challenging due to the
> >> differences in the Rx and Tx paths that build out of the difference
> >> between the 2 driver and 3 driver models.
> > At this point it might be worth it to articulate the advantages
> > of the 3 netdev model.
> The main advantages are performance on the lower devices. Specifically
> we can run LLTX and IFF_NOQUEUE on the upper device, meaning the VF
> doesn't take a performance hit when we start transmitting through it.
> In addition the paravirtual device is able to fully expose it's
> features, so for example virtio_net can maintain in-driver XDP, and we
> can provide generic XDP on the upper device. We can also have an
> asymmetric configuration where the number of queues or feature sets
> can be different between the ports.
> Basically what it comes down to is in the 3 netdev model we never have
> to deal with the issues of going from being a standard netdev to being
> a master netdev. The role of each netdev is clearly defined.
> As far as doing a generic solution it is the preferred way to go as
> well since we don't have to rewrite functionality of the lower device.
> Currently the only changes really needed are to add a phys_port_name
> operation so that you can distinguish between the bypass interface and
> the original paravirtual one. As such you have no confusion about what
> you are running.
> > If they are compelling, why wouldn't netvsc users want them?
> I believe part of the logic for Stephen's choice is that netvsc
> doesn't have a "BACKUP" bit like what we have virtio_net. As a result
> they have no way of knowing if a VF will ever show up or not. That
> makes the 3 netdev solution less appealing as they always end up in
> the bonding mode. In addition they have intentionally limited
> themselves to avoid features such as XDP that would cause issues with
> the 2 netdev model.
> > Alex, I think you were one of the strongest proponents of this model,
> > you should be well placed to provide a summary.
> Hopefully the information provided helps. In my mind the issue is that
> netvsc was not designed correctly, and as such it is using the 2
> netdev model as a bit of a crutch to handle the case where they wanted
> to add a VF bypass. At this point it has become a part of their
> architecture so it would be confusing for their user base to deal with
> having 2 netdevs spawn every time their driver sees a new device. In
> the case of virtio we only spawn 2 netdevs if the backup bit is set
> which implies a specific use case. When the bit isn't set we are only
> spawning one device, and as a result we can get much better
> performance out of the resultant combination of devices, and can
> expose functionality such as in-driver XDP in virtio.
I have experimented with toggling IFF_NOQUEUE and/or LLTX on the netvsc
device when doing passthrough. It didn't help performance much and there
are corner cases that make it painful, like what if qdisc was placed
by user on the netvsc device? Should the qdisc then be moved back
and forth to the VF device when it arrives or is removed?
As far as XDP support, it is on the plan to support XDP on the netvsc
device since the receive buffers do have to be copied. But there has
been no customer demand for it; and the VF model on Azure has limitations
which make a transparent XDP model pretty much impossible.