[virtio-dev] Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available

2018-03-06 Thread Michael S. Tsirkin
On Tue, Mar 06, 2018 at 03:27:46PM -0800, Alexander Duyck wrote:
> > I definitelly vote for a separate common shared code for both netvsc and
> > virtio_net - even if you use 2 and 3 netdev model, you could share the
> > common code. Strict checks and limitation should be in place.
> 
> Noted. But as I also mentioned there isn't that much "common" code
> between the two models. I think if anything we could probably look at
> peeling out a few bits such as "get__bymac" which really would
> become dev_get_by_mac_and_ops in order to find the device for the
> notifiers. I probably wouldn't even put that in our driver and would
> instead put it in the core code since it almost makes more sense
> there. Beyond that sharing becomes much more challenging due to the
> differences in the Rx and Tx paths that build out of the difference
> between the 2 driver and 3 driver models.

At this point it might be worth it to articulate the advantages
of the 3 netdev model.

If they are compelling, why wouldn't netvsc users want them?

Alex, I think you were one of the strongest proponents of this model,
you should be well placed to provide a summary.

-- 
MST

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH] pci-iov: Add support for unmanaged SR-IOV

2018-03-06 Thread Alexander Duyck
On Tue, Mar 6, 2018 at 12:19 PM, Don Dutile  wrote:
> On 03/05/2018 04:41 PM, Alexander Duyck wrote:
>>
>> On Mon, Mar 5, 2018 at 12:57 PM, Don Dutile  wrote:
>>>
>>> On 03/01/2018 03:22 PM, Alex Williamson wrote:


 On Wed, 28 Feb 2018 16:36:38 -0800
 Alexander Duyck  wrote:

> On Wed, Feb 28, 2018 at 2:59 PM, Alex Williamson
>  wrote:
>>
>>
>> On Wed, 28 Feb 2018 09:49:21 -0800
>> Alexander Duyck  wrote:
>>
>>>
>>> On Tue, Feb 27, 2018 at 2:25 PM, Alexander Duyck
>>>  wrote:


 On Tue, Feb 27, 2018 at 1:40 PM, Alex Williamson
  wrote:
>
>
> On Tue, 27 Feb 2018 11:06:54 -0800
> Alexander Duyck  wrote:
>
>>
>> From: Alexander Duyck 
>>
>> This patch is meant to add support for SR-IOV on devices when the
>> VFs are
>> not managed by the kernel. Examples of recent patches attempting
>> to
>> do this
>> include:
>
>
>
> It appears to enable sriov when the _pf_ is not managed by the
> kernel, but by "managed" we mean that either there is no pf driver
> or
> the pf driver doesn't provide an sriov_configure callback,
> intentionally or otherwise.
>
>>
>> virto - https://patchwork.kernel.org/patch/10241225/
>> pci-stub - https://patchwork.kernel.org/patch/10109935/
>> vfio - https://patchwork.kernel.org/patch/10103353/
>> uio - https://patchwork.kernel.org/patch/9974031/
>
>
>
> So is the goal to get around the issues with enabling sriov on each
> of
> the above drivers by doing it under the covers or are you really
> just
> trying to enable sriov for a truly unmanage (no pf driver) case?
> For
> example, should a driver explicitly not wanting sriov enabled
> implement
> a dummy sriov_configure function?
>
>>
>> Since this is quickly blowing up into a multi-driver problem it is
>> probably
>> best to implement this solution in one spot.
>>
>> This patch is an attempt to do that. What we do with this patch is
>> provide
>> a generic call to enable SR-IOV in the case that the PF driver is
>> either
>> not present, or the PF driver doesn't support configuring SR-IOV.
>>
>> A new sysfs value called sriov_unmanaged_autoprobe has been added.
>> This
>> value is used as the drivers_autoprobe setting of the VFs when
>> they
>> are
>> being managed by an external entity such as userspace or device
>> firmware
>> instead of being managed by the kernel.
>
>
>
> Documentation/ABI/testing/sysfs-bus-pci update is missing.



 I can make sure to update that in the next version.

>>
>> One side effect of this change is that the sriov_drivers_autoprobe
>> and
>> sriov_unmanaged_autoprobe will only apply their updates when
>> SR-IOV
>> is
>> disabled. Attempts to update them when SR-IOV is in use will only
>> update
>> the local value and will not update sriov->autoprobe.
>
>
>
> And we expect users to understand when sriov_drivers_autoprobe
> applies
> vs sriov_unmanaged_autoprobe, even though they're using the same
> interfaces to enable sriov?  Are all combinations expected to work,
> ex.
> unmanaged sriov is enabled, a native pf driver loads, vfs work?
> Not
> only does it seems like there's opportunity to use this
> incorrectly,
> I
> think maybe it might be difficult to use correctly.
>
>>
>> I based my patch set originally on the patch by Mark Rustad but
>> there isn't
>> much left after going through and cleaning out the bits that were
>> no
>> longer
>> needed, and after incorporating the feedback from David Miller.
>>
>> I have included the authors of the original 4 patches above in the
>> Cc here.
>> My hope is to get feedback and/or review on if this works for
>> their
>> use
>> cases.
>>
>> Cc: Mark Rustad 
>> Cc: Maximilian Heyne 
>> Cc: Liang-Min Wang 
>> Cc: David Woodhouse 
>> Signed-off-by: Alexander Duyck 

[virtio-dev] Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available

2018-03-06 Thread Alexander Duyck
On Tue, Mar 6, 2018 at 2:59 PM, Jiri Pirko  wrote:
> Tue, Mar 06, 2018 at 08:08:21PM CET, alexander.du...@gmail.com wrote:
>>On Mon, Mar 5, 2018 at 7:15 PM, Stephen Hemminger
>> wrote:
>>> On Mon, 5 Mar 2018 14:47:20 -0800
>>> Alexander Duyck  wrote:
>>>
 On Mon, Mar 5, 2018 at 2:30 PM, Jiri Pirko  wrote:
 > Mon, Mar 05, 2018 at 05:11:32PM CET, step...@networkplumber.org wrote:
 >>On Mon, 5 Mar 2018 10:21:18 +0100
 >>Jiri Pirko  wrote:
 >>
 >>> Sun, Mar 04, 2018 at 10:58:34PM CET, alexander.du...@gmail.com wrote:
 >>> >On Sun, Mar 4, 2018 at 10:50 AM, Jiri Pirko  wrote:
 >>> >> Sun, Mar 04, 2018 at 07:24:12PM CET, alexander.du...@gmail.com 
 >>> >> wrote:
 >>> >>>On Sat, Mar 3, 2018 at 11:13 PM, Jiri Pirko  
 >>> >>>wrote:
 >>>
 >>> [...]
 >>>
 >>> >
 >>> >>>Currently we only have agreement from Michael on taking this code, 
 >>> >>>as
 >>> >>>such we are working with virtio only for now. When the time comes 
 >>> >>>that
 >>> >>
 >>> >> If you do duplication of netvsc in-driver bonding in virtio_net, it 
 >>> >> will
 >>> >> stay there forever. So what you say is: "We will do it halfway now
 >>> >> and promise to fix it later". That later will never happen, I'm 
 >>> >> pretty
 >>> >> sure. That is why I push for in-driver bonding shared code as a 
 >>> >> part of
 >>> >> this patchset.
 >>> >
 >>> >You want this new approach and a copy of netvsc moved into either core
 >>> >or some module of its own. I say pick an architecture. We are looking
 >>> >at either 2 netdevs or 3. We are not going to support both because
 >>> >that will ultimately lead to a terrible user experience and make
 >>> >things quite confusing.
 >>> >
 >>> >> + if you would be pushing first driver to do this, I would 
 >>> >> understand.
 >>> >> But the first driver is already in. You are pushing second. This is 
 >>> >> the
 >>> >> time to do the sharing, unification of behaviour. Next time is too 
 >>> >> late.
 >>> >
 >>> >That is great, if we want to share then lets share. But what you are
 >>> >essentially telling us is that we need to fork this solution and
 >>> >maintain two code paths, one for 2 netdevs, and another for 3. At that
 >>> >point what is the point in merging them together?
 >>>
 >>> Of course, I vote for the same behaviour for netvsc and virtio_net. 
 >>> That
 >>> is my point from the very beginning.
 >>>
 >>> Stephen, what do you think? Could we please make virtio_net and netvsc
 >>> behave the same and to use a single code with well-defined checks and
 >>> restrictions for this feature?
 >>
 >>Eventually, yes both could share common code routines. In reality,
 >>the failover stuff is only a very small part of either driver so
 >>it is not worth stretching to try and cover too much. If you look,
 >>the failover code is just using routines that already exist for
 >>use by bonding, teaming, etc.
 >
 > Yeah, we consern was also about the code that processes the netdev
 > notifications and does auto-enslave and all related stuff.

 The concern was the driver model. If we expose 3 netdevs or 2 with the
 VF driver present. Somehow this is turning into a "merge netvsc into
 virtio" think and that isn't the subject that was being asked.

 Ideally we want one model for this. Either 3 netdevs or 2. The problem
 is 2 causes issues in terms of performance and will limit features of
 virtio, but 2 is the precedent set by netvsc. We need to figure out
 the path forward for this. There is talk about "sharing" but it is
 hard to make these two approaches share code when they are doing two
 very different setups and end up presenting themselves as two very
 different driver models.
>>>
>>> I appreciate this discussion, and it has helped a lot.
>>>
>>> Netvsc is stuck with 2 netdev model for the foreseeable future.
>>> We already failed once with the bonding model, and that created a lot of
>>> pain. The current model is working well and have convinced the major distros
>>> to support the two netdev model and don't want to back.
>>>
>>> Very open to optimizations and ways to smooth out the rough edges.
>>
>>Thank you for clarifying this Stephen.
>>
>>Okay. So with things defined such that we are doing a 2 netdev model
>>for netvsc, and a 3 netdev model for virtio, is it still in our
>>interest for us to try making a shared library between the two? In my
>>mind, the virtnet_bypass becomes the way we go forward for any future
>>solutions. I say we treat the netvsc approach as a "legacy" approach
>>and avoid creating any new libraries or drivers to support it, and
>>instead just focus on the 3 

[virtio-dev] [pci PATCH v3 3/3] virtio_pci: Add support for unmanaged SR-IOV on virtio_pci devices

2018-03-06 Thread Alexander Duyck
From: Alexander Duyck 

Hardware-realized virtio_pci devices can implement SR-IOV, so this
patch enables its use. The device in question is an upcoming Intel
NIC that implements both a virtio_net PF and virtio_net VFs. These
are hardware realizations of what has been up to now been a software
interface.

The device in question has the following 4-part PCI IDs:

PF: vendor: 1af4 device: 1041 subvendor: 8086 subdevice: 15fe
VF: vendor: 1af4 device: 1041 subvendor: 8086 subdevice: 05fe

The patch currently needs no check for device ID, because the callback
will never be made for devices that do not assert the capability or
when run on a platform incapable of SR-IOV.

One reason for this patch is because the hardware requires the
vendor ID of a VF to be the same as the vendor ID of the PF that
created it. So it seemed logical to simply have a fully-functioning
virtio_net PF create the VFs. This patch makes that possible.

Signed-off-by: Mark Rustad 
Signed-off-by: Alexander Duyck 
---
 drivers/virtio/virtio_pci_common.c |4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index 48d4d1cf1cb6..ca1549393255 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -584,6 +584,7 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
else
virtio_pci_modern_remove(vp_dev);
 
+   pci_disable_sriov(pci_dev);
pci_disable_device(pci_dev);
put_device(dev);
 }
@@ -596,6 +597,9 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
 #ifdef CONFIG_PM_SLEEP
.driver.pm  = _pci_pm_ops,
 #endif
+#ifdef CONFIG_PCI_IOV
+   .sriov_configure = pci_sriov_configure_unmanaged,
+#endif
 };
 
 module_pci_driver(virtio_pci_driver);


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [pci PATCH v3 2/3] vfio: Add support for unmanaged or userspace managed SR-IOV

2018-03-06 Thread Alexander Duyck
From: Alexander Duyck 

This patch is meant to allow assignment of an SR-IOV enabled PF, as in VFs
have been generated, with vfio-pci. My understanding is the primary use
case for this is something like DPDK running the PF while the VFs are all
assigned to guests.

A secondary effect of this is that it provides an interface through which
it would be possible to enable SR-IOV on drivers that may not have a
physical function that actually manages the device.

Enabling SR-IOV should be pretty straight forward. As long as there are no
userspace processes currently controlling the interface the number of VFs
can be changed, and VFs will be generated without drivers being loaded on
the host. Once the userspace process begins controlling the interface the
number of VFs cannot be updated via the sysfs until the control is
released.

Note the VFs will have drivers load on them in the host if the
sriov_unmanaged_autoprobe is updated to a value of 1. However the behavior
of the VFs in such a setup cannot be guaranteed as the PF will not be
available until the userspace process starts and begins to manage the
device.

For now I am leaving the value as locked when the PF is being controlled
from userspace as a form of synchronization. Basically this way we cannot
have the number of VFs change out from under the process so it should not
require any notification framework, and the configuration can just be read
out via configuration space accesses.

Signed-off-by: Alexander Duyck 
---
 drivers/vfio/pci/vfio_pci.c |   59 +++
 1 file changed, 59 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index b0f759476900..8025d7336071 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1224,6 +1224,8 @@ static void vfio_pci_remove(struct pci_dev *pdev)
VGA_RSRC_LEGACY_IO | VGA_RSRC_LEGACY_MEM);
}
 
+   pci_disable_sriov(pdev);
+
if (!disable_idle_d3)
pci_set_power_state(pdev, PCI_D0);
 }
@@ -1260,12 +1262,69 @@ static pci_ers_result_t 
vfio_pci_aer_err_detected(struct pci_dev *pdev,
.error_detected = vfio_pci_aer_err_detected,
 };
 
+#ifdef CONFIG_PCI_IOV
+static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
+{
+   struct vfio_pci_device *vdev;
+   struct vfio_device *device;
+   int err;
+
+   device = vfio_device_get_from_dev(>dev);
+   if (device == NULL)
+   return -ENODEV;
+
+   vdev = vfio_device_data(device);
+   if (vdev == NULL) {
+   vfio_device_put(device);
+   return -ENODEV;
+   }
+
+   /*
+* If a userspace process is already using this device just return
+* busy and don't allow for any changes.
+*/
+   if (vdev->refcnt) {
+   pci_warn(pdev,
+"PF is currently in use, blocked until released by 
user\n");
+   return -EBUSY;
+   }
+
+   err = pci_sriov_configure_unmanaged(pdev, nr_virtfn);
+   if (err <= 0)
+   return err;
+
+   /*
+* We are now leaving VFs in the control of some unknown PF entity.
+*
+* Best case is a well behaved userspace PF is expected and any VMs
+* that the VFs will be assigned to are dependent on the userspace
+* entity anyway. An example being NFV where maybe the PF is acting
+* as an accelerated interface for a firewall or switch.
+*
+* Worst case is somebody really messed up and just enabled SR-IOV
+* on a device they were planning to assign to a VM somwhere.
+*
+* In either case it is probably best for us to set the taint flag
+* and warn the user since this could get really ugly really quick
+* if this wasn't what they were planning to do.
+*/
+   add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+   pci_warn(pdev,
+"Adding kernel taint for vfio-pci now managing SR-IOV PF 
device\n");
+
+   return nr_virtfn;
+}
+#endif /* CONFIG_PCI_IOV */
+
 static struct pci_driver vfio_pci_driver = {
.name   = "vfio-pci",
.id_table   = NULL, /* only dynamic ids */
.probe  = vfio_pci_probe,
.remove = vfio_pci_remove,
.err_handler= _err_handlers,
+#ifdef CONFIG_PCI_IOV
+   .sriov_configure = vfio_pci_sriov_configure,
+#endif
 };
 
 struct vfio_devices {


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [pci PATCH v3 1/3] pci-iov: Add support for unmanaged SR-IOV

2018-03-06 Thread Alexander Duyck
From: Alexander Duyck 

This patch is meant to add some basic functionality to support for SR-IOV
on devices when the VFs are not managed by the kernel. The functions
provided here can be used by drivers such as vfio-pci and virtio to enable
SR-IOV on devices that are either managed by userspace, or by some sort of
firmware entity respectively.

A new sysfs value called sriov_unmanaged_autoprobe has been added. This
value is used as the drivers_autoprobe setting of the VFs when they are
being managed by an external entity such as userspace or device firmware
instead of being managed by the kernel.

One side effect of this change is that the sriov_drivers_autoprobe and
sriov_unmanaged_autoprobe will only apply their updates when SR-IOV is
disabled. Attempts to update them when SR-IOV is in use will only update
the local value and will not update sriov->autoprobe.

Signed-off-by: Alexander Duyck 
---

v3: Updated documentation to better explain sriov_unmanaged_autoprobe use

 Documentation/ABI/testing/sysfs-bus-pci |   24 
 drivers/pci/iov.c   |   37 +++
 drivers/pci/pci-driver.c|2 +-
 drivers/pci/pci-sysfs.c |   29 
 drivers/pci/pci.h   |4 +++
 include/linux/pci.h |1 +
 6 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci 
b/Documentation/ABI/testing/sysfs-bus-pci
index 44d4b2be92fd..d9d20611fc91 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -323,3 +323,27 @@ Description:
 
This is similar to /sys/bus/pci/drivers_autoprobe, but
affects only the VFs associated with a specific PF.
+
+What:  /sys/bus/pci/devices/.../sriov_unmanaged_autoprobe
+Date:  March 2018
+Contact:   Alexander Duyck 
+Description:
+   This file is associated with the PF of a device that
+   supports SR-IOV.  It determines whether newly-enabled VFs
+   are immediately bound to a driver when the PF driver does
+   not manage the VFs itself.  It initially contains 0, which
+   means the kernel will not automatically bind VFs to a driver.
+   If an application writes 1 to the file before enabling VFs,
+   the kernel will bind VFs to a compatible driver immediately
+   after they are enabled.
+
+   Currently the use of this setting is limited to drivers that
+   make use of pci_sriov_configure_unmanaged. Examples might
+   include drivers such as virtio_net which could expose a PCI
+   function that resembles a VF with the extra SR-IOV related
+   bits, or a PF attached to a vfio interface which is being
+   managed by userspace instead of the kernel directly.
+
+   This overrides /sys/bus/pci/devices/.../sriov_drivers_autoprobe
+   when a PF driver does not provide functionality to manage the
+   VFs when SR-IOV is enabled.
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 677924ae0350..3dcec1fa86bd 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -446,6 +446,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
pci_read_config_word(dev, pos + PCI_SRIOV_VF_DID, >vf_device);
iov->pgsz = pgsz;
iov->self = dev;
+   iov->autoprobe = true;
iov->drivers_autoprobe = true;
pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, >cap);
pci_read_config_byte(dev, pos + PCI_SRIOV_FUNC_LINK, >link);
@@ -683,6 +684,9 @@ int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
if (!dev->is_physfn)
return -ENOSYS;
 
+   /* Update autoprobe setting to reflect managed device */
+   dev->sriov->autoprobe = dev->sriov->drivers_autoprobe;
+
return sriov_enable(dev, nr_virtfn);
 }
 EXPORT_SYMBOL_GPL(pci_enable_sriov);
@@ -807,3 +811,36 @@ int pci_sriov_get_totalvfs(struct pci_dev *dev)
return dev->sriov->total_VFs;
 }
 EXPORT_SYMBOL_GPL(pci_sriov_get_totalvfs);
+
+/**
+ * pci_sriov_configure_unmanaged - helper to configure unmanaged SR-IOV
+ * @dev: the PCI device
+ * @nr_virtfn: number of virtual functions to enable, 0 to disable
+ *
+ * Used to provide generic enable/disable SR-IOV option for devices
+ * that do not manage the VFs generated by their driver, or have no
+ * driver present.
+ */
+int pci_sriov_configure_unmanaged(struct pci_dev *dev, int nr_virtfn)
+{
+   int err;
+
+   might_sleep();
+
+   if (!dev->is_physfn)
+   return -ENODEV;
+
+   if (!nr_virtfn) {
+   sriov_disable(dev);
+
+   return 0;
+   }
+
+   /* Update autoprobe setting to reflect unmanaged device */
+  

[virtio-dev] Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available

2018-03-06 Thread Alexander Duyck
On Mon, Mar 5, 2018 at 7:15 PM, Stephen Hemminger
 wrote:
> On Mon, 5 Mar 2018 14:47:20 -0800
> Alexander Duyck  wrote:
>
>> On Mon, Mar 5, 2018 at 2:30 PM, Jiri Pirko  wrote:
>> > Mon, Mar 05, 2018 at 05:11:32PM CET, step...@networkplumber.org wrote:
>> >>On Mon, 5 Mar 2018 10:21:18 +0100
>> >>Jiri Pirko  wrote:
>> >>
>> >>> Sun, Mar 04, 2018 at 10:58:34PM CET, alexander.du...@gmail.com wrote:
>> >>> >On Sun, Mar 4, 2018 at 10:50 AM, Jiri Pirko  wrote:
>> >>> >> Sun, Mar 04, 2018 at 07:24:12PM CET, alexander.du...@gmail.com wrote:
>> >>> >>>On Sat, Mar 3, 2018 at 11:13 PM, Jiri Pirko  wrote:
>> >>>
>> >>> [...]
>> >>>
>> >>> >
>> >>> >>>Currently we only have agreement from Michael on taking this code, as
>> >>> >>>such we are working with virtio only for now. When the time comes that
>> >>> >>
>> >>> >> If you do duplication of netvsc in-driver bonding in virtio_net, it 
>> >>> >> will
>> >>> >> stay there forever. So what you say is: "We will do it halfway now
>> >>> >> and promise to fix it later". That later will never happen, I'm pretty
>> >>> >> sure. That is why I push for in-driver bonding shared code as a part 
>> >>> >> of
>> >>> >> this patchset.
>> >>> >
>> >>> >You want this new approach and a copy of netvsc moved into either core
>> >>> >or some module of its own. I say pick an architecture. We are looking
>> >>> >at either 2 netdevs or 3. We are not going to support both because
>> >>> >that will ultimately lead to a terrible user experience and make
>> >>> >things quite confusing.
>> >>> >
>> >>> >> + if you would be pushing first driver to do this, I would understand.
>> >>> >> But the first driver is already in. You are pushing second. This is 
>> >>> >> the
>> >>> >> time to do the sharing, unification of behaviour. Next time is too 
>> >>> >> late.
>> >>> >
>> >>> >That is great, if we want to share then lets share. But what you are
>> >>> >essentially telling us is that we need to fork this solution and
>> >>> >maintain two code paths, one for 2 netdevs, and another for 3. At that
>> >>> >point what is the point in merging them together?
>> >>>
>> >>> Of course, I vote for the same behaviour for netvsc and virtio_net. That
>> >>> is my point from the very beginning.
>> >>>
>> >>> Stephen, what do you think? Could we please make virtio_net and netvsc
>> >>> behave the same and to use a single code with well-defined checks and
>> >>> restrictions for this feature?
>> >>
>> >>Eventually, yes both could share common code routines. In reality,
>> >>the failover stuff is only a very small part of either driver so
>> >>it is not worth stretching to try and cover too much. If you look,
>> >>the failover code is just using routines that already exist for
>> >>use by bonding, teaming, etc.
>> >
>> > Yeah, we consern was also about the code that processes the netdev
>> > notifications and does auto-enslave and all related stuff.
>>
>> The concern was the driver model. If we expose 3 netdevs or 2 with the
>> VF driver present. Somehow this is turning into a "merge netvsc into
>> virtio" think and that isn't the subject that was being asked.
>>
>> Ideally we want one model for this. Either 3 netdevs or 2. The problem
>> is 2 causes issues in terms of performance and will limit features of
>> virtio, but 2 is the precedent set by netvsc. We need to figure out
>> the path forward for this. There is talk about "sharing" but it is
>> hard to make these two approaches share code when they are doing two
>> very different setups and end up presenting themselves as two very
>> different driver models.
>
> I appreciate this discussion, and it has helped a lot.
>
> Netvsc is stuck with 2 netdev model for the foreseeable future.
> We already failed once with the bonding model, and that created a lot of
> pain. The current model is working well and have convinced the major distros
> to support the two netdev model and don't want to back.
>
> Very open to optimizations and ways to smooth out the rough edges.

Thank you for clarifying this Stephen.

Okay. So with things defined such that we are doing a 2 netdev model
for netvsc, and a 3 netdev model for virtio, is it still in our
interest for us to try making a shared library between the two? In my
mind, the virtnet_bypass becomes the way we go forward for any future
solutions. I say we treat the netvsc approach as a "legacy" approach
and avoid creating any new libraries or drivers to support it, and
instead just focus on the 3 netdev approach as the way this is to be
done going forward. That way we avoid anyone else trying to implement
something like the 2 netdev solution in the future.

So getting back to the code here. Should we split the virtnet_bypass
code out into a separate module? My preference would be to let this
incubate as a part of virtio_net until either there is another user,
or it becomes big enough that it needs to be 

[virtio-dev] Re: [PATCH 1/4] eth: add speed and duplex definitions

2018-03-06 Thread Michael S. Tsirkin
On Tue, Mar 06, 2018 at 12:53:14PM -0500, Jason Baron wrote:
> 
> 
> On 03/02/2018 12:54 PM, Michael S. Tsirkin wrote:
> > On Thu, Mar 01, 2018 at 10:46:33PM -0500, Jason Baron wrote:
> >> Pull in definitions for SPEED_UNKNOWN, DUPLEX_UNKNOWN, DUPLEX_HALF,
> >> and DUPLEX_FULL.
> >>
> >> Signed-off-by: Jason Baron 
> >> Cc: "Michael S. Tsirkin" 
> >> Cc: Jason Wang 
> >> Cc: virtio-dev@lists.oasis-open.org
> >> ---
> >>  include/net/eth.h | 7 +++
> >>  1 file changed, 7 insertions(+)
> >>
> >> diff --git a/include/net/eth.h b/include/net/eth.h
> >> index 09054a5..9843678 100644
> >> --- a/include/net/eth.h
> >> +++ b/include/net/eth.h
> >> @@ -417,4 +417,11 @@ bool
> >>  eth_parse_ipv6_hdr(const struct iovec *pkt, int pkt_frags,
> >> size_t ip6hdr_off, eth_ip6_hdr_info *info);
> >>  
> >> +/* ethtool defines - from linux/ethtool.h */
> >> +#define SPEED_UNKNOWN   -1
> >> +
> >> +#define DUPLEX_HALF 0x00
> >> +#define DUPLEX_FULL 0x01
> >> +#define DUPLEX_UNKNOWN  0xff
> >> +
> >>  #endif
> > 
> > While that's not a lot, I think we should import linux/ethtool.h into
> > include/standard-headers/linux/ using scripts/update-linux-headers.sh
> > 
> 
> Ok, I had started down that path, by including
> include/uapi/linux/ethtool.h but that resulted in a few other headers -
> kernel.h, sysinfo.h. And so it seemed like a lot of headers for only a
> few lines. But I will re-visit it...
> 
> Thanks,
> 
> -Jason

I don't know why is sysinfo there. Want to try sending a patch to
drop it from linux/kernel.h?

-- 
MST

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH 4/4] virtio-net: add linkspeed and duplex settings to virtio-net

2018-03-06 Thread Michael S. Tsirkin
On Tue, Mar 06, 2018 at 01:02:06PM -0500, Jason Baron wrote:
> 
> 
> On 03/04/2018 08:05 AM, Yan Vugenfirer wrote:
> > 
> > 
> >> On 2 Mar 2018, at 22:19, Michael S. Tsirkin  >> > wrote:
> >>
> >> On Fri, Mar 02, 2018 at 03:14:01PM +0800, Jason Wang wrote:
> >>>
> >>>
> >>> On 2018年03月02日 11:46, Jason Baron wrote:
>  Although linkspeed and duplex can be set in a linux guest via
>  'ethtool -s',
>  this requires custom ethtool commands for virtio-net by default.
> 
>  Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows
>  the hypervisor to export a linkspeed and duplex setting. The user can
>  subsequently overwrite it later if desired via: 'ethtool -s'.
> 
>  Linkspeed and duplex settings can be set as:
>  '-device virtio-net,speed=1,duplex=full'
> >>>
> >>> I was thinking whether or not it's better to decide the duplex by the
> >>> type
> >>> of backends.
> >>>
> >>> E.g userspace and vhost-kernel implement a in fact half duplex. But dpdk
> >>> implement a full duplex.
> >>>
> >>> Thanks
> >>
> >> OTOH it's a priority for some people to be able to support migration
> >> between different backend types. Breaking that won't be nice.
> > 
> > I think that in this case we need a way to update the settings of link
> > speed and link duplex (maybe add QMP command). Migration between
> > different backend types should cause link down\link up events. And this
> > is a time for a driver to re-read the settings and update the OS.
> > 
> > Best regards,
> > Yan.
> > 
> 
> So the virtio_net driver in linux will re-read these settings on link up
> events. So I could add a qmp command to set these (in addition to the
> command-line) interface, if desired. Is there a consensus that we need
> to add a qmp command here? Or can that be treated as a future item, if
> somebody wants it?
> 
> Thanks,
> 
> -Jason

We already have ability to take link down and up.
I'd say it's a future item.

-- 
MST

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH 4/4] virtio-net: add linkspeed and duplex settings to virtio-net

2018-03-06 Thread Jason Baron


On 03/04/2018 08:05 AM, Yan Vugenfirer wrote:
> 
> 
>> On 2 Mar 2018, at 22:19, Michael S. Tsirkin > > wrote:
>>
>> On Fri, Mar 02, 2018 at 03:14:01PM +0800, Jason Wang wrote:
>>>
>>>
>>> On 2018年03月02日 11:46, Jason Baron wrote:
 Although linkspeed and duplex can be set in a linux guest via
 'ethtool -s',
 this requires custom ethtool commands for virtio-net by default.

 Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows
 the hypervisor to export a linkspeed and duplex setting. The user can
 subsequently overwrite it later if desired via: 'ethtool -s'.

 Linkspeed and duplex settings can be set as:
 '-device virtio-net,speed=1,duplex=full'
>>>
>>> I was thinking whether or not it's better to decide the duplex by the
>>> type
>>> of backends.
>>>
>>> E.g userspace and vhost-kernel implement a in fact half duplex. But dpdk
>>> implement a full duplex.
>>>
>>> Thanks
>>
>> OTOH it's a priority for some people to be able to support migration
>> between different backend types. Breaking that won't be nice.
> 
> I think that in this case we need a way to update the settings of link
> speed and link duplex (maybe add QMP command). Migration between
> different backend types should cause link down\link up events. And this
> is a time for a driver to re-read the settings and update the OS.
> 
> Best regards,
> Yan.
> 

So the virtio_net driver in linux will re-read these settings on link up
events. So I could add a qmp command to set these (in addition to the
command-line) interface, if desired. Is there a consensus that we need
to add a qmp command here? Or can that be treated as a future item, if
somebody wants it?

Thanks,

-Jason

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH 4/4] virtio-net: add linkspeed and duplex settings to virtio-net

2018-03-06 Thread Jason Baron


On 03/02/2018 03:22 PM, Michael S. Tsirkin wrote:
> On Fri, Mar 02, 2018 at 11:59:00AM -0500, Jason Baron wrote:
>> On 03/02/2018 02:14 AM, Jason Wang wrote:
>>>
>>>
>>> On 2018年03月02日 11:46, Jason Baron wrote:
 Although linkspeed and duplex can be set in a linux guest via 'ethtool
 -s',
 this requires custom ethtool commands for virtio-net by default.

 Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows
 the hypervisor to export a linkspeed and duplex setting. The user can
 subsequently overwrite it later if desired via: 'ethtool -s'.

 Linkspeed and duplex settings can be set as:
 '-device virtio-net,speed=1,duplex=full'
>>>
>>> I was thinking whether or not it's better to decide the duplex by the
>>> type of backends.
>>>
>>> E.g userspace and vhost-kernel implement a in fact half duplex. But dpdk
>>> implement a full duplex.
>>
>> Interesting - could this be derived only from the backend 'type'. IE:
>> NET_CLIENT_DRIVER_TAP, NET_CLIENT_DRIVER_VHOST_USER...
>>
>>
>> I was also thinking this could be specified as 'duplex=backend', in
>> addition to the proposed 'duplex=full' or 'duplex=half'?
>>
>> Thanks,
>>
>> -Jason
> 
> I'd say it would make more sense to teach backends to obey what's
> specified by the user. E.g. if vhost gets a duplex config,
> create two threads.
> 
> But I think all that's for future, we can just fake it for
> now - the current uses don't seem to particularly care about whether
> virtio actually is or isn't a duplex.
> 
> 

Ok, I wouldn't add 'duplex=backend' when I re-post, and will leave any
automatic settings of duplex to 'future work'.

Thanks,

-Jason

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH 1/4] eth: add speed and duplex definitions

2018-03-06 Thread Jason Baron


On 03/02/2018 12:54 PM, Michael S. Tsirkin wrote:
> On Thu, Mar 01, 2018 at 10:46:33PM -0500, Jason Baron wrote:
>> Pull in definitions for SPEED_UNKNOWN, DUPLEX_UNKNOWN, DUPLEX_HALF,
>> and DUPLEX_FULL.
>>
>> Signed-off-by: Jason Baron 
>> Cc: "Michael S. Tsirkin" 
>> Cc: Jason Wang 
>> Cc: virtio-dev@lists.oasis-open.org
>> ---
>>  include/net/eth.h | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/include/net/eth.h b/include/net/eth.h
>> index 09054a5..9843678 100644
>> --- a/include/net/eth.h
>> +++ b/include/net/eth.h
>> @@ -417,4 +417,11 @@ bool
>>  eth_parse_ipv6_hdr(const struct iovec *pkt, int pkt_frags,
>> size_t ip6hdr_off, eth_ip6_hdr_info *info);
>>  
>> +/* ethtool defines - from linux/ethtool.h */
>> +#define SPEED_UNKNOWN   -1
>> +
>> +#define DUPLEX_HALF 0x00
>> +#define DUPLEX_FULL 0x01
>> +#define DUPLEX_UNKNOWN  0xff
>> +
>>  #endif
> 
> While that's not a lot, I think we should import linux/ethtool.h into
> include/standard-headers/linux/ using scripts/update-linux-headers.sh
> 

Ok, I had started down that path, by including
include/uapi/linux/ethtool.h but that resulted in a few other headers -
kernel.h, sysinfo.h. And so it seemed like a lot of headers for only a
few lines. But I will re-visit it...

Thanks,

-Jason

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [virtio] [PATCH] introduction: add note on bitfield portability

2018-03-06 Thread Cornelia Huck
On Tue, 6 Mar 2018 16:35:46 +0200
"Michael S. Tsirkin"  wrote:

[You picked my old email address :)]

> Signed-off-by: Michael S. Tsirkin 
> ---
>  introduction.tex | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/introduction.tex b/introduction.tex
> index d0b770e..18ad76e 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -175,5 +175,12 @@ packed by C compilers on little-endian architectures but 
> not the
>  way bitfields are packed by C compilers on big-endian
>  architectures.
>  
> +Assuming that CPU_TO_BE16 converts a 16-bit integer from a native
> +CPU to the big-endian byte order, the following is the equivalent
> +portable C code to generate a value in this format:
> +\begin{lstlisting}
> +CPU_TO_BE16(B << 15 | A)
> +\end{lstlisting}
> +
>  \newpage
>  

Reviewed-by: Cornelia Huck 

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] [PATCH] introduction: add note on bitfield portability

2018-03-06 Thread Michael S. Tsirkin
Signed-off-by: Michael S. Tsirkin 
---
 introduction.tex | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/introduction.tex b/introduction.tex
index d0b770e..18ad76e 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -175,5 +175,12 @@ packed by C compilers on little-endian architectures but 
not the
 way bitfields are packed by C compilers on big-endian
 architectures.
 
+Assuming that CPU_TO_BE16 converts a 16-bit integer from a native
+CPU to the big-endian byte order, the following is the equivalent
+portable C code to generate a value in this format:
+\begin{lstlisting}
+CPU_TO_BE16(B << 15 | A)
+\end{lstlisting}
+
 \newpage
 
-- 
MST

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH] introduction: document bitfield notation

2018-03-06 Thread Cornelia Huck
On Mon, 5 Mar 2018 23:11:40 +0200
"Michael S. Tsirkin"  wrote:

> On Mon, Mar 05, 2018 at 05:04:40PM +0100, Cornelia Huck wrote:
> > On Mon, 5 Mar 2018 16:26:11 +0200
> > "Michael S. Tsirkin"  wrote:
> >   
> > > On Mon, Mar 05, 2018 at 03:20:35PM +0100, Cornelia Huck wrote:  
> > > > On Wed, 28 Feb 2018 21:16:32 +0200
> > > > "Michael S. Tsirkin"  wrote:
> > > > 
> > > > > Bitfields are a useful and familiar way to specify sub-byte structure
> > > > > layout. The only issue is that bitfield order isn't portable across
> > > > > architectures.  Document that we list bitfields from least to
> > > > > most significant one, and warn about portability issues.
> > > > > 
> > > > > Signed-off-by: Michael S. Tsirkin 
> > > > > ---
> > > > >  introduction.tex | 18 ++
> > > > >  1 file changed, 18 insertions(+)
> > > > > 
> > > > > diff --git a/introduction.tex b/introduction.tex
> > > > > index 979881e..3cb7a70 100644
> > > > > --- a/introduction.tex
> > > > > +++ b/introduction.tex
> > > > > @@ -157,5 +157,23 @@ in little-endian byte order.
> > > > >  in big-endian byte order.
> > > > >  \end{description}
> > > > >  
> > > > > +When documenting sub-byte data fields, C-like bitfield notation
> > > > > +is used. Fields within an integer are always listed in order,
> > > > > +from the least significant to the most significant bit.
> > > > > +
> > > > > +For example:
> > > > > +\begin{lstlisting}
> > > > > +be16 A : 15;
> > > > > +be16 B : 1;
> > > > > +\end{lstlisting}
> > > > > +documents the value A stored in the low 15 bit of a 16 bit
> > > > > +integer and the value B stored in the high bit of the 16 bit
> > > > > +integer, the integer in turn using the big-endian byte order.
> > > > > +
> > > > > +Note that this notation typically matches the way bitfields are
> > > > > +packed by C compilers on little-endian architectures but not the
> > > > > +way bitfields are packed by C compilers on big-endian
> > > > > +architectures.
> > > > > +
> > > > >  \newpage
> > > > >  
> > > > 
> > > > I must admit that this explanation confuses me a bit.
> > > 
> > > What it is saying is that this is equivalent to
> > > 
> > > CPU_TO_BE16(B << 15 | A)
> > > 
> > > Maybe adding this part will clarify things?
> > > 
> > >   
> > > > Would some kind
> > > > of graphic representation be more helpful?
> > > 
> > > I'm not good at graphics :)  
> > 
> > Me neither :) But pseudo-graphics might be enough.
> >   
> > >   
> > > > For example, on s390 I would expect the structure to look like the
> > > > following:
> > > > 
> > > > |0  ..  14 | 15 |
> > > > |A |  B |
> > > > 
> > > > If you included another example for little-endian byte order, this
> > > > would clear up things more, I think.
> > > 
> > > 
> > > It's BE so I think it's
> > > 
> > > | 15 |14  ..  0 |
> > > | B  |A |
> > >   
> > 
> > But that's the same, no?  
> 
> I just tried to show that B is in byte 0,
> assuming bytes are numbered 0,1,2,3 left to right.
> 
> > Or it's just IBM bitorder striking again...  
> 
> That's what I'm saying these graphics do not help at all.
> 
> It's an integer, B is the most significant bit. Integer is stored
> in BE format, thus B is the MSB in the first byte.
> 
> Let me know whether writing 
> CPU_TO_BE16(B << 15 | A)
> 
> helps clarify things.

I think so.

Let's go with that, then. I'm probably a bad measure since I've been
subjected to the IBM notation for a long time...

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org