RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-12 Thread Parav Pandit via Virtualization


> From: Michael S. Tsirkin 
> Sent: Thursday, October 12, 2023 5:00 PM

> I am instead talking about devices that work with existing legacy linux 
> drivers
> with no traps.
> 
Yep, I understood.

> > I am not expecting OASIS to do anything extra for legacy registers.
> >
> > [1] The device MUST reset when 0 is written to device_status, and present a > > 0
> in device_status once that is done.
> > [2] After writing 0 to device_status, the driver MUST wait for a read
> > of device_status to return 0 before reinitializing the device.
> 
> We can add a note explaining that legacy drivers do not wait after doing 
> reset,
> that is not a problem.
> If someone wants to make a device that works with existing legacy linux 
> drivers,
> they can do that.
> Won't work with all drivers though, which is why oasis did not want to
> standardize this.

Ok. thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-12 Thread Michael S. Tsirkin
On Thu, Oct 12, 2023 at 11:11:20AM +, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin 
> > Sent: Thursday, October 12, 2023 4:23 PM
> > 
> > On Tue, Sep 26, 2023 at 03:45:36AM +, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin 
> > > > Sent: Tuesday, September 26, 2023 12:06 AM
> > >
> > > > One can thinkably do that wait in hardware, though. Just defer
> > > > completion until read is done.
> > > >
> > > Once OASIS does such new interface and if some hw vendor _actually_ wants
> > to do such complex hw, may be vfio driver can adopt to it.
> > 
> > The reset behaviour I describe is already in the spec. What else do you want
> > OASIS to standardize? Virtio currently is just a register map it does not 
> > yet
> > include suggestions on how exactly do pci express transactions look. You 
> > feel we
> > should add that?
> 
> The reset behavior in the spec for modern as listed in [1] and [2] is just 
> fine.
> 
> What I meant is in context of having MMIO based legacy registers to "defer 
> completion until read is done".
> I think you meant, "Just differ read completion, until reset is done".

yes

> This means the hw needs to finish the device reset for thousands of devices 
> within the read completion timeout of the pci.

no, each device does it's own reset.

> So when if OASIS does such standardization, someone can implement it.
> 
> What I recollect, is OASIS didn't not standardize such anti-scale approach 
> and took the admin command approach which achieve better scale.
> Hope I clarified.

You are talking about the extension for trap and emulate.
I am instead talking about devices that work with
existing legacy linux drivers with no traps.

> I am not expecting OASIS to do anything extra for legacy registers.
> 
> [1] The device MUST reset when 0 is written to device_status, and present a 0 
> in device_status once that is done.
> [2] After writing 0 to device_status, the driver MUST wait for a read of 
> device_status to return 0 before reinitializing
> the device.

We can add a note explaining that legacy drivers do not wait
after doing reset, that is not a problem.
If someone wants to make a device that works with existing
legacy linux drivers, they can do that.
Won't work with all drivers though, which is why oasis did not
want to standardize this.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-12 Thread Parav Pandit via Virtualization


> From: Michael S. Tsirkin 
> Sent: Thursday, October 12, 2023 4:23 PM
> 
> On Tue, Sep 26, 2023 at 03:45:36AM +, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin 
> > > Sent: Tuesday, September 26, 2023 12:06 AM
> >
> > > One can thinkably do that wait in hardware, though. Just defer
> > > completion until read is done.
> > >
> > Once OASIS does such new interface and if some hw vendor _actually_ wants
> to do such complex hw, may be vfio driver can adopt to it.
> 
> The reset behaviour I describe is already in the spec. What else do you want
> OASIS to standardize? Virtio currently is just a register map it does not yet
> include suggestions on how exactly do pci express transactions look. You feel 
> we
> should add that?

The reset behavior in the spec for modern as listed in [1] and [2] is just fine.

What I meant is in context of having MMIO based legacy registers to "defer 
completion until read is done".
I think you meant, "Just differ read completion, until reset is done".
This means the hw needs to finish the device reset for thousands of devices 
within the read completion timeout of the pci.
So when if OASIS does such standardization, someone can implement it.

What I recollect, is OASIS didn't not standardize such anti-scale approach and 
took the admin command approach which achieve better scale.
Hope I clarified.

I am not expecting OASIS to do anything extra for legacy registers.

[1] The device MUST reset when 0 is written to device_status, and present a 0 
in device_status once that is done.
[2] After writing 0 to device_status, the driver MUST wait for a read of 
device_status to return 0 before reinitializing
the device.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-12 Thread Michael S. Tsirkin
On Tue, Sep 26, 2023 at 03:45:36AM +, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin 
> > Sent: Tuesday, September 26, 2023 12:06 AM
> 
> > One can thinkably do that wait in hardware, though. Just defer completion 
> > until
> > read is done.
> >
> Once OASIS does such new interface and if some hw vendor _actually_ wants to 
> do such complex hw, may be vfio driver can adopt to it.

The reset behaviour I describe is already in the spec. What else do you
want OASIS to standardize? Virtio currently is just a register map it
does not yet include suggestions on how exactly do pci express
transactions look. You feel we should add that?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-07 Thread Jason Wang
On Tue, Sep 26, 2023 at 7:49 PM Michael S. Tsirkin  wrote:
>
> On Tue, Sep 26, 2023 at 10:32:39AM +0800, Jason Wang wrote:
> > It's the implementation details in legacy. The device needs to make
> > sure (reset) the driver can work (is done before get_status return).
>
> I think that there's no way to make it reliably work for all legacy drivers.

Yes, we may have ancient drivers.

>
> They just assumed a software backend and did not bother with DMA
> ordering. You can try to avoid resets, they are not that common so
> things will tend to mostly work if you don't stress them to much with
> things like hot plug/unplug in a loop.  Or you can try to use a driver
> after 2011 which is more aware of hardware ordering and flushes the
> reset write with a read.  One of these two tricks, I think, is the magic
> behind the device exposing memory bar 0 that you mention.

Right this is what I see for hardware legacy devices shipped by some
cloud vendors.

Thanks

>
> --
> MST
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-02 Thread Parav Pandit via Virtualization



> From: Michael S. Tsirkin 
> Sent: Friday, September 22, 2023 9:23 PM

> > +static int virtiovf_pci_probe(struct pci_dev *pdev,
> > + const struct pci_device_id *id) {
> > +   const struct vfio_device_ops *ops = &virtiovf_acc_vfio_pci_ops;
> > +   struct virtiovf_pci_core_device *virtvdev;
> > +   int ret;
> > +
> > +   if (pdev->is_virtfn && virtiovf_support_legacy_access(pdev) &&
> > +   !virtiovf_bar0_exists(pdev) && pdev->msix_cap)
> 
> I see this is the reason you set MSIX to true. But I think it's a 
> misunderstanding -
> that true means MSIX is enabled by guest, not that it exists.

Msix check here just looks a sanity check to make sure that guest can enable 
msix.
The msix enable check should be in the read()/write() calls to decide which AQ 
command to choose from, 
i.e. to access common config or device config as written in the virtio spec. 

Yishai please fix the read() write() calls to dynamically consider the offset 
of 24/20 based on the msix enabled state.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-01 Thread Parav Pandit via Virtualization



> From: Michael S. Tsirkin 
> Sent: Tuesday, September 26, 2023 10:30 PM

> For example, a transitional device
> must not in theory be safely passed through to guest userspace, because guest
> then might try to use it through the legacy BAR without acknowledging
> ACCESS_PLATFORM.
> Do any guests check this and fail? Hard to say.
>
ACCESS_PLATFORM is not offered on the legacy interface because legacy interface 
spec 0.9.5 didn't have it.
Whether guest VM maps it to user space and using GIOVA is completely unknown to 
the device.
And all of this is just fine, because IOMMU through vfio takes care of 
necessary translation with/without mapping the transitional device to the guest 
user space.

Hence, it is not a compat problem.
Anyways, only those user will attach a virtio device to vfio-virtio device when 
user care to expose transitional device in guest.

I can see that in future, when user wants to do this optionally, a 
devlink/sysfs knob will be added, at that point, one needs to have a 
disable_transitional flag.
So it may be worth to optionally enable transitional support on user request as 
Michael suggested.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-27 Thread Michael S. Tsirkin
On Wed, Sep 27, 2023 at 08:20:05PM -0300, Jason Gunthorpe wrote:
> On Wed, Sep 27, 2023 at 05:38:55PM -0400, Michael S. Tsirkin wrote:
> > On Tue, Sep 26, 2023 at 10:50:57AM -0300, Jason Gunthorpe wrote:
> > > On Tue, Sep 26, 2023 at 01:42:52AM -0400, Michael S. Tsirkin wrote:
> > > > On Mon, Sep 25, 2023 at 09:40:59PM -0300, Jason Gunthorpe wrote:
> > > > > On Mon, Sep 25, 2023 at 03:44:11PM -0400, Michael S. Tsirkin wrote:
> > > > > > > VDPA is very different from this. You might call them both 
> > > > > > > mediation,
> > > > > > > sure, but then you need another word to describe the additional
> > > > > > > changes VPDA is doing.
> > > > > > 
> > > > > > Sorry about hijacking the thread a little bit, but could you
> > > > > > call out some of the changes that are the most problematic
> > > > > > for you?
> > > > > 
> > > > > I don't really know these details.
> > > > 
> > > > Maybe, you then should desist from saying things like "It entirely fails
> > > > to achieve the most important thing it needs to do!" You are not making
> > > > any new friends with saying this about a piece of software without
> > > > knowing the details.
> > > 
> > > I can't tell you what cloud operators are doing, but I can say with
> > > confidence that it is not the same as VDPA. As I said, if you want to
> > > know more details you need to ask a cloud operator.
> >
> > So it's not the changes that are problematic, it's that you have
> > customers who are not using vdpa. The "most important thing" that vdpa
> > fails at is simply converting your customers from vfio to vdpa.
> 
> I said the most important thing was that VFIO presents exactly the
> same virtio device to the VM as the baremetal. Do you dispute that,
> technically, VDPA does not actually achieve that?

I dispute that it is the most important. The important thing is to have
guests work.

> Then why is it so surprising that people don't want a solution that
> changes the vPCI ABI they worked hard to create in the first place?
> 
> I'm still baffled why you think everyone should use vdpa..
> 
> Jason

They shouldn't. If you want proprietary extensions then vfio is the way
to go, I don't think vdpa will support that.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-27 Thread Michael S. Tsirkin
On Tue, Sep 26, 2023 at 10:50:57AM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 26, 2023 at 01:42:52AM -0400, Michael S. Tsirkin wrote:
> > On Mon, Sep 25, 2023 at 09:40:59PM -0300, Jason Gunthorpe wrote:
> > > On Mon, Sep 25, 2023 at 03:44:11PM -0400, Michael S. Tsirkin wrote:
> > > > > VDPA is very different from this. You might call them both mediation,
> > > > > sure, but then you need another word to describe the additional
> > > > > changes VPDA is doing.
> > > > 
> > > > Sorry about hijacking the thread a little bit, but could you
> > > > call out some of the changes that are the most problematic
> > > > for you?
> > > 
> > > I don't really know these details.
> > 
> > Maybe, you then should desist from saying things like "It entirely fails
> > to achieve the most important thing it needs to do!" You are not making
> > any new friends with saying this about a piece of software without
> > knowing the details.
> 
> I can't tell you what cloud operators are doing, but I can say with
> confidence that it is not the same as VDPA. As I said, if you want to
> know more details you need to ask a cloud operator.
> 
> Jason

So it's not the changes that are problematic, it's that you have
customers who are not using vdpa. The "most important thing" that vdpa
fails at is simply converting your customers from vfio to vdpa.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-26 Thread Michael S. Tsirkin
On Tue, Sep 26, 2023 at 06:20:45PM +0300, Yishai Hadas wrote:
> On 21/09/2023 22:58, Alex Williamson wrote:
> > On Thu, 21 Sep 2023 15:40:40 +0300
> > Yishai Hadas  wrote:
> > 
> > > Introduce a vfio driver over virtio devices to support the legacy
> > > interface functionality for VFs.
> > > 
> > > Background, from the virtio spec [1].
> > > 
> > > In some systems, there is a need to support a virtio legacy driver with
> > > a device that does not directly support the legacy interface. In such
> > > scenarios, a group owner device can provide the legacy interface
> > > functionality for the group member devices. The driver of the owner
> > > device can then access the legacy interface of a member device on behalf
> > > of the legacy member device driver.
> > > 
> > > For example, with the SR-IOV group type, group members (VFs) can not
> > > present the legacy interface in an I/O BAR in BAR0 as expected by the
> > > legacy pci driver. If the legacy driver is running inside a virtual
> > > machine, the hypervisor executing the virtual machine can present a
> > > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
> > > legacy driver accesses to this I/O BAR and forwards them to the group
> > > owner device (PF) using group administration commands.
> > > 
> > > 
> > > Specifically, this driver adds support for a virtio-net VF to be exposed
> > > as a transitional device to a guest driver and allows the legacy IO BAR
> > > functionality on top.
> > > 
> > > This allows a VM which uses a legacy virtio-net driver in the guest to
> > > work transparently over a VF which its driver in the host is that new
> > > driver.
> > > 
> > > The driver can be extended easily to support some other types of virtio
> > > devices (e.g virtio-blk), by adding in a few places the specific type
> > > properties as was done for virtio-net.
> > > 
> > > For now, only the virtio-net use case was tested and as such we introduce
> > > the support only for such a device.
> > > 
> > > Practically,
> > > Upon probing a VF for a virtio-net device, in case its PF supports
> > > legacy access over the virtio admin commands and the VF doesn't have BAR
> > > 0, we set some specific 'vfio_device_ops' to be able to simulate in SW a
> > > transitional device with I/O BAR in BAR 0.
> > > 
> > > The existence of the simulated I/O bar is reported later on by
> > > overwriting the VFIO_DEVICE_GET_REGION_INFO command and the device
> > > exposes itself as a transitional device by overwriting some properties
> > > upon reading its config space.
> > > 
> > > Once we report the existence of I/O BAR as BAR 0 a legacy driver in the
> > > guest may use it via read/write calls according to the virtio
> > > specification.
> > > 
> > > Any read/write towards the control parts of the BAR will be captured by
> > > the new driver and will be translated into admin commands towards the
> > > device.
> > > 
> > > Any data path read/write access (i.e. virtio driver notifications) will
> > > be forwarded to the physical BAR which its properties were supplied by
> > > the command VIRTIO_PCI_QUEUE_NOTIFY upon the probing/init flow.
> > > 
> > > With that code in place a legacy driver in the guest has the look and
> > > feel as if having a transitional device with legacy support for both its
> > > control and data path flows.
> > > 
> > > [1]
> > > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c
> > > 
> > > Signed-off-by: Yishai Hadas 
> > > ---
> > >   MAINTAINERS  |   6 +
> > >   drivers/vfio/pci/Kconfig |   2 +
> > >   drivers/vfio/pci/Makefile|   2 +
> > >   drivers/vfio/pci/virtio/Kconfig  |  15 +
> > >   drivers/vfio/pci/virtio/Makefile |   4 +
> > >   drivers/vfio/pci/virtio/cmd.c|   4 +-
> > >   drivers/vfio/pci/virtio/cmd.h|   8 +
> > >   drivers/vfio/pci/virtio/main.c   | 546 +++
> > >   8 files changed, 585 insertions(+), 2 deletions(-)
> > >   create mode 100644 drivers/vfio/pci/virtio/Kconfig
> > >   create mode 100644 drivers/vfio/pci/virtio/Makefile
> > >   create mode 100644 drivers/vfio/pci/virtio/main.c
> > > 
> > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > index bf0f54c24f81..5098418c8389 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -22624,6 +22624,12 @@ L:   k...@vger.kernel.org
> > >   S:  Maintained
> > >   F:  drivers/vfio/pci/mlx5/
> > > +VFIO VIRTIO PCI DRIVER
> > > +M:   Yishai Hadas 
> > > +L:   k...@vger.kernel.org
> > > +S:   Maintained
> > > +F:   drivers/vfio/pci/virtio
> > > +
> > >   VFIO PCI DEVICE SPECIFIC DRIVERS
> > >   R:  Jason Gunthorpe 
> > >   R:  Yishai Hadas 
> > > diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> > > index 8125e5f37832..18c397df566d 100644
> > > --- a/drivers/vfio/pci/Kconf

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-26 Thread Yishai Hadas via Virtualization

On 21/09/2023 22:58, Alex Williamson wrote:

On Thu, 21 Sep 2023 15:40:40 +0300
Yishai Hadas  wrote:


Introduce a vfio driver over virtio devices to support the legacy
interface functionality for VFs.

Background, from the virtio spec [1].

In some systems, there is a need to support a virtio legacy driver with
a device that does not directly support the legacy interface. In such
scenarios, a group owner device can provide the legacy interface
functionality for the group member devices. The driver of the owner
device can then access the legacy interface of a member device on behalf
of the legacy member device driver.

For example, with the SR-IOV group type, group members (VFs) can not
present the legacy interface in an I/O BAR in BAR0 as expected by the
legacy pci driver. If the legacy driver is running inside a virtual
machine, the hypervisor executing the virtual machine can present a
virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
legacy driver accesses to this I/O BAR and forwards them to the group
owner device (PF) using group administration commands.


Specifically, this driver adds support for a virtio-net VF to be exposed
as a transitional device to a guest driver and allows the legacy IO BAR
functionality on top.

This allows a VM which uses a legacy virtio-net driver in the guest to
work transparently over a VF which its driver in the host is that new
driver.

The driver can be extended easily to support some other types of virtio
devices (e.g virtio-blk), by adding in a few places the specific type
properties as was done for virtio-net.

For now, only the virtio-net use case was tested and as such we introduce
the support only for such a device.

Practically,
Upon probing a VF for a virtio-net device, in case its PF supports
legacy access over the virtio admin commands and the VF doesn't have BAR
0, we set some specific 'vfio_device_ops' to be able to simulate in SW a
transitional device with I/O BAR in BAR 0.

The existence of the simulated I/O bar is reported later on by
overwriting the VFIO_DEVICE_GET_REGION_INFO command and the device
exposes itself as a transitional device by overwriting some properties
upon reading its config space.

Once we report the existence of I/O BAR as BAR 0 a legacy driver in the
guest may use it via read/write calls according to the virtio
specification.

Any read/write towards the control parts of the BAR will be captured by
the new driver and will be translated into admin commands towards the
device.

Any data path read/write access (i.e. virtio driver notifications) will
be forwarded to the physical BAR which its properties were supplied by
the command VIRTIO_PCI_QUEUE_NOTIFY upon the probing/init flow.

With that code in place a legacy driver in the guest has the look and
feel as if having a transitional device with legacy support for both its
control and data path flows.

[1]
https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c

Signed-off-by: Yishai Hadas 
---
  MAINTAINERS  |   6 +
  drivers/vfio/pci/Kconfig |   2 +
  drivers/vfio/pci/Makefile|   2 +
  drivers/vfio/pci/virtio/Kconfig  |  15 +
  drivers/vfio/pci/virtio/Makefile |   4 +
  drivers/vfio/pci/virtio/cmd.c|   4 +-
  drivers/vfio/pci/virtio/cmd.h|   8 +
  drivers/vfio/pci/virtio/main.c   | 546 +++
  8 files changed, 585 insertions(+), 2 deletions(-)
  create mode 100644 drivers/vfio/pci/virtio/Kconfig
  create mode 100644 drivers/vfio/pci/virtio/Makefile
  create mode 100644 drivers/vfio/pci/virtio/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index bf0f54c24f81..5098418c8389 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -22624,6 +22624,12 @@ L: k...@vger.kernel.org
  S:Maintained
  F:drivers/vfio/pci/mlx5/
  
+VFIO VIRTIO PCI DRIVER

+M: Yishai Hadas 
+L: k...@vger.kernel.org
+S: Maintained
+F: drivers/vfio/pci/virtio
+
  VFIO PCI DEVICE SPECIFIC DRIVERS
  R:Jason Gunthorpe 
  R:Yishai Hadas 
diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 8125e5f37832..18c397df566d 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -65,4 +65,6 @@ source "drivers/vfio/pci/hisilicon/Kconfig"
  
  source "drivers/vfio/pci/pds/Kconfig"
  
+source "drivers/vfio/pci/virtio/Kconfig"

+
  endmenu
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 45167be462d8..046139a4eca5 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -13,3 +13,5 @@ obj-$(CONFIG_MLX5_VFIO_PCI)   += mlx5/
  obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/
  
  obj-$(CONFIG_PDS_VFIO_PCI) += pds/

+
+obj-$(CONFIG_VIRTIO_VFIO_PCI) += virtio/
diff --git a/drivers/vfio/pci/virtio/Kconfig b/drivers/vfio/pci/virtio/Kconfig
new file mode 100644
index ..8

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-26 Thread Michael S. Tsirkin
On Tue, Sep 26, 2023 at 10:32:39AM +0800, Jason Wang wrote:
> It's the implementation details in legacy. The device needs to make
> sure (reset) the driver can work (is done before get_status return).

I think that there's no way to make it reliably work for all legacy drivers.

They just assumed a software backend and did not bother with DMA
ordering. You can try to avoid resets, they are not that common so
things will tend to mostly work if you don't stress them to much with
things like hot plug/unplug in a loop.  Or you can try to use a driver
after 2011 which is more aware of hardware ordering and flushes the
reset write with a read.  One of these two tricks, I think, is the magic
behind the device exposing memory bar 0 that you mention.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Michael S. Tsirkin
On Mon, Sep 25, 2023 at 09:40:59PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 25, 2023 at 03:44:11PM -0400, Michael S. Tsirkin wrote:
> > > VDPA is very different from this. You might call them both mediation,
> > > sure, but then you need another word to describe the additional
> > > changes VPDA is doing.
> > 
> > Sorry about hijacking the thread a little bit, but could you
> > call out some of the changes that are the most problematic
> > for you?
> 
> I don't really know these details.

Maybe, you then should desist from saying things like "It entirely fails
to achieve the most important thing it needs to do!" You are not making
any new friends with saying this about a piece of software without
knowing the details.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Michael S. Tsirkin
On Mon, Sep 25, 2023 at 09:40:59PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 25, 2023 at 03:44:11PM -0400, Michael S. Tsirkin wrote:
> > > VDPA is very different from this. You might call them both mediation,
> > > sure, but then you need another word to describe the additional
> > > changes VPDA is doing.
> > 
> > Sorry about hijacking the thread a little bit, but could you
> > call out some of the changes that are the most problematic
> > for you?
> 
> I don't really know these details. The operators have an existing
> virtio world that is ABI toward the VM for them, and they do not want
> *anything* to change. The VM should be unware if the virtio device is
> created by old hypervisor software or new DPU software. It presents
> exactly the same ABI.
> 
> So the challenge really is to convince that VDPA delivers that, and
> frankly, I don't think it does. ABI toward the VM is very important
> here.

And to complete the picture, it is the DPU software/firmware that
is resposible for maintaining this ABI in your ideal world?


> > > In this model the DPU is an extension of the hypervisor/qemu
> > > environment and we shift code from x86 side to arm side to increase
> > > security, save power and increase total system performance.
> > 
> > I think I begin to understand. On the DPU you have some virtio
> > devices but also some non-virtio devices.  So you have to
> > use VFIO to talk to the DPU. Reusing VFIO to talk to virtio
> > devices too, simplifies things for you. 
> 
> Yes
> 
> > If guests will see vendor-specific devices from the DPU anyway, it
> > will be impossible to migrate such guests away from the DPU so the
> > cross-vendor migration capability is less important in this
> > use-case.  Is this a good summary?
> 
> Well, sort of. As I said before, the vendor here is the cloud
> operator, not the DPU supplier. The guest will see an AWS virtio-net
> function, for example.
> 
> The operator will ensure that all their different implementations of
> this function will interwork for migration.
> 
> So within the closed world of a single operator live migration will
> work just fine.
> 
> Since the hypervisor's controlled by the operator only migrate within
> the operators own environment anyhow, it is an already solved problem.
> 
> Jason


Okay the picture emerges I think. Thanks! I'll try to summarize later
for everyone's benefit.


-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Parav Pandit via Virtualization



> From: Jason Wang 
> Sent: Tuesday, September 26, 2023 10:08 AM

> Right, so if we'd consider migration from virtio to vDPA, it needs to be 
> designed
> in a way that allows more involvement from hypervisor other than coupling it
> with a specific interface (like admin virtqueues).
It is not attached to the admin virtqueues.
One way to use it using admin commands at [1].
One can define without admin command by explaining the technical difficulties 
in admin command may/cannot work.

[1] https://lists.oasis-open.org/archives/virtio-comment/202309/msg00061.html
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Parav Pandit via Virtualization



> From: Jason Wang 
> Sent: Tuesday, September 26, 2023 10:07 AM


> 
> If you can't find a way to make legacy drivers work, use modern.
>
Understood.
This vfio series make the legacy drivers work.
Thanks.
 
> That's it.
> 
> Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Jason Wang
On Mon, Sep 25, 2023 at 8:26 PM Jason Gunthorpe  wrote:
>
> On Mon, Sep 25, 2023 at 10:34:54AM +0800, Jason Wang wrote:
>
> > > Cloud vendors will similarly use DPUs to create a PCI functions that
> > > meet the cloud vendor's internal specification.
> >
> > This can only work if:
> >
> > 1) the internal specification has finer garin than virtio spec
> > 2) so it can define what is not implemented in the virtio spec (like
> > migration and compatibility)
>
> Yes, and that is what is happening. Realistically the "spec" isjust a
> piece of software that the Cloud vendor owns which is simply ported to
> multiple DPU vendors.
>
> It is the same as VDPA. If VDPA can make multiple NIC vendors
> consistent then why do you have a hard time believing we can do the
> same thing just on the ARM side of a DPU?

I don't. We all know vDPA can do more than virtio.

>
> > All of the above doesn't seem to be possible or realistic now, and it
> > actually has a risk to be not compatible with virtio spec. In the
> > future when virtio has live migration supported, they want to be able
> > to migrate between virtio and vDPA.
>
> Well, that is for the spec to design.

Right, so if we'd consider migration from virtio to vDPA, it needs to
be designed in a way that allows more involvement from hypervisor
other than coupling it with a specific interface (like admin
virtqueues).

>
> > > So, as I keep saying, in this scenario the goal is no mediation in the
> > > hypervisor.
> >
> > That's pretty fine, but I don't think trapping + relying is not
> > mediation. Does it really matter what happens after trapping?
>
> It is not mediation in the sense that the kernel driver does not in
> any way make decisions on the behavior of the device. It simply
> transforms an IO operation into a device command and relays it to the
> device. The device still fully controls its own behavior.
>
> VDPA is very different from this. You might call them both mediation,
> sure, but then you need another word to describe the additional
> changes VPDA is doing.
>
> > > It is pointless, everything you think you need to do there
> > > is actually already being done in the DPU.
> >
> > Well, migration or even Qemu could be offloaded to DPU as well. If
> > that's the direction that's pretty fine.
>
> That's silly, of course qemu/kvm can't run in the DPU.

KVM can't for sure but part of Qemu could. This model has been used.

>
> However, we can empty qemu and the hypervisor out so all it does is
> run kvm and run vfio. In this model the DPU does all the OVS, storage,
> "VPDA", etc. qemu is just a passive relay of the DPU PCI functions
> into VM's vPCI functions.
>
> So, everything VDPA was doing in the environment is migrated into the
> DPU.

It really depends on the use cases. For example, in the case of DPU
what if we want to provide multiple virtio devices through a single
VF?

>
> In this model the DPU is an extension of the hypervisor/qemu
> environment and we shift code from x86 side to arm side to increase
> security, save power and increase total system performance.

That's pretty fine.

Thanks

>
> Jason
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Jason Wang
On Tue, Sep 26, 2023 at 11:45 AM Parav Pandit  wrote:
>
>
>
> > From: Michael S. Tsirkin 
> > Sent: Tuesday, September 26, 2023 12:06 AM
>
> > One can thinkably do that wait in hardware, though. Just defer completion 
> > until
> > read is done.
> >
> Once OASIS does such new interface and if some hw vendor _actually_ wants to 
> do such complex hw, may be vfio driver can adopt to it.

It is you that tries to revive legacy in the spec. We all know legacy
is tricky but work.

> When we worked with you, we discussed that there such hw does not have enough 
> returns and hence technical committee choose to proceed with admin commands.

I don't think my questions regarding the legacy transport get good
answers at that time. What's more, we all know spec allows to fix,
workaround or even deprecate a feature.

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Jason Wang
On Tue, Sep 26, 2023 at 12:01 PM Parav Pandit  wrote:
>
>
>
> > From: Jason Wang 
> > Sent: Tuesday, September 26, 2023 8:03 AM
> >
> > It's the implementation details in legacy. The device needs to make sure 
> > (reset)
> > the driver can work (is done before get_status return).
> It is part of the 0.9.5 and 1.x specification as I quoted those text above.

What I meant is: legacy devices need to find their way to make legacy
drivers work. That's how legacy works.

It's too late to add any normative to the 0.95 spec. So the device
behaviour is actually defined by the legacy drivers. That is why it is
tricky.

If you can't find a way to make legacy drivers work, use modern.

That's it.

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Parav Pandit via Virtualization



> From: Jason Wang 
> Sent: Tuesday, September 26, 2023 8:03 AM
> 
> It's the implementation details in legacy. The device needs to make sure 
> (reset)
> the driver can work (is done before get_status return).
It is part of the 0.9.5 and 1.x specification as I quoted those text above.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Parav Pandit via Virtualization



> From: Michael S. Tsirkin 
> Sent: Tuesday, September 26, 2023 12:06 AM

> One can thinkably do that wait in hardware, though. Just defer completion 
> until
> read is done.
>
Once OASIS does such new interface and if some hw vendor _actually_ wants to do 
such complex hw, may be vfio driver can adopt to it.
When we worked with you, we discussed that there such hw does not have enough 
returns and hence technical committee choose to proceed with admin commands.
I will skip re-discussing all over it again here.

The current virto spec is delivering the best trade-offs of functionality, 
performance and light weight implementation with future forward path towards 
more features as Jason explained such as migration.
All with near zero driver, qemu and sw involvement for rapidly growing feature 
set...
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Zhu, Lingshan



On 9/26/2023 2:36 AM, Michael S. Tsirkin wrote:

On Mon, Sep 25, 2023 at 08:26:33AM +, Parav Pandit wrote:



From: Jason Wang 
Sent: Monday, September 25, 2023 8:00 AM

On Fri, Sep 22, 2023 at 8:25 PM Parav Pandit  wrote:



From: Jason Gunthorpe 
Sent: Friday, September 22, 2023 5:53 PM



And what's more, using MMIO BAR0 then it can work for legacy.

Oh? How? Our team didn't think so.

It does not. It was already discussed.
The device reset in legacy is not synchronous.

How do you know this?


Not sure the motivation of same discussion done in the OASIS with you and 
others in past.

Anyways, please find the answer below.

About reset,
The legacy device specification has not enforced below cited 1.0 driver 
requirement of 1.0.

"The driver SHOULD consider a driver-initiated reset complete when it reads device 
status as 0."
  
[1] https://ozlabs.org/~rusty/virtio-spec/virtio-0.9.5.pdf

Basically, I think any drivers that did not read status (linux pre 2011)
before freeing memory under DMA have a reset path that is racy wrt DMA, since
memory writes are posted and IO writes while not posted have completion
that does not order posted transactions, e.g. from pci express spec:
 D2b
 An I/O or Configuration Write Completion 37 is permitted to pass a 
Posted Request.
having said that there were a ton of driver races discovered on this
path in the years since, I suspect if one cares about this then
just avoiding stress on reset is wise.




The drivers do not wait for reset to complete; it was written for the sw

backend.

Do you see there's a flush after reset in the legacy driver?


Yes. it only flushes the write by reading it. The driver does not get _wait_ 
for the reset to complete within the device like above.

One can thinkably do that wait in hardware, though. Just defer completion until
read is done.
I agree with MST. At least Intel devices work fine with vfio-pci and 
legacy driver without any changes.

So far so good.

Thanks
Zhu Lingshan



Please see the reset flow of 1.x device as below.
In fact the comment of the 1.x device also needs to be updated to indicate that 
driver need to wait for the device to finish the reset.
I will send separate patch for improving this comment of vp_reset() to match 
the spec.

static void vp_reset(struct virtio_device *vdev)
{
 struct virtio_pci_device *vp_dev = to_vp_device(vdev);
 struct virtio_pci_modern_device *mdev = &vp_dev->mdev;

 /* 0 status means a reset. */
 vp_modern_set_status(mdev, 0);
 /* After writing 0 to device_status, the driver MUST wait for a read of
  * device_status to return 0 before reinitializing the device.
  * This will flush out the status write, and flush in device writes,
  * including MSI-X interrupts, if any.
  */
 while (vp_modern_get_status(mdev))
 msleep(1);
 /* Flush pending VQ/configuration callbacks. */
 vp_synchronize_vectors(vdev);
}



static void vp_reset(struct virtio_device *vdev) {
 struct virtio_pci_device *vp_dev = to_vp_device(vdev);
 /* 0 status means a reset. */
 vp_legacy_set_status(&vp_dev->ldev, 0);
 /* Flush out the status write, and flush in device writes,
  * including MSi-X interrupts, if any. */
 vp_legacy_get_status(&vp_dev->ldev);
 /* Flush pending VQ/configuration callbacks. */
 vp_synchronize_vectors(vdev);
}

Thanks




Hence MMIO BAR0 is not the best option in real implementations.


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Jason Wang
On Mon, Sep 25, 2023 at 4:26 PM Parav Pandit  wrote:
>
>
>
> > From: Jason Wang 
> > Sent: Monday, September 25, 2023 8:00 AM
> >
> > On Fri, Sep 22, 2023 at 8:25 PM Parav Pandit  wrote:
> > >
> > >
> > > > From: Jason Gunthorpe 
> > > > Sent: Friday, September 22, 2023 5:53 PM
> > >
> > >
> > > > > And what's more, using MMIO BAR0 then it can work for legacy.
> > > >
> > > > Oh? How? Our team didn't think so.
> > >
> > > It does not. It was already discussed.
> > > The device reset in legacy is not synchronous.
> >
> > How do you know this?
> >
> Not sure the motivation of same discussion done in the OASIS with you and 
> others in past.

That is exactly the same point.

It's too late to define the legacy behaviour accurately in the spec so
people will be lost in the legacy maze easily.

>
> Anyways, please find the answer below.
>
> About reset,
> The legacy device specification has not enforced below cited 1.0 driver 
> requirement of 1.0.
>
> "The driver SHOULD consider a driver-initiated reset complete when it reads 
> device status as 0."

We are talking about how to make devices work for legacy drivers. So
it has nothing related to 1.0.

>
> [1] https://ozlabs.org/~rusty/virtio-spec/virtio-0.9.5.pdf
>
> > > The drivers do not wait for reset to complete; it was written for the sw
> > backend.
> >
> > Do you see there's a flush after reset in the legacy driver?
> >
> Yes. it only flushes the write by reading it. The driver does not get _wait_ 
> for the reset to complete within the device like above.

It's the implementation details in legacy. The device needs to make
sure (reset) the driver can work (is done before get_status return).

That's all.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Michael S. Tsirkin
On Mon, Sep 25, 2023 at 03:53:18PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 25, 2023 at 02:16:30PM -0400, Michael S. Tsirkin wrote:
> 
> > I do want to understand if there's a use-case that vdpa does not address
> > simply because it might be worth while to extend it to do so, and a
> > bunch of people working on it are at Red Hat and I might have some input
> > into how that labor is allocated. But if the use-case is simply "has to
> > be vfio and not vdpa" then I guess not.
> 
> If you strip away all the philisophical arguing VDPA has no way to
> isolate the control and data virtqs to different IOMMU configurations
> with this single PCI function.

Aha, so address space/PASID support then?

> The existing HW VDPA drivers provided device specific ways to handle
> this.
> 
> Without DMA isolation you can't assign the high speed data virtq's to
> the VM without mediating them as well.
> 
> > It could be that we are using mediation differently - in my world it's
> > when there's some host software on the path between guest and hardware,
> > and this qualifies.  
> 
> That is pretty general. As I said to Jason, if you want to use it that
> way then you need to make up a new word to describe what VDPA does as
> there is a clear difference in scope between this VFIO patch (relay IO
> commands to the device) and VDPA (intercept all the control plane,
> control virtq and bring it to a RedHat/qemu standard common behavior)

IIUC VDPA itself does not really bring it to either RedHat or qemu
standard, it just allows userspace to control behaviour - if userspace
is qemu then it's qemu deciding how it behaves. Which I guess this
doesn't. Right?  RedHat's not in the picture at all I think.

> > There is also a question of capability. Specifically iommufd support
> > is lacking in vdpa (though there are finally some RFC patches to
> > address that). All this is fine, could be enough to motivate
> > a work like this one.
> 
> I've answered many times, you just don't semm to like the answers or
> dismiss them as not relevant to you.
> 
> Jason


Not really I think I lack some of the picture so I don't fully
understand. Or maybe I missed something else.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Michael S. Tsirkin
On Mon, Sep 25, 2023 at 09:26:07AM -0300, Jason Gunthorpe wrote:
> > > So, as I keep saying, in this scenario the goal is no mediation in the
> > > hypervisor.
> > 
> > That's pretty fine, but I don't think trapping + relying is not
> > mediation. Does it really matter what happens after trapping?
> 
> It is not mediation in the sense that the kernel driver does not in
> any way make decisions on the behavior of the device. It simply
> transforms an IO operation into a device command and relays it to the
> device. The device still fully controls its own behavior.
> 
> VDPA is very different from this. You might call them both mediation,
> sure, but then you need another word to describe the additional
> changes VPDA is doing.

Sorry about hijacking the thread a little bit, but could you
call out some of the changes that are the most problematic
for you?

> > > It is pointless, everything you think you need to do there
> > > is actually already being done in the DPU.
> > 
> > Well, migration or even Qemu could be offloaded to DPU as well. If
> > that's the direction that's pretty fine.
> 
> That's silly, of course qemu/kvm can't run in the DPU.
> 
> However, we can empty qemu and the hypervisor out so all it does is
> run kvm and run vfio. In this model the DPU does all the OVS, storage,
> "VPDA", etc. qemu is just a passive relay of the DPU PCI functions
> into VM's vPCI functions.
> 
> So, everything VDPA was doing in the environment is migrated into the
> DPU.
> 
> In this model the DPU is an extension of the hypervisor/qemu
> environment and we shift code from x86 side to arm side to increase
> security, save power and increase total system performance.
> 
> Jason

I think I begin to understand. On the DPU you have some virtio
devices but also some non-virtio devices.  So you have to
use VFIO to talk to the DPU. Reusing VFIO to talk to virtio
devices too, simplifies things for you. If guests will see
vendor-specific devices from the DPU anyway, it will be impossible
to migrate such guests away from the DPU so the cross-vendor
migration capability is less important in this use-case.
Is this a good summary?


-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Michael S. Tsirkin
On Mon, Sep 25, 2023 at 08:26:33AM +, Parav Pandit wrote:
> 
> 
> > From: Jason Wang 
> > Sent: Monday, September 25, 2023 8:00 AM
> > 
> > On Fri, Sep 22, 2023 at 8:25 PM Parav Pandit  wrote:
> > >
> > >
> > > > From: Jason Gunthorpe 
> > > > Sent: Friday, September 22, 2023 5:53 PM
> > >
> > >
> > > > > And what's more, using MMIO BAR0 then it can work for legacy.
> > > >
> > > > Oh? How? Our team didn't think so.
> > >
> > > It does not. It was already discussed.
> > > The device reset in legacy is not synchronous.
> > 
> > How do you know this?
> >
> Not sure the motivation of same discussion done in the OASIS with you and 
> others in past.
> 
> Anyways, please find the answer below.
> 
> About reset,
> The legacy device specification has not enforced below cited 1.0 driver 
> requirement of 1.0.
> 
> "The driver SHOULD consider a driver-initiated reset complete when it reads 
> device status as 0."
>  
> [1] https://ozlabs.org/~rusty/virtio-spec/virtio-0.9.5.pdf

Basically, I think any drivers that did not read status (linux pre 2011)
before freeing memory under DMA have a reset path that is racy wrt DMA, since 
memory writes are posted and IO writes while not posted have completion
that does not order posted transactions, e.g. from pci express spec:
D2b
An I/O or Configuration Write Completion 37 is permitted to pass a 
Posted Request.
having said that there were a ton of driver races discovered on this
path in the years since, I suspect if one cares about this then
just avoiding stress on reset is wise.



> > > The drivers do not wait for reset to complete; it was written for the sw
> > backend.
> > 
> > Do you see there's a flush after reset in the legacy driver?
> > 
> Yes. it only flushes the write by reading it. The driver does not get _wait_ 
> for the reset to complete within the device like above.

One can thinkably do that wait in hardware, though. Just defer completion until
read is done.

> Please see the reset flow of 1.x device as below.
> In fact the comment of the 1.x device also needs to be updated to indicate 
> that driver need to wait for the device to finish the reset.
> I will send separate patch for improving this comment of vp_reset() to match 
> the spec.
> 
> static void vp_reset(struct virtio_device *vdev)
> {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> struct virtio_pci_modern_device *mdev = &vp_dev->mdev;
> 
> /* 0 status means a reset. */
> vp_modern_set_status(mdev, 0);
> /* After writing 0 to device_status, the driver MUST wait for a read 
> of
>  * device_status to return 0 before reinitializing the device.
>  * This will flush out the status write, and flush in device writes,
>  * including MSI-X interrupts, if any.
>  */
> while (vp_modern_get_status(mdev))
> msleep(1);
> /* Flush pending VQ/configuration callbacks. */
> vp_synchronize_vectors(vdev);
> }
> 
> 
> > static void vp_reset(struct virtio_device *vdev) {
> > struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> > /* 0 status means a reset. */
> > vp_legacy_set_status(&vp_dev->ldev, 0);
> > /* Flush out the status write, and flush in device writes,
> >  * including MSi-X interrupts, if any. */
> > vp_legacy_get_status(&vp_dev->ldev);
> > /* Flush pending VQ/configuration callbacks. */
> > vp_synchronize_vectors(vdev);
> > }
> > 
> > Thanks
> > 
> > 
> > 
> > > Hence MMIO BAR0 is not the best option in real implementations.
> > >
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Michael S. Tsirkin
On Fri, Sep 22, 2023 at 01:19:28PM -0300, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 11:39:19AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Sep 22, 2023 at 09:25:01AM -0300, Jason Gunthorpe wrote:
> > > On Fri, Sep 22, 2023 at 11:02:50AM +0800, Jason Wang wrote:
> > > > On Fri, Sep 22, 2023 at 3:53 AM Jason Gunthorpe  wrote:
> > > > >
> > > > > On Thu, Sep 21, 2023 at 03:34:03PM -0400, Michael S. Tsirkin wrote:
> > > > >
> > > > > > that's easy/practical.  If instead VDPA gives the same speed with 
> > > > > > just
> > > > > > shadow vq then keeping this hack in vfio seems like less of a 
> > > > > > problem.
> > > > > > Finally if VDPA is faster then maybe you will reconsider using it ;)
> > > > >
> > > > > It is not all about the speed.
> > > > >
> > > > > VDPA presents another large and complex software stack in the
> > > > > hypervisor that can be eliminated by simply using VFIO.
> > > > 
> > > > vDPA supports standard virtio devices so how did you define
> > > > complexity?
> > > 
> > > As I said, VFIO is already required for other devices in these VMs. So
> > > anything incremental over base-line vfio-pci is complexity to
> > > minimize.
> > > 
> > > Everything vdpa does is either redundant or unnecessary compared to
> > > VFIO in these environments.
> > > 
> > > Jason
> > 
> > Yes but you know. There are all kind of environments.  I guess you
> > consider yours the most mainstream and important, and are sure it will
> > always stay like this.  But if there's a driver that does what you need
> > then you use that.
> 
> Come on, you are the one saying we cannot do things in the best way
> possible because you want your way of doing things to be the only way
> allowed. Which of us thinks "yours the most mainstream and important" ??
> 
> I'm not telling you to throw away VPDA, I'm saying there are
> legimitate real world use cases where VFIO is the appropriate
> interface, not VDPA.
> 
> I want choice, not dogmatic exclusion that there is Only One True Way.

I don't particularly think there's only one way, vfio is already there.
I am specifically thinking about this patch, for example it
muddies the waters a bit: normally I think vfio exposed device
with the same ID, suddenly it changes the ID as visible to the guest.
But again, whether doing this kind of thing is OK is more up to Alex than me.

I do want to understand if there's a use-case that vdpa does not address
simply because it might be worth while to extend it to do so, and a
bunch of people working on it are at Red Hat and I might have some input
into how that labor is allocated. But if the use-case is simply "has to
be vfio and not vdpa" then I guess not.




> > You really should be explaining what vdpa *does not* do that you
> > need.
> 
> I think I've done that enough, but if you have been following my
> explanation you should see that the entire point of this design is to
> allow a virtio device to be created inside a DPU to a specific
> detailed specification (eg an AWS virtio-net device, for instance)
> 
> The implementation is in the DPU, and only the DPU.
> 
> At the end of the day VDPA uses mediation and creates some
> RedHat/VDPA/Qemu virtio-net device in the guest. It is emphatically
> NOT a perfect recreation of the "AWS virtio-net" we started out with.
> 
> It entirely fails to achieve the most important thing it needs to do!

It could be that we are using mediation differently - in my world it's
when there's some host software on the path between guest and hardware,
and this qualifies.  The difference between what this patch does and
what vdpa does seems quantitative, not qualitative. Which might be
enough to motivate this work, I don't mind. But you seem to feel
it is qualitative and I am genuinely curious about it, because
if yes then it might lead e.g. the virtio standard in new directions.

I can *imagine* all kind of reasons to want to use vfio as compared to vdpa;
here are some examples I came up with, quickly:
- maybe you have drivers that poke at registers not in virtio spec:
  vfio allows that, vdpa by design does not
- maybe you are using vfio with a lot of devices already and don't want
  to special-case handling for virtio devices on the host
do any of the above motivations ring the bell? Some of the things you
said seem to hint at that. If yes maybe include this in the cover
letter.

There is also a question of capability. Specifically iommufd support
is lacking in vdpa (though there are finally some RFC patches to
address that). All this is fine, could be enough to motivate
a work like this one. But I am very curious to know if there
is any other capability lacking in vdpa. I asked already and you
didn't answer so I guess not?




> Yishai will rework the series with your remarks, we can look again on
> v2, thanks for all the input!
> 
> Jason

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinf

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Michael S. Tsirkin
On Fri, Sep 22, 2023 at 01:22:33PM -0300, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 11:40:58AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Sep 22, 2023 at 12:15:34PM -0300, Jason Gunthorpe wrote:
> > > On Fri, Sep 22, 2023 at 11:13:18AM -0400, Michael S. Tsirkin wrote:
> > > > On Fri, Sep 22, 2023 at 12:25:06PM +, Parav Pandit wrote:
> > > > > 
> > > > > > From: Jason Gunthorpe 
> > > > > > Sent: Friday, September 22, 2023 5:53 PM
> > > > > 
> > > > > 
> > > > > > > And what's more, using MMIO BAR0 then it can work for legacy.
> > > > > > 
> > > > > > Oh? How? Our team didn't think so.
> > > > > 
> > > > > It does not. It was already discussed.
> > > > > The device reset in legacy is not synchronous.
> > > > > The drivers do not wait for reset to complete; it was written for the 
> > > > > sw backend.
> > > > > Hence MMIO BAR0 is not the best option in real implementations.
> > > > 
> > > > Or maybe they made it synchronous in hardware, that's all.
> > > > After all same is true for the IO BAR0 e.g. for the PF: IO writes
> > > > are posted anyway.
> > > 
> > > IO writes are not posted in PCI.
> > 
> > Aha, I was confused. Thanks for the correction. I guess you just buffer
> > subsequent transactions while reset is going on and reset quickly enough
> > for it to be seemless then?
> 
> >From a hardware perspective the CPU issues an non-posted IO write and
> then it stops processing until the far side returns an IO completion.
> 
> Using that you can emulate what the SW virtio model did and delay the
> CPU from restarting until the reset is completed.
> 
> Since MMIO is always posted, this is not possible to emulate directly
> using MMIO.
> 
> Converting IO into non-posted admin commands is a fairly close
> recreation to what actual HW would do.
> 
> Jason

I thought you asked how it is possible for hardware to support reset if
all it does is replace IO BAR with memory BAR. The answer is that since
2011 the reset is followed by read of the status field (which isn't much
older than MSIX support from 2009 - which this code assumes).  If one
uses a Linux driver from 2011 and on then all you need to do is defer
response to this read until after the reset is complete.

If you are using older drivers or other OSes then reset using a posted
write after device has operated for a while might not be safe, so e.g.
you might trigger races if you remove drivers from system or
trigger hot unplug.  For example: 

static void virtio_pci_remove(struct pci_dev *pci_dev)
{



unregister_virtio_device(&vp_dev->vdev);

 triggers reset, then releases memory



pci_disable_device(pci_dev);

^^^ blocks DMA by clearing bus master

}

here you could see some DMA into memory that has just been released.


As Jason mentions hardware exists that is used under one of these two
restrictions on the guest (Linux since 2011 or no resets while DMA is
going on), and it works fine with these existing guests.

Given the restrictions, virtio TC didn't elect to standardize this
approach and instead opted for the heavier approach of
converting IO into non-posted admin commands in software.


-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-25 Thread Parav Pandit via Virtualization


> From: Jason Wang 
> Sent: Monday, September 25, 2023 8:00 AM
> 
> On Fri, Sep 22, 2023 at 8:25 PM Parav Pandit  wrote:
> >
> >
> > > From: Jason Gunthorpe 
> > > Sent: Friday, September 22, 2023 5:53 PM
> >
> >
> > > > And what's more, using MMIO BAR0 then it can work for legacy.
> > >
> > > Oh? How? Our team didn't think so.
> >
> > It does not. It was already discussed.
> > The device reset in legacy is not synchronous.
> 
> How do you know this?
>
Not sure the motivation of same discussion done in the OASIS with you and 
others in past.

Anyways, please find the answer below.

About reset,
The legacy device specification has not enforced below cited 1.0 driver 
requirement of 1.0.

"The driver SHOULD consider a driver-initiated reset complete when it reads 
device status as 0."
 
[1] https://ozlabs.org/~rusty/virtio-spec/virtio-0.9.5.pdf

> > The drivers do not wait for reset to complete; it was written for the sw
> backend.
> 
> Do you see there's a flush after reset in the legacy driver?
> 
Yes. it only flushes the write by reading it. The driver does not get _wait_ 
for the reset to complete within the device like above.

Please see the reset flow of 1.x device as below.
In fact the comment of the 1.x device also needs to be updated to indicate that 
driver need to wait for the device to finish the reset.
I will send separate patch for improving this comment of vp_reset() to match 
the spec.

static void vp_reset(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
struct virtio_pci_modern_device *mdev = &vp_dev->mdev;

/* 0 status means a reset. */
vp_modern_set_status(mdev, 0);
/* After writing 0 to device_status, the driver MUST wait for a read of
 * device_status to return 0 before reinitializing the device.
 * This will flush out the status write, and flush in device writes,
 * including MSI-X interrupts, if any.
 */
while (vp_modern_get_status(mdev))
msleep(1);
/* Flush pending VQ/configuration callbacks. */
vp_synchronize_vectors(vdev);
}


> static void vp_reset(struct virtio_device *vdev) {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> /* 0 status means a reset. */
> vp_legacy_set_status(&vp_dev->ldev, 0);
> /* Flush out the status write, and flush in device writes,
>  * including MSi-X interrupts, if any. */
> vp_legacy_get_status(&vp_dev->ldev);
> /* Flush pending VQ/configuration callbacks. */
> vp_synchronize_vectors(vdev);
> }
> 
> Thanks
> 
> 
> 
> > Hence MMIO BAR0 is not the best option in real implementations.
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-24 Thread Zhu, Lingshan




On 9/22/2023 4:55 AM, Michael S. Tsirkin wrote:

On Thu, Sep 21, 2023 at 04:51:15PM -0300, Jason Gunthorpe wrote:

On Thu, Sep 21, 2023 at 03:17:25PM -0400, Michael S. Tsirkin wrote:

On Thu, Sep 21, 2023 at 03:39:26PM -0300, Jason Gunthorpe wrote:

What is the huge amount of work am I asking to do?

You are asking us to invest in the complexity of VDPA through out
(keep it working, keep it secure, invest time in deploying and
debugging in the field)

I'm asking you to do nothing of the kind - I am saying that this code
will have to be duplicated in vdpa,

Why would that be needed?

For the same reason it was developed in the 1st place - presumably
because it adds efficient legacy guest support with the right card?
I get it, you specifically don't need VDPA functionality, but I don't
see why is this universal, or common.



and so I am asking what exactly is missing to just keep it all
there.

VFIO. Seriously, we don't want unnecessary mediation in this path at
all.

But which mediation is necessary is exactly up to the specific use-case.
I have no idea why would you want all of VFIO to e.g. pass access to
random config registers to the guest when it's a virtio device and the
config registers are all nicely listed in the spec. I know nvidia
hardware is so great, it has super robust cards with less security holes
than the vdpa driver, but I very much doubt this is universal for all
virtio offload cards.

I agree with MST.

note I didn't ask you to add iommufd to vdpa though that would be
nice ;)

I did once send someone to look.. It didn't succeed :(

Jason

Pity. Maybe there's some big difficulty blocking this? I'd like to know.



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-24 Thread Jason Wang
On Fri, Sep 22, 2023 at 8:11 PM Jason Gunthorpe  wrote:
>
> On Fri, Sep 22, 2023 at 11:01:23AM +0800, Jason Wang wrote:
>
> > > Even when it does, there is no real use case to live migrate a
> > > virtio-net function from, say, AWS to GCP.
> >
> > It can happen inside a single cloud vendor. For some reasons, DPU must
> > be purchased from different vendors. And vDPA has been used in that
> > case.
>
> Nope, you misunderstand the DPU scenario.
>
> Look at something like vmware DPU enablement. vmware runs the software
> side of the DPU and all their supported DPU HW, from every vendor,
> generates the same PCI functions on the x86. They are the same because
> the same software on the DPU side is creating them.
>
> There is no reason to put a mediation layer in the x86 if you also
> control the DPU.
>
> Cloud vendors will similarly use DPUs to create a PCI functions that
> meet the cloud vendor's internal specification.

This can only work if:

1) the internal specification has finer garin than virtio spec
2) so it can define what is not implemented in the virtio spec (like
migration and compatibility)

All of the above doesn't seem to be possible or realistic now, and it
actually has a risk to be not compatible with virtio spec. In the
future when virtio has live migration supported, they want to be able
to migrate between virtio and vDPA.

As I said, vDPA has been used for cross vendor live migration for a while.

> Regardless of DPU
> vendor.
>
> Fundamentally if you control the DPU SW and the hypervisor software
> you do not need hypervisor meditation because everything you could do
> in hypervisor mediation can just be done in the DPU. Putting it in the
> DPU is better in every regard.
>
> So, as I keep saying, in this scenario the goal is no mediation in the
> hypervisor.

That's pretty fine, but I don't think trapping + relying is not
mediation. Does it really matter what happens after trapping?

> It is pointless, everything you think you need to do there
> is actually already being done in the DPU.

Well, migration or even Qemu could be offloaded to DPU as well. If
that's the direction that's pretty fine.

Thanks

>
> Jason
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-24 Thread Jason Wang
On Fri, Sep 22, 2023 at 8:25 PM Parav Pandit  wrote:
>
>
> > From: Jason Gunthorpe 
> > Sent: Friday, September 22, 2023 5:53 PM
>
>
> > > And what's more, using MMIO BAR0 then it can work for legacy.
> >
> > Oh? How? Our team didn't think so.
>
> It does not. It was already discussed.
> The device reset in legacy is not synchronous.

How do you know this?

> The drivers do not wait for reset to complete; it was written for the sw 
> backend.

Do you see there's a flush after reset in the legacy driver?

static void vp_reset(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
/* 0 status means a reset. */
vp_legacy_set_status(&vp_dev->ldev, 0);
/* Flush out the status write, and flush in device writes,
 * including MSi-X interrupts, if any. */
vp_legacy_get_status(&vp_dev->ldev);
/* Flush pending VQ/configuration callbacks. */
vp_synchronize_vectors(vdev);
}

Thanks



> Hence MMIO BAR0 is not the best option in real implementations.
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-22 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 03:40:40PM +0300, Yishai Hadas wrote:
> Introduce a vfio driver over virtio devices to support the legacy
> interface functionality for VFs.
> 
> Background, from the virtio spec [1].
> 
> In some systems, there is a need to support a virtio legacy driver with
> a device that does not directly support the legacy interface. In such
> scenarios, a group owner device can provide the legacy interface
> functionality for the group member devices. The driver of the owner
> device can then access the legacy interface of a member device on behalf
> of the legacy member device driver.
> 
> For example, with the SR-IOV group type, group members (VFs) can not
> present the legacy interface in an I/O BAR in BAR0 as expected by the
> legacy pci driver. If the legacy driver is running inside a virtual
> machine, the hypervisor executing the virtual machine can present a
> virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
> legacy driver accesses to this I/O BAR and forwards them to the group
> owner device (PF) using group administration commands.
> 
> 
> Specifically, this driver adds support for a virtio-net VF to be exposed
> as a transitional device to a guest driver and allows the legacy IO BAR
> functionality on top.
> 
> This allows a VM which uses a legacy virtio-net driver in the guest to
> work transparently over a VF which its driver in the host is that new
> driver.
> 
> The driver can be extended easily to support some other types of virtio
> devices (e.g virtio-blk), by adding in a few places the specific type
> properties as was done for virtio-net.
> 
> For now, only the virtio-net use case was tested and as such we introduce
> the support only for such a device.
> 
> Practically,
> Upon probing a VF for a virtio-net device, in case its PF supports
> legacy access over the virtio admin commands and the VF doesn't have BAR
> 0, we set some specific 'vfio_device_ops' to be able to simulate in SW a
> transitional device with I/O BAR in BAR 0.
> 
> The existence of the simulated I/O bar is reported later on by
> overwriting the VFIO_DEVICE_GET_REGION_INFO command and the device
> exposes itself as a transitional device by overwriting some properties
> upon reading its config space.
> 
> Once we report the existence of I/O BAR as BAR 0 a legacy driver in the
> guest may use it via read/write calls according to the virtio
> specification.
> 
> Any read/write towards the control parts of the BAR will be captured by
> the new driver and will be translated into admin commands towards the
> device.
> 
> Any data path read/write access (i.e. virtio driver notifications) will
> be forwarded to the physical BAR which its properties were supplied by
> the command VIRTIO_PCI_QUEUE_NOTIFY upon the probing/init flow.
> 
> With that code in place a legacy driver in the guest has the look and
> feel as if having a transitional device with legacy support for both its
> control and data path flows.
> 
> [1]
> https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c
> 
> Signed-off-by: Yishai Hadas 
> ---
>  MAINTAINERS  |   6 +
>  drivers/vfio/pci/Kconfig |   2 +
>  drivers/vfio/pci/Makefile|   2 +
>  drivers/vfio/pci/virtio/Kconfig  |  15 +
>  drivers/vfio/pci/virtio/Makefile |   4 +
>  drivers/vfio/pci/virtio/cmd.c|   4 +-
>  drivers/vfio/pci/virtio/cmd.h|   8 +
>  drivers/vfio/pci/virtio/main.c   | 546 +++
>  8 files changed, 585 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/vfio/pci/virtio/Kconfig
>  create mode 100644 drivers/vfio/pci/virtio/Makefile
>  create mode 100644 drivers/vfio/pci/virtio/main.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bf0f54c24f81..5098418c8389 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -22624,6 +22624,12 @@ L:   k...@vger.kernel.org
>  S:   Maintained
>  F:   drivers/vfio/pci/mlx5/
>  
> +VFIO VIRTIO PCI DRIVER
> +M:   Yishai Hadas 
> +L:   k...@vger.kernel.org
> +S:   Maintained
> +F:   drivers/vfio/pci/virtio
> +
>  VFIO PCI DEVICE SPECIFIC DRIVERS
>  R:   Jason Gunthorpe 
>  R:   Yishai Hadas 
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index 8125e5f37832..18c397df566d 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -65,4 +65,6 @@ source "drivers/vfio/pci/hisilicon/Kconfig"
>  
>  source "drivers/vfio/pci/pds/Kconfig"
>  
> +source "drivers/vfio/pci/virtio/Kconfig"
> +
>  endmenu
> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
> index 45167be462d8..046139a4eca5 100644
> --- a/drivers/vfio/pci/Makefile
> +++ b/drivers/vfio/pci/Makefile
> @@ -13,3 +13,5 @@ obj-$(CONFIG_MLX5_VFIO_PCI)   += mlx5/
>  obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/
>  
>  obj-$(CONFIG_PDS_VFIO_PCI) += pds/
> +
> +obj-$(CO

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-22 Thread Michael S. Tsirkin
On Fri, Sep 22, 2023 at 09:23:28AM -0300, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 05:47:23AM -0400, Michael S. Tsirkin wrote:
> 
> > it will require maintainance effort when virtio changes are made.  For
> > example it pokes at the device state - I don't see specific races right
> > now but in the past we did e.g. reset the device to recover from errors
> > and we might start doing it again.
> > 
> > If more of the logic is under virtio directory where we'll remember
> > to keep it in loop, and will be able to reuse it from vdpa
> > down the road, I would be more sympathetic.
> 
> This is inevitable, the VFIO live migration driver will need all this
> infrastructure too.
> 
> Jason
>  

I am not sure what you are saying and what is inevitable.
VDPA for sure will want live migration support.  I am not at all
sympathetic to efforts that want to duplicate that support for virtio
under VFIO. Put it in a library under the virtio directory,
with a sane will documented interface.
I don't maintain VFIO and Alex can merge what he wants,
but I won't merge patches that export virtio internals in a way
that will make virtio maintainance harder.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-22 Thread Michael S. Tsirkin
On Fri, Sep 22, 2023 at 12:15:34PM -0300, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 11:13:18AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Sep 22, 2023 at 12:25:06PM +, Parav Pandit wrote:
> > > 
> > > > From: Jason Gunthorpe 
> > > > Sent: Friday, September 22, 2023 5:53 PM
> > > 
> > > 
> > > > > And what's more, using MMIO BAR0 then it can work for legacy.
> > > > 
> > > > Oh? How? Our team didn't think so.
> > > 
> > > It does not. It was already discussed.
> > > The device reset in legacy is not synchronous.
> > > The drivers do not wait for reset to complete; it was written for the sw 
> > > backend.
> > > Hence MMIO BAR0 is not the best option in real implementations.
> > 
> > Or maybe they made it synchronous in hardware, that's all.
> > After all same is true for the IO BAR0 e.g. for the PF: IO writes
> > are posted anyway.
> 
> IO writes are not posted in PCI.

Aha, I was confused. Thanks for the correction. I guess you just buffer
subsequent transactions while reset is going on and reset quickly enough
for it to be seemless then?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-22 Thread Michael S. Tsirkin
On Fri, Sep 22, 2023 at 09:25:01AM -0300, Jason Gunthorpe wrote:
> On Fri, Sep 22, 2023 at 11:02:50AM +0800, Jason Wang wrote:
> > On Fri, Sep 22, 2023 at 3:53 AM Jason Gunthorpe  wrote:
> > >
> > > On Thu, Sep 21, 2023 at 03:34:03PM -0400, Michael S. Tsirkin wrote:
> > >
> > > > that's easy/practical.  If instead VDPA gives the same speed with just
> > > > shadow vq then keeping this hack in vfio seems like less of a problem.
> > > > Finally if VDPA is faster then maybe you will reconsider using it ;)
> > >
> > > It is not all about the speed.
> > >
> > > VDPA presents another large and complex software stack in the
> > > hypervisor that can be eliminated by simply using VFIO.
> > 
> > vDPA supports standard virtio devices so how did you define
> > complexity?
> 
> As I said, VFIO is already required for other devices in these VMs. So
> anything incremental over base-line vfio-pci is complexity to
> minimize.
> 
> Everything vdpa does is either redundant or unnecessary compared to
> VFIO in these environments.
> 
> Jason

Yes but you know. There are all kind of environments.  I guess you
consider yours the most mainstream and important, and are sure it will
always stay like this.  But if there's a driver that does what you need
then you use that. You really should be explaining what vdpa
*does not* do that you need.

But anyway, if Alex wants to maintain this it's not too bad,
but I would like to see more code move into a library
living under the virtio directory. As it is structured now
it will make virtio core development harder.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-22 Thread Michael S. Tsirkin
On Fri, Sep 22, 2023 at 12:25:06PM +, Parav Pandit wrote:
> 
> > From: Jason Gunthorpe 
> > Sent: Friday, September 22, 2023 5:53 PM
> 
> 
> > > And what's more, using MMIO BAR0 then it can work for legacy.
> > 
> > Oh? How? Our team didn't think so.
> 
> It does not. It was already discussed.
> The device reset in legacy is not synchronous.
> The drivers do not wait for reset to complete; it was written for the sw 
> backend.
> Hence MMIO BAR0 is not the best option in real implementations.

Or maybe they made it synchronous in hardware, that's all.
After all same is true for the IO BAR0 e.g. for the PF: IO writes are posted 
anyway.

Whether that's possible would depend on the hardware architecture.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-22 Thread Parav Pandit via Virtualization


> From: Jason Gunthorpe 
> Sent: Friday, September 22, 2023 6:07 PM
> 
> On Thu, Sep 21, 2023 at 01:58:32PM -0600, Alex Williamson wrote:
> 
> > If the heart of this driver is simply pretending to have an I/O BAR
> > where I/O accesses into that BAR are translated to accesses in the
> > MMIO BAR, why can't this be done in the VMM, ie. QEMU?
> 
> That isn't exactly what it does, the IO bar access is translated into an admin
> queue command on the PF and excuted by the PCI function.
> 
> So it would be difficult to do that in qemu without also somehow wiring up
> qemu to access the PF's kernel driver's admin queue.
> 
> It would have been nice if it was a trivial 1:1 translation to the MMIO bar, 
> but it
> seems that didn't entirely work with existing VMs. So OASIS standardized this
> approach.
> 
> The bigger picture is there is also a live migration standard & driver in the
> works that will re-use all this admin queue infrastructure anyhow, so the best
> course is to keep this in the kernel.

Additionally in the future the AQ of the PF will also be used to provision the 
VFs (virtio OASIS calls them member devices), such framework also resides in 
the kernel.
Such PFs are in use by the kernel driver.

+1 for keeping this framework in the kernel.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-22 Thread Parav Pandit via Virtualization


> From: Jason Gunthorpe 
> Sent: Friday, September 22, 2023 5:53 PM


> > And what's more, using MMIO BAR0 then it can work for legacy.
> 
> Oh? How? Our team didn't think so.

It does not. It was already discussed.
The device reset in legacy is not synchronous.
The drivers do not wait for reset to complete; it was written for the sw 
backend.
Hence MMIO BAR0 is not the best option in real implementations.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-22 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 07:55:26PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 04:45:45PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Sep 21, 2023 at 04:49:46PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Sep 21, 2023 at 03:13:10PM -0400, Michael S. Tsirkin wrote:
> > > > On Thu, Sep 21, 2023 at 03:39:26PM -0300, Jason Gunthorpe wrote:
> > > > > On Thu, Sep 21, 2023 at 12:53:04PM -0400, Michael S. Tsirkin wrote:
> > > > > > > vdpa is not vfio, I don't know how you can suggest vdpa is a
> > > > > > > replacement for a vfio driver. They are completely different
> > > > > > > things.
> > > > > > > Each side has its own strengths, and vfio especially is 
> > > > > > > accelerating
> > > > > > > in its capability in way that vpda is not. eg if an iommufd 
> > > > > > > conversion
> > > > > > > had been done by now for vdpa I might be more sympathetic.
> > > > > > 
> > > > > > Yea, I agree iommufd is a big problem with vdpa right now. Cindy was
> > > > > > sick and I didn't know and kept assuming she's working on this. I 
> > > > > > don't
> > > > > > think it's a huge amount of work though.  I'll take a look.
> > > > > > Is there anything else though? Do tell.
> > > > > 
> > > > > Confidential compute will never work with VDPA's approach.
> > > > 
> > > > I don't see how what this patchset is doing is different
> > > > wrt to Confidential compute - you trap IO accesses and emulate.
> > > > Care to elaborate?
> > > 
> > > This patch series isn't about confidential compute, you asked about
> > > the future. VFIO will support confidential compute in the future, VDPA
> > > will not.
> > 
> > Nonsense it already works.
> 
> That isn't what I'm talking about. With a real PCI function and TDISP
> we can actually DMA directly from the guest's memory without needing
> the ugly bounce buffer hack. Then you can get decent performance.

Aha, TDISP.  But that one clearly does not need and can not use
this kind of hack?

> > But I did not ask about the future since I do not believe it
> > can be confidently predicted. I asked what is missing in VDPA
> > now for you to add this feature there and not in VFIO.
> 
> I don't see that VDPA needs this, VDPA should process the IO BAR on
> its own with its own logic, just like everything else it does.

First there's some logic here such as translating legacy IO
offsets to modern ones that could be reused.

But also, this is not just IO BAR, that indeed can be easily done in
software.  When a device operates in legacy mode there are subtle
differences with modern mode such as a different header size for the net
device.

> This is specifically about avoiding mediation by relaying directly the
> IO BAR operations to the device itself.
> 
> That is the entire irony, this whole scheme was designed and
> standardized *specifically* to avoid complex mediation and here you
> are saying we should just use mediation.
> 
> Jason

Not exactly. What I had in mind is just having the logic in
the vdpa module so users don't need to know what does the device
support and what it doesn't. If we can we bypass mediation
(to simplify the software stack) if we can not we do not.

Looking at it from user's POV, it is just super confusing that
card ABC would need to be used with VDPA to drive legacy while
card DEF needs to be used with VFIO. And both VFIO and VDPA
will happily bind, too. Oh man ...


-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-22 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 03:40:40PM +0300, Yishai Hadas wrote:
> Introduce a vfio driver over virtio devices to support the legacy
> interface functionality for VFs.
> 
> Background, from the virtio spec [1].
> 
> In some systems, there is a need to support a virtio legacy driver with
> a device that does not directly support the legacy interface. In such
> scenarios, a group owner device can provide the legacy interface
> functionality for the group member devices. The driver of the owner
> device can then access the legacy interface of a member device on behalf
> of the legacy member device driver.
> 
> For example, with the SR-IOV group type, group members (VFs) can not
> present the legacy interface in an I/O BAR in BAR0 as expected by the
> legacy pci driver. If the legacy driver is running inside a virtual
> machine, the hypervisor executing the virtual machine can present a
> virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
> legacy driver accesses to this I/O BAR and forwards them to the group
> owner device (PF) using group administration commands.
> 
> 
> Specifically, this driver adds support for a virtio-net VF to be exposed
> as a transitional device to a guest driver and allows the legacy IO BAR
> functionality on top.
> 
> This allows a VM which uses a legacy virtio-net driver in the guest to
> work transparently over a VF which its driver in the host is that new
> driver.
> 
> The driver can be extended easily to support some other types of virtio
> devices (e.g virtio-blk), by adding in a few places the specific type
> properties as was done for virtio-net.
> 
> For now, only the virtio-net use case was tested and as such we introduce
> the support only for such a device.
> 
> Practically,
> Upon probing a VF for a virtio-net device, in case its PF supports
> legacy access over the virtio admin commands and the VF doesn't have BAR
> 0, we set some specific 'vfio_device_ops' to be able to simulate in SW a
> transitional device with I/O BAR in BAR 0.
> 
> The existence of the simulated I/O bar is reported later on by
> overwriting the VFIO_DEVICE_GET_REGION_INFO command and the device
> exposes itself as a transitional device by overwriting some properties
> upon reading its config space.
> 
> Once we report the existence of I/O BAR as BAR 0 a legacy driver in the
> guest may use it via read/write calls according to the virtio
> specification.
> 
> Any read/write towards the control parts of the BAR will be captured by
> the new driver and will be translated into admin commands towards the
> device.
> 
> Any data path read/write access (i.e. virtio driver notifications) will
> be forwarded to the physical BAR which its properties were supplied by
> the command VIRTIO_PCI_QUEUE_NOTIFY upon the probing/init flow.
> 
> With that code in place a legacy driver in the guest has the look and
> feel as if having a transitional device with legacy support for both its
> control and data path flows.
> 
> [1]
> https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c
> 
> Signed-off-by: Yishai Hadas 
> ---
>  MAINTAINERS  |   6 +
>  drivers/vfio/pci/Kconfig |   2 +
>  drivers/vfio/pci/Makefile|   2 +
>  drivers/vfio/pci/virtio/Kconfig  |  15 +
>  drivers/vfio/pci/virtio/Makefile |   4 +
>  drivers/vfio/pci/virtio/cmd.c|   4 +-
>  drivers/vfio/pci/virtio/cmd.h|   8 +
>  drivers/vfio/pci/virtio/main.c   | 546 +++
>  8 files changed, 585 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/vfio/pci/virtio/Kconfig
>  create mode 100644 drivers/vfio/pci/virtio/Makefile
>  create mode 100644 drivers/vfio/pci/virtio/main.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bf0f54c24f81..5098418c8389 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -22624,6 +22624,12 @@ L:   k...@vger.kernel.org
>  S:   Maintained
>  F:   drivers/vfio/pci/mlx5/
>  
> +VFIO VIRTIO PCI DRIVER
> +M:   Yishai Hadas 
> +L:   k...@vger.kernel.org
> +S:   Maintained
> +F:   drivers/vfio/pci/virtio
> +
>  VFIO PCI DEVICE SPECIFIC DRIVERS
>  R:   Jason Gunthorpe 
>  R:   Yishai Hadas 
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index 8125e5f37832..18c397df566d 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -65,4 +65,6 @@ source "drivers/vfio/pci/hisilicon/Kconfig"
>  
>  source "drivers/vfio/pci/pds/Kconfig"
>  
> +source "drivers/vfio/pci/virtio/Kconfig"
> +
>  endmenu
> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
> index 45167be462d8..046139a4eca5 100644
> --- a/drivers/vfio/pci/Makefile
> +++ b/drivers/vfio/pci/Makefile
> @@ -13,3 +13,5 @@ obj-$(CONFIG_MLX5_VFIO_PCI)   += mlx5/
>  obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/
>  
>  obj-$(CONFIG_PDS_VFIO_PCI) += pds/
> +
> +obj-$(CO

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-22 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 07:48:36PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 04:16:25PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Sep 21, 2023 at 04:53:45PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Sep 21, 2023 at 03:34:03PM -0400, Michael S. Tsirkin wrote:
> > > 
> > > > that's easy/practical.  If instead VDPA gives the same speed with just
> > > > shadow vq then keeping this hack in vfio seems like less of a problem.
> > > > Finally if VDPA is faster then maybe you will reconsider using it ;)
> > > 
> > > It is not all about the speed.
> > > 
> > > VDPA presents another large and complex software stack in the
> > > hypervisor that can be eliminated by simply using VFIO.
> > 
> > If all you want is passing through your card to guest
> > then yes this can be addressed "by simply using VFIO".
> 
> That is pretty much the goal, yes.
> 
> > And let me give you a simple example just from this patchset:
> > it assumes guest uses MSIX and just breaks if it doesn't.
> 
> It does? Really? Where did you see that?

This thing apparently:

+   opcode = (pos < VIRTIO_PCI_CONFIG_OFF(true)) ?
+   VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_READ :
+   VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_READ;

That "true" is supposed to be whether guest enabled MSI or not.


> > > VFIO is
> > > already required for other scenarios.
> > 
> > Required ... by some people? Most VMs I run don't use anything
> > outside of virtio.
> 
> Yes, some people. The sorts of people who run large data centers.
>
> > It seems to deal with emulating virtio which seems more like a vdpa
> > thing.
> 
> Alex described it right, it creates an SW trapped IO bar that relays
> the doorbell to an admin queue command.
> 
> > If you start adding virtio emulation to vfio then won't
> > you just end up with another vdpa? And if no why not?
> > And I don't buy the "we already invested in this vfio based solution",
> > sorry - that's not a reason upstream has to maintain it.
> 
> I think you would be well justified to object to actual mediation,
> like processing queues in VFIO or otherwise complex things.

This mediation is kind of smallish, I agree. Not completely devoid of
logic though.

> Fortunately there is no need to do that with DPU HW. The legacy IO BAR
> is a weird quirk that just cannot be done without a software trap, and
> the OASIS standardization effort was for exactly this kind of
> simplistic transformation.
> 
> I also don't buy the "upstream has to maintain it" line. The team that
> submitted it will maintain it just fine, thank you.

it will require maintainance effort when virtio changes are made.  For
example it pokes at the device state - I don't see specific races right
now but in the past we did e.g. reset the device to recover from errors
and we might start doing it again.

If more of the logic is under virtio directory where we'll remember
to keep it in loop, and will be able to reuse it from vdpa
down the road, I would be more sympathetic.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Zhu, Lingshan




On 9/22/2023 2:39 AM, Jason Gunthorpe wrote:

On Thu, Sep 21, 2023 at 12:53:04PM -0400, Michael S. Tsirkin wrote:

vdpa is not vfio, I don't know how you can suggest vdpa is a
replacement for a vfio driver. They are completely different
things.
Each side has its own strengths, and vfio especially is accelerating
in its capability in way that vpda is not. eg if an iommufd conversion
had been done by now for vdpa I might be more sympathetic.

Yea, I agree iommufd is a big problem with vdpa right now. Cindy was
sick and I didn't know and kept assuming she's working on this. I don't
think it's a huge amount of work though.  I'll take a look.
Is there anything else though? Do tell.

Confidential compute will never work with VDPA's approach.
I don't understand why vDPA can not and will never support Confidential 
computing?


Do you see any blockers?



There are a bunch of things that I think are important for virtio
that are completely out of scope for vfio, such as migrating
cross-vendor.

VFIO supports migration, if you want to have cross-vendor migration
then make a standard that describes the VFIO migration data format for
virtio devices.


What is the huge amount of work am I asking to do?

You are asking us to invest in the complexity of VDPA through out
(keep it working, keep it secure, invest time in deploying and
debugging in the field)

When it doesn't provide *ANY* value to the solution.

The starting point is a completely working vfio PCI function and the
end goal is to put that function into a VM. That is VFIO, not VDPA.

VPDA is fine for what it does, but it is not a reasonable replacement
for VFIO.

Jason


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Jason Wang
On Fri, Sep 22, 2023 at 3:53 AM Jason Gunthorpe  wrote:
>
> On Thu, Sep 21, 2023 at 03:34:03PM -0400, Michael S. Tsirkin wrote:
>
> > that's easy/practical.  If instead VDPA gives the same speed with just
> > shadow vq then keeping this hack in vfio seems like less of a problem.
> > Finally if VDPA is faster then maybe you will reconsider using it ;)
>
> It is not all about the speed.
>
> VDPA presents another large and complex software stack in the
> hypervisor that can be eliminated by simply using VFIO.

vDPA supports standard virtio devices so how did you define complexity?

From the view of the application, what it wants is a simple virtio
device but not virtio-pci devices. That is what vDPA tries to present.

By simply counting LOCs: vdpa + vhost + vp_vdpa is much less code than
what VFIO had. It's not hard to expect, it will still be much less
even if iommufd is done.

Thanks



> VFIO is
> already required for other scenarios.
>
> This is about reducing complexity, reducing attack surface and
> increasing maintainability of the hypervisor environment.
>
> Jason
>
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Jason Wang
On Fri, Sep 22, 2023 at 6:55 AM Jason Gunthorpe  wrote:
>
> On Thu, Sep 21, 2023 at 04:45:45PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Sep 21, 2023 at 04:49:46PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Sep 21, 2023 at 03:13:10PM -0400, Michael S. Tsirkin wrote:
> > > > On Thu, Sep 21, 2023 at 03:39:26PM -0300, Jason Gunthorpe wrote:
> > > > > On Thu, Sep 21, 2023 at 12:53:04PM -0400, Michael S. Tsirkin wrote:
> > > > > > > vdpa is not vfio, I don't know how you can suggest vdpa is a
> > > > > > > replacement for a vfio driver. They are completely different
> > > > > > > things.
> > > > > > > Each side has its own strengths, and vfio especially is 
> > > > > > > accelerating
> > > > > > > in its capability in way that vpda is not. eg if an iommufd 
> > > > > > > conversion
> > > > > > > had been done by now for vdpa I might be more sympathetic.
> > > > > >
> > > > > > Yea, I agree iommufd is a big problem with vdpa right now. Cindy was
> > > > > > sick and I didn't know and kept assuming she's working on this. I 
> > > > > > don't
> > > > > > think it's a huge amount of work though.  I'll take a look.
> > > > > > Is there anything else though? Do tell.
> > > > >
> > > > > Confidential compute will never work with VDPA's approach.
> > > >
> > > > I don't see how what this patchset is doing is different
> > > > wrt to Confidential compute - you trap IO accesses and emulate.
> > > > Care to elaborate?
> > >
> > > This patch series isn't about confidential compute, you asked about
> > > the future. VFIO will support confidential compute in the future, VDPA
> > > will not.

What blocks vDPA from supporting that?

> >
> > Nonsense it already works.
>
> That isn't what I'm talking about. With a real PCI function and TDISP
> we can actually DMA directly from the guest's memory without needing
> the ugly bounce buffer hack. Then you can get decent performance.

This series requires the trapping in the legacy I/O BAR in VFIO. Why
can TDISP work when trapping in VFIO but not vDPA? If neither, how can
TDISP affect here?

>
> > But I did not ask about the future since I do not believe it
> > can be confidently predicted. I asked what is missing in VDPA
> > now for you to add this feature there and not in VFIO.
>
> I don't see that VDPA needs this, VDPA should process the IO BAR on
> its own with its own logic, just like everything else it does.
>
> This is specifically about avoiding mediation by relaying directly the
> IO BAR operations to the device itself.

So we had:

1) a new virtio specific driver for VFIO
2) the existing vp_vdpa driver

How much differences between them in the context of the mediation or
relaying? Or is it hard to introduce admin commands in the vDPA bus?

> That is the entire irony, this whole scheme was designed and
> standardized *specifically* to avoid complex mediation and here you
> are saying we should just use mediation.

No, using "simple VFIO passthrough" is just fine.

Thanks

>
> Jason
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Jason Wang
On Fri, Sep 22, 2023 at 4:16 AM Michael S. Tsirkin  wrote:
>
> On Thu, Sep 21, 2023 at 04:53:45PM -0300, Jason Gunthorpe wrote:
> > On Thu, Sep 21, 2023 at 03:34:03PM -0400, Michael S. Tsirkin wrote:
> >
> > > that's easy/practical.  If instead VDPA gives the same speed with just
> > > shadow vq then keeping this hack in vfio seems like less of a problem.
> > > Finally if VDPA is faster then maybe you will reconsider using it ;)
> >
> > It is not all about the speed.
> >
> > VDPA presents another large and complex software stack in the
> > hypervisor that can be eliminated by simply using VFIO.
>
> If all you want is passing through your card to guest
> then yes this can be addressed "by simply using VFIO".

+1.

And what's more, using MMIO BAR0 then it can work for legacy.

I have a handy virtio hardware from one vendor that works like this,
and I see it is done by a lot of other vendors.

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Jason Wang
On Fri, Sep 22, 2023 at 3:49 AM Jason Gunthorpe  wrote:
>
> On Thu, Sep 21, 2023 at 03:13:10PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Sep 21, 2023 at 03:39:26PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Sep 21, 2023 at 12:53:04PM -0400, Michael S. Tsirkin wrote:
> > > > > vdpa is not vfio, I don't know how you can suggest vdpa is a
> > > > > replacement for a vfio driver. They are completely different
> > > > > things.
> > > > > Each side has its own strengths, and vfio especially is accelerating
> > > > > in its capability in way that vpda is not. eg if an iommufd conversion
> > > > > had been done by now for vdpa I might be more sympathetic.
> > > >
> > > > Yea, I agree iommufd is a big problem with vdpa right now. Cindy was
> > > > sick and I didn't know and kept assuming she's working on this. I don't
> > > > think it's a huge amount of work though.  I'll take a look.
> > > > Is there anything else though? Do tell.
> > >
> > > Confidential compute will never work with VDPA's approach.
> >
> > I don't see how what this patchset is doing is different
> > wrt to Confidential compute - you trap IO accesses and emulate.
> > Care to elaborate?
>
> This patch series isn't about confidential compute, you asked about
> the future. VFIO will support confidential compute in the future, VDPA
> will not.
>
> > > > There are a bunch of things that I think are important for virtio
> > > > that are completely out of scope for vfio, such as migrating
> > > > cross-vendor.
> > >
> > > VFIO supports migration, if you want to have cross-vendor migration
> > > then make a standard that describes the VFIO migration data format for
> > > virtio devices.
> >
> > This has nothing to do with data formats - you need two devices to
> > behave identically. Which is what VDPA is about really.
>
> We've been looking at VFIO live migration extensively. Device
> mediation, like VDPA does, is one legitimate approach for live
> migration. It suites a certain type of heterogeneous environment well.
>
> But, it is equally legitimate to make the devices behave the same and
> have them process a common migration data.
>
> This can happen in public with standards, or it can happen in private
> within a cloud operator's "private-standard" environment.
>
> To date, in most of my discussions, I have not seen a strong appetite
> for such public standards. In part due to the complexity.
>
> Regardles, it is not the kernel communities job to insist on one
> approach or the other.
>
> > > You are asking us to invest in the complexity of VDPA through out
> > > (keep it working, keep it secure, invest time in deploying and
> > > debugging in the field)
> > >
> > > When it doesn't provide *ANY* value to the solution.
> >
> > There's no "the solution"
>
> Nonsense.
>
> > this sounds like a vendor only caring about solutions that involve
> > that vendor's hardware exclusively, a little.
>
> Not really.
>
> Understand the DPU provider is not the vendor here. The DPU provider
> gives a cloud operator a SDK to build these things. The operator is
> the vendor from your perspective.
>
> In many cases live migration never leaves the operator's confines in
> the first place.
>
> Even when it does, there is no real use case to live migrate a
> virtio-net function from, say, AWS to GCP.

It can happen inside a single cloud vendor. For some reasons, DPU must
be purchased from different vendors. And vDPA has been used in that
case.

I've asked them to present this probably somewhere like KVM Forum.

>
> You are pushing for a lot of complexity and software that solves a
> problem people in this space don't actually have.
>
> As I said, VDPA is fine for the scenarios it addresses. It is an
> alternative, not a replacement, for VFIO.

We never try to replace VFIO. I don't see any problem by just using
the current VFIO to assign a virtio-pci device to the guest.

The problem is the mediation (or what you called relaying) layer
you've invented.

Thanks

>
> Jason
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Alex Williamson
On Thu, 21 Sep 2023 16:20:59 -0400
"Michael S. Tsirkin"  wrote:

> On Thu, Sep 21, 2023 at 05:01:21PM -0300, Jason Gunthorpe wrote:
> > On Thu, Sep 21, 2023 at 01:58:32PM -0600, Alex Williamson wrote:
> >   
> > > > +static const struct pci_device_id virtiovf_pci_table[] = {
> > > > +   { 
> > > > PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_REDHAT_QUMRANET, 
> > > > PCI_ANY_ID) },  
> > > 
> > > libvirt will blindly use this driver for all devices matching this as
> > > we've discussed how it should make use of modules.alias.  I don't think
> > > this driver should be squatting on devices where it doesn't add value
> > > and it's not clear whether this is adding or subtracting value in all
> > > cases for the one NIC that it modifies.  How should libvirt choose when
> > > and where to use this driver?  What regressions are we going to see
> > > with VMs that previously saw "modern" virtio-net devices and now see a
> > > legacy compatible device?  Thanks,  
> > 
> > Maybe this approach needs to use a subsystem ID match?
> > 
> > Jason  
> 
> Maybe make users load it manually?
> 
> Please don't bind to virtio by default, you will break
> all guests.

This would never bind by default, it's only bound as a vfio override
driver, but if libvirt were trying to determine the correct driver to
use with vfio for a 0x1af4 device, it'd land on this one.  Thanks,

Alex

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 04:51:15PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 03:17:25PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Sep 21, 2023 at 03:39:26PM -0300, Jason Gunthorpe wrote:
> > > > What is the huge amount of work am I asking to do?
> > > 
> > > You are asking us to invest in the complexity of VDPA through out
> > > (keep it working, keep it secure, invest time in deploying and
> > > debugging in the field)
> > 
> > I'm asking you to do nothing of the kind - I am saying that this code
> > will have to be duplicated in vdpa,
> 
> Why would that be needed?

For the same reason it was developed in the 1st place - presumably
because it adds efficient legacy guest support with the right card?
I get it, you specifically don't need VDPA functionality, but I don't
see why is this universal, or common.


> > and so I am asking what exactly is missing to just keep it all
> > there.
> 
> VFIO. Seriously, we don't want unnecessary mediation in this path at
> all.

But which mediation is necessary is exactly up to the specific use-case.
I have no idea why would you want all of VFIO to e.g. pass access to
random config registers to the guest when it's a virtio device and the
config registers are all nicely listed in the spec. I know nvidia
hardware is so great, it has super robust cards with less security holes
than the vdpa driver, but I very much doubt this is universal for all
virtio offload cards.

> > note I didn't ask you to add iommufd to vdpa though that would be
> > nice ;)
> 
> I did once send someone to look.. It didn't succeed :(
> 
> Jason

Pity. Maybe there's some big difficulty blocking this? I'd like to know.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 04:49:46PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 03:13:10PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Sep 21, 2023 at 03:39:26PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Sep 21, 2023 at 12:53:04PM -0400, Michael S. Tsirkin wrote:
> > > > > vdpa is not vfio, I don't know how you can suggest vdpa is a
> > > > > replacement for a vfio driver. They are completely different
> > > > > things.
> > > > > Each side has its own strengths, and vfio especially is accelerating
> > > > > in its capability in way that vpda is not. eg if an iommufd conversion
> > > > > had been done by now for vdpa I might be more sympathetic.
> > > > 
> > > > Yea, I agree iommufd is a big problem with vdpa right now. Cindy was
> > > > sick and I didn't know and kept assuming she's working on this. I don't
> > > > think it's a huge amount of work though.  I'll take a look.
> > > > Is there anything else though? Do tell.
> > > 
> > > Confidential compute will never work with VDPA's approach.
> > 
> > I don't see how what this patchset is doing is different
> > wrt to Confidential compute - you trap IO accesses and emulate.
> > Care to elaborate?
> 
> This patch series isn't about confidential compute, you asked about
> the future. VFIO will support confidential compute in the future, VDPA
> will not.

Nonsense it already works.

But I did not ask about the future since I do not believe it
can be confidently predicted. I asked what is missing in VDPA
now for you to add this feature there and not in VFIO.


> > > > There are a bunch of things that I think are important for virtio
> > > > that are completely out of scope for vfio, such as migrating
> > > > cross-vendor. 
> > > 
> > > VFIO supports migration, if you want to have cross-vendor migration
> > > then make a standard that describes the VFIO migration data format for
> > > virtio devices.
> > 
> > This has nothing to do with data formats - you need two devices to
> > behave identically. Which is what VDPA is about really.
> 
> We've been looking at VFIO live migration extensively. Device
> mediation, like VDPA does, is one legitimate approach for live
> migration. It suites a certain type of heterogeneous environment well.
> 
> But, it is equally legitimate to make the devices behave the same and
> have them process a common migration data.
> 
> This can happen in public with standards, or it can happen in private
> within a cloud operator's "private-standard" environment.
> 
> To date, in most of my discussions, I have not seen a strong appetite
> for such public standards. In part due to the complexity.
> 
> Regardles, it is not the kernel communities job to insist on one
> approach or the other.
>
> > > You are asking us to invest in the complexity of VDPA through out
> > > (keep it working, keep it secure, invest time in deploying and
> > > debugging in the field)
> > > 
> > > When it doesn't provide *ANY* value to the solution.
> > 
> > There's no "the solution"
> 
> Nonsense.

what there's only one solution that you use the definite article?

> > this sounds like a vendor only caring about solutions that involve
> > that vendor's hardware exclusively, a little.
> 
> Not really.
> 
> Understand the DPU provider is not the vendor here. The DPU provider
> gives a cloud operator a SDK to build these things. The operator is
> the vendor from your perspective.
> 
> In many cases live migration never leaves the operator's confines in
> the first place.
> 
> Even when it does, there is no real use case to live migrate a
> virtio-net function from, say, AWS to GCP.
> 
> You are pushing for a lot of complexity and software that solves a
> problem people in this space don't actually have.
> 
> As I said, VDPA is fine for the scenarios it addresses. It is an
> alternative, not a replacement, for VFIO.
> 
> Jason

yea, VDPA does trap and emulate for config accesses.  which is exactly
what this patch does?  so why does it belong in vfio muddying up its
passthrough model is beyond me, except that apparently there's some
specific deployment that happens to use vfio so now whatever
that deployment needs has to go into vfio whether it belongs there or not.


-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 05:01:21PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 01:58:32PM -0600, Alex Williamson wrote:
> 
> > > +static const struct pci_device_id virtiovf_pci_table[] = {
> > > + { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_REDHAT_QUMRANET, 
> > > PCI_ANY_ID) },
> > 
> > libvirt will blindly use this driver for all devices matching this as
> > we've discussed how it should make use of modules.alias.  I don't think
> > this driver should be squatting on devices where it doesn't add value
> > and it's not clear whether this is adding or subtracting value in all
> > cases for the one NIC that it modifies.  How should libvirt choose when
> > and where to use this driver?  What regressions are we going to see
> > with VMs that previously saw "modern" virtio-net devices and now see a
> > legacy compatible device?  Thanks,
> 
> Maybe this approach needs to use a subsystem ID match?
> 
> Jason

Maybe make users load it manually?

Please don't bind to virtio by default, you will break
all guests.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 04:53:45PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 03:34:03PM -0400, Michael S. Tsirkin wrote:
> 
> > that's easy/practical.  If instead VDPA gives the same speed with just
> > shadow vq then keeping this hack in vfio seems like less of a problem.
> > Finally if VDPA is faster then maybe you will reconsider using it ;)
> 
> It is not all about the speed.
> 
> VDPA presents another large and complex software stack in the
> hypervisor that can be eliminated by simply using VFIO.

If all you want is passing through your card to guest
then yes this can be addressed "by simply using VFIO".

And let me give you a simple example just from this patchset:
it assumes guest uses MSIX and just breaks if it doesn't.
As VDPA emulates it can emulate INT#x for guest while doing MSI
on the host side. Yea modern guests use MSIX but this is about legacy
yes?


> VFIO is
> already required for other scenarios.

Required ... by some people? Most VMs I run don't use anything
outside of virtio.

> This is about reducing complexity, reducing attack surface and
> increasing maintainability of the hypervisor environment.
> 
> Jason

Generally you get better security if you don't let guests poke at
hardware when they don't have to. But sure, matter of preference -
use VFIO, it's great. I am worried about the specific patchset though.
It seems to deal with emulating virtio which seems more like a vdpa
thing. If you start adding virtio emulation to vfio then won't
you just end up with another vdpa? And if no why not?
And I don't buy the "we already invested in this vfio based solution",
sorry - that's not a reason upstream has to maintain it.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Alex Williamson
On Thu, 21 Sep 2023 15:40:40 +0300
Yishai Hadas  wrote:

> Introduce a vfio driver over virtio devices to support the legacy
> interface functionality for VFs.
> 
> Background, from the virtio spec [1].
> 
> In some systems, there is a need to support a virtio legacy driver with
> a device that does not directly support the legacy interface. In such
> scenarios, a group owner device can provide the legacy interface
> functionality for the group member devices. The driver of the owner
> device can then access the legacy interface of a member device on behalf
> of the legacy member device driver.
> 
> For example, with the SR-IOV group type, group members (VFs) can not
> present the legacy interface in an I/O BAR in BAR0 as expected by the
> legacy pci driver. If the legacy driver is running inside a virtual
> machine, the hypervisor executing the virtual machine can present a
> virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
> legacy driver accesses to this I/O BAR and forwards them to the group
> owner device (PF) using group administration commands.
> 
> 
> Specifically, this driver adds support for a virtio-net VF to be exposed
> as a transitional device to a guest driver and allows the legacy IO BAR
> functionality on top.
> 
> This allows a VM which uses a legacy virtio-net driver in the guest to
> work transparently over a VF which its driver in the host is that new
> driver.
> 
> The driver can be extended easily to support some other types of virtio
> devices (e.g virtio-blk), by adding in a few places the specific type
> properties as was done for virtio-net.
> 
> For now, only the virtio-net use case was tested and as such we introduce
> the support only for such a device.
> 
> Practically,
> Upon probing a VF for a virtio-net device, in case its PF supports
> legacy access over the virtio admin commands and the VF doesn't have BAR
> 0, we set some specific 'vfio_device_ops' to be able to simulate in SW a
> transitional device with I/O BAR in BAR 0.
> 
> The existence of the simulated I/O bar is reported later on by
> overwriting the VFIO_DEVICE_GET_REGION_INFO command and the device
> exposes itself as a transitional device by overwriting some properties
> upon reading its config space.
> 
> Once we report the existence of I/O BAR as BAR 0 a legacy driver in the
> guest may use it via read/write calls according to the virtio
> specification.
> 
> Any read/write towards the control parts of the BAR will be captured by
> the new driver and will be translated into admin commands towards the
> device.
> 
> Any data path read/write access (i.e. virtio driver notifications) will
> be forwarded to the physical BAR which its properties were supplied by
> the command VIRTIO_PCI_QUEUE_NOTIFY upon the probing/init flow.
> 
> With that code in place a legacy driver in the guest has the look and
> feel as if having a transitional device with legacy support for both its
> control and data path flows.
> 
> [1]
> https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c
> 
> Signed-off-by: Yishai Hadas 
> ---
>  MAINTAINERS  |   6 +
>  drivers/vfio/pci/Kconfig |   2 +
>  drivers/vfio/pci/Makefile|   2 +
>  drivers/vfio/pci/virtio/Kconfig  |  15 +
>  drivers/vfio/pci/virtio/Makefile |   4 +
>  drivers/vfio/pci/virtio/cmd.c|   4 +-
>  drivers/vfio/pci/virtio/cmd.h|   8 +
>  drivers/vfio/pci/virtio/main.c   | 546 +++
>  8 files changed, 585 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/vfio/pci/virtio/Kconfig
>  create mode 100644 drivers/vfio/pci/virtio/Makefile
>  create mode 100644 drivers/vfio/pci/virtio/main.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bf0f54c24f81..5098418c8389 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -22624,6 +22624,12 @@ L:   k...@vger.kernel.org
>  S:   Maintained
>  F:   drivers/vfio/pci/mlx5/
>  
> +VFIO VIRTIO PCI DRIVER
> +M:   Yishai Hadas 
> +L:   k...@vger.kernel.org
> +S:   Maintained
> +F:   drivers/vfio/pci/virtio
> +
>  VFIO PCI DEVICE SPECIFIC DRIVERS
>  R:   Jason Gunthorpe 
>  R:   Yishai Hadas 
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index 8125e5f37832..18c397df566d 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -65,4 +65,6 @@ source "drivers/vfio/pci/hisilicon/Kconfig"
>  
>  source "drivers/vfio/pci/pds/Kconfig"
>  
> +source "drivers/vfio/pci/virtio/Kconfig"
> +
>  endmenu
> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
> index 45167be462d8..046139a4eca5 100644
> --- a/drivers/vfio/pci/Makefile
> +++ b/drivers/vfio/pci/Makefile
> @@ -13,3 +13,5 @@ obj-$(CONFIG_MLX5_VFIO_PCI)   += mlx5/
>  obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/
>  
>  obj-$(CONFIG_PDS_VFIO_PCI) += pds/
> +
> +obj-$(CONFIG_

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 03:16:37PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 01:55:42PM -0400, Michael S. Tsirkin wrote:
> 
> > That's not what I'm asking about though - not what shadow vq does,
> > shadow vq is a vdpa feature.
> 
> That's just VDPA then. We already talked about why VDPA is not a
> replacement for VFIO.

It does however work universally, by software, without any special
hardware support. Which is kind of why I am curious - if VDPA needs this
proxy code because shadow vq is slower then that's an argument for not
having it in two places, and trying to improve vdpa to use iommufd if
that's easy/practical.  If instead VDPA gives the same speed with just
shadow vq then keeping this hack in vfio seems like less of a problem.
Finally if VDPA is faster then maybe you will reconsider using it ;)

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 03:39:26PM -0300, Jason Gunthorpe wrote:
> > What is the huge amount of work am I asking to do?
> 
> You are asking us to invest in the complexity of VDPA through out
> (keep it working, keep it secure, invest time in deploying and
> debugging in the field)

I'm asking you to do nothing of the kind - I am saying that this code
will have to be duplicated in vdpa, and so I am asking what exactly is
missing to just keep it all there. So far you said iommufd and
note I didn't ask you to add iommufd to vdpa though that would be nice ;)
I just said I'll look into it in the next several days.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 03:39:26PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 12:53:04PM -0400, Michael S. Tsirkin wrote:
> > > vdpa is not vfio, I don't know how you can suggest vdpa is a
> > > replacement for a vfio driver. They are completely different
> > > things.
> > > Each side has its own strengths, and vfio especially is accelerating
> > > in its capability in way that vpda is not. eg if an iommufd conversion
> > > had been done by now for vdpa I might be more sympathetic.
> > 
> > Yea, I agree iommufd is a big problem with vdpa right now. Cindy was
> > sick and I didn't know and kept assuming she's working on this. I don't
> > think it's a huge amount of work though.  I'll take a look.
> > Is there anything else though? Do tell.
> 
> Confidential compute will never work with VDPA's approach.

I don't see how what this patchset is doing is different
wrt to Confidential compute - you trap IO accesses and emulate.
Care to elaborate?


> > There are a bunch of things that I think are important for virtio
> > that are completely out of scope for vfio, such as migrating
> > cross-vendor. 
> 
> VFIO supports migration, if you want to have cross-vendor migration
> then make a standard that describes the VFIO migration data format for
> virtio devices.

This has nothing to do with data formats - you need two devices to
behave identically. Which is what VDPA is about really.

> > What is the huge amount of work am I asking to do?
> 
> You are asking us to invest in the complexity of VDPA through out
> (keep it working, keep it secure, invest time in deploying and
> debugging in the field)
> 
> When it doesn't provide *ANY* value to the solution.

There's no "the solution" - this sounds like a vendor only caring about
solutions that involve that vendor's hardware exclusively, a little.

> The starting point is a completely working vfio PCI function and the
> end goal is to put that function into a VM. That is VFIO, not VDPA.
> 
> VPDA is fine for what it does, but it is not a reasonable replacement
> for VFIO.
> 
> Jason

VDPA basically should be a kind of "VFIO for virtio".

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 02:44:50PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 01:21:26PM -0400, Michael S. Tsirkin wrote:
> > Yea it's very useful - it's also useful for vdpa whether this patchset
> > goes in or not.  At some level, if vdpa can't keep up then maybe going
> > the vfio route is justified. I'm not sure why didn't anyone fix iommufd
> > yet - looks like a small amount of work. I'll see if I can address it
> > quickly because we already have virtio accelerators under vdpa and it
> > seems confusing to people to use vdpa for some and vfio for others, with
> > overlapping but slightly incompatible functionality.  I'll get back next
> > week, in either case. I am however genuinely curious whether all the new
> > functionality is actually useful for these legacy guests.
> 
> It doesn't have much to do with the guests - this is new hypervisor
> functionality to make the hypervisor do more things. This stuff can
> still work with old VMs.
> 
> > > > Another question I'm interested in is whether there's actually a
> > > > performance benefit to using this as compared to just software
> > > > vhost. I note there's a VM exit on each IO access, so ... perhaps?
> > > > Would be nice to see some numbers.
> > > 
> > > At least a single trap compared with an entire per-packet SW flow
> > > undoubtably uses alot less CPU power in the hypervisor.
> >
> > Something like the shadow vq thing will be more or less equivalent
> > then?
> 
> Huh? It still has the entire netdev stack to go through on every
> packet before it reaches the real virtio device.

No - shadow vq just tweaks the descriptor and forwards it to
the modern vdpa hardware. No net stack involved.

> > That's upstream in qemu and needs no hardware support. Worth comparing
> > against.  Anyway, there's presumably actual hardware this was tested
> > with, so why guess? Just test and post numbers.
> 
> Our prior benchmarking put our VPDA/VFIO solutions at something like
> 2x-3x improvement over the qemu SW path it replaces.
> Parav said 10% is lost, so 10% of 3x is still 3x better :)
> 
> I thought we all agreed on this when vdpa was created in the first
> place, the all SW path was hopeless to get high performance out of?
> 
> Jason

That's not what I'm asking about though - not what shadow vq does,
shadow vq is a vdpa feature.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 05:09:04PM +, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin 
> > Sent: Thursday, September 21, 2023 10:31 PM
> 
> > Another question I'm interested in is whether there's actually a performance
> > benefit to using this as compared to just software vhost. I note there's a 
> > VM exit
> > on each IO access, so ... perhaps?
> > Would be nice to see some numbers.
> 
> Packet rate and bandwidth are close are only 10% lower than modern device due 
> to the batching of driver notification.
> Bw tested with iperf with one and multiple queues. 
> Packet rate tested with testpmd.

Nice, good to know.  Could you compare this with vdpa with shadow vq
enabled?  That's probably the closest equivalent that needs
no kernel or hardware work.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 02:07:09PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 01:01:12PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Sep 21, 2023 at 01:52:24PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Sep 21, 2023 at 10:43:50AM -0600, Alex Williamson wrote:
> > > 
> > > > > With that code in place a legacy driver in the guest has the look and
> > > > > feel as if having a transitional device with legacy support for both 
> > > > > its
> > > > > control and data path flows.
> > > > 
> > > > Why do we need to enable a "legacy" driver in the guest?  The very name
> > > > suggests there's an alternative driver that perhaps doesn't require
> > > > this I/O BAR.  Why don't we just require the non-legacy driver in the
> > > > guest rather than increase our maintenance burden?  Thanks,
> > > 
> > > It was my reaction also.
> > > 
> > > Apparently there is a big deployed base of people using old guest VMs
> > > with old drivers and they do not want to update their VMs. It is the
> > > same basic reason why qemu supports all those weird old machine types
> > > and HW emulations. The desire is to support these old devices so that
> > > old VMs can work unchanged.
> > > 
> > > Jason
> > 
> > And you are saying all these very old VMs use such a large number of
> > legacy devices that over-counting of locked memory due to vdpa not
> > correctly using iommufd is a problem that urgently needs to be solved
> > otherwise the solution has no value?
> 
> No one has said that.
> 
> iommufd is gaining alot more functions than just pinned memory
> accounting.

Yea it's very useful - it's also useful for vdpa whether this patchset
goes in or not.  At some level, if vdpa can't keep up then maybe going
the vfio route is justified. I'm not sure why didn't anyone fix iommufd
yet - looks like a small amount of work. I'll see if I can address it
quickly because we already have virtio accelerators under vdpa and it
seems confusing to people to use vdpa for some and vfio for others, with
overlapping but slightly incompatible functionality.  I'll get back next
week, in either case. I am however genuinely curious whether all the new
functionality is actually useful for these legacy guests.

> > Another question I'm interested in is whether there's actually a
> > performance benefit to using this as compared to just software
> > vhost. I note there's a VM exit on each IO access, so ... perhaps?
> > Would be nice to see some numbers.
> 
> At least a single trap compared with an entire per-packet SW flow
> undoubtably uses alot less CPU power in the hypervisor.
> 
> Jason

Something like the shadow vq thing will be more or less equivalent then?
That's upstream in qemu and needs no hardware support. Worth comparing
against.  Anyway, there's presumably actual hardware this was tested
with, so why guess? Just test and post numbers.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Parav Pandit via Virtualization



> From: Michael S. Tsirkin 
> Sent: Thursday, September 21, 2023 10:31 PM

> Another question I'm interested in is whether there's actually a performance
> benefit to using this as compared to just software vhost. I note there's a VM 
> exit
> on each IO access, so ... perhaps?
> Would be nice to see some numbers.

Packet rate and bandwidth are close are only 10% lower than modern device due 
to the batching of driver notification.
Bw tested with iperf with one and multiple queues. 
Packet rate tested with testpmd.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 01:52:24PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 10:43:50AM -0600, Alex Williamson wrote:
> 
> > > With that code in place a legacy driver in the guest has the look and
> > > feel as if having a transitional device with legacy support for both its
> > > control and data path flows.
> > 
> > Why do we need to enable a "legacy" driver in the guest?  The very name
> > suggests there's an alternative driver that perhaps doesn't require
> > this I/O BAR.  Why don't we just require the non-legacy driver in the
> > guest rather than increase our maintenance burden?  Thanks,
> 
> It was my reaction also.
> 
> Apparently there is a big deployed base of people using old guest VMs
> with old drivers and they do not want to update their VMs. It is the
> same basic reason why qemu supports all those weird old machine types
> and HW emulations. The desire is to support these old devices so that
> old VMs can work unchanged.
> 
> Jason

And you are saying all these very old VMs use such a large number of
legacy devices that over-counting of locked memory due to vdpa not
correctly using iommufd is a problem that urgently needs to be solved
otherwise the solution has no value?

Another question I'm interested in is whether there's actually a
performance benefit to using this as compared to just software
vhost. I note there's a VM exit on each IO access, so ... perhaps?
Would be nice to see some numbers.


-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 01:41:39PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 10:16:04AM -0400, Michael S. Tsirkin wrote:
> > On Thu, Sep 21, 2023 at 11:11:25AM -0300, Jason Gunthorpe wrote:
> > > On Thu, Sep 21, 2023 at 09:16:21AM -0400, Michael S. Tsirkin wrote:
> > > 
> > > > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > > > index bf0f54c24f81..5098418c8389 100644
> > > > > --- a/MAINTAINERS
> > > > > +++ b/MAINTAINERS
> > > > > @@ -22624,6 +22624,12 @@ L:   k...@vger.kernel.org
> > > > >  S:   Maintained
> > > > >  F:   drivers/vfio/pci/mlx5/
> > > > >  
> > > > > +VFIO VIRTIO PCI DRIVER
> > > > > +M:   Yishai Hadas 
> > > > > +L:   k...@vger.kernel.org
> > > > > +S:   Maintained
> > > > > +F:   drivers/vfio/pci/virtio
> > > > > +
> > > > >  VFIO PCI DEVICE SPECIFIC DRIVERS
> > > > >  R:   Jason Gunthorpe 
> > > > >  R:   Yishai Hadas 
> > > > 
> > > > Tying two subsystems together like this is going to cause pain when
> > > > merging. God forbid there's something e.g. virtio net specific
> > > > (and there's going to be for sure) - now we are talking 3
> > > > subsystems.
> > > 
> > > Cross subsystem stuff is normal in the kernel.
> > 
> > Yea. But it's completely spurious here - virtio has its own way
> > to work with userspace which is vdpa and let's just use that.
> > Keeps things nice and contained.
> 
> vdpa is not vfio, I don't know how you can suggest vdpa is a
> replacement for a vfio driver. They are completely different
> things.
> Each side has its own strengths, and vfio especially is accelerating
> in its capability in way that vpda is not. eg if an iommufd conversion
> had been done by now for vdpa I might be more sympathetic.

Yea, I agree iommufd is a big problem with vdpa right now. Cindy was
sick and I didn't know and kept assuming she's working on this. I don't
think it's a huge amount of work though.  I'll take a look.
Is there anything else though? Do tell.

> Asking for
> someone else to do a huge amount of pointless work to improve vdpa
> just to level of this vfio driver already is at is ridiculous.
> 
> vdpa is great for certain kinds of HW, let it focus on that, don't try
> to paint it as an alternative to vfio. It isn't.
> 
> Jason

There are a bunch of things that I think are important for virtio
that are completely out of scope for vfio, such as migrating
cross-vendor. What is the huge amount of work am I asking to do?



-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Alex Williamson
On Thu, 21 Sep 2023 15:40:40 +0300
Yishai Hadas  wrote:

> Introduce a vfio driver over virtio devices to support the legacy
> interface functionality for VFs.
> 
> Background, from the virtio spec [1].
> 
> In some systems, there is a need to support a virtio legacy driver with
> a device that does not directly support the legacy interface. In such
> scenarios, a group owner device can provide the legacy interface
> functionality for the group member devices. The driver of the owner
> device can then access the legacy interface of a member device on behalf
> of the legacy member device driver.
> 
> For example, with the SR-IOV group type, group members (VFs) can not
> present the legacy interface in an I/O BAR in BAR0 as expected by the
> legacy pci driver. If the legacy driver is running inside a virtual
> machine, the hypervisor executing the virtual machine can present a
> virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
> legacy driver accesses to this I/O BAR and forwards them to the group
> owner device (PF) using group administration commands.
> 
> 
> Specifically, this driver adds support for a virtio-net VF to be exposed
> as a transitional device to a guest driver and allows the legacy IO BAR
> functionality on top.
> 
> This allows a VM which uses a legacy virtio-net driver in the guest to
> work transparently over a VF which its driver in the host is that new
> driver.
> 
> The driver can be extended easily to support some other types of virtio
> devices (e.g virtio-blk), by adding in a few places the specific type
> properties as was done for virtio-net.
> 
> For now, only the virtio-net use case was tested and as such we introduce
> the support only for such a device.
> 
> Practically,
> Upon probing a VF for a virtio-net device, in case its PF supports
> legacy access over the virtio admin commands and the VF doesn't have BAR
> 0, we set some specific 'vfio_device_ops' to be able to simulate in SW a
> transitional device with I/O BAR in BAR 0.
> 
> The existence of the simulated I/O bar is reported later on by
> overwriting the VFIO_DEVICE_GET_REGION_INFO command and the device
> exposes itself as a transitional device by overwriting some properties
> upon reading its config space.
> 
> Once we report the existence of I/O BAR as BAR 0 a legacy driver in the
> guest may use it via read/write calls according to the virtio
> specification.
> 
> Any read/write towards the control parts of the BAR will be captured by
> the new driver and will be translated into admin commands towards the
> device.
> 
> Any data path read/write access (i.e. virtio driver notifications) will
> be forwarded to the physical BAR which its properties were supplied by
> the command VIRTIO_PCI_QUEUE_NOTIFY upon the probing/init flow.
> 
> With that code in place a legacy driver in the guest has the look and
> feel as if having a transitional device with legacy support for both its
> control and data path flows.

Why do we need to enable a "legacy" driver in the guest?  The very name
suggests there's an alternative driver that perhaps doesn't require
this I/O BAR.  Why don't we just require the non-legacy driver in the
guest rather than increase our maintenance burden?  Thanks,

Alex

> 
> [1]
> https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c
> 
> Signed-off-by: Yishai Hadas 
> ---
>  MAINTAINERS  |   6 +
>  drivers/vfio/pci/Kconfig |   2 +
>  drivers/vfio/pci/Makefile|   2 +
>  drivers/vfio/pci/virtio/Kconfig  |  15 +
>  drivers/vfio/pci/virtio/Makefile |   4 +
>  drivers/vfio/pci/virtio/cmd.c|   4 +-
>  drivers/vfio/pci/virtio/cmd.h|   8 +
>  drivers/vfio/pci/virtio/main.c   | 546 +++
>  8 files changed, 585 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/vfio/pci/virtio/Kconfig
>  create mode 100644 drivers/vfio/pci/virtio/Makefile
>  create mode 100644 drivers/vfio/pci/virtio/main.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bf0f54c24f81..5098418c8389 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -22624,6 +22624,12 @@ L:   k...@vger.kernel.org
>  S:   Maintained
>  F:   drivers/vfio/pci/mlx5/
>  
> +VFIO VIRTIO PCI DRIVER
> +M:   Yishai Hadas 
> +L:   k...@vger.kernel.org
> +S:   Maintained
> +F:   drivers/vfio/pci/virtio
> +
>  VFIO PCI DEVICE SPECIFIC DRIVERS
>  R:   Jason Gunthorpe 
>  R:   Yishai Hadas 
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index 8125e5f37832..18c397df566d 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -65,4 +65,6 @@ source "drivers/vfio/pci/hisilicon/Kconfig"
>  
>  source "drivers/vfio/pci/pds/Kconfig"
>  
> +source "drivers/vfio/pci/virtio/Kconfig"
> +
>  endmenu
> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
> index 45

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 11:11:25AM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 21, 2023 at 09:16:21AM -0400, Michael S. Tsirkin wrote:
> 
> > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > index bf0f54c24f81..5098418c8389 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -22624,6 +22624,12 @@ L:   k...@vger.kernel.org
> > >  S:   Maintained
> > >  F:   drivers/vfio/pci/mlx5/
> > >  
> > > +VFIO VIRTIO PCI DRIVER
> > > +M:   Yishai Hadas 
> > > +L:   k...@vger.kernel.org
> > > +S:   Maintained
> > > +F:   drivers/vfio/pci/virtio
> > > +
> > >  VFIO PCI DEVICE SPECIFIC DRIVERS
> > >  R:   Jason Gunthorpe 
> > >  R:   Yishai Hadas 
> > 
> > Tying two subsystems together like this is going to cause pain when
> > merging. God forbid there's something e.g. virtio net specific
> > (and there's going to be for sure) - now we are talking 3
> > subsystems.
> 
> Cross subsystem stuff is normal in the kernel.

Yea. But it's completely spurious here - virtio has its own way
to work with userspace which is vdpa and let's just use that.
Keeps things nice and contained.

> Drivers should be
> placed in their most logical spot - this driver exposes a VFIO
> interface so it belongs here.
> 
> Your exact argument works the same from the VFIO perspective, someone
> has to have code that belongs to them outside their little sphere
> here.
> 
> > Case in point all other virtio drivers are nicely grouped, have a common
> > mailing list etc etc.  This one is completely separate to the point
> > where people won't even remember to copy the virtio mailing list.
> 
> The virtio mailing list should probably be added to the maintainers
> enry
> 
> Jason

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin
On Thu, Sep 21, 2023 at 03:40:40PM +0300, Yishai Hadas wrote:
> +#define VIRTIO_LEGACY_IO_BAR_HEADER_LEN 20
> +#define VIRTIO_LEGACY_IO_BAR_MSIX_HEADER_LEN 4

This is exactly part of VIRTIO_PCI_CONFIG_OFF duplicated.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Michael S. Tsirkin


>  MAINTAINERS  |   6 +
>  drivers/vfio/pci/Kconfig |   2 +
>  drivers/vfio/pci/Makefile|   2 +
>  drivers/vfio/pci/virtio/Kconfig  |  15 +
>  drivers/vfio/pci/virtio/Makefile |   4 +
>  drivers/vfio/pci/virtio/cmd.c|   4 +-
>  drivers/vfio/pci/virtio/cmd.h|   8 +
>  drivers/vfio/pci/virtio/main.c   | 546 +++
>  8 files changed, 585 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/vfio/pci/virtio/Kconfig
>  create mode 100644 drivers/vfio/pci/virtio/Makefile
>  create mode 100644 drivers/vfio/pci/virtio/main.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bf0f54c24f81..5098418c8389 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -22624,6 +22624,12 @@ L:   k...@vger.kernel.org
>  S:   Maintained
>  F:   drivers/vfio/pci/mlx5/
>  
> +VFIO VIRTIO PCI DRIVER
> +M:   Yishai Hadas 
> +L:   k...@vger.kernel.org
> +S:   Maintained
> +F:   drivers/vfio/pci/virtio
> +
>  VFIO PCI DEVICE SPECIFIC DRIVERS
>  R:   Jason Gunthorpe 
>  R:   Yishai Hadas 

Tying two subsystems together like this is going to cause pain when
merging. God forbid there's something e.g. virtio net specific
(and there's going to be for sure) - now we are talking 3 subsystems.

Case in point all other virtio drivers are nicely grouped, have a common
mailing list etc etc.  This one is completely separate to the point
where people won't even remember to copy the virtio mailing list.


diff --git a/drivers/vfio/pci/virtio/Kconfig b/drivers/vfio/pci/virtio/Kconfig
new file mode 100644
index ..89eddce8b1bd
--- /dev/null
+++ b/drivers/vfio/pci/virtio/Kconfig
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config VIRTIO_VFIO_PCI
+tristate "VFIO support for VIRTIO PCI devices"
+depends on VIRTIO_PCI
+select VFIO_PCI_CORE
+help
+  This provides support for exposing VIRTIO VF devices using the VFIO
+  framework that can work with a legacy virtio driver in the guest.
+  Based on PCIe spec, VFs do not support I/O Space; thus, VF BARs shall
+  not indicate I/O Space.
+  As of that this driver emulated I/O BAR in software to let a VF be
+  seen as a transitional device in the guest and let it work with
+  a legacy driver.
+
+  If you don't know what to do here, say N.

I don't promise we'll remember to poke at vfio if we tweak something
in the virtio kconfig.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-09-21 Thread Yishai Hadas via Virtualization
Introduce a vfio driver over virtio devices to support the legacy
interface functionality for VFs.

Background, from the virtio spec [1].

In some systems, there is a need to support a virtio legacy driver with
a device that does not directly support the legacy interface. In such
scenarios, a group owner device can provide the legacy interface
functionality for the group member devices. The driver of the owner
device can then access the legacy interface of a member device on behalf
of the legacy member device driver.

For example, with the SR-IOV group type, group members (VFs) can not
present the legacy interface in an I/O BAR in BAR0 as expected by the
legacy pci driver. If the legacy driver is running inside a virtual
machine, the hypervisor executing the virtual machine can present a
virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
legacy driver accesses to this I/O BAR and forwards them to the group
owner device (PF) using group administration commands.


Specifically, this driver adds support for a virtio-net VF to be exposed
as a transitional device to a guest driver and allows the legacy IO BAR
functionality on top.

This allows a VM which uses a legacy virtio-net driver in the guest to
work transparently over a VF which its driver in the host is that new
driver.

The driver can be extended easily to support some other types of virtio
devices (e.g virtio-blk), by adding in a few places the specific type
properties as was done for virtio-net.

For now, only the virtio-net use case was tested and as such we introduce
the support only for such a device.

Practically,
Upon probing a VF for a virtio-net device, in case its PF supports
legacy access over the virtio admin commands and the VF doesn't have BAR
0, we set some specific 'vfio_device_ops' to be able to simulate in SW a
transitional device with I/O BAR in BAR 0.

The existence of the simulated I/O bar is reported later on by
overwriting the VFIO_DEVICE_GET_REGION_INFO command and the device
exposes itself as a transitional device by overwriting some properties
upon reading its config space.

Once we report the existence of I/O BAR as BAR 0 a legacy driver in the
guest may use it via read/write calls according to the virtio
specification.

Any read/write towards the control parts of the BAR will be captured by
the new driver and will be translated into admin commands towards the
device.

Any data path read/write access (i.e. virtio driver notifications) will
be forwarded to the physical BAR which its properties were supplied by
the command VIRTIO_PCI_QUEUE_NOTIFY upon the probing/init flow.

With that code in place a legacy driver in the guest has the look and
feel as if having a transitional device with legacy support for both its
control and data path flows.

[1]
https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c

Signed-off-by: Yishai Hadas 
---
 MAINTAINERS  |   6 +
 drivers/vfio/pci/Kconfig |   2 +
 drivers/vfio/pci/Makefile|   2 +
 drivers/vfio/pci/virtio/Kconfig  |  15 +
 drivers/vfio/pci/virtio/Makefile |   4 +
 drivers/vfio/pci/virtio/cmd.c|   4 +-
 drivers/vfio/pci/virtio/cmd.h|   8 +
 drivers/vfio/pci/virtio/main.c   | 546 +++
 8 files changed, 585 insertions(+), 2 deletions(-)
 create mode 100644 drivers/vfio/pci/virtio/Kconfig
 create mode 100644 drivers/vfio/pci/virtio/Makefile
 create mode 100644 drivers/vfio/pci/virtio/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index bf0f54c24f81..5098418c8389 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -22624,6 +22624,12 @@ L: k...@vger.kernel.org
 S: Maintained
 F: drivers/vfio/pci/mlx5/
 
+VFIO VIRTIO PCI DRIVER
+M: Yishai Hadas 
+L: k...@vger.kernel.org
+S: Maintained
+F: drivers/vfio/pci/virtio
+
 VFIO PCI DEVICE SPECIFIC DRIVERS
 R: Jason Gunthorpe 
 R: Yishai Hadas 
diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 8125e5f37832..18c397df566d 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -65,4 +65,6 @@ source "drivers/vfio/pci/hisilicon/Kconfig"
 
 source "drivers/vfio/pci/pds/Kconfig"
 
+source "drivers/vfio/pci/virtio/Kconfig"
+
 endmenu
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 45167be462d8..046139a4eca5 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -13,3 +13,5 @@ obj-$(CONFIG_MLX5_VFIO_PCI)   += mlx5/
 obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/
 
 obj-$(CONFIG_PDS_VFIO_PCI) += pds/
+
+obj-$(CONFIG_VIRTIO_VFIO_PCI) += virtio/
diff --git a/drivers/vfio/pci/virtio/Kconfig b/drivers/vfio/pci/virtio/Kconfig
new file mode 100644
index ..89eddce8b1bd
--- /dev/null
+++ b/drivers/vfio/pci/virtio/Kconfig
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0-only
+confi