On Fri, May 5, 2023 at 8:49 PM Parav Pandit <[email protected]> wrote:
>
>
>
> > From: Jason Wang <[email protected]>
> > Sent: Thursday, May 4, 2023 11:27 PM
> >
> > So the "single stack" is kind of misleading, you need a dedicated virtio
> > mediation layer which has different code path than a simpler vfio-pci which 
> > is
> > completely duplicated with vDPA subsystem.
> Huh. No. it is not duplicated.
> Vfio-pci provides the framework for extension than doing simple vfio-pci for 
> extensions.

I'm not sure how to define simple here, do you mean mdev?

> I am not debating here vdpa vs non vdpa yet again.
>
> > And you lose all the advantages of
> > vDPA in this way. The device should not be designed for a single type of
> > software stack , it need to leave the decision to the hypervisor/cloud 
> > vendors.
> >
> It is left to the hypervisor/cloud user to decide to use vdpa or vfio or 
> something else.
>
> >
> > >     virtio device type (net/blk) and be future compatible with a
> > >     single vfio stack using SR-IOV or other scalable device
> > >     virtualization technology to map PCI devices to the guest VM.
> > >     (as transitional or otherwise)
> > >
> > > Motivation/Background:
> > > ----------------------
> > > The existing virtio transitional PCI device is missing support for PCI
> > > SR-IOV based devices. Currently it does not work beyond PCI PF, or as
> > > software emulated device in reality. Currently it has below cited
> > > system level limitations:
> > >
> > > [a] PCIe spec citation:
> > > VFs do not support I/O Space and thus VF BARs shall not indicate I/O 
> > > Space.
> > >
> > > [b] cpu arch citiation:
> > > Intel 64 and IA-32 Architectures Software Developer’s Manual:
> > > The processor’s I/O address space is separate and distinct from the
> > > physical-memory address space. The I/O address space consists of 64K
> > > individually addressable 8-bit I/O ports, numbered 0 through FFFFH.
> > >
> > > [c] PCIe spec citation:
> > > If a bridge implements an I/O address range,...I/O address range will
> > > be aligned to a 4 KB boundary.
> > >
> > > Above usecase requirements can be solved by PCI PF group owner
> > > enabling the access to its group member PCI VFs legacy registers using
> > > an admin virtqueue of the group owner PCI PF.
> > >
> > > Software usage example:
> > > -----------------------
> > > The most common way to use and map to the guest VM is by using vfio
> > > driver framework in Linux kernel.
> > >
> > >                  +----------------------+
> > >                  |pci_dev_id = 0x100X   |
> > > +---------------|pci_rev_id = 0x0      |-----+
> > > |vfio device    |BAR0 = I/O region     |     |
> > > |               |Other attributes      |     |
> > > |               +----------------------+     |
> > > |                                            |
> > > +   +--------------+     +-----------------+ |
> > > |   |I/O BAR to AQ |     | Other vfio      | |
> > > |   |rd/wr mapper  |     | functionalities | |
> > > |   +--------------+     +-----------------+ |
> > > |                                            |
> > > +------+-------------------------+-----------+
> > >         |                         |
> >
> >
> > So the mapper here is actually the control path mediation layer which
> > duplicates with vDPA.
> >
> Yet again no. It implements PCI level abstraction.
> It is not touching the whole QEMU layer and not at all getting involved in 
> virtio device flow of understanding device reset, and device config space, 
> cvq, features bits and more.

I think you miss the fact that QEMU can choose to not understand all
you mentioned here with the help of the general vdpa device.
Vhost-vDPA provides a much simpler device abstraction than vfio-pci.
If a cloud vendor wants a tiny/thin hypervisor layer, it can be done
through vDPA for sure.

> All of these were discussed in v0, lets not repeat.
>
> >
> > >    +----+------------+       +----+------------+
> > >    | +-----+         |       | PCI VF device A |
> > >    | | AQ  |-------------+---->+-------------+ |
> > >    | +-----+         |   |   | | legacy regs | |
> > >    | PCI PF device   |   |   | +-------------+ |
> > >    +-----------------+   |   +-----------------+
> > >                          |
> > >                          |   +----+------------+
> > >                          |   | PCI VF device N |
> > >                          +---->+-------------+ |
> > >                              | | legacy regs | |
> > >                              | +-------------+ |
> > >                              +-----------------+
> > >
> > > 2. Virtio pci driver to bind to the listed device id and
> > >     use it as native device in the host.
> >
> >
> > How this can be done now?
> >
> Currently a PCI VF binds to the virtio driver and without any vdpa layering, 
> virtio net/blk etc devices are created on top of virtio PCI VF device.
> Not sure I understood your question.

I meant the current virtio-pci driver can use what you propose here.

>
> > > +\begin{lstlisting}
> > > +struct virtio_admin_cmd_lreg_wr_data {
> > > +   u8 offset; /* Starting byte offset of the register(s) to write */
> > > +   u8 size; /* Number of bytes to write into the register. */
> > > +   u8 register[];
> > > +};
> > > +\end{lstlisting}
> >
> >
> > So this actually implements a transport, I wonder if it would be better
> > (and simpler) to do it on top of the transport vq proposal:
> >
> > https://lists.oasis-open.org/archives/virtio-comment/202208/msg00003.html
> >
> I also wonder why TVQ cannot use AQ.

It can for sure, but whether using a single virtqueue type for both
administration and transport is still questionable.

>
> > Then it aligns with SIOV natively.
> >
> SIOV is not well defined spec, whenever it is defined, it can use AQ or TVQ.
>
> We also discussed that mediation of hypervisor for control path in some use 
> case is not desired, hence I will leave that discussion to the future when 
> SIOV arrives.

We need to plan it ahead. We don't want to end up with redundant
design. For example, this proposal is actually a partial transport
implementation. Transport virtqueue can do much better in this case.

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to