On Fri, Jul 07, 2023 at 12:27:19AM +0300, Parav Pandit wrote:
> This short series introduces legacy registers access commands for the owner
> group member access the legacy registers of the member VFs.
> This short series introduces legacy region access commands by the group owner
> device for its member devices.
> Currently it is applicable to the PCI PF and VF devices. If in future any
> SIOV devices to support legacy registers, they can be easily supported using
> same commands by using the group member identifiers of the future SIOV
> devices.
>
> More details as overview, motivation, use case are further described
> below.
corneli want to apply 1,2 as editorial?
> Patch summary:
> --
> patch-1 split rows of admin opcode tables by a line
> patch-2 fix section numbering
> patch-3 add legacy region access commands
>
> It uses the newly introduced administration command facility with 4 new
> commands and a new optional command to query the legacy notification region.
>
> Usecase:
>
> 1. A hypervisor/system needs to provide transitional
>virtio devices to the guest VM at scale of thousands,
>typically, one to eight devices per VM.
>
> 2. A hypervisor/system needs to provide such devices using a
>vendor agnostic driver in the hypervisor system.
>
> 3. A hypervisor system prefers to have single stack regardless of
>virtio device type (net/blk) and be future compatible with a
>single vfio stack using SR-IOV or other scalable device
>virtualization technology to map PCI devices to the guest VM.
>(as transitional or otherwise)
>
> Motivation/Background:
> --
> The existing virtio transitional PCI device is missing support for
> PCI SR-IOV based devices. Currently it does not work beyond
> PCI PF, or as software emulated device in reality. Currently it
> has below cited system level limitations:
>
> [a] PCIe spec citation:
> VFs do not support I/O Space and thus VF BARs shall not indicate I/O Space.
>
> [b] cpu arch citiation:
> Intel 64 and IA-32 Architectures Software Developer’s Manual:
> The processor’s I/O address space is separate and distinct from
> the physical-memory address space. The I/O address space consists
> of 64K individually addressable 8-bit I/O ports, numbered 0 through H.
>
> [c] PCIe spec citation:
> If a bridge implements an I/O address range,...I/O address range will be
> aligned to a 4 KB boundary.
>
> Overview:
> -
> Above usecase requirements is solved by PCI PF group owner accessing
> its group member PCI VFs legacy registers using an admin virtqueue of
> the group owner PCI PF.
>
> Two new admin virtqueue commands are added which read/write PCI VF
> registers.
>
> Software usage example:
> ---
> One way to use and map to the guest VM is by using vfio driver
> framework in Linux kernel.
>
> +--+
> |pci_dev_id = 0x100X |
> +---|pci_rev_id = 0x0 |-+
> |vfio device|BAR0 = I/O region | |
> | |Other attributes | |
> | +--+ |
> ||
> + +--+ +-+ |
> | |I/O BAR to AQ | | Other vfio | |
> | |rd/wr mapper | | functionalities | |
> | +--+ +-+ |
> ||
> +--+-+---+
>| |
>Legacy regionDriver notification
> access |
>| |
> +++ +++
> | +-+ | | PCI VF device A |
> | | AQ |-+>+-+ |
> | +-+ | | | | legacy regs | |
> | PCI PF device | | | +-+ |
> +-+ | +-+
> |
> | +++
> | | PCI VF device N |
> +>+-+ |
> | | legacy regs | |
> | +-+ |
> +-+
>
> 2. Virtio pci driver to bind to the listed device id and
>use it in the host.
>
> 3. Use it in a light weight hypervisor to run bare-metal OS.
>
> Please review.
>
> Alternatives considered:
>
> 1. Exposing BAR0 as MMIO BAR that follows legacy registers template
> Pros:
> a. Kind of works with legacy drivers as some of them have used API
>which is agnostic to MMIO vs IOBAR.
> b. Does not require hypervisor intervantion
> Cons:
> a. Device reset is extremely hard to implement in device at scale as
>driver does not wait for device reset completion
> b. Device register width related problems persist that hypervisor if
>wishes, it cannot be fixed.
>
>