Re: [DRAFT RFC] ioeventfd/ioregionfd support in vfio-user

2021-01-28 Thread John Levon
On Tue, Jan 26, 2021 at 11:01:04AM +, Stefan Hajnoczi wrote:

> > 4) (ioregionfd issue) region_id is 4 bytes, which seems a little awkward 
> > from
> > the server side: this has to encode both the region ID as well as the 
> > offset of
> > the sub-region within that region. Can this be 64 bits wide?
> 
> The user_data field in Elena's ioregionfd patches is 64 bits. Does this
> solve the problem?

I had missed this had been changed from the proposal at 
https://www.spinics.net/lists/kvm/msg208139.html

Thanks for pointing this out.

regards
john


Re: [DRAFT RFC] ioeventfd/ioregionfd support in vfio-user

2021-01-26 Thread Stefan Hajnoczi
On Thu, Jan 21, 2021 at 04:09:09PM +, John Levon wrote:
> 
> Hi, please take a look. For context, this addition is against the current
> https://github.com/nutanix/libvfio-user/blob/master/docs/vfio-user.rst
> specification.
> 
> kvm@ readers: Stefan suggested this may be of interest from a VFIO point of
> view, in case there is any potential cross-over in defining how to wire up 
> these
> fds.

After talking to Alex Williamson about including these APIs in the
kernel VFIO ioctls it became clear to me that there is an architectural
difference between classic vfio-pci devices and vfio-user devices:

The current model with kernel VFIO is that the VMM sets up ioeventfds
using ioctl(VFIO_DEVICE_IOEVENTFD). The VMM has device-specific
knowledge that allows it to add ioeventfds for specific hardware
registers.

In vfio-user the VMM has no knowledge of the specific device. Instead
the device tells the VMM how to configure ioeventfds and ioregionfds.
There are a few reasons for this:

1. It is not practical to add knowledge of every vfio-user device
   implementation into the VMM. A cleaner solution is for the device to
   advertise its desired configuration to the VMM instead.

2. The device implementation may prefer a specific ioeventfd or
   ioregionfd configuration to fit its multi-threading architecture. A
   multi-threaded (multi-queue) device implementation may want
   ioeventfds and ioregionfds to be configured so that the right thread
   receives certain MMIO/PIO accesses.

3. The device implementation may want to specify ioregionfd's user_data
   value. This is the application-specific value that is passed along
   with an MMIO/PIO access.

I think mdev fits in somewhere between these two models. I can imagine
complex mdev devices wanting to control the ioeventfd and ioregionfd
configuration just like vfio-user. But in simple cases it doesn't
matter.

I suggest including John's ioeventfd/ioregionfd work into the kernel
VFIO interface so mdev devices can take advantage of it too.

> Note that this is a new message instead of adding a new region capability to
> VFIO_USER_DEVICE_GET_REGION_INFO: with a new capability, there's no way for 
> the
> server to know if ioeventfd/ioregionfd is actually used/requested by the 
> client
> (the client would just have to close those fds if it didn't want to use FDs). 
> An
> explicit new call is much clearer for this.
> 
> The ioregionfd feature itself is at proposal stage, so there's a good chance 
> of
> further changes depending on that.
> 
> I also have these pending issues so far:
> 
> 1) I'm not familiar with CCW bus, so don't know if
> KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY flag makes sense or is supportable in
> vfio-user context
> 
> 2) if ioregionfd subsumes all ioeventfd use cases, we can remove the 
> distinction
> from this API, and the client caller can translate to ioregionfd or ioeventfd 
> as
> needed
> 
> 3) is it neccessary for the client to indicate support (e.g. lacking 
> ioregionfd
> support) ?
> 
> 4) (ioregionfd issue) region_id is 4 bytes, which seems a little awkward from
> the server side: this has to encode both the region ID as well as the offset 
> of
> the sub-region within that region. Can this be 64 bits wide?

The user_data field in Elena's ioregionfd patches is 64 bits. Does this
solve the problem?

> 
> thanks
> john
> 
> (NB: I edited the diff so the new sections are more readable.)
> 
> diff --git a/docs/vfio-user.rst b/docs/vfio-user.rst
> index e3adc7a..e7274a2 100644
> --- a/docs/vfio-user.rst
> +++ b/docs/vfio-user.rst
> @@ -161,6 +161,17 @@ in the region info reply of a device-specific region. 
> Such regions are reflected
>  in ``struct vfio_device_info.num_regions``. Thus, for PCI devices this value 
> can
>  be equal to, or higher than, VFIO_PCI_NUM_REGIONS.
>  
> +Region I/O via file descriptors
> +---
> +
> +For unmapped regions, region I/O from the client is done via
> +VFIO_USER_REGION_READ/WRITE.  As an optimization, ioeventfds or ioregionfds 
> may
> +be configured for sub-regions of some regions. A client may request 
> information
> +on these sub-regions via VFIO_USER_DEVICE_GET_REGION_IO_FDS; by configuring 
> the
> +returned file descriptors as ioeventfds or ioregionfds, the server can be
> +directly notified of I/O (for example, by KVM) without taking a trip through 
> the
> +client.
> +
>  Interrupts
>  ^^
>  The client uses VFIO_USER_DEVICE_GET_IRQ_INFO messages to query the server 
> for
> @@ -293,37 +304,39 @@ Commands
>  The following table lists the VFIO message command IDs, and whether the
>  message command is sent from the client or the server.
>  
> -+--+-+---+
> -| Name | Command | Request Direction |
> -+==+=+===+
> -| VFIO_USER_VERSION| 1   | client -> server  |
> 

[DRAFT RFC] ioeventfd/ioregionfd support in vfio-user

2021-01-21 Thread John Levon


Hi, please take a look. For context, this addition is against the current
https://github.com/nutanix/libvfio-user/blob/master/docs/vfio-user.rst
specification.

kvm@ readers: Stefan suggested this may be of interest from a VFIO point of
view, in case there is any potential cross-over in defining how to wire up these
fds.

Note that this is a new message instead of adding a new region capability to
VFIO_USER_DEVICE_GET_REGION_INFO: with a new capability, there's no way for the
server to know if ioeventfd/ioregionfd is actually used/requested by the client
(the client would just have to close those fds if it didn't want to use FDs). An
explicit new call is much clearer for this.

The ioregionfd feature itself is at proposal stage, so there's a good chance of
further changes depending on that.

I also have these pending issues so far:

1) I'm not familiar with CCW bus, so don't know if
KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY flag makes sense or is supportable in
vfio-user context

2) if ioregionfd subsumes all ioeventfd use cases, we can remove the distinction
from this API, and the client caller can translate to ioregionfd or ioeventfd as
needed

3) is it neccessary for the client to indicate support (e.g. lacking ioregionfd
support) ?

4) (ioregionfd issue) region_id is 4 bytes, which seems a little awkward from
the server side: this has to encode both the region ID as well as the offset of
the sub-region within that region. Can this be 64 bits wide?

thanks
john

(NB: I edited the diff so the new sections are more readable.)

diff --git a/docs/vfio-user.rst b/docs/vfio-user.rst
index e3adc7a..e7274a2 100644
--- a/docs/vfio-user.rst
+++ b/docs/vfio-user.rst
@@ -161,6 +161,17 @@ in the region info reply of a device-specific region. Such 
regions are reflected
 in ``struct vfio_device_info.num_regions``. Thus, for PCI devices this value 
can
 be equal to, or higher than, VFIO_PCI_NUM_REGIONS.
 
+Region I/O via file descriptors
+---
+
+For unmapped regions, region I/O from the client is done via
+VFIO_USER_REGION_READ/WRITE.  As an optimization, ioeventfds or ioregionfds may
+be configured for sub-regions of some regions. A client may request information
+on these sub-regions via VFIO_USER_DEVICE_GET_REGION_IO_FDS; by configuring the
+returned file descriptors as ioeventfds or ioregionfds, the server can be
+directly notified of I/O (for example, by KVM) without taking a trip through 
the
+client.
+
 Interrupts
 ^^
 The client uses VFIO_USER_DEVICE_GET_IRQ_INFO messages to query the server for
@@ -293,37 +304,39 @@ Commands
 The following table lists the VFIO message command IDs, and whether the
 message command is sent from the client or the server.
 
-+--+-+---+
-| Name | Command | Request Direction |
-+==+=+===+
-| VFIO_USER_VERSION| 1   | client -> server  |
-+--+-+---+
-| VFIO_USER_DMA_MAP| 2   | client -> server  |
-+--+-+---+
-| VFIO_USER_DMA_UNMAP  | 3   | client -> server  |
-+--+-+---+
-| VFIO_USER_DEVICE_GET_INFO| 4   | client -> server  |
-+--+-+---+
-| VFIO_USER_DEVICE_GET_REGION_INFO | 5   | client -> server  |
-+--+-+---+
-| VFIO_USER_DEVICE_GET_IRQ_INFO| 6   | client -> server  |
-+--+-+---+
-| VFIO_USER_DEVICE_SET_IRQS| 7   | client -> server  |
-+--+-+---+
-| VFIO_USER_REGION_READ| 8   | client -> server  |
-+--+-+---+
-| VFIO_USER_REGION_WRITE   | 9   | client -> server  |
-+--+-+---+
-| VFIO_USER_DMA_READ   | 10  | server -> client  |
-+--+-+---+
-| VFIO_USER_DMA_WRITE  | 11  | server -> client  |
-+--+-+---+
-| VFIO_USER_VM_INTERRUPT   | 12  | server -> client  |
-+--+-+---+
-| VFIO_USER_DEVICE_RESET   | 13  | client -> server  |
-+--+-+---+
-| VFIO_USER_DIRTY_PAGES| 14  | client -> server  |
-+--+-+---+
+++-+---+
+| Name   | Command | Request Direction |