Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-19 Thread Michael S. Tsirkin
On Fri, May 19, 2017 at 03:46:36PM +0800, Jason Wang wrote:
> 
> 
> On 2017年05月18日 16:43, Maxime Coquelin wrote:
> > > > 
> > > > +When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the
> > > > slave, and the
> > > > +master initiated the slave to master communication channel using the
> > > > +VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB
> > > > miss and access
> > > > +failure events by sending VHOST_USER_SLAVE_IOTLB_MSG requests
> > > > to the master
> > > > +with a struct vhost_iotlb_msg payload. For miss events, the
> > > > iotlb payload has
> > > > +to be filled with the miss message type (1), the I/O virtual
> > > > address and the
> > > > +permissions flags. For access failure event, the iotlb payload
> > > > has to be
> > > > +filled with the access failure message type (4), the I/O
> > > > virtual address and
> > > > +the permissions flags.
> > > 
> > > I don't think slave should cache invalid entries. If it does not,
> > > how can it detect access failure as opposed to a miss?
> > 
> > Of course, invalid cache entries should not be cached.
> > The VHOST_IOTLB_ACCESS_FAIL has been specified for the Kernel backend,
> > even if the latter does not implement it yet.
> 
> Yes, I leave this for future use e.g reporting copy_to_user() failure to
> userspace.
> 
> Thanks

Interesting. And it's not handled now.
So let's add a text "reserved for reporting internal access
errors in the future. Should not be used for now.".

-- 
MST



Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-19 Thread Maxime Coquelin



On 05/19/2017 08:48 AM, Jason Wang wrote:



On 2017年05月17日 22:10, Maxime Coquelin wrote:



On 05/17/2017 04:53 AM, Jason Wang wrote:



On 2017年05月16日 23:16, Michael S. Tsirkin wrote:

On Mon, May 15, 2017 at 01:45:28PM +0800, Jason Wang wrote:


On 2017年05月13日 08:02, Michael S. Tsirkin wrote:

On Fri, May 12, 2017 at 04:21:58PM +0200, Maxime Coquelin wrote:

On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:

On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:

This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin 
---
docs/specs/vhost-user.txt | 75 
+++

hw/virtio/vhost-user.c| 31 
2 files changed, 106 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 5fa7016..4a1f0c3 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -97,6 +97,23 @@ Depending on the request type, payload can be:
   log offset: offset from start of supplied file descriptor
   where logging starts (i.e. where guest address 0 
would be logged)

+ * An IOTLB message
+ -
+   | iova | size | user address | permissions flags | type |
+ -
+
+   IOVA: a 64-bit guest I/O virtual address

guest -> VM

Ok.


+   Size: a 64-bit size

How do you specify "all memory"? give special meaning to size 0?

Good point, it does not support all memory currently.
It is not vhost-user specific, but general to the vhost 
implementation.

But iommu needs it to support passthrough.

Probably not, we will just pass the mappings in vhost_memory_region to
vhost. Its memory_size is also a __u64.

Thanks

That's different since that's chunks of qemu virtual memory.

IOMMU maps IOVA to GPA.



But we're in fact cache IOVA -> HVA mapping in the remote IOTLB. When 
passthrough mode is enabled, IOVA == GPA, so passing mappings in 
vhost_memory_region should be fine.


Not sure this is a good idea, because when configured in passthrough,
QEMU will see the IOMMU as enabled, so the the VIRTIO_F_IOMMU_PLATFORM
feature will be negotiated if both guest and backend support it.
So how the backend will know whether it should directly pick the
translation directly into the vhost_memory_region, or translate it
through the device IOTLB?


This no need for backend to know about this, since IOVA equals to GPA, 
vhost_memory_region stores IOVA -> HVA mapping. If we pass them, device 
IOTLB should work as usual?


Ok, I think there were a misunderstanding. I understood you said there
were no need to use the device IOTLB in this case.





Maybe the solution would be for QEMU to wrap "all memory" IOTLB updates
& invalidations to vhost_memory_regions, since the backend won't anyway
be able to perform accesses outside these regions?


This is just what I mean, you can refer Peter's series. >


The only possible "issue" with "all memory" is if you can not use a 
single TLB invalidation to invalidate all caches in remote TLB.


If needed, maybe we could introduce a new VHOST_IOTLB_INVALIDATE message
type?
For older kernel backend that doesn't support it, -EINVAL will be
returned, so QEMU could handle it another way in this case.


We could, but not sure it was really needed.


I meant VHOST_IOTLB_INVALIDATE_ALL, and yes, I'm not sure this is
needed. But this is an option we have it turns out to be at some point.

Thanks,
Maxime



Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-19 Thread Jason Wang



On 2017年05月18日 16:43, Maxime Coquelin wrote:


+When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, 
and the

+master initiated the slave to master communication channel using the
+VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss 
and access
+failure events by sending VHOST_USER_SLAVE_IOTLB_MSG requests to 
the master
+with a struct vhost_iotlb_msg payload. For miss events, the iotlb 
payload has
+to be filled with the miss message type (1), the I/O virtual 
address and the
+permissions flags. For access failure event, the iotlb payload has 
to be
+filled with the access failure message type (4), the I/O virtual 
address and

+the permissions flags.


I don't think slave should cache invalid entries. If it does not,
how can it detect access failure as opposed to a miss?


Of course, invalid cache entries should not be cached.
The VHOST_IOTLB_ACCESS_FAIL has been specified for the Kernel backend,
even if the latter does not implement it yet. 


Yes, I leave this for future use e.g reporting copy_to_user() failure to 
userspace.


Thanks



Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-18 Thread Jason Wang



On 2017年05月17日 22:10, Maxime Coquelin wrote:



On 05/17/2017 04:53 AM, Jason Wang wrote:



On 2017年05月16日 23:16, Michael S. Tsirkin wrote:

On Mon, May 15, 2017 at 01:45:28PM +0800, Jason Wang wrote:


On 2017年05月13日 08:02, Michael S. Tsirkin wrote:

On Fri, May 12, 2017 at 04:21:58PM +0200, Maxime Coquelin wrote:

On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:

On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:

This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin 
---
docs/specs/vhost-user.txt | 75 
+++

hw/virtio/vhost-user.c| 31 
2 files changed, 106 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 5fa7016..4a1f0c3 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -97,6 +97,23 @@ Depending on the request type, payload can be:
   log offset: offset from start of supplied file descriptor
   where logging starts (i.e. where guest address 0 
would be logged)

+ * An IOTLB message
+ -
+   | iova | size | user address | permissions flags | type |
+ -
+
+   IOVA: a 64-bit guest I/O virtual address

guest -> VM

Ok.


+   Size: a 64-bit size

How do you specify "all memory"? give special meaning to size 0?

Good point, it does not support all memory currently.
It is not vhost-user specific, but general to the vhost 
implementation.

But iommu needs it to support passthrough.

Probably not, we will just pass the mappings in vhost_memory_region to
vhost. Its memory_size is also a __u64.

Thanks

That's different since that's chunks of qemu virtual memory.

IOMMU maps IOVA to GPA.



But we're in fact cache IOVA -> HVA mapping in the remote IOTLB. When 
passthrough mode is enabled, IOVA == GPA, so passing mappings in 
vhost_memory_region should be fine.


Not sure this is a good idea, because when configured in passthrough,
QEMU will see the IOMMU as enabled, so the the VIRTIO_F_IOMMU_PLATFORM
feature will be negotiated if both guest and backend support it.
So how the backend will know whether it should directly pick the
translation directly into the vhost_memory_region, or translate it
through the device IOTLB?


This no need for backend to know about this, since IOVA equals to GPA, 
vhost_memory_region stores IOVA -> HVA mapping. If we pass them, device 
IOTLB should work as usual?




Maybe the solution would be for QEMU to wrap "all memory" IOTLB updates
& invalidations to vhost_memory_regions, since the backend won't anyway
be able to perform accesses outside these regions?


This is just what I mean, you can refer Peter's series.



The only possible "issue" with "all memory" is if you can not use a 
single TLB invalidation to invalidate all caches in remote TLB.


If needed, maybe we could introduce a new VHOST_IOTLB_INVALIDATE message
type?
For older kernel backend that doesn't support it, -EINVAL will be
returned, so QEMU could handle it another way in this case.


We could, but not sure it was really needed.



But this is only theoretical problem since it only happen when we 
have a 1 byte mapping [2^64 - 1, 2^64) cached in remote TLB. Consider:


- E.g intel IOMMU has a range limitation for invalidation (1G currently)
- Looks like all existed IOMMU use page aligned mappings

It was probably not a big issue. And for safety we could use two 
invalidations to make sure all caches were flushed remotely. Or just 
change the protocol from start, size to start, end. Vhost-kernel is 
probably too late for this change, but I'm still not quite sure it is 
worthwhile.


I'm not for diverging the protocol between kernel & user backends.

Thanks,
Maxime


Yes, agree.

Thanks




Thanks





Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-18 Thread Maxime Coquelin



On 05/17/2017 06:48 PM, Michael S. Tsirkin wrote:

On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:

This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin 
---
  docs/specs/vhost-user.txt | 75 +++
  hw/virtio/vhost-user.c| 31 
  2 files changed, 106 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 5fa7016..4a1f0c3 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -97,6 +97,23 @@ Depending on the request type, payload can be:
 log offset: offset from start of supplied file descriptor
 where logging starts (i.e. where guest address 0 would be logged)
  
+ * An IOTLB message

+   -
+   | iova | size | user address | permissions flags | type |
+   -
+
+   IOVA: a 64-bit guest I/O virtual address
+   Size: a 64-bit size
+   User address: a 64-bit user address
+   Permissions flags: a 8-bit bit field:
+- Bit 0: Read access
+- Bit 1: Write access
+   Type: a 8-bit IOTLB message type:
+- 1: IOTLB miss
+- 2: IOTLB update
+- 3: IOTLB invalidate
+- 4: IOTLB access fail
+
  In QEMU the vhost-user message is implemented with the following struct:
  
  typedef struct VhostUserMsg {

@@ -109,6 +126,7 @@ typedef struct VhostUserMsg {
  struct vhost_vring_addr addr;
  VhostUserMemory memory;
  VhostUserLog log;
+struct vhost_iotlb_msg iotlb;
  };
  } QEMU_PACKED VhostUserMsg;
  
@@ -253,6 +271,31 @@ Once the source has finished migration, rings will be stopped by

  the source. No further update must be done before rings are
  restarted.
  
+IOMMU support

+-
+
+When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master has
+to send IOTLB entries update & invalidation by sending VHOST_USER_IOTLB_MSG
+requests to the slave with a struct vhost_iotlb_msg payload. For update events,
+the iotlb payload has to be filled with the update message type (2), the I/O
+virtual address, the size, the user virtual address, and the permissions
+flags. For invalidation events, the iotlb payload has to be filled with the
+invalidation message type (3), the I/O virtual address and the size. On
+success, the slave is expected to reply with a zero payload, non-zero
+otherwise.
+
+When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and the
+master initiated the slave to master communication channel using the
+VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and access
+failure events by sending VHOST_USER_SLAVE_IOTLB_MSG requests to the master
+with a struct vhost_iotlb_msg payload. For miss events, the iotlb payload has
+to be filled with the miss message type (1), the I/O virtual address and the
+permissions flags. For access failure event, the iotlb payload has to be
+filled with the access failure message type (4), the I/O virtual address and
+the permissions flags.


I don't think slave should cache invalid entries. If it does not,
how can it detect access failure as opposed to a miss?


Of course, invalid cache entries should not be cached.
The VHOST_IOTLB_ACCESS_FAIL has been specified for the Kernel backend,
even if the latter does not implement it yet.

In the case of user backend, I think it could be used without caching
entries.

After the backend sends a miss requests, the master will send an update
requests.

If for some reasons, the update entry is invalid (e.g. is outside the
memory ranges shared by set_mem_table), it won't be inserted into the
cache. So, when looking again at the cache, the backend will still face
a miss, and so can notify the master of the failing access.

That said, I specified this message type to mimic the kernel backend, I
don't what the master could do of such information.




For synchronization purpose, the slave may rely on the
+reply-ack feature, so the master may send a reply when operation is completed


What does completed mean in this context?


Completed means either the master has sent the IOTLB entry update to the
slave through the master->slave channel, or the IOVA is invalid (so
nothing sent).




+if the reply-ack feature is negotiated and slaves requests a reply.
+


This is not very clear to me.
So slave sends an access to master. Master finds a pte that
overlaps. What does it send to guest? Initial PTE?
All PTEs to cover the request? Part of the PTE that overlaps
the request?


Actually, the slave only sends the IOVA and access permissions, not the
size. So the master will send the PTE overlap

Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-17 Thread Michael S. Tsirkin
On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:
> This patch specifies and implements the master/slave communication
> to support device IOTLB in slave.
> 
> The vhost_iotlb_msg structure introduced for kernel backends is
> re-used, making the design close between the two backends.
> 
> An exception is the use of the secondary channel to enable the
> slave to send IOTLB miss requests to the master.
> 
> Signed-off-by: Maxime Coquelin 
> ---
>  docs/specs/vhost-user.txt | 75 
> +++
>  hw/virtio/vhost-user.c| 31 
>  2 files changed, 106 insertions(+)
> 
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 5fa7016..4a1f0c3 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -97,6 +97,23 @@ Depending on the request type, payload can be:
> log offset: offset from start of supplied file descriptor
> where logging starts (i.e. where guest address 0 would be logged)
>  
> + * An IOTLB message
> +   -
> +   | iova | size | user address | permissions flags | type |
> +   -
> +
> +   IOVA: a 64-bit guest I/O virtual address
> +   Size: a 64-bit size
> +   User address: a 64-bit user address
> +   Permissions flags: a 8-bit bit field:
> +- Bit 0: Read access
> +- Bit 1: Write access
> +   Type: a 8-bit IOTLB message type:
> +- 1: IOTLB miss
> +- 2: IOTLB update
> +- 3: IOTLB invalidate
> +- 4: IOTLB access fail
> +
>  In QEMU the vhost-user message is implemented with the following struct:
>  
>  typedef struct VhostUserMsg {
> @@ -109,6 +126,7 @@ typedef struct VhostUserMsg {
>  struct vhost_vring_addr addr;
>  VhostUserMemory memory;
>  VhostUserLog log;
> +struct vhost_iotlb_msg iotlb;
>  };
>  } QEMU_PACKED VhostUserMsg;
>  
> @@ -253,6 +271,31 @@ Once the source has finished migration, rings will be 
> stopped by
>  the source. No further update must be done before rings are
>  restarted.
>  
> +IOMMU support
> +-
> +
> +When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master has
> +to send IOTLB entries update & invalidation by sending VHOST_USER_IOTLB_MSG
> +requests to the slave with a struct vhost_iotlb_msg payload. For update 
> events,
> +the iotlb payload has to be filled with the update message type (2), the I/O
> +virtual address, the size, the user virtual address, and the permissions
> +flags. For invalidation events, the iotlb payload has to be filled with the
> +invalidation message type (3), the I/O virtual address and the size. On
> +success, the slave is expected to reply with a zero payload, non-zero
> +otherwise.
> +
> +When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and the
> +master initiated the slave to master communication channel using the
> +VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and access
> +failure events by sending VHOST_USER_SLAVE_IOTLB_MSG requests to the master
> +with a struct vhost_iotlb_msg payload. For miss events, the iotlb payload has
> +to be filled with the miss message type (1), the I/O virtual address and the
> +permissions flags. For access failure event, the iotlb payload has to be
> +filled with the access failure message type (4), the I/O virtual address and
> +the permissions flags.

I don't think slave should cache invalid entries. If it does not,
how can it detect access failure as opposed to a miss?

> For synchronization purpose, the slave may rely on the
> +reply-ack feature, so the master may send a reply when operation is completed

What does completed mean in this context?

> +if the reply-ack feature is negotiated and slaves requests a reply.
> +

This is not very clear to me.

So slave sends an access to master. Master finds a pte that
overlaps. What does it send to guest? Initial PTE?
All PTEs to cover the request? Part of the PTE that overlaps
the request?


>  Slave communication
>  ---
>  
> @@ -514,6 +557,38 @@ Master message types
>If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must respond
>with zero for success, non-zero otherwise.
>  
> + * VHOST_USER_IOTLB_MSG
> +
> +  Id: 22
> +  Equivalent ioctl: N/A (equivalent to VHOST_IOTLB_MSG message type)
> +  Master payload: struct vhost_iotlb_msg
> +  Slave payload: u64
> +
> +  Send IOTLB messages with struct vhost_iotlb_msg as payload.
> +  Master sends such requests to update and invalidate entries in the 
> device
> +  IOTLB. The slave has to acknowledge the request with sending zero as 
> u64
> +  payload for success, non-zero otherwise.
> +  This request should be send only when VIRTIO_F_IOMMU_PLATFORM feature
> +  has been successfully negotiated.
> +
> +Slave message types
> +---
> +
> + * VHOST_USER_SLAVE_IOTL

Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-17 Thread Maxime Coquelin



On 05/13/2017 02:02 AM, Michael S. Tsirkin wrote:

@@ -253,6 +271,31 @@ Once the source has finished migration, rings will be 
stopped by
   the source. No further update must be done before rings are
   restarted.
+IOMMU support
+-
+
+When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master has
+to send IOTLB entries update & invalidation by sending VHOST_USER_IOTLB_MSG
+requests to the slave with a struct vhost_iotlb_msg payload.

Always? This seems a bit strange since iommu can be enabled/disabled
dynamically.

Ok, what about:

When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated and iommu
is enbaled, the master sends IOTLB entries update & invalidation via
VHOST_USER_IOTLB_MSG requests to the slave with a struct vhost_iotlb_msg
payload.



Closing channel seems like a wrong thing to do for this.

Sorry, I'm not sure to get your comment.

What happens when guest disables the IOMMU?


I think you mean for example when rebooting the guest with IOMMU
disabled. In this case, I think that the user backend should clean its
IOTLB cache on the next SET_FEATURE call, as the kernel backend does.

Note that I was wrong when stating:
"When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated and iommu
is enabled..."

Actually, VIRTIO_F_IOMMU_PLATFORM is negotiated if both supported by the
backend and the guest, and if an iommu device is attached. If the guest
kernel doesn't enable the IOMMU (e.g. intel_iommu=off in kernel
cmdline), the master replies to IOTLB misses with identity IOTLB
entries updates (Passthrough) (it requires recent fixes from Peter to
work[0]).

Maxime

[0]: http://patchwork.ozlabs.org/patch/763379/



Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-17 Thread Maxime Coquelin



On 05/17/2017 04:53 AM, Jason Wang wrote:



On 2017年05月16日 23:16, Michael S. Tsirkin wrote:

On Mon, May 15, 2017 at 01:45:28PM +0800, Jason Wang wrote:


On 2017年05月13日 08:02, Michael S. Tsirkin wrote:

On Fri, May 12, 2017 at 04:21:58PM +0200, Maxime Coquelin wrote:

On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:

On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:

This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin 
---
docs/specs/vhost-user.txt | 75 
+++

hw/virtio/vhost-user.c| 31 
2 files changed, 106 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 5fa7016..4a1f0c3 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -97,6 +97,23 @@ Depending on the request type, payload can be:
   log offset: offset from start of supplied file descriptor
   where logging starts (i.e. where guest address 0 would 
be logged)

+ * An IOTLB message
+   -
+   | iova | size | user address | permissions flags | type |
+   -
+
+   IOVA: a 64-bit guest I/O virtual address

guest -> VM

Ok.


+   Size: a 64-bit size

How do you specify "all memory"? give special meaning to size 0?

Good point, it does not support all memory currently.
It is not vhost-user specific, but general to the vhost 
implementation.

But iommu needs it to support passthrough.

Probably not, we will just pass the mappings in vhost_memory_region to
vhost. Its memory_size is also a __u64.

Thanks

That's different since that's chunks of qemu virtual memory.

IOMMU maps IOVA to GPA.



But we're in fact cache IOVA -> HVA mapping in the remote IOTLB. When 
passthrough mode is enabled, IOVA == GPA, so passing mappings in 
vhost_memory_region should be fine.


Not sure this is a good idea, because when configured in passthrough,
QEMU will see the IOMMU as enabled, so the the VIRTIO_F_IOMMU_PLATFORM
feature will be negotiated if both guest and backend support it.
So how the backend will know whether it should directly pick the
translation directly into the vhost_memory_region, or translate it
through the device IOTLB?

Maybe the solution would be for QEMU to wrap "all memory" IOTLB updates
& invalidations to vhost_memory_regions, since the backend won't anyway
be able to perform accesses outside these regions?

The only possible "issue" with "all memory" is if you can not use a 
single TLB invalidation to invalidate all caches in remote TLB.


If needed, maybe we could introduce a new VHOST_IOTLB_INVALIDATE message
type?
For older kernel backend that doesn't support it, -EINVAL will be
returned, so QEMU could handle it another way in this case.

But this 
is only theoretical problem since it only happen when we have a 1 byte 
mapping [2^64 - 1, 2^64) cached in remote TLB. Consider:


- E.g intel IOMMU has a range limitation for invalidation (1G currently)
- Looks like all existed IOMMU use page aligned mappings

It was probably not a big issue. And for safety we could use two 
invalidations to make sure all caches were flushed remotely. Or just 
change the protocol from start, size to start, end. Vhost-kernel is 
probably too late for this change, but I'm still not quite sure it is 
worthwhile.


I'm not for diverging the protocol between kernel & user backends.

Thanks,
Maxime


Thanks




Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-16 Thread Jason Wang



On 2017年05月16日 23:16, Michael S. Tsirkin wrote:

On Mon, May 15, 2017 at 01:45:28PM +0800, Jason Wang wrote:


On 2017年05月13日 08:02, Michael S. Tsirkin wrote:

On Fri, May 12, 2017 at 04:21:58PM +0200, Maxime Coquelin wrote:

On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:

On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:

This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin 
---
docs/specs/vhost-user.txt | 75 
+++
hw/virtio/vhost-user.c| 31 
2 files changed, 106 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 5fa7016..4a1f0c3 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -97,6 +97,23 @@ Depending on the request type, payload can be:
   log offset: offset from start of supplied file descriptor
   where logging starts (i.e. where guest address 0 would be logged)
+ * An IOTLB message
+   -
+   | iova | size | user address | permissions flags | type |
+   -
+
+   IOVA: a 64-bit guest I/O virtual address

guest -> VM

Ok.


+   Size: a 64-bit size

How do you specify "all memory"? give special meaning to size 0?

Good point, it does not support all memory currently.
It is not vhost-user specific, but general to the vhost implementation.

But iommu needs it to support passthrough.

Probably not, we will just pass the mappings in vhost_memory_region to
vhost. Its memory_size is also a __u64.

Thanks

That's different since that's chunks of qemu virtual memory.

IOMMU maps IOVA to GPA.



But we're in fact cache IOVA -> HVA mapping in the remote IOTLB. When 
passthrough mode is enabled, IOVA == GPA, so passing mappings in 
vhost_memory_region should be fine.


The only possible "issue" with "all memory" is if you can not use a 
single TLB invalidation to invalidate all caches in remote TLB. But this 
is only theoretical problem since it only happen when we have a 1 byte 
mapping [2^64 - 1, 2^64) cached in remote TLB. Consider:


- E.g intel IOMMU has a range limitation for invalidation (1G currently)
- Looks like all existed IOMMU use page aligned mappings

It was probably not a big issue. And for safety we could use two 
invalidations to make sure all caches were flushed remotely. Or just 
change the protocol from start, size to start, end. Vhost-kernel is 
probably too late for this change, but I'm still not quite sure it is 
worthwhile.


Thanks



Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-16 Thread Michael S. Tsirkin
On Mon, May 15, 2017 at 01:45:28PM +0800, Jason Wang wrote:
> 
> 
> On 2017年05月13日 08:02, Michael S. Tsirkin wrote:
> > On Fri, May 12, 2017 at 04:21:58PM +0200, Maxime Coquelin wrote:
> > > 
> > > On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:
> > > > On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:
> > > > > This patch specifies and implements the master/slave communication
> > > > > to support device IOTLB in slave.
> > > > > 
> > > > > The vhost_iotlb_msg structure introduced for kernel backends is
> > > > > re-used, making the design close between the two backends.
> > > > > 
> > > > > An exception is the use of the secondary channel to enable the
> > > > > slave to send IOTLB miss requests to the master.
> > > > > 
> > > > > Signed-off-by: Maxime Coquelin 
> > > > > ---
> > > > >docs/specs/vhost-user.txt | 75 
> > > > > +++
> > > > >hw/virtio/vhost-user.c| 31 
> > > > >2 files changed, 106 insertions(+)
> > > > > 
> > > > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > > > > index 5fa7016..4a1f0c3 100644
> > > > > --- a/docs/specs/vhost-user.txt
> > > > > +++ b/docs/specs/vhost-user.txt
> > > > > @@ -97,6 +97,23 @@ Depending on the request type, payload can be:
> > > > >   log offset: offset from start of supplied file descriptor
> > > > >   where logging starts (i.e. where guest address 0 would be 
> > > > > logged)
> > > > > + * An IOTLB message
> > > > > +   -
> > > > > +   | iova | size | user address | permissions flags | type |
> > > > > +   -
> > > > > +
> > > > > +   IOVA: a 64-bit guest I/O virtual address
> > > > guest -> VM
> > > Ok.
> > > 
> > > > > +   Size: a 64-bit size
> > > > How do you specify "all memory"? give special meaning to size 0?
> > > Good point, it does not support all memory currently.
> > > It is not vhost-user specific, but general to the vhost implementation.
> > But iommu needs it to support passthrough.
> 
> Probably not, we will just pass the mappings in vhost_memory_region to
> vhost. Its memory_size is also a __u64.
> 
> Thanks

That's different since that's chunks of qemu virtual memory.

IOMMU maps IOVA to GPA.

-- 
MST



Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-16 Thread Michael S. Tsirkin
On Tue, May 16, 2017 at 10:19:24AM +0200, Maxime Coquelin wrote:
> 
> 
> On 05/12/2017 04:21 PM, Maxime Coquelin wrote:
> > 
> > 
> > On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:
> > > On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:
> > > > This patch specifies and implements the master/slave communication
> > > > to support device IOTLB in slave.
> > > > 
> > > > The vhost_iotlb_msg structure introduced for kernel backends is
> > > > re-used, making the design close between the two backends.
> > > > 
> > > > An exception is the use of the secondary channel to enable the
> > > > slave to send IOTLB miss requests to the master.
> > > > 
> > > > Signed-off-by: Maxime Coquelin 
> > > > ---
> > > >   docs/specs/vhost-user.txt | 75
> > > > +++
> > > >   hw/virtio/vhost-user.c| 31 
> > > >   2 files changed, 106 insertions(+)
> > > > 
> > > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > > > index 5fa7016..4a1f0c3 100644
> > > > --- a/docs/specs/vhost-user.txt
> > > > +++ b/docs/specs/vhost-user.txt
> > > > @@ -97,6 +97,23 @@ Depending on the request type, payload can be:
> > > >  log offset: offset from start of supplied file descriptor
> > > >  where logging starts (i.e. where guest address 0 would
> > > > be logged)
> > > > + * An IOTLB message
> > > > +   -
> > > > +   | iova | size | user address | permissions flags | type |
> > > > +   -
> > > > +
> > > > +   IOVA: a 64-bit guest I/O virtual address
> > > 
> > > guest -> VM
> > 
> > Ok.
> 
> It seems that VM is never used in the doc, but guest is:
> $ grep -i guest docs/specs/vhost-user.txt | wc -l
> 13
> grep -i vm docs/specs/vhost-user.txt | wc -l
> 0
> 
> I think I should keep "guest" for consistency?
> 
> Thanks,
> Maxime

"I/O virtual address programmed by the guest" ?




Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-16 Thread Maxime Coquelin



On 05/12/2017 04:21 PM, Maxime Coquelin wrote:



On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:

On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:

This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin 
---
  docs/specs/vhost-user.txt | 75 
+++

  hw/virtio/vhost-user.c| 31 
  2 files changed, 106 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 5fa7016..4a1f0c3 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -97,6 +97,23 @@ Depending on the request type, payload can be:
 log offset: offset from start of supplied file descriptor
 where logging starts (i.e. where guest address 0 would be 
logged)

+ * An IOTLB message
+   -
+   | iova | size | user address | permissions flags | type |
+   -
+
+   IOVA: a 64-bit guest I/O virtual address


guest -> VM


Ok.


It seems that VM is never used in the doc, but guest is:
$ grep -i guest docs/specs/vhost-user.txt | wc -l
13
grep -i vm docs/specs/vhost-user.txt | wc -l
0

I think I should keep "guest" for consistency?

Thanks,
Maxime



Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-14 Thread Jason Wang



On 2017年05月13日 08:02, Michael S. Tsirkin wrote:

On Fri, May 12, 2017 at 04:21:58PM +0200, Maxime Coquelin wrote:


On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:

On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:

This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin 
---
   docs/specs/vhost-user.txt | 75 
+++
   hw/virtio/vhost-user.c| 31 
   2 files changed, 106 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 5fa7016..4a1f0c3 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -97,6 +97,23 @@ Depending on the request type, payload can be:
  log offset: offset from start of supplied file descriptor
  where logging starts (i.e. where guest address 0 would be logged)
+ * An IOTLB message
+   -
+   | iova | size | user address | permissions flags | type |
+   -
+
+   IOVA: a 64-bit guest I/O virtual address

guest -> VM

Ok.


+   Size: a 64-bit size

How do you specify "all memory"? give special meaning to size 0?

Good point, it does not support all memory currently.
It is not vhost-user specific, but general to the vhost implementation.

But iommu needs it to support passthrough.


Probably not, we will just pass the mappings in vhost_memory_region to 
vhost. Its memory_size is also a __u64.


Thanks



Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-12 Thread Michael S. Tsirkin
On Fri, May 12, 2017 at 04:21:58PM +0200, Maxime Coquelin wrote:
> 
> 
> On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:
> > On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:
> > > This patch specifies and implements the master/slave communication
> > > to support device IOTLB in slave.
> > > 
> > > The vhost_iotlb_msg structure introduced for kernel backends is
> > > re-used, making the design close between the two backends.
> > > 
> > > An exception is the use of the secondary channel to enable the
> > > slave to send IOTLB miss requests to the master.
> > > 
> > > Signed-off-by: Maxime Coquelin 
> > > ---
> > >   docs/specs/vhost-user.txt | 75 
> > > +++
> > >   hw/virtio/vhost-user.c| 31 
> > >   2 files changed, 106 insertions(+)
> > > 
> > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > > index 5fa7016..4a1f0c3 100644
> > > --- a/docs/specs/vhost-user.txt
> > > +++ b/docs/specs/vhost-user.txt
> > > @@ -97,6 +97,23 @@ Depending on the request type, payload can be:
> > >  log offset: offset from start of supplied file descriptor
> > >  where logging starts (i.e. where guest address 0 would be logged)
> > > + * An IOTLB message
> > > +   -
> > > +   | iova | size | user address | permissions flags | type |
> > > +   -
> > > +
> > > +   IOVA: a 64-bit guest I/O virtual address
> > 
> > guest -> VM
> 
> Ok.
> 
> > 
> > > +   Size: a 64-bit size
> > 
> > How do you specify "all memory"? give special meaning to size 0?
> 
> Good point, it does not support all memory currently.
> It is not vhost-user specific, but general to the vhost implementation.

But iommu needs it to support passthrough.

> 
> > > +   User address: a 64-bit user address
> > > +   Permissions flags: a 8-bit bit field:
> > > +- Bit 0: Read access
> > > +- Bit 1: Write access
> > 
> > Can both bits be set? Can none?
> 
> Both. I will change it by listing values directly:
>  - 0 : No access
>  - 1 : Read
>  - 2 : Write
>  - 3 : Read Write
> 
> > > +   Type: a 8-bit IOTLB message type:
> > > +- 1: IOTLB miss
> > > +- 2: IOTLB update
> > > +- 3: IOTLB invalidate
> > > +- 4: IOTLB access fail
> > > +
> > >   In QEMU the vhost-user message is implemented with the following struct:
> > >   typedef struct VhostUserMsg {
> > > @@ -109,6 +126,7 @@ typedef struct VhostUserMsg {
> > >   struct vhost_vring_addr addr;
> > >   VhostUserMemory memory;
> > >   VhostUserLog log;
> > > +struct vhost_iotlb_msg iotlb;
> > >   };
> > >   } QEMU_PACKED VhostUserMsg;
> > > @@ -253,6 +271,31 @@ Once the source has finished migration, rings will 
> > > be stopped by
> > >   the source. No further update must be done before rings are
> > >   restarted.
> > > +IOMMU support
> > > +-
> > > +
> > > +When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master 
> > > has
> > > +to send IOTLB entries update & invalidation by sending 
> > > VHOST_USER_IOTLB_MSG
> > > +requests to the slave with a struct vhost_iotlb_msg payload.
> > 
> > Always? This seems a bit strange since iommu can be enabled/disabled
> > dynamically.
> Ok, what about:
> 
> When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated and iommu
> is enbaled, the master sends IOTLB entries update & invalidation via
> VHOST_USER_IOTLB_MSG requests to the slave with a struct vhost_iotlb_msg
> payload.
> 
> 
> > Closing channel seems like a wrong thing to do for this.
> 
> Sorry, I'm not sure to get your comment.

What happens when guest disables the IOMMU?

> > > For update events,
> > > +the iotlb payload has to be filled with the update message type (2), the 
> > > I/O
> > > +virtual address, the size, the user virtual address, and the permissions
> > > +flags. For invalidation events, the iotlb payload has to be filled with 
> > > the
> > > +invalidation message type (3), the I/O virtual address and the size. On
> > > +success, the slave is expected to reply with a zero payload, non-zero
> > > +otherwise.
> > > +
> > > +When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and 
> > > the
> > > +master initiated the slave to master communication channel using the
> > > +VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and 
> > > access
> > > +failure events by sending VHOST_USER_SLAVE_IOTLB_MSG requests to the 
> > > master
> > > +with a struct vhost_iotlb_msg payload. For miss events, the iotlb 
> > > payload has
> > > +to be filled with the miss message type (1), the I/O virtual address and 
> > > the
> > > +permissions flags. For access failure event, the iotlb payload has to be
> > > +filled with the access failure message type (4), the I/O virtual address 
> > > and
> > > +the permissions flags. For synchronization purpose, the slave may rely 
> > > on t

Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-12 Thread Maxime Coquelin



On 05/11/2017 08:25 PM, Michael S. Tsirkin wrote:

On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:

This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin 
---
  docs/specs/vhost-user.txt | 75 +++
  hw/virtio/vhost-user.c| 31 
  2 files changed, 106 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 5fa7016..4a1f0c3 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -97,6 +97,23 @@ Depending on the request type, payload can be:
 log offset: offset from start of supplied file descriptor
 where logging starts (i.e. where guest address 0 would be logged)
  
+ * An IOTLB message

+   -
+   | iova | size | user address | permissions flags | type |
+   -
+
+   IOVA: a 64-bit guest I/O virtual address


guest -> VM


Ok.




+   Size: a 64-bit size


How do you specify "all memory"? give special meaning to size 0?


Good point, it does not support all memory currently.
It is not vhost-user specific, but general to the vhost implementation.


+   User address: a 64-bit user address
+   Permissions flags: a 8-bit bit field:
+- Bit 0: Read access
+- Bit 1: Write access


Can both bits be set? Can none?


Both. I will change it by listing values directly:
 - 0 : No access
 - 1 : Read
 - 2 : Write
 - 3 : Read Write


+   Type: a 8-bit IOTLB message type:
+- 1: IOTLB miss
+- 2: IOTLB update
+- 3: IOTLB invalidate
+- 4: IOTLB access fail
+
  In QEMU the vhost-user message is implemented with the following struct:
  
  typedef struct VhostUserMsg {

@@ -109,6 +126,7 @@ typedef struct VhostUserMsg {
  struct vhost_vring_addr addr;
  VhostUserMemory memory;
  VhostUserLog log;
+struct vhost_iotlb_msg iotlb;
  };
  } QEMU_PACKED VhostUserMsg;
  
@@ -253,6 +271,31 @@ Once the source has finished migration, rings will be stopped by

  the source. No further update must be done before rings are
  restarted.
  
+IOMMU support

+-
+
+When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master has
+to send IOTLB entries update & invalidation by sending VHOST_USER_IOTLB_MSG
+requests to the slave with a struct vhost_iotlb_msg payload.


Always? This seems a bit strange since iommu can be enabled/disabled
dynamically.

Ok, what about:

When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated and iommu
is enbaled, the master sends IOTLB entries update & invalidation via
VHOST_USER_IOTLB_MSG requests to the slave with a struct vhost_iotlb_msg
payload.



Closing channel seems like a wrong thing to do for this.


Sorry, I'm not sure to get your comment.


For update events,
+the iotlb payload has to be filled with the update message type (2), the I/O
+virtual address, the size, the user virtual address, and the permissions
+flags. For invalidation events, the iotlb payload has to be filled with the
+invalidation message type (3), the I/O virtual address and the size. On
+success, the slave is expected to reply with a zero payload, non-zero
+otherwise.
+
+When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and the
+master initiated the slave to master communication channel using the
+VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and access
+failure events by sending VHOST_USER_SLAVE_IOTLB_MSG requests to the master
+with a struct vhost_iotlb_msg payload. For miss events, the iotlb payload has
+to be filled with the miss message type (1), the I/O virtual address and the
+permissions flags. For access failure event, the iotlb payload has to be
+filled with the access failure message type (4), the I/O virtual address and
+the permissions flags. For synchronization purpose, the slave may rely on the
+reply-ack feature, so the master may send a reply when operation is completed
+if the reply-ack feature is negotiated and slaves requests a reply.
+
  Slave communication
  ---
  
@@ -514,6 +557,38 @@ Master message types

If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must respond
with zero for success, non-zero otherwise.
  
+ * VHOST_USER_IOTLB_MSG

+
+  Id: 22
+  Equivalent ioctl: N/A (equivalent to VHOST_IOTLB_MSG message type)
+  Master payload: struct vhost_iotlb_msg
+  Slave payload: u64
+
+  Send IOTLB messages with struct vhost_iotlb_msg as payload.
+  Master sends such requests to update and invalidate entries in the device
+  IOTLB. The slave has to acknowledge the

Re: [Qemu-devel] [PATCH 6/6] spec/vhost-user spec: Add IOMMU support

2017-05-11 Thread Michael S. Tsirkin
On Thu, May 11, 2017 at 02:32:46PM +0200, Maxime Coquelin wrote:
> This patch specifies and implements the master/slave communication
> to support device IOTLB in slave.
> 
> The vhost_iotlb_msg structure introduced for kernel backends is
> re-used, making the design close between the two backends.
> 
> An exception is the use of the secondary channel to enable the
> slave to send IOTLB miss requests to the master.
> 
> Signed-off-by: Maxime Coquelin 
> ---
>  docs/specs/vhost-user.txt | 75 
> +++
>  hw/virtio/vhost-user.c| 31 
>  2 files changed, 106 insertions(+)
> 
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 5fa7016..4a1f0c3 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -97,6 +97,23 @@ Depending on the request type, payload can be:
> log offset: offset from start of supplied file descriptor
> where logging starts (i.e. where guest address 0 would be logged)
>  
> + * An IOTLB message
> +   -
> +   | iova | size | user address | permissions flags | type |
> +   -
> +
> +   IOVA: a 64-bit guest I/O virtual address

guest -> VM

> +   Size: a 64-bit size

How do you specify "all memory"? give special meaning to size 0?

> +   User address: a 64-bit user address
> +   Permissions flags: a 8-bit bit field:
> +- Bit 0: Read access
> +- Bit 1: Write access

Can both bits be set? Can none?

> +   Type: a 8-bit IOTLB message type:
> +- 1: IOTLB miss
> +- 2: IOTLB update
> +- 3: IOTLB invalidate
> +- 4: IOTLB access fail
> +
>  In QEMU the vhost-user message is implemented with the following struct:
>  
>  typedef struct VhostUserMsg {
> @@ -109,6 +126,7 @@ typedef struct VhostUserMsg {
>  struct vhost_vring_addr addr;
>  VhostUserMemory memory;
>  VhostUserLog log;
> +struct vhost_iotlb_msg iotlb;
>  };
>  } QEMU_PACKED VhostUserMsg;
>  
> @@ -253,6 +271,31 @@ Once the source has finished migration, rings will be 
> stopped by
>  the source. No further update must be done before rings are
>  restarted.
>  
> +IOMMU support
> +-
> +
> +When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master has
> +to send IOTLB entries update & invalidation by sending VHOST_USER_IOTLB_MSG
> +requests to the slave with a struct vhost_iotlb_msg payload.

Always? This seems a bit strange since iommu can be enabled/disabled
dynamically. Closing channel seems like a wrong thing to do for this.

> For update events,
> +the iotlb payload has to be filled with the update message type (2), the I/O
> +virtual address, the size, the user virtual address, and the permissions
> +flags. For invalidation events, the iotlb payload has to be filled with the
> +invalidation message type (3), the I/O virtual address and the size. On
> +success, the slave is expected to reply with a zero payload, non-zero
> +otherwise.
> +
> +When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and the
> +master initiated the slave to master communication channel using the
> +VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and access
> +failure events by sending VHOST_USER_SLAVE_IOTLB_MSG requests to the master
> +with a struct vhost_iotlb_msg payload. For miss events, the iotlb payload has
> +to be filled with the miss message type (1), the I/O virtual address and the
> +permissions flags. For access failure event, the iotlb payload has to be
> +filled with the access failure message type (4), the I/O virtual address and
> +the permissions flags. For synchronization purpose, the slave may rely on the
> +reply-ack feature, so the master may send a reply when operation is completed
> +if the reply-ack feature is negotiated and slaves requests a reply.
> +
>  Slave communication
>  ---
>  
> @@ -514,6 +557,38 @@ Master message types
>If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must respond
>with zero for success, non-zero otherwise.
>  
> + * VHOST_USER_IOTLB_MSG
> +
> +  Id: 22
> +  Equivalent ioctl: N/A (equivalent to VHOST_IOTLB_MSG message type)
> +  Master payload: struct vhost_iotlb_msg
> +  Slave payload: u64
> +
> +  Send IOTLB messages with struct vhost_iotlb_msg as payload.
> +  Master sends such requests to update and invalidate entries in the 
> device
> +  IOTLB. The slave has to acknowledge the request with sending zero as 
> u64
> +  payload for success, non-zero otherwise.
> +  This request should be send only when VIRTIO_F_IOMMU_PLATFORM feature
> +  has been successfully negotiated.
> +
> +Slave message types
> +---
> +
> + * VHOST_USER_SLAVE_IOTLB_MSG
> +
> +  Id: 1
> +  Equivalent ioctl: N/A (equivalent to VHOST_IOTLB_MSG message type)
> +  Slave payload: struct v