Re: [PATCH v2 3/3] Lift max memory slots limit imposed by vhost-user

2020-02-25 Thread Michael S. Tsirkin
On Sun, Feb 09, 2020 at 12:43:35PM -0500, Raphael Norwitz wrote:
> > > The current feature implementation does not work with postcopy migration
> > > and cannot be enabled if the VHOST_USER_PROTOCOL_F_REPLY_ACK feature has
> > > also been negotiated.
> > 
> > Hmm what would it take to lift the restrictions?
> > conflicting features like this makes is very hard for users to make
> > an informed choice what to support.
> >
> 
> We would need a setup with a backend which supports these features (REPLY_ACK
> and postcopy migration). At first glance it looks like DPDK could work but
> I'm not sure how easy it will be to test postcopy migration with the resources
> we have.

Yes, DPDK works with postcopy. I understand it's of no
immediate interest to you but I'm afraid it just becomes too messy
if everyone keeps breaking it.
VHOST_USER_PROTOCOL_F_REPLY_ACK was added by a contributor from
nutanix, surely you can ping them internally for a test-case :).
 

-- 
MST




Re: [PATCH v2 3/3] Lift max memory slots limit imposed by vhost-user

2020-02-24 Thread Raphael Norwitz
Ping

On Sun, Feb 09, 2020 at 12:43:35PM -0500, Raphael Norwitz wrote:
> 
> On Thu, Feb 06, 2020 at 03:32:38AM -0500, Michael S. Tsirkin wrote:
> > 
> > On Wed, Jan 15, 2020 at 09:57:06PM -0500, Raphael Norwitz wrote:
> > > The current vhost-user implementation in Qemu imposes a limit on the
> > > maximum number of memory slots exposed to a VM using a vhost-user
> > > device. This change provides a new protocol feature
> > > VHOST_USER_F_CONFIGURE_SLOTS which, when enabled, lifts this limit and
> > > allows a VM with a vhost-user device to expose a configurable number of
> > > memory slots, up to the ACPI defined maximum. Existing backends which
> > > do not support this protocol feature are unaffected.
> > 
> > Hmm ACPI maximum seems to be up to 512 - is this too much to fit in a
> > single message?  So can't we just increase the number (after negotiating
> > with remote) and be done with it, instead of add/remove?  Or is there
> > another reason to prefer add/remove?
> >
> 
> As mentioned in my cover letter, we experimented with simply increasing the
> message size and it didn’t work on our setup. We debugged down to the socket
> layer and found that on the receiving end the messages were truncated at
> around 512 bytes, or around 16 memory regions. To support 512 memory regions
> we would need a message size of around  32  * 512 
> + 8  ~= 16k packet size. That would be 64
> times larger than the next largest message size. We thought it would be 
> cleaner
> and more in line with the rest of the protocol to keep the message sizes
> smaller. In particular, we thought memory regions should be treated like the
> rings, which are sent over one message at a time instead of in one large 
> message.
> Whether or not such a large message size can be made to work in our case,
> separate messages will always work on Linux, and most likely all other UNIX
> platforms QEMU is used on.
> 

> > > 
> > > This feature works by using three new messages,
> > > VHOST_USER_GET_MAX_MEM_SLOTS, VHOST_USER_ADD_MEM_REG and
> > > VHOST_USER_REM_MEM_REG. VHOST_USER_GET_MAX_MEM_SLOTS gets the
> > > number of memory slots the backend is willing to accept when the
> > > backend is initialized. Then, when the memory tables are set or updated,
> > > a series of VHOST_USER_ADD_MEM_REG and VHOST_USER_REM_MEM_REG messages
> > > are sent to transmit the regions to map and/or unmap instead of trying
> > > to send all the regions in one fixed size VHOST_USER_SET_MEM_TABLE
> > > message.
> > > 
> > > The vhost_user struct maintains a shadow state of the VM’s memory
> > > regions. When the memory tables are modified, the
> > > vhost_user_set_mem_table() function compares the new device memory state
> > > to the shadow state and only sends regions which need to be unmapped or
> > > mapped in. The regions which must be unmapped are sent first, followed
> > > by the new regions to be mapped in. After all the messages have been
> > > sent, the shadow state is set to the current virtual device state.
> > > 
> > > The current feature implementation does not work with postcopy migration
> > > and cannot be enabled if the VHOST_USER_PROTOCOL_F_REPLY_ACK feature has
> > > also been negotiated.
> > 
> > Hmm what would it take to lift the restrictions?
> > conflicting features like this makes is very hard for users to make
> > an informed choice what to support.
> >
> 
> We would need a setup with a backend which supports these features (REPLY_ACK
> and postcopy migration). At first glance it looks like DPDK could work but
> I'm not sure how easy it will be to test postcopy migration with the resources
> we have.
>  

> > > Signed-off-by: Raphael Norwitz 
> > > Signed-off-by: Peter Turschmid 
> > > Suggested-by: Mike Cui 
> > > ---
> > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > index af83fdd..fed6d02 100644
> > > --- a/hw/virtio/vhost-user.c
> > > +++ b/hw/virtio/vhost-user.c
> > > @@ -35,11 +35,29 @@
> > >  #include 
> > >  #endif
> > >  
> > > -#define VHOST_MEMORY_MAX_NREGIONS8
> > > +#define VHOST_MEMORY_LEGACY_NREGIONS8
> > 
> > Hardly legacy when this is intended to always be used e.g. with
> > postcopy, right?
> >
> 
> How about 'BASELINE'?

> > > +msg->hdr.size = sizeof(msg->payload.mem_reg.padding);
> > > +msg->hdr.size += sizeof(VhostUserMemoryRegion);
> > > +
> > > +/*
> > > + * Send VHOST_USER_REM_MEM_REG for memory regions in our shadow state
> > > + * which are not found not in the device's memory state.
> > 
> > double negation - could not parse this.
> >
> 
> Apologies - typo here. It should say “Send VHOST_USER_REM_MEM_REG for memory
> regions in our shadow state which are not found in the device's memory 
> state.” 
> i.e. send messages to remove regions in the shadow state but not in the 
> updated
> device state. 
>  
> > > + */
> > > +for (i = 0; i < u->num_shadow_regions; ++i) {
> > > +reg = dev->mem->regions;
> > > +
> > > +for (j = 0; j < 

Re: [PATCH v2 3/3] Lift max memory slots limit imposed by vhost-user

2020-02-10 Thread Raphael Norwitz
On Thu, Feb 06, 2020 at 03:32:38AM -0500, Michael S. Tsirkin wrote:
> 
> On Wed, Jan 15, 2020 at 09:57:06PM -0500, Raphael Norwitz wrote:
> > The current vhost-user implementation in Qemu imposes a limit on the
> > maximum number of memory slots exposed to a VM using a vhost-user
> > device. This change provides a new protocol feature
> > VHOST_USER_F_CONFIGURE_SLOTS which, when enabled, lifts this limit and
> > allows a VM with a vhost-user device to expose a configurable number of
> > memory slots, up to the ACPI defined maximum. Existing backends which
> > do not support this protocol feature are unaffected.
> 
> Hmm ACPI maximum seems to be up to 512 - is this too much to fit in a
> single message?  So can't we just increase the number (after negotiating
> with remote) and be done with it, instead of add/remove?  Or is there
> another reason to prefer add/remove?
>

As mentioned in my cover letter, we experimented with simply increasing the
message size and it didn’t work on our setup. We debugged down to the socket
layer and found that on the receiving end the messages were truncated at
around 512 bytes, or around 16 memory regions. To support 512 memory regions
we would need a message size of around  32  * 512 
+ 8  ~= 16k packet size. That would be 64
times larger than the next largest message size. We thought it would be cleaner
and more in line with the rest of the protocol to keep the message sizes
smaller. In particular, we thought memory regions should be treated like the
rings, which are sent over one message at a time instead of in one large 
message.
Whether or not such a large message size can be made to work in our case,
separate messages will always work on Linux, and most likely all other UNIX
platforms QEMU is used on.

> > 
> > This feature works by using three new messages,
> > VHOST_USER_GET_MAX_MEM_SLOTS, VHOST_USER_ADD_MEM_REG and
> > VHOST_USER_REM_MEM_REG. VHOST_USER_GET_MAX_MEM_SLOTS gets the
> > number of memory slots the backend is willing to accept when the
> > backend is initialized. Then, when the memory tables are set or updated,
> > a series of VHOST_USER_ADD_MEM_REG and VHOST_USER_REM_MEM_REG messages
> > are sent to transmit the regions to map and/or unmap instead of trying
> > to send all the regions in one fixed size VHOST_USER_SET_MEM_TABLE
> > message.
> > 
> > The vhost_user struct maintains a shadow state of the VM’s memory
> > regions. When the memory tables are modified, the
> > vhost_user_set_mem_table() function compares the new device memory state
> > to the shadow state and only sends regions which need to be unmapped or
> > mapped in. The regions which must be unmapped are sent first, followed
> > by the new regions to be mapped in. After all the messages have been
> > sent, the shadow state is set to the current virtual device state.
> > 
> > The current feature implementation does not work with postcopy migration
> > and cannot be enabled if the VHOST_USER_PROTOCOL_F_REPLY_ACK feature has
> > also been negotiated.
> 
> Hmm what would it take to lift the restrictions?
> conflicting features like this makes is very hard for users to make
> an informed choice what to support.
>

We would need a setup with a backend which supports these features (REPLY_ACK
and postcopy migration). At first glance it looks like DPDK could work but
I'm not sure how easy it will be to test postcopy migration with the resources
we have.
 
> > Signed-off-by: Raphael Norwitz 
> > Signed-off-by: Peter Turschmid 
> > Suggested-by: Mike Cui 
> > ---
> >  docs/interop/vhost-user.rst |  43 
> >  hw/virtio/vhost-user.c  | 254 
> > 
> >  2 files changed, 275 insertions(+), 22 deletions(-)
> > 
> > diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > index 5f8b3a4..ae9acf2 100644
> > --- a/docs/interop/vhost-user.rst
> > +++ b/docs/interop/vhost-user.rst
> > @@ -786,6 +786,7 @@ Protocol features
> >#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER  11
> >#define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12
> >#define VHOST_USER_PROTOCOL_F_RESET_DEVICE   13
> > +  #define VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS   14
> >  
> >  Master message types
> >  
> > @@ -1205,6 +1206,48 @@ Master message types
> >Only valid if the ``VHOST_USER_PROTOCOL_F_RESET_DEVICE`` protocol
> >feature is set by the backend.
> >  
> > +``VHOST_USER_GET_MAX_MEM_SLOTS``
> > +  :id: 35
> > +  :equivalent ioctl: N/A
> > +  :slave payload: u64
> > +
> > +  When the VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been
> > +  successfully negotiated, this message is submitted by master to the
> > +  slave. The slave should return the message with a u64 payload
> > +  containing the maximum number of memory slots for QEMU to expose to
> > +  the guest. This message is not supported with postcopy migration or if
> > +  the VHOST_USER_PROTOCOL_F_REPLY_ACK feature has also been negotiated.
> > +
> > 

Re: [PATCH v2 3/3] Lift max memory slots limit imposed by vhost-user

2020-02-06 Thread Michael S. Tsirkin
On Wed, Jan 15, 2020 at 09:57:06PM -0500, Raphael Norwitz wrote:
> The current vhost-user implementation in Qemu imposes a limit on the
> maximum number of memory slots exposed to a VM using a vhost-user
> device. This change provides a new protocol feature
> VHOST_USER_F_CONFIGURE_SLOTS which, when enabled, lifts this limit and
> allows a VM with a vhost-user device to expose a configurable number of
> memory slots, up to the ACPI defined maximum. Existing backends which
> do not support this protocol feature are unaffected.

Hmm ACPI maximum seems to be up to 512 - is this too much to fit in a
single message?  So can't we just increase the number (after negotiating
with remote) and be done with it, instead of add/remove?  Or is there
another reason to prefer add/remove?

> 
> This feature works by using three new messages,
> VHOST_USER_GET_MAX_MEM_SLOTS, VHOST_USER_ADD_MEM_REG and
> VHOST_USER_REM_MEM_REG. VHOST_USER_GET_MAX_MEM_SLOTS gets the
> number of memory slots the backend is willing to accept when the
> backend is initialized. Then, when the memory tables are set or updated,
> a series of VHOST_USER_ADD_MEM_REG and VHOST_USER_REM_MEM_REG messages
> are sent to transmit the regions to map and/or unmap instead of trying
> to send all the regions in one fixed size VHOST_USER_SET_MEM_TABLE
> message.
> 
> The vhost_user struct maintains a shadow state of the VM’s memory
> regions. When the memory tables are modified, the
> vhost_user_set_mem_table() function compares the new device memory state
> to the shadow state and only sends regions which need to be unmapped or
> mapped in. The regions which must be unmapped are sent first, followed
> by the new regions to be mapped in. After all the messages have been
> sent, the shadow state is set to the current virtual device state.
> 
> The current feature implementation does not work with postcopy migration
> and cannot be enabled if the VHOST_USER_PROTOCOL_F_REPLY_ACK feature has
> also been negotiated.

Hmm what would it take to lift the restrictions?
conflicting features like this makes is very hard for users to make
an informed choice what to support.


> Signed-off-by: Raphael Norwitz 
> Signed-off-by: Peter Turschmid 
> Suggested-by: Mike Cui 
> ---
>  docs/interop/vhost-user.rst |  43 
>  hw/virtio/vhost-user.c  | 254 
> 
>  2 files changed, 275 insertions(+), 22 deletions(-)
> 
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 5f8b3a4..ae9acf2 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -786,6 +786,7 @@ Protocol features
>#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER  11
>#define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12
>#define VHOST_USER_PROTOCOL_F_RESET_DEVICE   13
> +  #define VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS   14
>  
>  Master message types
>  
> @@ -1205,6 +1206,48 @@ Master message types
>Only valid if the ``VHOST_USER_PROTOCOL_F_RESET_DEVICE`` protocol
>feature is set by the backend.
>  
> +``VHOST_USER_GET_MAX_MEM_SLOTS``
> +  :id: 35
> +  :equivalent ioctl: N/A
> +  :slave payload: u64
> +
> +  When the VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been
> +  successfully negotiated, this message is submitted by master to the
> +  slave. The slave should return the message with a u64 payload
> +  containing the maximum number of memory slots for QEMU to expose to
> +  the guest. This message is not supported with postcopy migration or if
> +  the VHOST_USER_PROTOCOL_F_REPLY_ACK feature has also been negotiated.
> +
> +``VHOST_USER_ADD_MEM_REG``
> +  :id: 36
> +  :equivalent ioctl: N/A
> +  :slave payload: memory region
> +
> +  When the VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been
> +  successfully negotiated, this message is submitted by master to the slave.
> +  The message payload contains a memory region descriptor struct, describing
> +  a region of guest memory which the slave device must map in. When the
> +  VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been 
> successfully
> +  negotiated, along with the VHOST_USER_REM_MEM_REG message, this message is
> +  used to set and update the memory tables of the slave device. This message
> +  is not supported with postcopy migration or if the
> +  VHOST_USER_PROTOCOL_F_REPLY_ACK feature has also been negotiated.
> +
> +``VHOST_USER_REM_MEM_REG``
> +  :id: 37
> +  :equivalent ioctl: N/A
> +  :slave payload: memory region
> +
> +  When the VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been
> +  successfully negotiated, this message is submitted by master to the slave.
> +  The message payload contains a memory region descriptor struct, describing
> +  a region of guest memory which the slave device must unmap. When the
> +  VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been 
> successfully
> +  negotiated, along with the VHOST_USER_ADD_MEM_REG message, 

[PATCH v2 3/3] Lift max memory slots limit imposed by vhost-user

2020-01-27 Thread Raphael Norwitz
The current vhost-user implementation in Qemu imposes a limit on the
maximum number of memory slots exposed to a VM using a vhost-user
device. This change provides a new protocol feature
VHOST_USER_F_CONFIGURE_SLOTS which, when enabled, lifts this limit and
allows a VM with a vhost-user device to expose a configurable number of
memory slots, up to the ACPI defined maximum. Existing backends which
do not support this protocol feature are unaffected.

This feature works by using three new messages,
VHOST_USER_GET_MAX_MEM_SLOTS, VHOST_USER_ADD_MEM_REG and
VHOST_USER_REM_MEM_REG. VHOST_USER_GET_MAX_MEM_SLOTS gets the
number of memory slots the backend is willing to accept when the
backend is initialized. Then, when the memory tables are set or updated,
a series of VHOST_USER_ADD_MEM_REG and VHOST_USER_REM_MEM_REG messages
are sent to transmit the regions to map and/or unmap instead of trying
to send all the regions in one fixed size VHOST_USER_SET_MEM_TABLE
message.

The vhost_user struct maintains a shadow state of the VM’s memory
regions. When the memory tables are modified, the
vhost_user_set_mem_table() function compares the new device memory state
to the shadow state and only sends regions which need to be unmapped or
mapped in. The regions which must be unmapped are sent first, followed
by the new regions to be mapped in. After all the messages have been
sent, the shadow state is set to the current virtual device state.

The current feature implementation does not work with postcopy migration
and cannot be enabled if the VHOST_USER_PROTOCOL_F_REPLY_ACK feature has
also been negotiated.

Signed-off-by: Raphael Norwitz 
Signed-off-by: Peter Turschmid 
Suggested-by: Mike Cui 
---
 docs/interop/vhost-user.rst |  43 
 hw/virtio/vhost-user.c  | 254 
 2 files changed, 275 insertions(+), 22 deletions(-)

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 5f8b3a4..ae9acf2 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -786,6 +786,7 @@ Protocol features
   #define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER  11
   #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12
   #define VHOST_USER_PROTOCOL_F_RESET_DEVICE   13
+  #define VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS   14
 
 Master message types
 
@@ -1205,6 +1206,48 @@ Master message types
   Only valid if the ``VHOST_USER_PROTOCOL_F_RESET_DEVICE`` protocol
   feature is set by the backend.
 
+``VHOST_USER_GET_MAX_MEM_SLOTS``
+  :id: 35
+  :equivalent ioctl: N/A
+  :slave payload: u64
+
+  When the VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been
+  successfully negotiated, this message is submitted by master to the
+  slave. The slave should return the message with a u64 payload
+  containing the maximum number of memory slots for QEMU to expose to
+  the guest. This message is not supported with postcopy migration or if
+  the VHOST_USER_PROTOCOL_F_REPLY_ACK feature has also been negotiated.
+
+``VHOST_USER_ADD_MEM_REG``
+  :id: 36
+  :equivalent ioctl: N/A
+  :slave payload: memory region
+
+  When the VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been
+  successfully negotiated, this message is submitted by master to the slave.
+  The message payload contains a memory region descriptor struct, describing
+  a region of guest memory which the slave device must map in. When the
+  VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been successfully
+  negotiated, along with the VHOST_USER_REM_MEM_REG message, this message is
+  used to set and update the memory tables of the slave device. This message
+  is not supported with postcopy migration or if the
+  VHOST_USER_PROTOCOL_F_REPLY_ACK feature has also been negotiated.
+
+``VHOST_USER_REM_MEM_REG``
+  :id: 37
+  :equivalent ioctl: N/A
+  :slave payload: memory region
+
+  When the VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been
+  successfully negotiated, this message is submitted by master to the slave.
+  The message payload contains a memory region descriptor struct, describing
+  a region of guest memory which the slave device must unmap. When the
+  VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS protocol feature has been successfully
+  negotiated, along with the VHOST_USER_ADD_MEM_REG message, this message is
+  used to set and update the memory tables of the slave device. This message
+  is not supported with postcopy migration or if the
+  VHOST_USER_PROTOCOL_F_REPLY_ACK feature has also been negotiated.
+
 Slave message types
 ---
 
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index af83fdd..fed6d02 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -35,11 +35,29 @@
 #include 
 #endif
 
-#define VHOST_MEMORY_MAX_NREGIONS8
+#define VHOST_MEMORY_LEGACY_NREGIONS8
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 #define VHOST_USER_SLAVE_MAX_FDS 8
 
 /*
+ * Set maximum number of RAM slots