Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-05-09 Thread Stefan Hajnoczi
On Tue, 9 May 2023 at 11:35, Eugenio Perez Martin  wrote:
>
> On Tue, May 9, 2023 at 5:09 PM Stefan Hajnoczi  wrote:
> >
> > On Tue, May 09, 2023 at 08:45:33AM +0200, Eugenio Perez Martin wrote:
> > > On Mon, May 8, 2023 at 10:10 PM Stefan Hajnoczi  
> > > wrote:
> > > >
> > > > On Thu, Apr 20, 2023 at 03:29:44PM +0200, Eugenio Pérez wrote:
> > > > > On Wed, 2023-04-19 at 07:21 -0400, Stefan Hajnoczi wrote:
> > > > > > On Wed, 19 Apr 2023 at 07:10, Hanna Czenczek  
> > > > > > wrote:
> > > > > > > On 18.04.23 09:54, Eugenio Perez Martin wrote:
> > > > > > > > On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi 
> > > > > > > > 
> > > > > > > > wrote:
> > > > > > > > > On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin 
> > > > > > > > > 
> > > > > > > > > wrote:
> > > > > > > > > > On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi 
> > > > > > > > > > 
> > > > > > > > > > wrote:
> > > > > > > > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez 
> > > > > > > > > > > Martin
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi <
> > > > > > > > > > > > stefa...@redhat.com> wrote:
> > > > > > > > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna 
> > > > > > > > > > > > > Czenczek wrote:
> > > > > > > > > > > > > > So-called "internal" virtio-fs migration refers to
> > > > > > > > > > > > > > transporting the
> > > > > > > > > > > > > > back-end's (virtiofsd's) state through qemu's 
> > > > > > > > > > > > > > migration
> > > > > > > > > > > > > > stream.  To do
> > > > > > > > > > > > > > this, we need to be able to transfer virtiofsd's 
> > > > > > > > > > > > > > internal
> > > > > > > > > > > > > > state to and
> > > > > > > > > > > > > > from virtiofsd.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Because virtiofsd's internal state will not be too 
> > > > > > > > > > > > > > large, we
> > > > > > > > > > > > > > believe it
> > > > > > > > > > > > > > is best to transfer it as a single binary blob 
> > > > > > > > > > > > > > after the
> > > > > > > > > > > > > > streaming
> > > > > > > > > > > > > > phase.  Because this method should be useful to 
> > > > > > > > > > > > > > other vhost-
> > > > > > > > > > > > > > user
> > > > > > > > > > > > > > implementations, too, it is introduced as a 
> > > > > > > > > > > > > > general-purpose
> > > > > > > > > > > > > > addition to
> > > > > > > > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > These are the additions to the protocol:
> > > > > > > > > > > > > > - New vhost-user protocol feature
> > > > > > > > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > > > > > > > >This feature signals support for transferring 
> > > > > > > > > > > > > > state, and is
> > > > > > > > > > > > > > added so
> > > > > > > > > > > > > >that migration can fail early when the back-end 
> > > > > > > > > > > > > > has no
> > > > > > > > > > > > > > support.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and 
> > > > > > > > > > > > > > back-end
> > > > > > > > > > > > > > negotiate a pipe
> > > > > > > > > > > > > >over which to transfer the state.  The front-end 
> > > > > > > > > > > > > > sends an
> > > > > > > > > > > > > > FD to the
> > > > > > > > > > > > > >back-end into/from which it can write/read its 
> > > > > > > > > > > > > > state, and
> > > > > > > > > > > > > > the back-end
> > > > > > > > > > > > > >can decide to either use it, or reply with a 
> > > > > > > > > > > > > > different FD
> > > > > > > > > > > > > > for the
> > > > > > > > > > > > > >front-end to override the front-end's choice.
> > > > > > > > > > > > > >The front-end creates a simple pipe to transfer 
> > > > > > > > > > > > > > the state,
> > > > > > > > > > > > > > but maybe
> > > > > > > > > > > > > >the back-end already has an FD into/from which 
> > > > > > > > > > > > > > it has to
> > > > > > > > > > > > > > write/read
> > > > > > > > > > > > > >its state, in which case it will want to 
> > > > > > > > > > > > > > override the
> > > > > > > > > > > > > > simple pipe.
> > > > > > > > > > > > > >Conversely, maybe in the future we find a way to 
> > > > > > > > > > > > > > have the
> > > > > > > > > > > > > > front-end
> > > > > > > > > > > > > >get an immediate FD for the migration stream (in 
> > > > > > > > > > > > > > some
> > > > > > > > > > > > > > cases), in which
> > > > > > > > > > > > > >case we will want to send this to the back-end 
> > > > > > > > > > > > > > instead of
> > > > > > > > > > > > > > creating a
> > > > > > > > > > > > > >pipe.
> > > > > > > > > > > > > >Hence the negotiation: If one side has a better 
> > > > > > > > > > > > > > idea than a
> > > > > > > > > > > > > > plain
> > > > > > > > > > > > > >pipe, we will want to use that.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-05-09 Thread Eugenio Perez Martin
On Tue, May 9, 2023 at 5:09 PM Stefan Hajnoczi  wrote:
>
> On Tue, May 09, 2023 at 08:45:33AM +0200, Eugenio Perez Martin wrote:
> > On Mon, May 8, 2023 at 10:10 PM Stefan Hajnoczi  wrote:
> > >
> > > On Thu, Apr 20, 2023 at 03:29:44PM +0200, Eugenio Pérez wrote:
> > > > On Wed, 2023-04-19 at 07:21 -0400, Stefan Hajnoczi wrote:
> > > > > On Wed, 19 Apr 2023 at 07:10, Hanna Czenczek  
> > > > > wrote:
> > > > > > On 18.04.23 09:54, Eugenio Perez Martin wrote:
> > > > > > > On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi 
> > > > > > > 
> > > > > > > wrote:
> > > > > > > > On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin 
> > > > > > > > 
> > > > > > > > wrote:
> > > > > > > > > On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi 
> > > > > > > > > 
> > > > > > > > > wrote:
> > > > > > > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez 
> > > > > > > > > > Martin
> > > > > > > > > > wrote:
> > > > > > > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi <
> > > > > > > > > > > stefa...@redhat.com> wrote:
> > > > > > > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna 
> > > > > > > > > > > > Czenczek wrote:
> > > > > > > > > > > > > So-called "internal" virtio-fs migration refers to
> > > > > > > > > > > > > transporting the
> > > > > > > > > > > > > back-end's (virtiofsd's) state through qemu's 
> > > > > > > > > > > > > migration
> > > > > > > > > > > > > stream.  To do
> > > > > > > > > > > > > this, we need to be able to transfer virtiofsd's 
> > > > > > > > > > > > > internal
> > > > > > > > > > > > > state to and
> > > > > > > > > > > > > from virtiofsd.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Because virtiofsd's internal state will not be too 
> > > > > > > > > > > > > large, we
> > > > > > > > > > > > > believe it
> > > > > > > > > > > > > is best to transfer it as a single binary blob after 
> > > > > > > > > > > > > the
> > > > > > > > > > > > > streaming
> > > > > > > > > > > > > phase.  Because this method should be useful to other 
> > > > > > > > > > > > > vhost-
> > > > > > > > > > > > > user
> > > > > > > > > > > > > implementations, too, it is introduced as a 
> > > > > > > > > > > > > general-purpose
> > > > > > > > > > > > > addition to
> > > > > > > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > > > > > >
> > > > > > > > > > > > > These are the additions to the protocol:
> > > > > > > > > > > > > - New vhost-user protocol feature
> > > > > > > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > > > > > > >This feature signals support for transferring 
> > > > > > > > > > > > > state, and is
> > > > > > > > > > > > > added so
> > > > > > > > > > > > >that migration can fail early when the back-end 
> > > > > > > > > > > > > has no
> > > > > > > > > > > > > support.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end
> > > > > > > > > > > > > negotiate a pipe
> > > > > > > > > > > > >over which to transfer the state.  The front-end 
> > > > > > > > > > > > > sends an
> > > > > > > > > > > > > FD to the
> > > > > > > > > > > > >back-end into/from which it can write/read its 
> > > > > > > > > > > > > state, and
> > > > > > > > > > > > > the back-end
> > > > > > > > > > > > >can decide to either use it, or reply with a 
> > > > > > > > > > > > > different FD
> > > > > > > > > > > > > for the
> > > > > > > > > > > > >front-end to override the front-end's choice.
> > > > > > > > > > > > >The front-end creates a simple pipe to transfer 
> > > > > > > > > > > > > the state,
> > > > > > > > > > > > > but maybe
> > > > > > > > > > > > >the back-end already has an FD into/from which it 
> > > > > > > > > > > > > has to
> > > > > > > > > > > > > write/read
> > > > > > > > > > > > >its state, in which case it will want to override 
> > > > > > > > > > > > > the
> > > > > > > > > > > > > simple pipe.
> > > > > > > > > > > > >Conversely, maybe in the future we find a way to 
> > > > > > > > > > > > > have the
> > > > > > > > > > > > > front-end
> > > > > > > > > > > > >get an immediate FD for the migration stream (in 
> > > > > > > > > > > > > some
> > > > > > > > > > > > > cases), in which
> > > > > > > > > > > > >case we will want to send this to the back-end 
> > > > > > > > > > > > > instead of
> > > > > > > > > > > > > creating a
> > > > > > > > > > > > >pipe.
> > > > > > > > > > > > >Hence the negotiation: If one side has a better 
> > > > > > > > > > > > > idea than a
> > > > > > > > > > > > > plain
> > > > > > > > > > > > >pipe, we will want to use that.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - CHECK_DEVICE_STATE: After the state has been 
> > > > > > > > > > > > > transferred
> > > > > > > > > > > > > through the
> > > > > > > > > > > > >pipe (the end indicated by EOF), the front-end 
> > > > > > > > > > > > > invokes this
> > > > > > > > > > > > > function
> > > > > > > > > > > > >to 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-05-09 Thread Eugenio Perez Martin
On Tue, May 9, 2023 at 11:01 AM Hanna Czenczek  wrote:
>
> On 09.05.23 08:31, Eugenio Perez Martin wrote:
> > On Mon, May 8, 2023 at 9:12 PM Stefan Hajnoczi  wrote:
>
> [...]
>
> >> VHOST_USER_GET_VRING_BASE itself isn't really enough because it stops a
> >> specific virtqueue but not the whole device. Unfortunately stopping all
> >> virtqueues is not the same as SUSPEND since spontaneous device activity
> >> is possible independent of any virtqueue (e.g. virtio-scsi events and
> >> maybe virtio-net link status).
> >>
> >> That's why I think SUSPEND is necessary for a solution that's generic
> >> enough to cover all device types.
> >>
> > I agree.
> >
> > In particular virtiofsd is already resetting all the device at
> > VHOST_USER_GET_VRING_BASE if I'm not wrong, so that's even more of a
> > reason to implement suspend call.
>
> Oh, no, just the vring in question.  Not the whole device.
>
> In addition, we still need the GET_VRING_BASE call anyway, because,
> well, we want to restore the vring on the destination via SET_VRING_BASE.
>

Ok, that makes sense, sorry for the confusion!

Thanks!




Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-05-09 Thread Stefan Hajnoczi
On Tue, May 09, 2023 at 08:45:33AM +0200, Eugenio Perez Martin wrote:
> On Mon, May 8, 2023 at 10:10 PM Stefan Hajnoczi  wrote:
> >
> > On Thu, Apr 20, 2023 at 03:29:44PM +0200, Eugenio Pérez wrote:
> > > On Wed, 2023-04-19 at 07:21 -0400, Stefan Hajnoczi wrote:
> > > > On Wed, 19 Apr 2023 at 07:10, Hanna Czenczek  wrote:
> > > > > On 18.04.23 09:54, Eugenio Perez Martin wrote:
> > > > > > On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi 
> > > > > > wrote:
> > > > > > > On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin 
> > > > > > > 
> > > > > > > wrote:
> > > > > > > > On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi 
> > > > > > > > 
> > > > > > > > wrote:
> > > > > > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin
> > > > > > > > > wrote:
> > > > > > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi <
> > > > > > > > > > stefa...@redhat.com> wrote:
> > > > > > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > So-called "internal" virtio-fs migration refers to
> > > > > > > > > > > > transporting the
> > > > > > > > > > > > back-end's (virtiofsd's) state through qemu's migration
> > > > > > > > > > > > stream.  To do
> > > > > > > > > > > > this, we need to be able to transfer virtiofsd's 
> > > > > > > > > > > > internal
> > > > > > > > > > > > state to and
> > > > > > > > > > > > from virtiofsd.
> > > > > > > > > > > >
> > > > > > > > > > > > Because virtiofsd's internal state will not be too 
> > > > > > > > > > > > large, we
> > > > > > > > > > > > believe it
> > > > > > > > > > > > is best to transfer it as a single binary blob after the
> > > > > > > > > > > > streaming
> > > > > > > > > > > > phase.  Because this method should be useful to other 
> > > > > > > > > > > > vhost-
> > > > > > > > > > > > user
> > > > > > > > > > > > implementations, too, it is introduced as a 
> > > > > > > > > > > > general-purpose
> > > > > > > > > > > > addition to
> > > > > > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > > > > >
> > > > > > > > > > > > These are the additions to the protocol:
> > > > > > > > > > > > - New vhost-user protocol feature
> > > > > > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > > > > > >This feature signals support for transferring state, 
> > > > > > > > > > > > and is
> > > > > > > > > > > > added so
> > > > > > > > > > > >that migration can fail early when the back-end has 
> > > > > > > > > > > > no
> > > > > > > > > > > > support.
> > > > > > > > > > > >
> > > > > > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end
> > > > > > > > > > > > negotiate a pipe
> > > > > > > > > > > >over which to transfer the state.  The front-end 
> > > > > > > > > > > > sends an
> > > > > > > > > > > > FD to the
> > > > > > > > > > > >back-end into/from which it can write/read its 
> > > > > > > > > > > > state, and
> > > > > > > > > > > > the back-end
> > > > > > > > > > > >can decide to either use it, or reply with a 
> > > > > > > > > > > > different FD
> > > > > > > > > > > > for the
> > > > > > > > > > > >front-end to override the front-end's choice.
> > > > > > > > > > > >The front-end creates a simple pipe to transfer the 
> > > > > > > > > > > > state,
> > > > > > > > > > > > but maybe
> > > > > > > > > > > >the back-end already has an FD into/from which it 
> > > > > > > > > > > > has to
> > > > > > > > > > > > write/read
> > > > > > > > > > > >its state, in which case it will want to override the
> > > > > > > > > > > > simple pipe.
> > > > > > > > > > > >Conversely, maybe in the future we find a way to 
> > > > > > > > > > > > have the
> > > > > > > > > > > > front-end
> > > > > > > > > > > >get an immediate FD for the migration stream (in some
> > > > > > > > > > > > cases), in which
> > > > > > > > > > > >case we will want to send this to the back-end 
> > > > > > > > > > > > instead of
> > > > > > > > > > > > creating a
> > > > > > > > > > > >pipe.
> > > > > > > > > > > >Hence the negotiation: If one side has a better idea 
> > > > > > > > > > > > than a
> > > > > > > > > > > > plain
> > > > > > > > > > > >pipe, we will want to use that.
> > > > > > > > > > > >
> > > > > > > > > > > > - CHECK_DEVICE_STATE: After the state has been 
> > > > > > > > > > > > transferred
> > > > > > > > > > > > through the
> > > > > > > > > > > >pipe (the end indicated by EOF), the front-end 
> > > > > > > > > > > > invokes this
> > > > > > > > > > > > function
> > > > > > > > > > > >to verify success.  There is no in-band way (through 
> > > > > > > > > > > > the
> > > > > > > > > > > > pipe) to
> > > > > > > > > > > >indicate failure, so we need to check explicitly.
> > > > > > > > > > > >
> > > > > > > > > > > > Once the transfer pipe has been established via
> > > > > > > > > > > > SET_DEVICE_STATE_FD
> > > > > > > > > > > > (which includes establishing the direction of 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-05-09 Thread Hanna Czenczek

On 09.05.23 08:31, Eugenio Perez Martin wrote:

On Mon, May 8, 2023 at 9:12 PM Stefan Hajnoczi  wrote:


[...]


VHOST_USER_GET_VRING_BASE itself isn't really enough because it stops a
specific virtqueue but not the whole device. Unfortunately stopping all
virtqueues is not the same as SUSPEND since spontaneous device activity
is possible independent of any virtqueue (e.g. virtio-scsi events and
maybe virtio-net link status).

That's why I think SUSPEND is necessary for a solution that's generic
enough to cover all device types.


I agree.

In particular virtiofsd is already resetting all the device at
VHOST_USER_GET_VRING_BASE if I'm not wrong, so that's even more of a
reason to implement suspend call.


Oh, no, just the vring in question.  Not the whole device.

In addition, we still need the GET_VRING_BASE call anyway, because, 
well, we want to restore the vring on the destination via SET_VRING_BASE.


Hanna




Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-05-09 Thread Eugenio Perez Martin
On Mon, May 8, 2023 at 10:10 PM Stefan Hajnoczi  wrote:
>
> On Thu, Apr 20, 2023 at 03:29:44PM +0200, Eugenio Pérez wrote:
> > On Wed, 2023-04-19 at 07:21 -0400, Stefan Hajnoczi wrote:
> > > On Wed, 19 Apr 2023 at 07:10, Hanna Czenczek  wrote:
> > > > On 18.04.23 09:54, Eugenio Perez Martin wrote:
> > > > > On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi 
> > > > > wrote:
> > > > > > On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin 
> > > > > > 
> > > > > > wrote:
> > > > > > > On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi 
> > > > > > > 
> > > > > > > wrote:
> > > > > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin
> > > > > > > > wrote:
> > > > > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi <
> > > > > > > > > stefa...@redhat.com> wrote:
> > > > > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek 
> > > > > > > > > > wrote:
> > > > > > > > > > > So-called "internal" virtio-fs migration refers to
> > > > > > > > > > > transporting the
> > > > > > > > > > > back-end's (virtiofsd's) state through qemu's migration
> > > > > > > > > > > stream.  To do
> > > > > > > > > > > this, we need to be able to transfer virtiofsd's internal
> > > > > > > > > > > state to and
> > > > > > > > > > > from virtiofsd.
> > > > > > > > > > >
> > > > > > > > > > > Because virtiofsd's internal state will not be too large, 
> > > > > > > > > > > we
> > > > > > > > > > > believe it
> > > > > > > > > > > is best to transfer it as a single binary blob after the
> > > > > > > > > > > streaming
> > > > > > > > > > > phase.  Because this method should be useful to other 
> > > > > > > > > > > vhost-
> > > > > > > > > > > user
> > > > > > > > > > > implementations, too, it is introduced as a 
> > > > > > > > > > > general-purpose
> > > > > > > > > > > addition to
> > > > > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > > > >
> > > > > > > > > > > These are the additions to the protocol:
> > > > > > > > > > > - New vhost-user protocol feature
> > > > > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > > > > >This feature signals support for transferring state, 
> > > > > > > > > > > and is
> > > > > > > > > > > added so
> > > > > > > > > > >that migration can fail early when the back-end has no
> > > > > > > > > > > support.
> > > > > > > > > > >
> > > > > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end
> > > > > > > > > > > negotiate a pipe
> > > > > > > > > > >over which to transfer the state.  The front-end sends 
> > > > > > > > > > > an
> > > > > > > > > > > FD to the
> > > > > > > > > > >back-end into/from which it can write/read its state, 
> > > > > > > > > > > and
> > > > > > > > > > > the back-end
> > > > > > > > > > >can decide to either use it, or reply with a different 
> > > > > > > > > > > FD
> > > > > > > > > > > for the
> > > > > > > > > > >front-end to override the front-end's choice.
> > > > > > > > > > >The front-end creates a simple pipe to transfer the 
> > > > > > > > > > > state,
> > > > > > > > > > > but maybe
> > > > > > > > > > >the back-end already has an FD into/from which it has 
> > > > > > > > > > > to
> > > > > > > > > > > write/read
> > > > > > > > > > >its state, in which case it will want to override the
> > > > > > > > > > > simple pipe.
> > > > > > > > > > >Conversely, maybe in the future we find a way to have 
> > > > > > > > > > > the
> > > > > > > > > > > front-end
> > > > > > > > > > >get an immediate FD for the migration stream (in some
> > > > > > > > > > > cases), in which
> > > > > > > > > > >case we will want to send this to the back-end instead 
> > > > > > > > > > > of
> > > > > > > > > > > creating a
> > > > > > > > > > >pipe.
> > > > > > > > > > >Hence the negotiation: If one side has a better idea 
> > > > > > > > > > > than a
> > > > > > > > > > > plain
> > > > > > > > > > >pipe, we will want to use that.
> > > > > > > > > > >
> > > > > > > > > > > - CHECK_DEVICE_STATE: After the state has been transferred
> > > > > > > > > > > through the
> > > > > > > > > > >pipe (the end indicated by EOF), the front-end invokes 
> > > > > > > > > > > this
> > > > > > > > > > > function
> > > > > > > > > > >to verify success.  There is no in-band way (through 
> > > > > > > > > > > the
> > > > > > > > > > > pipe) to
> > > > > > > > > > >indicate failure, so we need to check explicitly.
> > > > > > > > > > >
> > > > > > > > > > > Once the transfer pipe has been established via
> > > > > > > > > > > SET_DEVICE_STATE_FD
> > > > > > > > > > > (which includes establishing the direction of transfer and
> > > > > > > > > > > migration
> > > > > > > > > > > phase), the sending side writes its data into the pipe, 
> > > > > > > > > > > and
> > > > > > > > > > > the reading
> > > > > > > > > > > side reads it until it sees an EOF.  Then, the front-end 
> > > > > > > > > > > will
> > > > > > > > > > > check for
> > > > > > > > > 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-05-09 Thread Eugenio Perez Martin
On Mon, May 8, 2023 at 9:12 PM Stefan Hajnoczi  wrote:
>
> On Thu, Apr 20, 2023 at 03:27:51PM +0200, Eugenio Pérez wrote:
> > On Tue, 2023-04-18 at 16:40 -0400, Stefan Hajnoczi wrote:
> > > On Tue, 18 Apr 2023 at 14:31, Eugenio Perez Martin 
> > > wrote:
> > > > On Tue, Apr 18, 2023 at 7:59 PM Stefan Hajnoczi  
> > > > wrote:
> > > > > On Tue, Apr 18, 2023 at 10:09:30AM +0200, Eugenio Perez Martin wrote:
> > > > > > On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi 
> > > > > > wrote:
> > > > > > > On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin <
> > > > > > > epere...@redhat.com> wrote:
> > > > > > > > On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi 
> > > > > > > >  > > > > > > > > wrote:
> > > > > > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin
> > > > > > > > > wrote:
> > > > > > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi <
> > > > > > > > > > stefa...@redhat.com> wrote:
> > > > > > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek
> > > > > > > > > > > wrote:
> > > > > > > > > > > > So-called "internal" virtio-fs migration refers to
> > > > > > > > > > > > transporting the
> > > > > > > > > > > > back-end's (virtiofsd's) state through qemu's migration
> > > > > > > > > > > > stream.  To do
> > > > > > > > > > > > this, we need to be able to transfer virtiofsd's 
> > > > > > > > > > > > internal
> > > > > > > > > > > > state to and
> > > > > > > > > > > > from virtiofsd.
> > > > > > > > > > > >
> > > > > > > > > > > > Because virtiofsd's internal state will not be too 
> > > > > > > > > > > > large, we
> > > > > > > > > > > > believe it
> > > > > > > > > > > > is best to transfer it as a single binary blob after the
> > > > > > > > > > > > streaming
> > > > > > > > > > > > phase.  Because this method should be useful to other 
> > > > > > > > > > > > vhost-
> > > > > > > > > > > > user
> > > > > > > > > > > > implementations, too, it is introduced as a 
> > > > > > > > > > > > general-purpose
> > > > > > > > > > > > addition to
> > > > > > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > > > > >
> > > > > > > > > > > > These are the additions to the protocol:
> > > > > > > > > > > > - New vhost-user protocol feature
> > > > > > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > > > > > >   This feature signals support for transferring state, 
> > > > > > > > > > > > and
> > > > > > > > > > > > is added so
> > > > > > > > > > > >   that migration can fail early when the back-end has no
> > > > > > > > > > > > support.
> > > > > > > > > > > >
> > > > > > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end
> > > > > > > > > > > > negotiate a pipe
> > > > > > > > > > > >   over which to transfer the state.  The front-end 
> > > > > > > > > > > > sends an
> > > > > > > > > > > > FD to the
> > > > > > > > > > > >   back-end into/from which it can write/read its state, 
> > > > > > > > > > > > and
> > > > > > > > > > > > the back-end
> > > > > > > > > > > >   can decide to either use it, or reply with a 
> > > > > > > > > > > > different FD
> > > > > > > > > > > > for the
> > > > > > > > > > > >   front-end to override the front-end's choice.
> > > > > > > > > > > >   The front-end creates a simple pipe to transfer the 
> > > > > > > > > > > > state,
> > > > > > > > > > > > but maybe
> > > > > > > > > > > >   the back-end already has an FD into/from which it has 
> > > > > > > > > > > > to
> > > > > > > > > > > > write/read
> > > > > > > > > > > >   its state, in which case it will want to override the
> > > > > > > > > > > > simple pipe.
> > > > > > > > > > > >   Conversely, maybe in the future we find a way to have 
> > > > > > > > > > > > the
> > > > > > > > > > > > front-end
> > > > > > > > > > > >   get an immediate FD for the migration stream (in some
> > > > > > > > > > > > cases), in which
> > > > > > > > > > > >   case we will want to send this to the back-end 
> > > > > > > > > > > > instead of
> > > > > > > > > > > > creating a
> > > > > > > > > > > >   pipe.
> > > > > > > > > > > >   Hence the negotiation: If one side has a better idea 
> > > > > > > > > > > > than
> > > > > > > > > > > > a plain
> > > > > > > > > > > >   pipe, we will want to use that.
> > > > > > > > > > > >
> > > > > > > > > > > > - CHECK_DEVICE_STATE: After the state has been 
> > > > > > > > > > > > transferred
> > > > > > > > > > > > through the
> > > > > > > > > > > >   pipe (the end indicated by EOF), the front-end invokes
> > > > > > > > > > > > this function
> > > > > > > > > > > >   to verify success.  There is no in-band way (through 
> > > > > > > > > > > > the
> > > > > > > > > > > > pipe) to
> > > > > > > > > > > >   indicate failure, so we need to check explicitly.
> > > > > > > > > > > >
> > > > > > > > > > > > Once the transfer pipe has been established via
> > > > > > > > > > > > SET_DEVICE_STATE_FD
> > > > > > > > > > > > (which includes establishing the direction of transfer 
> > > > > > > > > > > > and

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-05-08 Thread Stefan Hajnoczi
On Thu, Apr 20, 2023 at 03:29:44PM +0200, Eugenio Pérez wrote:
> On Wed, 2023-04-19 at 07:21 -0400, Stefan Hajnoczi wrote:
> > On Wed, 19 Apr 2023 at 07:10, Hanna Czenczek  wrote:
> > > On 18.04.23 09:54, Eugenio Perez Martin wrote:
> > > > On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi 
> > > > wrote:
> > > > > On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin 
> > > > > 
> > > > > wrote:
> > > > > > On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi 
> > > > > > 
> > > > > > wrote:
> > > > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin
> > > > > > > wrote:
> > > > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi <
> > > > > > > > stefa...@redhat.com> wrote:
> > > > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek 
> > > > > > > > > wrote:
> > > > > > > > > > So-called "internal" virtio-fs migration refers to
> > > > > > > > > > transporting the
> > > > > > > > > > back-end's (virtiofsd's) state through qemu's migration
> > > > > > > > > > stream.  To do
> > > > > > > > > > this, we need to be able to transfer virtiofsd's internal
> > > > > > > > > > state to and
> > > > > > > > > > from virtiofsd.
> > > > > > > > > > 
> > > > > > > > > > Because virtiofsd's internal state will not be too large, we
> > > > > > > > > > believe it
> > > > > > > > > > is best to transfer it as a single binary blob after the
> > > > > > > > > > streaming
> > > > > > > > > > phase.  Because this method should be useful to other vhost-
> > > > > > > > > > user
> > > > > > > > > > implementations, too, it is introduced as a general-purpose
> > > > > > > > > > addition to
> > > > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > > > 
> > > > > > > > > > These are the additions to the protocol:
> > > > > > > > > > - New vhost-user protocol feature
> > > > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > > > >This feature signals support for transferring state, and 
> > > > > > > > > > is
> > > > > > > > > > added so
> > > > > > > > > >that migration can fail early when the back-end has no
> > > > > > > > > > support.
> > > > > > > > > > 
> > > > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end
> > > > > > > > > > negotiate a pipe
> > > > > > > > > >over which to transfer the state.  The front-end sends an
> > > > > > > > > > FD to the
> > > > > > > > > >back-end into/from which it can write/read its state, and
> > > > > > > > > > the back-end
> > > > > > > > > >can decide to either use it, or reply with a different FD
> > > > > > > > > > for the
> > > > > > > > > >front-end to override the front-end's choice.
> > > > > > > > > >The front-end creates a simple pipe to transfer the 
> > > > > > > > > > state,
> > > > > > > > > > but maybe
> > > > > > > > > >the back-end already has an FD into/from which it has to
> > > > > > > > > > write/read
> > > > > > > > > >its state, in which case it will want to override the
> > > > > > > > > > simple pipe.
> > > > > > > > > >Conversely, maybe in the future we find a way to have the
> > > > > > > > > > front-end
> > > > > > > > > >get an immediate FD for the migration stream (in some
> > > > > > > > > > cases), in which
> > > > > > > > > >case we will want to send this to the back-end instead of
> > > > > > > > > > creating a
> > > > > > > > > >pipe.
> > > > > > > > > >Hence the negotiation: If one side has a better idea 
> > > > > > > > > > than a
> > > > > > > > > > plain
> > > > > > > > > >pipe, we will want to use that.
> > > > > > > > > > 
> > > > > > > > > > - CHECK_DEVICE_STATE: After the state has been transferred
> > > > > > > > > > through the
> > > > > > > > > >pipe (the end indicated by EOF), the front-end invokes 
> > > > > > > > > > this
> > > > > > > > > > function
> > > > > > > > > >to verify success.  There is no in-band way (through the
> > > > > > > > > > pipe) to
> > > > > > > > > >indicate failure, so we need to check explicitly.
> > > > > > > > > > 
> > > > > > > > > > Once the transfer pipe has been established via
> > > > > > > > > > SET_DEVICE_STATE_FD
> > > > > > > > > > (which includes establishing the direction of transfer and
> > > > > > > > > > migration
> > > > > > > > > > phase), the sending side writes its data into the pipe, and
> > > > > > > > > > the reading
> > > > > > > > > > side reads it until it sees an EOF.  Then, the front-end 
> > > > > > > > > > will
> > > > > > > > > > check for
> > > > > > > > > > success via CHECK_DEVICE_STATE, which on the destination 
> > > > > > > > > > side
> > > > > > > > > > includes
> > > > > > > > > > checking for integrity (i.e. errors during deserialization).
> > > > > > > > > > 
> > > > > > > > > > Suggested-by: Stefan Hajnoczi 
> > > > > > > > > > Signed-off-by: Hanna Czenczek 
> > > > > > > > > > ---
> > > > > > > > > >   include/hw/virtio/vhost-backend.h |  24 +
> > > > > > > > > >   include/hw/virtio/vhost.h |  79 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-05-08 Thread Stefan Hajnoczi
On Thu, Apr 20, 2023 at 03:27:51PM +0200, Eugenio Pérez wrote:
> On Tue, 2023-04-18 at 16:40 -0400, Stefan Hajnoczi wrote:
> > On Tue, 18 Apr 2023 at 14:31, Eugenio Perez Martin 
> > wrote:
> > > On Tue, Apr 18, 2023 at 7:59 PM Stefan Hajnoczi  
> > > wrote:
> > > > On Tue, Apr 18, 2023 at 10:09:30AM +0200, Eugenio Perez Martin wrote:
> > > > > On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi 
> > > > > wrote:
> > > > > > On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin <
> > > > > > epere...@redhat.com> wrote:
> > > > > > > On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi 
> > > > > > >  > > > > > > > wrote:
> > > > > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin
> > > > > > > > wrote:
> > > > > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi <
> > > > > > > > > stefa...@redhat.com> wrote:
> > > > > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek
> > > > > > > > > > wrote:
> > > > > > > > > > > So-called "internal" virtio-fs migration refers to
> > > > > > > > > > > transporting the
> > > > > > > > > > > back-end's (virtiofsd's) state through qemu's migration
> > > > > > > > > > > stream.  To do
> > > > > > > > > > > this, we need to be able to transfer virtiofsd's internal
> > > > > > > > > > > state to and
> > > > > > > > > > > from virtiofsd.
> > > > > > > > > > > 
> > > > > > > > > > > Because virtiofsd's internal state will not be too large, 
> > > > > > > > > > > we
> > > > > > > > > > > believe it
> > > > > > > > > > > is best to transfer it as a single binary blob after the
> > > > > > > > > > > streaming
> > > > > > > > > > > phase.  Because this method should be useful to other 
> > > > > > > > > > > vhost-
> > > > > > > > > > > user
> > > > > > > > > > > implementations, too, it is introduced as a 
> > > > > > > > > > > general-purpose
> > > > > > > > > > > addition to
> > > > > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > > > > 
> > > > > > > > > > > These are the additions to the protocol:
> > > > > > > > > > > - New vhost-user protocol feature
> > > > > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > > > > >   This feature signals support for transferring state, and
> > > > > > > > > > > is added so
> > > > > > > > > > >   that migration can fail early when the back-end has no
> > > > > > > > > > > support.
> > > > > > > > > > > 
> > > > > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end
> > > > > > > > > > > negotiate a pipe
> > > > > > > > > > >   over which to transfer the state.  The front-end sends 
> > > > > > > > > > > an
> > > > > > > > > > > FD to the
> > > > > > > > > > >   back-end into/from which it can write/read its state, 
> > > > > > > > > > > and
> > > > > > > > > > > the back-end
> > > > > > > > > > >   can decide to either use it, or reply with a different 
> > > > > > > > > > > FD
> > > > > > > > > > > for the
> > > > > > > > > > >   front-end to override the front-end's choice.
> > > > > > > > > > >   The front-end creates a simple pipe to transfer the 
> > > > > > > > > > > state,
> > > > > > > > > > > but maybe
> > > > > > > > > > >   the back-end already has an FD into/from which it has to
> > > > > > > > > > > write/read
> > > > > > > > > > >   its state, in which case it will want to override the
> > > > > > > > > > > simple pipe.
> > > > > > > > > > >   Conversely, maybe in the future we find a way to have 
> > > > > > > > > > > the
> > > > > > > > > > > front-end
> > > > > > > > > > >   get an immediate FD for the migration stream (in some
> > > > > > > > > > > cases), in which
> > > > > > > > > > >   case we will want to send this to the back-end instead 
> > > > > > > > > > > of
> > > > > > > > > > > creating a
> > > > > > > > > > >   pipe.
> > > > > > > > > > >   Hence the negotiation: If one side has a better idea 
> > > > > > > > > > > than
> > > > > > > > > > > a plain
> > > > > > > > > > >   pipe, we will want to use that.
> > > > > > > > > > > 
> > > > > > > > > > > - CHECK_DEVICE_STATE: After the state has been transferred
> > > > > > > > > > > through the
> > > > > > > > > > >   pipe (the end indicated by EOF), the front-end invokes
> > > > > > > > > > > this function
> > > > > > > > > > >   to verify success.  There is no in-band way (through the
> > > > > > > > > > > pipe) to
> > > > > > > > > > >   indicate failure, so we need to check explicitly.
> > > > > > > > > > > 
> > > > > > > > > > > Once the transfer pipe has been established via
> > > > > > > > > > > SET_DEVICE_STATE_FD
> > > > > > > > > > > (which includes establishing the direction of transfer and
> > > > > > > > > > > migration
> > > > > > > > > > > phase), the sending side writes its data into the pipe, 
> > > > > > > > > > > and
> > > > > > > > > > > the reading
> > > > > > > > > > > side reads it until it sees an EOF.  Then, the front-end
> > > > > > > > > > > will check for
> > > > > > > > > > > success via CHECK_DEVICE_STATE, which on the destination
> > > > > > > > > > > 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-20 Thread Eugenio Pérez
On Wed, 2023-04-19 at 07:21 -0400, Stefan Hajnoczi wrote:
> On Wed, 19 Apr 2023 at 07:10, Hanna Czenczek  wrote:
> > On 18.04.23 09:54, Eugenio Perez Martin wrote:
> > > On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi 
> > > wrote:
> > > > On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin 
> > > > wrote:
> > > > > On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi 
> > > > > wrote:
> > > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin
> > > > > > wrote:
> > > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi <
> > > > > > > stefa...@redhat.com> wrote:
> > > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > > > > > So-called "internal" virtio-fs migration refers to
> > > > > > > > > transporting the
> > > > > > > > > back-end's (virtiofsd's) state through qemu's migration
> > > > > > > > > stream.  To do
> > > > > > > > > this, we need to be able to transfer virtiofsd's internal
> > > > > > > > > state to and
> > > > > > > > > from virtiofsd.
> > > > > > > > > 
> > > > > > > > > Because virtiofsd's internal state will not be too large, we
> > > > > > > > > believe it
> > > > > > > > > is best to transfer it as a single binary blob after the
> > > > > > > > > streaming
> > > > > > > > > phase.  Because this method should be useful to other vhost-
> > > > > > > > > user
> > > > > > > > > implementations, too, it is introduced as a general-purpose
> > > > > > > > > addition to
> > > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > > 
> > > > > > > > > These are the additions to the protocol:
> > > > > > > > > - New vhost-user protocol feature
> > > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > > >This feature signals support for transferring state, and is
> > > > > > > > > added so
> > > > > > > > >that migration can fail early when the back-end has no
> > > > > > > > > support.
> > > > > > > > > 
> > > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end
> > > > > > > > > negotiate a pipe
> > > > > > > > >over which to transfer the state.  The front-end sends an
> > > > > > > > > FD to the
> > > > > > > > >back-end into/from which it can write/read its state, and
> > > > > > > > > the back-end
> > > > > > > > >can decide to either use it, or reply with a different FD
> > > > > > > > > for the
> > > > > > > > >front-end to override the front-end's choice.
> > > > > > > > >The front-end creates a simple pipe to transfer the state,
> > > > > > > > > but maybe
> > > > > > > > >the back-end already has an FD into/from which it has to
> > > > > > > > > write/read
> > > > > > > > >its state, in which case it will want to override the
> > > > > > > > > simple pipe.
> > > > > > > > >Conversely, maybe in the future we find a way to have the
> > > > > > > > > front-end
> > > > > > > > >get an immediate FD for the migration stream (in some
> > > > > > > > > cases), in which
> > > > > > > > >case we will want to send this to the back-end instead of
> > > > > > > > > creating a
> > > > > > > > >pipe.
> > > > > > > > >Hence the negotiation: If one side has a better idea than a
> > > > > > > > > plain
> > > > > > > > >pipe, we will want to use that.
> > > > > > > > > 
> > > > > > > > > - CHECK_DEVICE_STATE: After the state has been transferred
> > > > > > > > > through the
> > > > > > > > >pipe (the end indicated by EOF), the front-end invokes this
> > > > > > > > > function
> > > > > > > > >to verify success.  There is no in-band way (through the
> > > > > > > > > pipe) to
> > > > > > > > >indicate failure, so we need to check explicitly.
> > > > > > > > > 
> > > > > > > > > Once the transfer pipe has been established via
> > > > > > > > > SET_DEVICE_STATE_FD
> > > > > > > > > (which includes establishing the direction of transfer and
> > > > > > > > > migration
> > > > > > > > > phase), the sending side writes its data into the pipe, and
> > > > > > > > > the reading
> > > > > > > > > side reads it until it sees an EOF.  Then, the front-end will
> > > > > > > > > check for
> > > > > > > > > success via CHECK_DEVICE_STATE, which on the destination side
> > > > > > > > > includes
> > > > > > > > > checking for integrity (i.e. errors during deserialization).
> > > > > > > > > 
> > > > > > > > > Suggested-by: Stefan Hajnoczi 
> > > > > > > > > Signed-off-by: Hanna Czenczek 
> > > > > > > > > ---
> > > > > > > > >   include/hw/virtio/vhost-backend.h |  24 +
> > > > > > > > >   include/hw/virtio/vhost.h |  79 
> > > > > > > > >   hw/virtio/vhost-user.c| 147
> > > > > > > > > ++
> > > > > > > > >   hw/virtio/vhost.c |  37 
> > > > > > > > >   4 files changed, 287 insertions(+)
> > > > > > > > > 
> > > > > > > > > diff --git a/include/hw/virtio/vhost-backend.h
> > > > > > > > > b/include/hw/virtio/vhost-backend.h
> > > > > > > > > index 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-20 Thread Eugenio Pérez
On Tue, 2023-04-18 at 16:40 -0400, Stefan Hajnoczi wrote:
> On Tue, 18 Apr 2023 at 14:31, Eugenio Perez Martin 
> wrote:
> > On Tue, Apr 18, 2023 at 7:59 PM Stefan Hajnoczi  wrote:
> > > On Tue, Apr 18, 2023 at 10:09:30AM +0200, Eugenio Perez Martin wrote:
> > > > On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi 
> > > > wrote:
> > > > > On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin <
> > > > > epere...@redhat.com> wrote:
> > > > > > On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  > > > > > > wrote:
> > > > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin
> > > > > > > wrote:
> > > > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi <
> > > > > > > > stefa...@redhat.com> wrote:
> > > > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek
> > > > > > > > > wrote:
> > > > > > > > > > So-called "internal" virtio-fs migration refers to
> > > > > > > > > > transporting the
> > > > > > > > > > back-end's (virtiofsd's) state through qemu's migration
> > > > > > > > > > stream.  To do
> > > > > > > > > > this, we need to be able to transfer virtiofsd's internal
> > > > > > > > > > state to and
> > > > > > > > > > from virtiofsd.
> > > > > > > > > > 
> > > > > > > > > > Because virtiofsd's internal state will not be too large, we
> > > > > > > > > > believe it
> > > > > > > > > > is best to transfer it as a single binary blob after the
> > > > > > > > > > streaming
> > > > > > > > > > phase.  Because this method should be useful to other vhost-
> > > > > > > > > > user
> > > > > > > > > > implementations, too, it is introduced as a general-purpose
> > > > > > > > > > addition to
> > > > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > > > 
> > > > > > > > > > These are the additions to the protocol:
> > > > > > > > > > - New vhost-user protocol feature
> > > > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > > > >   This feature signals support for transferring state, and
> > > > > > > > > > is added so
> > > > > > > > > >   that migration can fail early when the back-end has no
> > > > > > > > > > support.
> > > > > > > > > > 
> > > > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end
> > > > > > > > > > negotiate a pipe
> > > > > > > > > >   over which to transfer the state.  The front-end sends an
> > > > > > > > > > FD to the
> > > > > > > > > >   back-end into/from which it can write/read its state, and
> > > > > > > > > > the back-end
> > > > > > > > > >   can decide to either use it, or reply with a different FD
> > > > > > > > > > for the
> > > > > > > > > >   front-end to override the front-end's choice.
> > > > > > > > > >   The front-end creates a simple pipe to transfer the state,
> > > > > > > > > > but maybe
> > > > > > > > > >   the back-end already has an FD into/from which it has to
> > > > > > > > > > write/read
> > > > > > > > > >   its state, in which case it will want to override the
> > > > > > > > > > simple pipe.
> > > > > > > > > >   Conversely, maybe in the future we find a way to have the
> > > > > > > > > > front-end
> > > > > > > > > >   get an immediate FD for the migration stream (in some
> > > > > > > > > > cases), in which
> > > > > > > > > >   case we will want to send this to the back-end instead of
> > > > > > > > > > creating a
> > > > > > > > > >   pipe.
> > > > > > > > > >   Hence the negotiation: If one side has a better idea than
> > > > > > > > > > a plain
> > > > > > > > > >   pipe, we will want to use that.
> > > > > > > > > > 
> > > > > > > > > > - CHECK_DEVICE_STATE: After the state has been transferred
> > > > > > > > > > through the
> > > > > > > > > >   pipe (the end indicated by EOF), the front-end invokes
> > > > > > > > > > this function
> > > > > > > > > >   to verify success.  There is no in-band way (through the
> > > > > > > > > > pipe) to
> > > > > > > > > >   indicate failure, so we need to check explicitly.
> > > > > > > > > > 
> > > > > > > > > > Once the transfer pipe has been established via
> > > > > > > > > > SET_DEVICE_STATE_FD
> > > > > > > > > > (which includes establishing the direction of transfer and
> > > > > > > > > > migration
> > > > > > > > > > phase), the sending side writes its data into the pipe, and
> > > > > > > > > > the reading
> > > > > > > > > > side reads it until it sees an EOF.  Then, the front-end
> > > > > > > > > > will check for
> > > > > > > > > > success via CHECK_DEVICE_STATE, which on the destination
> > > > > > > > > > side includes
> > > > > > > > > > checking for integrity (i.e. errors during deserialization).
> > > > > > > > > > 
> > > > > > > > > > Suggested-by: Stefan Hajnoczi 
> > > > > > > > > > Signed-off-by: Hanna Czenczek 
> > > > > > > > > > ---
> > > > > > > > > >  include/hw/virtio/vhost-backend.h |  24 +
> > > > > > > > > >  include/hw/virtio/vhost.h |  79 
> > > > > > > > > >  hw/virtio/vhost-user.c| 147
> > > > > > > > > > ++
> > > > > > 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-20 Thread Eugenio Pérez
On Wed, 2023-04-19 at 13:10 +0200, Hanna Czenczek wrote:
> On 18.04.23 09:54, Eugenio Perez Martin wrote:
> > On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi  wrote:
> > > On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin 
> > > wrote:
> > > > On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi 
> > > > wrote:
> > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi <
> > > > > > stefa...@redhat.com> wrote:
> > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > > > > So-called "internal" virtio-fs migration refers to transporting
> > > > > > > > the
> > > > > > > > back-end's (virtiofsd's) state through qemu's migration
> > > > > > > > stream.  To do
> > > > > > > > this, we need to be able to transfer virtiofsd's internal state
> > > > > > > > to and
> > > > > > > > from virtiofsd.
> > > > > > > > 
> > > > > > > > Because virtiofsd's internal state will not be too large, we
> > > > > > > > believe it
> > > > > > > > is best to transfer it as a single binary blob after the
> > > > > > > > streaming
> > > > > > > > phase.  Because this method should be useful to other vhost-user
> > > > > > > > implementations, too, it is introduced as a general-purpose
> > > > > > > > addition to
> > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > 
> > > > > > > > These are the additions to the protocol:
> > > > > > > > - New vhost-user protocol feature
> > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > >This feature signals support for transferring state, and is
> > > > > > > > added so
> > > > > > > >that migration can fail early when the back-end has no
> > > > > > > > support.
> > > > > > > > 
> > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate
> > > > > > > > a pipe
> > > > > > > >over which to transfer the state.  The front-end sends an FD
> > > > > > > > to the
> > > > > > > >back-end into/from which it can write/read its state, and the
> > > > > > > > back-end
> > > > > > > >can decide to either use it, or reply with a different FD for
> > > > > > > > the
> > > > > > > >front-end to override the front-end's choice.
> > > > > > > >The front-end creates a simple pipe to transfer the state,
> > > > > > > > but maybe
> > > > > > > >the back-end already has an FD into/from which it has to
> > > > > > > > write/read
> > > > > > > >its state, in which case it will want to override the simple
> > > > > > > > pipe.
> > > > > > > >Conversely, maybe in the future we find a way to have the
> > > > > > > > front-end
> > > > > > > >get an immediate FD for the migration stream (in some cases),
> > > > > > > > in which
> > > > > > > >case we will want to send this to the back-end instead of
> > > > > > > > creating a
> > > > > > > >pipe.
> > > > > > > >Hence the negotiation: If one side has a better idea than a
> > > > > > > > plain
> > > > > > > >pipe, we will want to use that.
> > > > > > > > 
> > > > > > > > - CHECK_DEVICE_STATE: After the state has been transferred
> > > > > > > > through the
> > > > > > > >pipe (the end indicated by EOF), the front-end invokes this
> > > > > > > > function
> > > > > > > >to verify success.  There is no in-band way (through the
> > > > > > > > pipe) to
> > > > > > > >indicate failure, so we need to check explicitly.
> > > > > > > > 
> > > > > > > > Once the transfer pipe has been established via
> > > > > > > > SET_DEVICE_STATE_FD
> > > > > > > > (which includes establishing the direction of transfer and
> > > > > > > > migration
> > > > > > > > phase), the sending side writes its data into the pipe, and the
> > > > > > > > reading
> > > > > > > > side reads it until it sees an EOF.  Then, the front-end will
> > > > > > > > check for
> > > > > > > > success via CHECK_DEVICE_STATE, which on the destination side
> > > > > > > > includes
> > > > > > > > checking for integrity (i.e. errors during deserialization).
> > > > > > > > 
> > > > > > > > Suggested-by: Stefan Hajnoczi 
> > > > > > > > Signed-off-by: Hanna Czenczek 
> > > > > > > > ---
> > > > > > > >   include/hw/virtio/vhost-backend.h |  24 +
> > > > > > > >   include/hw/virtio/vhost.h |  79 
> > > > > > > >   hw/virtio/vhost-user.c| 147
> > > > > > > > ++
> > > > > > > >   hw/virtio/vhost.c |  37 
> > > > > > > >   4 files changed, 287 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/include/hw/virtio/vhost-backend.h
> > > > > > > > b/include/hw/virtio/vhost-backend.h
> > > > > > > > index ec3fbae58d..5935b32fe3 100644
> > > > > > > > --- a/include/hw/virtio/vhost-backend.h
> > > > > > > > +++ b/include/hw/virtio/vhost-backend.h
> > > > > > > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > > > > > >   VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > > > > > > >   } 

Re: [Virtio-fs] [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-19 Thread Stefan Hajnoczi
On Wed, 19 Apr 2023 at 07:16, Hanna Czenczek  wrote:
>
> On 19.04.23 13:10, Stefan Hajnoczi wrote:
> > On Wed, 19 Apr 2023 at 06:57, Hanna Czenczek  wrote:
> >> On 18.04.23 19:59, Stefan Hajnoczi wrote:
> >>> On Tue, Apr 18, 2023 at 10:09:30AM +0200, Eugenio Perez Martin wrote:
>  On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi  
>  wrote:
> > On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin 
> >  wrote:
> >> On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  
> >> wrote:
> >>> On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
>  On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi 
>   wrote:
> > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> >> So-called "internal" virtio-fs migration refers to transporting the
> >> back-end's (virtiofsd's) state through qemu's migration stream.  
> >> To do
> >> this, we need to be able to transfer virtiofsd's internal state to 
> >> and
> >> from virtiofsd.
> >>
> >> Because virtiofsd's internal state will not be too large, we 
> >> believe it
> >> is best to transfer it as a single binary blob after the streaming
> >> phase.  Because this method should be useful to other vhost-user
> >> implementations, too, it is introduced as a general-purpose 
> >> addition to
> >> the protocol, not limited to vhost-user-fs.
> >>
> >> These are the additions to the protocol:
> >> - New vhost-user protocol feature 
> >> VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> >> This feature signals support for transferring state, and is 
> >> added so
> >> that migration can fail early when the back-end has no support.
> >>
> >> - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a 
> >> pipe
> >> over which to transfer the state.  The front-end sends an FD 
> >> to the
> >> back-end into/from which it can write/read its state, and the 
> >> back-end
> >> can decide to either use it, or reply with a different FD for 
> >> the
> >> front-end to override the front-end's choice.
> >> The front-end creates a simple pipe to transfer the state, but 
> >> maybe
> >> the back-end already has an FD into/from which it has to 
> >> write/read
> >> its state, in which case it will want to override the simple 
> >> pipe.
> >> Conversely, maybe in the future we find a way to have the 
> >> front-end
> >> get an immediate FD for the migration stream (in some cases), 
> >> in which
> >> case we will want to send this to the back-end instead of 
> >> creating a
> >> pipe.
> >> Hence the negotiation: If one side has a better idea than a 
> >> plain
> >> pipe, we will want to use that.
> >>
> >> - CHECK_DEVICE_STATE: After the state has been transferred through 
> >> the
> >> pipe (the end indicated by EOF), the front-end invokes this 
> >> function
> >> to verify success.  There is no in-band way (through the pipe) 
> >> to
> >> indicate failure, so we need to check explicitly.
> >>
> >> Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> >> (which includes establishing the direction of transfer and 
> >> migration
> >> phase), the sending side writes its data into the pipe, and the 
> >> reading
> >> side reads it until it sees an EOF.  Then, the front-end will 
> >> check for
> >> success via CHECK_DEVICE_STATE, which on the destination side 
> >> includes
> >> checking for integrity (i.e. errors during deserialization).
> >>
> >> Suggested-by: Stefan Hajnoczi 
> >> Signed-off-by: Hanna Czenczek 
> >> ---
> >>include/hw/virtio/vhost-backend.h |  24 +
> >>include/hw/virtio/vhost.h |  79 
> >>hw/virtio/vhost-user.c| 147 
> >> ++
> >>hw/virtio/vhost.c |  37 
> >>4 files changed, 287 insertions(+)
> >>
> >> diff --git a/include/hw/virtio/vhost-backend.h 
> >> b/include/hw/virtio/vhost-backend.h
> >> index ec3fbae58d..5935b32fe3 100644
> >> --- a/include/hw/virtio/vhost-backend.h
> >> +++ b/include/hw/virtio/vhost-backend.h
> >> @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> >>VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> >>} VhostSetConfigType;
> >>
> >> +typedef enum VhostDeviceStateDirection {
> >> +/* Transfer 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-19 Thread Hanna Czenczek

On 19.04.23 13:21, Stefan Hajnoczi wrote:

On Wed, 19 Apr 2023 at 07:10, Hanna Czenczek  wrote:

On 18.04.23 09:54, Eugenio Perez Martin wrote:

On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi  wrote:

On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin  wrote:

On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi  wrote:

On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:

On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  wrote:

On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:

So-called "internal" virtio-fs migration refers to transporting the
back-end's (virtiofsd's) state through qemu's migration stream.  To do
this, we need to be able to transfer virtiofsd's internal state to and
from virtiofsd.

Because virtiofsd's internal state will not be too large, we believe it
is best to transfer it as a single binary blob after the streaming
phase.  Because this method should be useful to other vhost-user
implementations, too, it is introduced as a general-purpose addition to
the protocol, not limited to vhost-user-fs.

These are the additions to the protocol:
- New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
This feature signals support for transferring state, and is added so
that migration can fail early when the back-end has no support.

- SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
over which to transfer the state.  The front-end sends an FD to the
back-end into/from which it can write/read its state, and the back-end
can decide to either use it, or reply with a different FD for the
front-end to override the front-end's choice.
The front-end creates a simple pipe to transfer the state, but maybe
the back-end already has an FD into/from which it has to write/read
its state, in which case it will want to override the simple pipe.
Conversely, maybe in the future we find a way to have the front-end
get an immediate FD for the migration stream (in some cases), in which
case we will want to send this to the back-end instead of creating a
pipe.
Hence the negotiation: If one side has a better idea than a plain
pipe, we will want to use that.

- CHECK_DEVICE_STATE: After the state has been transferred through the
pipe (the end indicated by EOF), the front-end invokes this function
to verify success.  There is no in-band way (through the pipe) to
indicate failure, so we need to check explicitly.

Once the transfer pipe has been established via SET_DEVICE_STATE_FD
(which includes establishing the direction of transfer and migration
phase), the sending side writes its data into the pipe, and the reading
side reads it until it sees an EOF.  Then, the front-end will check for
success via CHECK_DEVICE_STATE, which on the destination side includes
checking for integrity (i.e. errors during deserialization).

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Hanna Czenczek 
---
   include/hw/virtio/vhost-backend.h |  24 +
   include/hw/virtio/vhost.h |  79 
   hw/virtio/vhost-user.c| 147 ++
   hw/virtio/vhost.c |  37 
   4 files changed, 287 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index ec3fbae58d..5935b32fe3 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
   VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
   } VhostSetConfigType;

+typedef enum VhostDeviceStateDirection {
+/* Transfer state from back-end (device) to front-end */
+VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
+/* Transfer state from front-end to back-end (device) */
+VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
+} VhostDeviceStateDirection;
+
+typedef enum VhostDeviceStatePhase {
+/* The device (and all its vrings) is stopped */
+VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
+} VhostDeviceStatePhase;

vDPA has:

/* Suspend a device so it does not process virtqueue requests anymore
 *
 * After the return of ioctl the device must preserve all the necessary 
state
 * (the virtqueue vring base plus the possible device specific states) that 
is
 * required for restoring in the future. The device must not change its
 * configuration after that point.
 */
#define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)

/* Resume a device so it can resume processing virtqueue requests
 *
 * After the return of this ioctl the device will have restored all the
 * necessary states and it is fully operational to continue processing the
 * virtqueue descriptors.
 */
#define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)

I wonder if it makes sense to import these into vhost-user so that the
difference between kernel vhost and vhost-user is minimized. It's okay
if one of them is ahead of the other, but it would be nice to avoid

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-19 Thread Stefan Hajnoczi
On Wed, 19 Apr 2023 at 07:10, Hanna Czenczek  wrote:
>
> On 18.04.23 09:54, Eugenio Perez Martin wrote:
> > On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi  wrote:
> >> On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin  
> >> wrote:
> >>> On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi  
> >>> wrote:
>  On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  
> > wrote:
> >> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> >>> So-called "internal" virtio-fs migration refers to transporting the
> >>> back-end's (virtiofsd's) state through qemu's migration stream.  To do
> >>> this, we need to be able to transfer virtiofsd's internal state to and
> >>> from virtiofsd.
> >>>
> >>> Because virtiofsd's internal state will not be too large, we believe 
> >>> it
> >>> is best to transfer it as a single binary blob after the streaming
> >>> phase.  Because this method should be useful to other vhost-user
> >>> implementations, too, it is introduced as a general-purpose addition 
> >>> to
> >>> the protocol, not limited to vhost-user-fs.
> >>>
> >>> These are the additions to the protocol:
> >>> - New vhost-user protocol feature 
> >>> VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> >>>This feature signals support for transferring state, and is added 
> >>> so
> >>>that migration can fail early when the back-end has no support.
> >>>
> >>> - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a 
> >>> pipe
> >>>over which to transfer the state.  The front-end sends an FD to the
> >>>back-end into/from which it can write/read its state, and the 
> >>> back-end
> >>>can decide to either use it, or reply with a different FD for the
> >>>front-end to override the front-end's choice.
> >>>The front-end creates a simple pipe to transfer the state, but 
> >>> maybe
> >>>the back-end already has an FD into/from which it has to write/read
> >>>its state, in which case it will want to override the simple pipe.
> >>>Conversely, maybe in the future we find a way to have the front-end
> >>>get an immediate FD for the migration stream (in some cases), in 
> >>> which
> >>>case we will want to send this to the back-end instead of creating 
> >>> a
> >>>pipe.
> >>>Hence the negotiation: If one side has a better idea than a plain
> >>>pipe, we will want to use that.
> >>>
> >>> - CHECK_DEVICE_STATE: After the state has been transferred through the
> >>>pipe (the end indicated by EOF), the front-end invokes this 
> >>> function
> >>>to verify success.  There is no in-band way (through the pipe) to
> >>>indicate failure, so we need to check explicitly.
> >>>
> >>> Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> >>> (which includes establishing the direction of transfer and migration
> >>> phase), the sending side writes its data into the pipe, and the 
> >>> reading
> >>> side reads it until it sees an EOF.  Then, the front-end will check 
> >>> for
> >>> success via CHECK_DEVICE_STATE, which on the destination side includes
> >>> checking for integrity (i.e. errors during deserialization).
> >>>
> >>> Suggested-by: Stefan Hajnoczi 
> >>> Signed-off-by: Hanna Czenczek 
> >>> ---
> >>>   include/hw/virtio/vhost-backend.h |  24 +
> >>>   include/hw/virtio/vhost.h |  79 
> >>>   hw/virtio/vhost-user.c| 147 
> >>> ++
> >>>   hw/virtio/vhost.c |  37 
> >>>   4 files changed, 287 insertions(+)
> >>>
> >>> diff --git a/include/hw/virtio/vhost-backend.h 
> >>> b/include/hw/virtio/vhost-backend.h
> >>> index ec3fbae58d..5935b32fe3 100644
> >>> --- a/include/hw/virtio/vhost-backend.h
> >>> +++ b/include/hw/virtio/vhost-backend.h
> >>> @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> >>>   VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> >>>   } VhostSetConfigType;
> >>>
> >>> +typedef enum VhostDeviceStateDirection {
> >>> +/* Transfer state from back-end (device) to front-end */
> >>> +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> >>> +/* Transfer state from front-end to back-end (device) */
> >>> +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> >>> +} VhostDeviceStateDirection;
> >>> +
> >>> +typedef enum VhostDeviceStatePhase {
> >>> +/* The device (and all its vrings) is stopped */
> >>> +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> >>> +} VhostDeviceStatePhase;
> >> vDPA has:
> >>
> >>/* Suspend a device so it does not process virtqueue requests 
> >> anymore
> >> *
> >>

Re: [Virtio-fs] [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-19 Thread Hanna Czenczek

On 19.04.23 13:10, Stefan Hajnoczi wrote:

On Wed, 19 Apr 2023 at 06:57, Hanna Czenczek  wrote:

On 18.04.23 19:59, Stefan Hajnoczi wrote:

On Tue, Apr 18, 2023 at 10:09:30AM +0200, Eugenio Perez Martin wrote:

On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi  wrote:

On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin  wrote:

On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  wrote:

On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:

On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  wrote:

On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:

So-called "internal" virtio-fs migration refers to transporting the
back-end's (virtiofsd's) state through qemu's migration stream.  To do
this, we need to be able to transfer virtiofsd's internal state to and
from virtiofsd.

Because virtiofsd's internal state will not be too large, we believe it
is best to transfer it as a single binary blob after the streaming
phase.  Because this method should be useful to other vhost-user
implementations, too, it is introduced as a general-purpose addition to
the protocol, not limited to vhost-user-fs.

These are the additions to the protocol:
- New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
This feature signals support for transferring state, and is added so
that migration can fail early when the back-end has no support.

- SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
over which to transfer the state.  The front-end sends an FD to the
back-end into/from which it can write/read its state, and the back-end
can decide to either use it, or reply with a different FD for the
front-end to override the front-end's choice.
The front-end creates a simple pipe to transfer the state, but maybe
the back-end already has an FD into/from which it has to write/read
its state, in which case it will want to override the simple pipe.
Conversely, maybe in the future we find a way to have the front-end
get an immediate FD for the migration stream (in some cases), in which
case we will want to send this to the back-end instead of creating a
pipe.
Hence the negotiation: If one side has a better idea than a plain
pipe, we will want to use that.

- CHECK_DEVICE_STATE: After the state has been transferred through the
pipe (the end indicated by EOF), the front-end invokes this function
to verify success.  There is no in-band way (through the pipe) to
indicate failure, so we need to check explicitly.

Once the transfer pipe has been established via SET_DEVICE_STATE_FD
(which includes establishing the direction of transfer and migration
phase), the sending side writes its data into the pipe, and the reading
side reads it until it sees an EOF.  Then, the front-end will check for
success via CHECK_DEVICE_STATE, which on the destination side includes
checking for integrity (i.e. errors during deserialization).

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Hanna Czenczek 
---
   include/hw/virtio/vhost-backend.h |  24 +
   include/hw/virtio/vhost.h |  79 
   hw/virtio/vhost-user.c| 147 ++
   hw/virtio/vhost.c |  37 
   4 files changed, 287 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index ec3fbae58d..5935b32fe3 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
   VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
   } VhostSetConfigType;

+typedef enum VhostDeviceStateDirection {
+/* Transfer state from back-end (device) to front-end */
+VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
+/* Transfer state from front-end to back-end (device) */
+VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
+} VhostDeviceStateDirection;
+
+typedef enum VhostDeviceStatePhase {
+/* The device (and all its vrings) is stopped */
+VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
+} VhostDeviceStatePhase;

vDPA has:

/* Suspend a device so it does not process virtqueue requests anymore
 *
 * After the return of ioctl the device must preserve all the necessary 
state
 * (the virtqueue vring base plus the possible device specific states) that 
is
 * required for restoring in the future. The device must not change its
 * configuration after that point.
 */
#define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)

/* Resume a device so it can resume processing virtqueue requests
 *
 * After the return of this ioctl the device will have restored all the
 * necessary states and it is fully operational to continue processing the
 * virtqueue descriptors.
 */
#define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)

I wonder if it makes sense to import these into vhost-user so that the
difference between kernel vhost and vhost-user is minimized. It's okay
if one 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-19 Thread Hanna Czenczek

On 18.04.23 09:54, Eugenio Perez Martin wrote:

On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi  wrote:

On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin  wrote:

On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi  wrote:

On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:

On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  wrote:

On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:

So-called "internal" virtio-fs migration refers to transporting the
back-end's (virtiofsd's) state through qemu's migration stream.  To do
this, we need to be able to transfer virtiofsd's internal state to and
from virtiofsd.

Because virtiofsd's internal state will not be too large, we believe it
is best to transfer it as a single binary blob after the streaming
phase.  Because this method should be useful to other vhost-user
implementations, too, it is introduced as a general-purpose addition to
the protocol, not limited to vhost-user-fs.

These are the additions to the protocol:
- New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
   This feature signals support for transferring state, and is added so
   that migration can fail early when the back-end has no support.

- SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
   over which to transfer the state.  The front-end sends an FD to the
   back-end into/from which it can write/read its state, and the back-end
   can decide to either use it, or reply with a different FD for the
   front-end to override the front-end's choice.
   The front-end creates a simple pipe to transfer the state, but maybe
   the back-end already has an FD into/from which it has to write/read
   its state, in which case it will want to override the simple pipe.
   Conversely, maybe in the future we find a way to have the front-end
   get an immediate FD for the migration stream (in some cases), in which
   case we will want to send this to the back-end instead of creating a
   pipe.
   Hence the negotiation: If one side has a better idea than a plain
   pipe, we will want to use that.

- CHECK_DEVICE_STATE: After the state has been transferred through the
   pipe (the end indicated by EOF), the front-end invokes this function
   to verify success.  There is no in-band way (through the pipe) to
   indicate failure, so we need to check explicitly.

Once the transfer pipe has been established via SET_DEVICE_STATE_FD
(which includes establishing the direction of transfer and migration
phase), the sending side writes its data into the pipe, and the reading
side reads it until it sees an EOF.  Then, the front-end will check for
success via CHECK_DEVICE_STATE, which on the destination side includes
checking for integrity (i.e. errors during deserialization).

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Hanna Czenczek 
---
  include/hw/virtio/vhost-backend.h |  24 +
  include/hw/virtio/vhost.h |  79 
  hw/virtio/vhost-user.c| 147 ++
  hw/virtio/vhost.c |  37 
  4 files changed, 287 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index ec3fbae58d..5935b32fe3 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
  } VhostSetConfigType;

+typedef enum VhostDeviceStateDirection {
+/* Transfer state from back-end (device) to front-end */
+VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
+/* Transfer state from front-end to back-end (device) */
+VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
+} VhostDeviceStateDirection;
+
+typedef enum VhostDeviceStatePhase {
+/* The device (and all its vrings) is stopped */
+VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
+} VhostDeviceStatePhase;

vDPA has:

   /* Suspend a device so it does not process virtqueue requests anymore
*
* After the return of ioctl the device must preserve all the necessary state
* (the virtqueue vring base plus the possible device specific states) that 
is
* required for restoring in the future. The device must not change its
* configuration after that point.
*/
   #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)

   /* Resume a device so it can resume processing virtqueue requests
*
* After the return of this ioctl the device will have restored all the
* necessary states and it is fully operational to continue processing the
* virtqueue descriptors.
*/
   #define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)

I wonder if it makes sense to import these into vhost-user so that the
difference between kernel vhost and vhost-user is minimized. It's okay
if one of them is ahead of the other, but it would be nice to avoid
overlapping/duplicated functionality.


That's what I had in mind in the first versions. I proposed VHOST_STOP
instead of VHOST_VDPA_STOP for 

Re: [Virtio-fs] [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-19 Thread Stefan Hajnoczi
On Wed, 19 Apr 2023 at 06:57, Hanna Czenczek  wrote:
>
> On 18.04.23 19:59, Stefan Hajnoczi wrote:
> > On Tue, Apr 18, 2023 at 10:09:30AM +0200, Eugenio Perez Martin wrote:
> >> On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi  wrote:
> >>> On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin  
> >>> wrote:
>  On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  
>  wrote:
> > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> >> On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  
> >> wrote:
> >>> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
>  So-called "internal" virtio-fs migration refers to transporting the
>  back-end's (virtiofsd's) state through qemu's migration stream.  To 
>  do
>  this, we need to be able to transfer virtiofsd's internal state to 
>  and
>  from virtiofsd.
> 
>  Because virtiofsd's internal state will not be too large, we believe 
>  it
>  is best to transfer it as a single binary blob after the streaming
>  phase.  Because this method should be useful to other vhost-user
>  implementations, too, it is introduced as a general-purpose addition 
>  to
>  the protocol, not limited to vhost-user-fs.
> 
>  These are the additions to the protocol:
>  - New vhost-user protocol feature 
>  VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> This feature signals support for transferring state, and is added 
>  so
> that migration can fail early when the back-end has no support.
> 
>  - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a 
>  pipe
> over which to transfer the state.  The front-end sends an FD to 
>  the
> back-end into/from which it can write/read its state, and the 
>  back-end
> can decide to either use it, or reply with a different FD for the
> front-end to override the front-end's choice.
> The front-end creates a simple pipe to transfer the state, but 
>  maybe
> the back-end already has an FD into/from which it has to 
>  write/read
> its state, in which case it will want to override the simple pipe.
> Conversely, maybe in the future we find a way to have the 
>  front-end
> get an immediate FD for the migration stream (in some cases), in 
>  which
> case we will want to send this to the back-end instead of 
>  creating a
> pipe.
> Hence the negotiation: If one side has a better idea than a plain
> pipe, we will want to use that.
> 
>  - CHECK_DEVICE_STATE: After the state has been transferred through 
>  the
> pipe (the end indicated by EOF), the front-end invokes this 
>  function
> to verify success.  There is no in-band way (through the pipe) to
> indicate failure, so we need to check explicitly.
> 
>  Once the transfer pipe has been established via SET_DEVICE_STATE_FD
>  (which includes establishing the direction of transfer and migration
>  phase), the sending side writes its data into the pipe, and the 
>  reading
>  side reads it until it sees an EOF.  Then, the front-end will check 
>  for
>  success via CHECK_DEVICE_STATE, which on the destination side 
>  includes
>  checking for integrity (i.e. errors during deserialization).
> 
>  Suggested-by: Stefan Hajnoczi 
>  Signed-off-by: Hanna Czenczek 
>  ---
>    include/hw/virtio/vhost-backend.h |  24 +
>    include/hw/virtio/vhost.h |  79 
>    hw/virtio/vhost-user.c| 147 
>  ++
>    hw/virtio/vhost.c |  37 
>    4 files changed, 287 insertions(+)
> 
>  diff --git a/include/hw/virtio/vhost-backend.h 
>  b/include/hw/virtio/vhost-backend.h
>  index ec3fbae58d..5935b32fe3 100644
>  --- a/include/hw/virtio/vhost-backend.h
>  +++ b/include/hw/virtio/vhost-backend.h
>  @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
>    VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
>    } VhostSetConfigType;
> 
>  +typedef enum VhostDeviceStateDirection {
>  +/* Transfer state from back-end (device) to front-end */
>  +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
>  +/* Transfer state from front-end to back-end (device) */
>  +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
>  +} VhostDeviceStateDirection;
>  +
>  +typedef enum VhostDeviceStatePhase {
>  +/* The device (and all its vrings) is 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-19 Thread Stefan Hajnoczi
On Wed, 19 Apr 2023 at 06:45, Hanna Czenczek  wrote:
>
> On 14.04.23 17:17, Eugenio Perez Martin wrote:
> > On Thu, Apr 13, 2023 at 7:55 PM Hanna Czenczek  wrote:
>
> [...]
>
> >> Basically, what I’m hearing is that I need to implement a different
> >> feature that has no practical impact right now, and also fix bugs around
> >> it along the way...
> >>
> > To fix this properly requires iterative device migration in qemu as
> > far as I know, instead of using VMStates [1]. This way the state is
> > requested to virtiofsd before the device reset.
> >
> > What does virtiofsd do when the state is totally sent? Does it keep
> > processing requests and generating new state or is only a one shot
> > that will suspend the daemon? If it is the second I think it still can
> > be done in one shot at the end, always indicating "no more state" at
> > save_live_pending and sending all the state at
> > save_live_complete_precopy.
>
> This sounds to me as if we should reset all devices during migration,
> and I don’t understand that.  virtiofsd will not immediately process
> requests when the state is sent, because the device is still stopped,
> but when it is re-enabled (e.g. because of a failed migration), it will
> have retained its state and continue processing requests as if nothing
> happened.  A reset would break this and other stateful back-ends, as I
> think Stefan has mentioned somewhere else.
>
> It seems to me as if there are devices that need a reset, and so need
> suspend+resume around it, but I also think there are back-ends that
> don’t, where this would only unnecessarily complicate the back-end
> implementation.

Existing vhost-user backends must continue working, so I think having
two code paths is (almost) unavoidable.

One approach is to add SUSPEND/RESUME to the vhost-user protocol with
a corresponding VHOST_USER_PROTOCOL_F_SUSPEND feature bit. vhost-user
frontends can identify backends that support SUSPEND/RESUME instead of
device reset. Old vhost-user backends will continue to use device
reset.

I said avoiding two code paths is almost unavoidable. It may be
possible to rely on existing VHOST_USER_GET_VRING_BASE's semantics (it
stops a single virtqueue) instead of SUSPEND. RESUME is replaced by
VHOST_USER_SET_VRING_* and gets the device going again. However, I'm
not 100% sure if this will work (even for all existing devices). It
would require carefully studying both the spec and various
implementations to see if it's viable. There's a chance of losing the
performance optimization that VHOST_USER_SET_STATUS provided to DPDK
if the device is not reset.

In my opinion SUSPEND/RESUME is the cleanest way to do this.

Stefan



Re: [Virtio-fs] [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-19 Thread Hanna Czenczek

On 18.04.23 19:59, Stefan Hajnoczi wrote:

On Tue, Apr 18, 2023 at 10:09:30AM +0200, Eugenio Perez Martin wrote:

On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi  wrote:

On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin  wrote:

On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  wrote:

On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:

On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  wrote:

On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:

So-called "internal" virtio-fs migration refers to transporting the
back-end's (virtiofsd's) state through qemu's migration stream.  To do
this, we need to be able to transfer virtiofsd's internal state to and
from virtiofsd.

Because virtiofsd's internal state will not be too large, we believe it
is best to transfer it as a single binary blob after the streaming
phase.  Because this method should be useful to other vhost-user
implementations, too, it is introduced as a general-purpose addition to
the protocol, not limited to vhost-user-fs.

These are the additions to the protocol:
- New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
   This feature signals support for transferring state, and is added so
   that migration can fail early when the back-end has no support.

- SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
   over which to transfer the state.  The front-end sends an FD to the
   back-end into/from which it can write/read its state, and the back-end
   can decide to either use it, or reply with a different FD for the
   front-end to override the front-end's choice.
   The front-end creates a simple pipe to transfer the state, but maybe
   the back-end already has an FD into/from which it has to write/read
   its state, in which case it will want to override the simple pipe.
   Conversely, maybe in the future we find a way to have the front-end
   get an immediate FD for the migration stream (in some cases), in which
   case we will want to send this to the back-end instead of creating a
   pipe.
   Hence the negotiation: If one side has a better idea than a plain
   pipe, we will want to use that.

- CHECK_DEVICE_STATE: After the state has been transferred through the
   pipe (the end indicated by EOF), the front-end invokes this function
   to verify success.  There is no in-band way (through the pipe) to
   indicate failure, so we need to check explicitly.

Once the transfer pipe has been established via SET_DEVICE_STATE_FD
(which includes establishing the direction of transfer and migration
phase), the sending side writes its data into the pipe, and the reading
side reads it until it sees an EOF.  Then, the front-end will check for
success via CHECK_DEVICE_STATE, which on the destination side includes
checking for integrity (i.e. errors during deserialization).

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Hanna Czenczek 
---
  include/hw/virtio/vhost-backend.h |  24 +
  include/hw/virtio/vhost.h |  79 
  hw/virtio/vhost-user.c| 147 ++
  hw/virtio/vhost.c |  37 
  4 files changed, 287 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index ec3fbae58d..5935b32fe3 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
  } VhostSetConfigType;

+typedef enum VhostDeviceStateDirection {
+/* Transfer state from back-end (device) to front-end */
+VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
+/* Transfer state from front-end to back-end (device) */
+VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
+} VhostDeviceStateDirection;
+
+typedef enum VhostDeviceStatePhase {
+/* The device (and all its vrings) is stopped */
+VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
+} VhostDeviceStatePhase;

vDPA has:

   /* Suspend a device so it does not process virtqueue requests anymore
*
* After the return of ioctl the device must preserve all the necessary state
* (the virtqueue vring base plus the possible device specific states) that 
is
* required for restoring in the future. The device must not change its
* configuration after that point.
*/
   #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)

   /* Resume a device so it can resume processing virtqueue requests
*
* After the return of this ioctl the device will have restored all the
* necessary states and it is fully operational to continue processing the
* virtqueue descriptors.
*/
   #define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)

I wonder if it makes sense to import these into vhost-user so that the
difference between kernel vhost and vhost-user is minimized. It's okay
if one of them is ahead of the other, but it would be nice to avoid
overlapping/duplicated functionality.


That's what I had in mind in the 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-19 Thread Hanna Czenczek

On 17.04.23 17:12, Stefan Hajnoczi wrote:

[...]


This brings to mind how iterative migration will work. The interface for
iterative migration is basically the same as non-iterative migration
plus a method to query the number of bytes remaining. When the number of
bytes falls below a threshold, the vCPUs are stopped and the remainder
of the data is read.

Some details from VFIO migration:
- The VMM must explicitly change the state when transitioning from
   iterative and non-iterative migration, but the data transfer fd
   remains the same.
- The state of the device (running, stopped, resuming, etc) doesn't
   change asynchronously, it's always driven by the VMM. However, setting
   the state can fail and then the new state may be an error state.

Mapping this to SET_DEVICE_STATE_FD:
- VhostDeviceStatePhase is extended with
   VHOST_TRANSFER_STATE_PHASE_RUNNING = 1 for iterative migration. The
   frontend sends SET_DEVICE_STATE_FD again with
   VHOST_TRANSFER_STATE_PHASE_STOPPED when entering non-iterative
   migration and the frontend sends the iterative fd from the previous
   SET_DEVICE_STATE_FD call to the backend. The backend may reply with
   another fd, if necessary. If the backend changes the fd, then the
   contents of the previous fd must be fully read and transferred before
   the contents of the new fd are migrated. (Maybe this is too complex
   and we should forbid changing the fd when going from RUNNING ->
   STOPPED.)
- CHECK_DEVICE_STATE can be extended to report the number of bytes
   remaining. The semantics change so that CHECK_DEVICE_STATE can be
   called while the VMM is still reading from the fd. It becomes:

 enum CheckDeviceStateResult {
 Saving(bytes_remaining : usize),
Failed(error_code : u64),
 }


Sounds good.  Personally, I’d forbid changing the FD when just changing 
state, which raises the question of whether there should then be a 
separate command for just changing the state (like VFIO_DEVICE_FEATURE 
..._MIG_DEVICE_STATE?), but that would be a question for then.


Changing the CHECK_DEVICE_STATE interface sounds good to me.

Hanna




Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-19 Thread Hanna Czenczek

On 14.04.23 17:17, Eugenio Perez Martin wrote:

On Thu, Apr 13, 2023 at 7:55 PM Hanna Czenczek  wrote:


[...]


Basically, what I’m hearing is that I need to implement a different
feature that has no practical impact right now, and also fix bugs around
it along the way...


To fix this properly requires iterative device migration in qemu as
far as I know, instead of using VMStates [1]. This way the state is
requested to virtiofsd before the device reset.

What does virtiofsd do when the state is totally sent? Does it keep
processing requests and generating new state or is only a one shot
that will suspend the daemon? If it is the second I think it still can
be done in one shot at the end, always indicating "no more state" at
save_live_pending and sending all the state at
save_live_complete_precopy.


This sounds to me as if we should reset all devices during migration, 
and I don’t understand that.  virtiofsd will not immediately process 
requests when the state is sent, because the device is still stopped, 
but when it is re-enabled (e.g. because of a failed migration), it will 
have retained its state and continue processing requests as if nothing 
happened.  A reset would break this and other stateful back-ends, as I 
think Stefan has mentioned somewhere else.


It seems to me as if there are devices that need a reset, and so need 
suspend+resume around it, but I also think there are back-ends that 
don’t, where this would only unnecessarily complicate the back-end 
implementation.


Hanna




Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-18 Thread Stefan Hajnoczi
On Tue, 18 Apr 2023 at 14:31, Eugenio Perez Martin  wrote:
>
> On Tue, Apr 18, 2023 at 7:59 PM Stefan Hajnoczi  wrote:
> >
> > On Tue, Apr 18, 2023 at 10:09:30AM +0200, Eugenio Perez Martin wrote:
> > > On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi  
> > > wrote:
> > > >
> > > > On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin 
> > > >  wrote:
> > > > >
> > > > > On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  
> > > > > wrote:
> > > > > >
> > > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin 
> > > > > > wrote:
> > > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > > > > > So-called "internal" virtio-fs migration refers to 
> > > > > > > > > transporting the
> > > > > > > > > back-end's (virtiofsd's) state through qemu's migration 
> > > > > > > > > stream.  To do
> > > > > > > > > this, we need to be able to transfer virtiofsd's internal 
> > > > > > > > > state to and
> > > > > > > > > from virtiofsd.
> > > > > > > > >
> > > > > > > > > Because virtiofsd's internal state will not be too large, we 
> > > > > > > > > believe it
> > > > > > > > > is best to transfer it as a single binary blob after the 
> > > > > > > > > streaming
> > > > > > > > > phase.  Because this method should be useful to other 
> > > > > > > > > vhost-user
> > > > > > > > > implementations, too, it is introduced as a general-purpose 
> > > > > > > > > addition to
> > > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > > >
> > > > > > > > > These are the additions to the protocol:
> > > > > > > > > - New vhost-user protocol feature 
> > > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > > >   This feature signals support for transferring state, and is 
> > > > > > > > > added so
> > > > > > > > >   that migration can fail early when the back-end has no 
> > > > > > > > > support.
> > > > > > > > >
> > > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end 
> > > > > > > > > negotiate a pipe
> > > > > > > > >   over which to transfer the state.  The front-end sends an 
> > > > > > > > > FD to the
> > > > > > > > >   back-end into/from which it can write/read its state, and 
> > > > > > > > > the back-end
> > > > > > > > >   can decide to either use it, or reply with a different FD 
> > > > > > > > > for the
> > > > > > > > >   front-end to override the front-end's choice.
> > > > > > > > >   The front-end creates a simple pipe to transfer the state, 
> > > > > > > > > but maybe
> > > > > > > > >   the back-end already has an FD into/from which it has to 
> > > > > > > > > write/read
> > > > > > > > >   its state, in which case it will want to override the 
> > > > > > > > > simple pipe.
> > > > > > > > >   Conversely, maybe in the future we find a way to have the 
> > > > > > > > > front-end
> > > > > > > > >   get an immediate FD for the migration stream (in some 
> > > > > > > > > cases), in which
> > > > > > > > >   case we will want to send this to the back-end instead of 
> > > > > > > > > creating a
> > > > > > > > >   pipe.
> > > > > > > > >   Hence the negotiation: If one side has a better idea than a 
> > > > > > > > > plain
> > > > > > > > >   pipe, we will want to use that.
> > > > > > > > >
> > > > > > > > > - CHECK_DEVICE_STATE: After the state has been transferred 
> > > > > > > > > through the
> > > > > > > > >   pipe (the end indicated by EOF), the front-end invokes this 
> > > > > > > > > function
> > > > > > > > >   to verify success.  There is no in-band way (through the 
> > > > > > > > > pipe) to
> > > > > > > > >   indicate failure, so we need to check explicitly.
> > > > > > > > >
> > > > > > > > > Once the transfer pipe has been established via 
> > > > > > > > > SET_DEVICE_STATE_FD
> > > > > > > > > (which includes establishing the direction of transfer and 
> > > > > > > > > migration
> > > > > > > > > phase), the sending side writes its data into the pipe, and 
> > > > > > > > > the reading
> > > > > > > > > side reads it until it sees an EOF.  Then, the front-end will 
> > > > > > > > > check for
> > > > > > > > > success via CHECK_DEVICE_STATE, which on the destination side 
> > > > > > > > > includes
> > > > > > > > > checking for integrity (i.e. errors during deserialization).
> > > > > > > > >
> > > > > > > > > Suggested-by: Stefan Hajnoczi 
> > > > > > > > > Signed-off-by: Hanna Czenczek 
> > > > > > > > > ---
> > > > > > > > >  include/hw/virtio/vhost-backend.h |  24 +
> > > > > > > > >  include/hw/virtio/vhost.h |  79 
> > > > > > > > >  hw/virtio/vhost-user.c| 147 
> > > > > > > > > ++
> > > > > > > > >  hw/virtio/vhost.c |  37 
> > > > > > > > >  4 files changed, 287 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > > > > > > > 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-18 Thread Eugenio Perez Martin
On Tue, Apr 18, 2023 at 7:59 PM Stefan Hajnoczi  wrote:
>
> On Tue, Apr 18, 2023 at 10:09:30AM +0200, Eugenio Perez Martin wrote:
> > On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi  wrote:
> > >
> > > On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin  
> > > wrote:
> > > >
> > > > On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  
> > > > wrote:
> > > > >
> > > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> > > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > > > > So-called "internal" virtio-fs migration refers to transporting 
> > > > > > > > the
> > > > > > > > back-end's (virtiofsd's) state through qemu's migration stream. 
> > > > > > > >  To do
> > > > > > > > this, we need to be able to transfer virtiofsd's internal state 
> > > > > > > > to and
> > > > > > > > from virtiofsd.
> > > > > > > >
> > > > > > > > Because virtiofsd's internal state will not be too large, we 
> > > > > > > > believe it
> > > > > > > > is best to transfer it as a single binary blob after the 
> > > > > > > > streaming
> > > > > > > > phase.  Because this method should be useful to other vhost-user
> > > > > > > > implementations, too, it is introduced as a general-purpose 
> > > > > > > > addition to
> > > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > > >
> > > > > > > > These are the additions to the protocol:
> > > > > > > > - New vhost-user protocol feature 
> > > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > > >   This feature signals support for transferring state, and is 
> > > > > > > > added so
> > > > > > > >   that migration can fail early when the back-end has no 
> > > > > > > > support.
> > > > > > > >
> > > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end 
> > > > > > > > negotiate a pipe
> > > > > > > >   over which to transfer the state.  The front-end sends an FD 
> > > > > > > > to the
> > > > > > > >   back-end into/from which it can write/read its state, and the 
> > > > > > > > back-end
> > > > > > > >   can decide to either use it, or reply with a different FD for 
> > > > > > > > the
> > > > > > > >   front-end to override the front-end's choice.
> > > > > > > >   The front-end creates a simple pipe to transfer the state, 
> > > > > > > > but maybe
> > > > > > > >   the back-end already has an FD into/from which it has to 
> > > > > > > > write/read
> > > > > > > >   its state, in which case it will want to override the simple 
> > > > > > > > pipe.
> > > > > > > >   Conversely, maybe in the future we find a way to have the 
> > > > > > > > front-end
> > > > > > > >   get an immediate FD for the migration stream (in some cases), 
> > > > > > > > in which
> > > > > > > >   case we will want to send this to the back-end instead of 
> > > > > > > > creating a
> > > > > > > >   pipe.
> > > > > > > >   Hence the negotiation: If one side has a better idea than a 
> > > > > > > > plain
> > > > > > > >   pipe, we will want to use that.
> > > > > > > >
> > > > > > > > - CHECK_DEVICE_STATE: After the state has been transferred 
> > > > > > > > through the
> > > > > > > >   pipe (the end indicated by EOF), the front-end invokes this 
> > > > > > > > function
> > > > > > > >   to verify success.  There is no in-band way (through the 
> > > > > > > > pipe) to
> > > > > > > >   indicate failure, so we need to check explicitly.
> > > > > > > >
> > > > > > > > Once the transfer pipe has been established via 
> > > > > > > > SET_DEVICE_STATE_FD
> > > > > > > > (which includes establishing the direction of transfer and 
> > > > > > > > migration
> > > > > > > > phase), the sending side writes its data into the pipe, and the 
> > > > > > > > reading
> > > > > > > > side reads it until it sees an EOF.  Then, the front-end will 
> > > > > > > > check for
> > > > > > > > success via CHECK_DEVICE_STATE, which on the destination side 
> > > > > > > > includes
> > > > > > > > checking for integrity (i.e. errors during deserialization).
> > > > > > > >
> > > > > > > > Suggested-by: Stefan Hajnoczi 
> > > > > > > > Signed-off-by: Hanna Czenczek 
> > > > > > > > ---
> > > > > > > >  include/hw/virtio/vhost-backend.h |  24 +
> > > > > > > >  include/hw/virtio/vhost.h |  79 
> > > > > > > >  hw/virtio/vhost-user.c| 147 
> > > > > > > > ++
> > > > > > > >  hw/virtio/vhost.c |  37 
> > > > > > > >  4 files changed, 287 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > > > > > > b/include/hw/virtio/vhost-backend.h
> > > > > > > > index ec3fbae58d..5935b32fe3 100644
> > > > > > > > --- a/include/hw/virtio/vhost-backend.h
> > > > > > > > +++ b/include/hw/virtio/vhost-backend.h
> > > > > > > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > > > > > >  

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-18 Thread Stefan Hajnoczi
On Tue, Apr 18, 2023 at 10:09:30AM +0200, Eugenio Perez Martin wrote:
> On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi  wrote:
> >
> > On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin  
> > wrote:
> > >
> > > On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  
> > > wrote:
> > > >
> > > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> > > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi 
> > > > >  wrote:
> > > > > >
> > > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > > > So-called "internal" virtio-fs migration refers to transporting 
> > > > > > > the
> > > > > > > back-end's (virtiofsd's) state through qemu's migration stream.  
> > > > > > > To do
> > > > > > > this, we need to be able to transfer virtiofsd's internal state 
> > > > > > > to and
> > > > > > > from virtiofsd.
> > > > > > >
> > > > > > > Because virtiofsd's internal state will not be too large, we 
> > > > > > > believe it
> > > > > > > is best to transfer it as a single binary blob after the streaming
> > > > > > > phase.  Because this method should be useful to other vhost-user
> > > > > > > implementations, too, it is introduced as a general-purpose 
> > > > > > > addition to
> > > > > > > the protocol, not limited to vhost-user-fs.
> > > > > > >
> > > > > > > These are the additions to the protocol:
> > > > > > > - New vhost-user protocol feature 
> > > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > > >   This feature signals support for transferring state, and is 
> > > > > > > added so
> > > > > > >   that migration can fail early when the back-end has no support.
> > > > > > >
> > > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate 
> > > > > > > a pipe
> > > > > > >   over which to transfer the state.  The front-end sends an FD to 
> > > > > > > the
> > > > > > >   back-end into/from which it can write/read its state, and the 
> > > > > > > back-end
> > > > > > >   can decide to either use it, or reply with a different FD for 
> > > > > > > the
> > > > > > >   front-end to override the front-end's choice.
> > > > > > >   The front-end creates a simple pipe to transfer the state, but 
> > > > > > > maybe
> > > > > > >   the back-end already has an FD into/from which it has to 
> > > > > > > write/read
> > > > > > >   its state, in which case it will want to override the simple 
> > > > > > > pipe.
> > > > > > >   Conversely, maybe in the future we find a way to have the 
> > > > > > > front-end
> > > > > > >   get an immediate FD for the migration stream (in some cases), 
> > > > > > > in which
> > > > > > >   case we will want to send this to the back-end instead of 
> > > > > > > creating a
> > > > > > >   pipe.
> > > > > > >   Hence the negotiation: If one side has a better idea than a 
> > > > > > > plain
> > > > > > >   pipe, we will want to use that.
> > > > > > >
> > > > > > > - CHECK_DEVICE_STATE: After the state has been transferred 
> > > > > > > through the
> > > > > > >   pipe (the end indicated by EOF), the front-end invokes this 
> > > > > > > function
> > > > > > >   to verify success.  There is no in-band way (through the pipe) 
> > > > > > > to
> > > > > > >   indicate failure, so we need to check explicitly.
> > > > > > >
> > > > > > > Once the transfer pipe has been established via 
> > > > > > > SET_DEVICE_STATE_FD
> > > > > > > (which includes establishing the direction of transfer and 
> > > > > > > migration
> > > > > > > phase), the sending side writes its data into the pipe, and the 
> > > > > > > reading
> > > > > > > side reads it until it sees an EOF.  Then, the front-end will 
> > > > > > > check for
> > > > > > > success via CHECK_DEVICE_STATE, which on the destination side 
> > > > > > > includes
> > > > > > > checking for integrity (i.e. errors during deserialization).
> > > > > > >
> > > > > > > Suggested-by: Stefan Hajnoczi 
> > > > > > > Signed-off-by: Hanna Czenczek 
> > > > > > > ---
> > > > > > >  include/hw/virtio/vhost-backend.h |  24 +
> > > > > > >  include/hw/virtio/vhost.h |  79 
> > > > > > >  hw/virtio/vhost-user.c| 147 
> > > > > > > ++
> > > > > > >  hw/virtio/vhost.c |  37 
> > > > > > >  4 files changed, 287 insertions(+)
> > > > > > >
> > > > > > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > > > > > b/include/hw/virtio/vhost-backend.h
> > > > > > > index ec3fbae58d..5935b32fe3 100644
> > > > > > > --- a/include/hw/virtio/vhost-backend.h
> > > > > > > +++ b/include/hw/virtio/vhost-backend.h
> > > > > > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > > > > >  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > > > > > >  } VhostSetConfigType;
> > > > > > >
> > > > > > > +typedef enum VhostDeviceStateDirection {
> > > > > > > +/* Transfer state from back-end (device) to front-end */
> > > > > > > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > > > > > > +/* Transfer state 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-18 Thread Eugenio Perez Martin
On Mon, Apr 17, 2023 at 9:46 PM Stefan Hajnoczi  wrote:
>
> On Mon, 17 Apr 2023 at 15:12, Eugenio Perez Martin  
> wrote:
> >
> > On Mon, Apr 17, 2023 at 9:08 PM Stefan Hajnoczi  wrote:
> > >
> > > On Mon, 17 Apr 2023 at 14:56, Eugenio Perez Martin  
> > > wrote:
> > > >
> > > > On Mon, Apr 17, 2023 at 5:18 PM Stefan Hajnoczi  
> > > > wrote:
> > > > >
> > > > > On Fri, Apr 14, 2023 at 05:17:02PM +0200, Eugenio Perez Martin wrote:
> > > > > > On Thu, Apr 13, 2023 at 7:55 PM Hanna Czenczek  
> > > > > > wrote:
> > > > > > >
> > > > > > > On 13.04.23 13:38, Stefan Hajnoczi wrote:
> > > > > > > > On Thu, 13 Apr 2023 at 05:24, Hanna Czenczek 
> > > > > > > >  wrote:
> > > > > > > >> On 12.04.23 23:06, Stefan Hajnoczi wrote:
> > > > > > > >>> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek 
> > > > > > > >>> wrote:
> > > > > > > Basically, what I’m hearing is that I need to implement a 
> > > > > > > different
> > > > > > > feature that has no practical impact right now, and also fix bugs 
> > > > > > > around
> > > > > > > it along the way...
> > > > > > >
> > > > > >
> > > > > > To fix this properly requires iterative device migration in qemu as
> > > > > > far as I know, instead of using VMStates [1]. This way the state is
> > > > > > requested to virtiofsd before the device reset.
> > > > >
> > > > > I don't follow. Many devices are fine with non-iterative migration. 
> > > > > They
> > > > > shouldn't be forced to do iterative migration.
> > > > >
> > > >
> > > > Sorry I think I didn't express myself well. I didn't mean to force
> > > > virtiofsd to support the iterative migration, but to use the device
> > > > iterative migration API in QEMU to send the needed commands before
> > > > vhost_dev_stop. In that regard, the device or the vhost-user commands
> > > > would not require changes.
> > > >
> > > > I think it is convenient in the long run for virtiofsd, as if the
> > > > state grows so much that it's not feasible to fetch it in one shot
> > > > there is no need to make changes in the qemu migration protocol. I
> > > > think it is not unlikely in virtiofs, but maybe I'm missing something
> > > > obvious and it's state will never grow.
> > >
> > > I don't understand. vCPUs are still running at that point and the
> > > device state could change. It's not safe to save the full device state
> > > until vCPUs have stopped (after vhost_dev_stop).
> > >
> >
> > I think the vCPU is already stopped at save_live_complete_precopy
> > callback. Maybe my understanding is wrong?
>
> Agreed, vCPUs are stopped in save_live_complete_precopy(). However,
> you wrote "use the device iterative migration API in QEMU to send the
> needed commands before vhost_dev_stop". save_live_complete_precopy()
> runs after vhost_dev_stop() so it doesn't seem to solve the problem.
>

You're right, and it actually makes the most sense.

So I guess this converges with the other thread, let's follow the
discussion there.

Thanks!




Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-18 Thread Eugenio Perez Martin
On Mon, Apr 17, 2023 at 9:33 PM Stefan Hajnoczi  wrote:
>
> On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin  
> wrote:
> >
> > On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  wrote:
> > >
> > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  
> > > > wrote:
> > > > >
> > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > > So-called "internal" virtio-fs migration refers to transporting the
> > > > > > back-end's (virtiofsd's) state through qemu's migration stream.  To 
> > > > > > do
> > > > > > this, we need to be able to transfer virtiofsd's internal state to 
> > > > > > and
> > > > > > from virtiofsd.
> > > > > >
> > > > > > Because virtiofsd's internal state will not be too large, we 
> > > > > > believe it
> > > > > > is best to transfer it as a single binary blob after the streaming
> > > > > > phase.  Because this method should be useful to other vhost-user
> > > > > > implementations, too, it is introduced as a general-purpose 
> > > > > > addition to
> > > > > > the protocol, not limited to vhost-user-fs.
> > > > > >
> > > > > > These are the additions to the protocol:
> > > > > > - New vhost-user protocol feature 
> > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > >   This feature signals support for transferring state, and is added 
> > > > > > so
> > > > > >   that migration can fail early when the back-end has no support.
> > > > > >
> > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a 
> > > > > > pipe
> > > > > >   over which to transfer the state.  The front-end sends an FD to 
> > > > > > the
> > > > > >   back-end into/from which it can write/read its state, and the 
> > > > > > back-end
> > > > > >   can decide to either use it, or reply with a different FD for the
> > > > > >   front-end to override the front-end's choice.
> > > > > >   The front-end creates a simple pipe to transfer the state, but 
> > > > > > maybe
> > > > > >   the back-end already has an FD into/from which it has to 
> > > > > > write/read
> > > > > >   its state, in which case it will want to override the simple pipe.
> > > > > >   Conversely, maybe in the future we find a way to have the 
> > > > > > front-end
> > > > > >   get an immediate FD for the migration stream (in some cases), in 
> > > > > > which
> > > > > >   case we will want to send this to the back-end instead of 
> > > > > > creating a
> > > > > >   pipe.
> > > > > >   Hence the negotiation: If one side has a better idea than a plain
> > > > > >   pipe, we will want to use that.
> > > > > >
> > > > > > - CHECK_DEVICE_STATE: After the state has been transferred through 
> > > > > > the
> > > > > >   pipe (the end indicated by EOF), the front-end invokes this 
> > > > > > function
> > > > > >   to verify success.  There is no in-band way (through the pipe) to
> > > > > >   indicate failure, so we need to check explicitly.
> > > > > >
> > > > > > Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > > > > > (which includes establishing the direction of transfer and migration
> > > > > > phase), the sending side writes its data into the pipe, and the 
> > > > > > reading
> > > > > > side reads it until it sees an EOF.  Then, the front-end will check 
> > > > > > for
> > > > > > success via CHECK_DEVICE_STATE, which on the destination side 
> > > > > > includes
> > > > > > checking for integrity (i.e. errors during deserialization).
> > > > > >
> > > > > > Suggested-by: Stefan Hajnoczi 
> > > > > > Signed-off-by: Hanna Czenczek 
> > > > > > ---
> > > > > >  include/hw/virtio/vhost-backend.h |  24 +
> > > > > >  include/hw/virtio/vhost.h |  79 
> > > > > >  hw/virtio/vhost-user.c| 147 
> > > > > > ++
> > > > > >  hw/virtio/vhost.c |  37 
> > > > > >  4 files changed, 287 insertions(+)
> > > > > >
> > > > > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > > > > b/include/hw/virtio/vhost-backend.h
> > > > > > index ec3fbae58d..5935b32fe3 100644
> > > > > > --- a/include/hw/virtio/vhost-backend.h
> > > > > > +++ b/include/hw/virtio/vhost-backend.h
> > > > > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > > > >  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > > > > >  } VhostSetConfigType;
> > > > > >
> > > > > > +typedef enum VhostDeviceStateDirection {
> > > > > > +/* Transfer state from back-end (device) to front-end */
> > > > > > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > > > > > +/* Transfer state from front-end to back-end (device) */
> > > > > > +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > > > > > +} VhostDeviceStateDirection;
> > > > > > +
> > > > > > +typedef enum VhostDeviceStatePhase {
> > > > > > +/* The device (and all its vrings) is stopped */
> > > > > > +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > > > > > +} VhostDeviceStatePhase;
> > > > >
> 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-18 Thread Eugenio Perez Martin
On Mon, Apr 17, 2023 at 9:21 PM Stefan Hajnoczi  wrote:
>
> On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin  
> wrote:
> >
> > On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi  wrote:
> > >
> > > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> > > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  
> > > > wrote:
> > > > >
> > > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > > So-called "internal" virtio-fs migration refers to transporting the
> > > > > > back-end's (virtiofsd's) state through qemu's migration stream.  To 
> > > > > > do
> > > > > > this, we need to be able to transfer virtiofsd's internal state to 
> > > > > > and
> > > > > > from virtiofsd.
> > > > > >
> > > > > > Because virtiofsd's internal state will not be too large, we 
> > > > > > believe it
> > > > > > is best to transfer it as a single binary blob after the streaming
> > > > > > phase.  Because this method should be useful to other vhost-user
> > > > > > implementations, too, it is introduced as a general-purpose 
> > > > > > addition to
> > > > > > the protocol, not limited to vhost-user-fs.
> > > > > >
> > > > > > These are the additions to the protocol:
> > > > > > - New vhost-user protocol feature 
> > > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > > >   This feature signals support for transferring state, and is added 
> > > > > > so
> > > > > >   that migration can fail early when the back-end has no support.
> > > > > >
> > > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a 
> > > > > > pipe
> > > > > >   over which to transfer the state.  The front-end sends an FD to 
> > > > > > the
> > > > > >   back-end into/from which it can write/read its state, and the 
> > > > > > back-end
> > > > > >   can decide to either use it, or reply with a different FD for the
> > > > > >   front-end to override the front-end's choice.
> > > > > >   The front-end creates a simple pipe to transfer the state, but 
> > > > > > maybe
> > > > > >   the back-end already has an FD into/from which it has to 
> > > > > > write/read
> > > > > >   its state, in which case it will want to override the simple pipe.
> > > > > >   Conversely, maybe in the future we find a way to have the 
> > > > > > front-end
> > > > > >   get an immediate FD for the migration stream (in some cases), in 
> > > > > > which
> > > > > >   case we will want to send this to the back-end instead of 
> > > > > > creating a
> > > > > >   pipe.
> > > > > >   Hence the negotiation: If one side has a better idea than a plain
> > > > > >   pipe, we will want to use that.
> > > > > >
> > > > > > - CHECK_DEVICE_STATE: After the state has been transferred through 
> > > > > > the
> > > > > >   pipe (the end indicated by EOF), the front-end invokes this 
> > > > > > function
> > > > > >   to verify success.  There is no in-band way (through the pipe) to
> > > > > >   indicate failure, so we need to check explicitly.
> > > > > >
> > > > > > Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > > > > > (which includes establishing the direction of transfer and migration
> > > > > > phase), the sending side writes its data into the pipe, and the 
> > > > > > reading
> > > > > > side reads it until it sees an EOF.  Then, the front-end will check 
> > > > > > for
> > > > > > success via CHECK_DEVICE_STATE, which on the destination side 
> > > > > > includes
> > > > > > checking for integrity (i.e. errors during deserialization).
> > > > > >
> > > > > > Suggested-by: Stefan Hajnoczi 
> > > > > > Signed-off-by: Hanna Czenczek 
> > > > > > ---
> > > > > >  include/hw/virtio/vhost-backend.h |  24 +
> > > > > >  include/hw/virtio/vhost.h |  79 
> > > > > >  hw/virtio/vhost-user.c| 147 
> > > > > > ++
> > > > > >  hw/virtio/vhost.c |  37 
> > > > > >  4 files changed, 287 insertions(+)
> > > > > >
> > > > > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > > > > b/include/hw/virtio/vhost-backend.h
> > > > > > index ec3fbae58d..5935b32fe3 100644
> > > > > > --- a/include/hw/virtio/vhost-backend.h
> > > > > > +++ b/include/hw/virtio/vhost-backend.h
> > > > > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > > > >  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > > > > >  } VhostSetConfigType;
> > > > > >
> > > > > > +typedef enum VhostDeviceStateDirection {
> > > > > > +/* Transfer state from back-end (device) to front-end */
> > > > > > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > > > > > +/* Transfer state from front-end to back-end (device) */
> > > > > > +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > > > > > +} VhostDeviceStateDirection;
> > > > > > +
> > > > > > +typedef enum VhostDeviceStatePhase {
> > > > > > +/* The device (and all its vrings) is stopped */
> > > > > > +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > > > > > +} VhostDeviceStatePhase;
> > > > >
> 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Stefan Hajnoczi
On Mon, 17 Apr 2023 at 15:12, Eugenio Perez Martin  wrote:
>
> On Mon, Apr 17, 2023 at 9:08 PM Stefan Hajnoczi  wrote:
> >
> > On Mon, 17 Apr 2023 at 14:56, Eugenio Perez Martin  
> > wrote:
> > >
> > > On Mon, Apr 17, 2023 at 5:18 PM Stefan Hajnoczi  
> > > wrote:
> > > >
> > > > On Fri, Apr 14, 2023 at 05:17:02PM +0200, Eugenio Perez Martin wrote:
> > > > > On Thu, Apr 13, 2023 at 7:55 PM Hanna Czenczek  
> > > > > wrote:
> > > > > >
> > > > > > On 13.04.23 13:38, Stefan Hajnoczi wrote:
> > > > > > > On Thu, 13 Apr 2023 at 05:24, Hanna Czenczek  
> > > > > > > wrote:
> > > > > > >> On 12.04.23 23:06, Stefan Hajnoczi wrote:
> > > > > > >>> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > > Basically, what I’m hearing is that I need to implement a different
> > > > > > feature that has no practical impact right now, and also fix bugs 
> > > > > > around
> > > > > > it along the way...
> > > > > >
> > > > >
> > > > > To fix this properly requires iterative device migration in qemu as
> > > > > far as I know, instead of using VMStates [1]. This way the state is
> > > > > requested to virtiofsd before the device reset.
> > > >
> > > > I don't follow. Many devices are fine with non-iterative migration. They
> > > > shouldn't be forced to do iterative migration.
> > > >
> > >
> > > Sorry I think I didn't express myself well. I didn't mean to force
> > > virtiofsd to support the iterative migration, but to use the device
> > > iterative migration API in QEMU to send the needed commands before
> > > vhost_dev_stop. In that regard, the device or the vhost-user commands
> > > would not require changes.
> > >
> > > I think it is convenient in the long run for virtiofsd, as if the
> > > state grows so much that it's not feasible to fetch it in one shot
> > > there is no need to make changes in the qemu migration protocol. I
> > > think it is not unlikely in virtiofs, but maybe I'm missing something
> > > obvious and it's state will never grow.
> >
> > I don't understand. vCPUs are still running at that point and the
> > device state could change. It's not safe to save the full device state
> > until vCPUs have stopped (after vhost_dev_stop).
> >
>
> I think the vCPU is already stopped at save_live_complete_precopy
> callback. Maybe my understanding is wrong?

Agreed, vCPUs are stopped in save_live_complete_precopy(). However,
you wrote "use the device iterative migration API in QEMU to send the
needed commands before vhost_dev_stop". save_live_complete_precopy()
runs after vhost_dev_stop() so it doesn't seem to solve the problem.

Stefan



Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Stefan Hajnoczi
On Mon, 17 Apr 2023 at 15:10, Eugenio Perez Martin  wrote:
>
> On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  wrote:
> >
> > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  
> > > wrote:
> > > >
> > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > So-called "internal" virtio-fs migration refers to transporting the
> > > > > back-end's (virtiofsd's) state through qemu's migration stream.  To do
> > > > > this, we need to be able to transfer virtiofsd's internal state to and
> > > > > from virtiofsd.
> > > > >
> > > > > Because virtiofsd's internal state will not be too large, we believe 
> > > > > it
> > > > > is best to transfer it as a single binary blob after the streaming
> > > > > phase.  Because this method should be useful to other vhost-user
> > > > > implementations, too, it is introduced as a general-purpose addition 
> > > > > to
> > > > > the protocol, not limited to vhost-user-fs.
> > > > >
> > > > > These are the additions to the protocol:
> > > > > - New vhost-user protocol feature 
> > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > >   This feature signals support for transferring state, and is added so
> > > > >   that migration can fail early when the back-end has no support.
> > > > >
> > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a 
> > > > > pipe
> > > > >   over which to transfer the state.  The front-end sends an FD to the
> > > > >   back-end into/from which it can write/read its state, and the 
> > > > > back-end
> > > > >   can decide to either use it, or reply with a different FD for the
> > > > >   front-end to override the front-end's choice.
> > > > >   The front-end creates a simple pipe to transfer the state, but maybe
> > > > >   the back-end already has an FD into/from which it has to write/read
> > > > >   its state, in which case it will want to override the simple pipe.
> > > > >   Conversely, maybe in the future we find a way to have the front-end
> > > > >   get an immediate FD for the migration stream (in some cases), in 
> > > > > which
> > > > >   case we will want to send this to the back-end instead of creating a
> > > > >   pipe.
> > > > >   Hence the negotiation: If one side has a better idea than a plain
> > > > >   pipe, we will want to use that.
> > > > >
> > > > > - CHECK_DEVICE_STATE: After the state has been transferred through the
> > > > >   pipe (the end indicated by EOF), the front-end invokes this function
> > > > >   to verify success.  There is no in-band way (through the pipe) to
> > > > >   indicate failure, so we need to check explicitly.
> > > > >
> > > > > Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > > > > (which includes establishing the direction of transfer and migration
> > > > > phase), the sending side writes its data into the pipe, and the 
> > > > > reading
> > > > > side reads it until it sees an EOF.  Then, the front-end will check 
> > > > > for
> > > > > success via CHECK_DEVICE_STATE, which on the destination side includes
> > > > > checking for integrity (i.e. errors during deserialization).
> > > > >
> > > > > Suggested-by: Stefan Hajnoczi 
> > > > > Signed-off-by: Hanna Czenczek 
> > > > > ---
> > > > >  include/hw/virtio/vhost-backend.h |  24 +
> > > > >  include/hw/virtio/vhost.h |  79 
> > > > >  hw/virtio/vhost-user.c| 147 
> > > > > ++
> > > > >  hw/virtio/vhost.c |  37 
> > > > >  4 files changed, 287 insertions(+)
> > > > >
> > > > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > > > b/include/hw/virtio/vhost-backend.h
> > > > > index ec3fbae58d..5935b32fe3 100644
> > > > > --- a/include/hw/virtio/vhost-backend.h
> > > > > +++ b/include/hw/virtio/vhost-backend.h
> > > > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > > >  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > > > >  } VhostSetConfigType;
> > > > >
> > > > > +typedef enum VhostDeviceStateDirection {
> > > > > +/* Transfer state from back-end (device) to front-end */
> > > > > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > > > > +/* Transfer state from front-end to back-end (device) */
> > > > > +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > > > > +} VhostDeviceStateDirection;
> > > > > +
> > > > > +typedef enum VhostDeviceStatePhase {
> > > > > +/* The device (and all its vrings) is stopped */
> > > > > +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > > > > +} VhostDeviceStatePhase;
> > > >
> > > > vDPA has:
> > > >
> > > >   /* Suspend a device so it does not process virtqueue requests anymore
> > > >*
> > > >* After the return of ioctl the device must preserve all the 
> > > > necessary state
> > > >* (the virtqueue vring base plus the possible device specific 
> > > > states) that is
> > > >* required for restoring in the future. The device must not 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Stefan Hajnoczi
On Mon, 17 Apr 2023 at 15:08, Eugenio Perez Martin  wrote:
>
> On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi  wrote:
> >
> > On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> > > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  
> > > wrote:
> > > >
> > > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > So-called "internal" virtio-fs migration refers to transporting the
> > > > > back-end's (virtiofsd's) state through qemu's migration stream.  To do
> > > > > this, we need to be able to transfer virtiofsd's internal state to and
> > > > > from virtiofsd.
> > > > >
> > > > > Because virtiofsd's internal state will not be too large, we believe 
> > > > > it
> > > > > is best to transfer it as a single binary blob after the streaming
> > > > > phase.  Because this method should be useful to other vhost-user
> > > > > implementations, too, it is introduced as a general-purpose addition 
> > > > > to
> > > > > the protocol, not limited to vhost-user-fs.
> > > > >
> > > > > These are the additions to the protocol:
> > > > > - New vhost-user protocol feature 
> > > > > VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > > >   This feature signals support for transferring state, and is added so
> > > > >   that migration can fail early when the back-end has no support.
> > > > >
> > > > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a 
> > > > > pipe
> > > > >   over which to transfer the state.  The front-end sends an FD to the
> > > > >   back-end into/from which it can write/read its state, and the 
> > > > > back-end
> > > > >   can decide to either use it, or reply with a different FD for the
> > > > >   front-end to override the front-end's choice.
> > > > >   The front-end creates a simple pipe to transfer the state, but maybe
> > > > >   the back-end already has an FD into/from which it has to write/read
> > > > >   its state, in which case it will want to override the simple pipe.
> > > > >   Conversely, maybe in the future we find a way to have the front-end
> > > > >   get an immediate FD for the migration stream (in some cases), in 
> > > > > which
> > > > >   case we will want to send this to the back-end instead of creating a
> > > > >   pipe.
> > > > >   Hence the negotiation: If one side has a better idea than a plain
> > > > >   pipe, we will want to use that.
> > > > >
> > > > > - CHECK_DEVICE_STATE: After the state has been transferred through the
> > > > >   pipe (the end indicated by EOF), the front-end invokes this function
> > > > >   to verify success.  There is no in-band way (through the pipe) to
> > > > >   indicate failure, so we need to check explicitly.
> > > > >
> > > > > Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > > > > (which includes establishing the direction of transfer and migration
> > > > > phase), the sending side writes its data into the pipe, and the 
> > > > > reading
> > > > > side reads it until it sees an EOF.  Then, the front-end will check 
> > > > > for
> > > > > success via CHECK_DEVICE_STATE, which on the destination side includes
> > > > > checking for integrity (i.e. errors during deserialization).
> > > > >
> > > > > Suggested-by: Stefan Hajnoczi 
> > > > > Signed-off-by: Hanna Czenczek 
> > > > > ---
> > > > >  include/hw/virtio/vhost-backend.h |  24 +
> > > > >  include/hw/virtio/vhost.h |  79 
> > > > >  hw/virtio/vhost-user.c| 147 
> > > > > ++
> > > > >  hw/virtio/vhost.c |  37 
> > > > >  4 files changed, 287 insertions(+)
> > > > >
> > > > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > > > b/include/hw/virtio/vhost-backend.h
> > > > > index ec3fbae58d..5935b32fe3 100644
> > > > > --- a/include/hw/virtio/vhost-backend.h
> > > > > +++ b/include/hw/virtio/vhost-backend.h
> > > > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > > >  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > > > >  } VhostSetConfigType;
> > > > >
> > > > > +typedef enum VhostDeviceStateDirection {
> > > > > +/* Transfer state from back-end (device) to front-end */
> > > > > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > > > > +/* Transfer state from front-end to back-end (device) */
> > > > > +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > > > > +} VhostDeviceStateDirection;
> > > > > +
> > > > > +typedef enum VhostDeviceStatePhase {
> > > > > +/* The device (and all its vrings) is stopped */
> > > > > +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > > > > +} VhostDeviceStatePhase;
> > > >
> > > > vDPA has:
> > > >
> > > >   /* Suspend a device so it does not process virtqueue requests anymore
> > > >*
> > > >* After the return of ioctl the device must preserve all the 
> > > > necessary state
> > > >* (the virtqueue vring base plus the possible device specific 
> > > > states) that is
> > > >* required for restoring in the future. The device must not 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Eugenio Perez Martin
On Mon, Apr 17, 2023 at 9:08 PM Stefan Hajnoczi  wrote:
>
> On Mon, 17 Apr 2023 at 14:56, Eugenio Perez Martin  
> wrote:
> >
> > On Mon, Apr 17, 2023 at 5:18 PM Stefan Hajnoczi  wrote:
> > >
> > > On Fri, Apr 14, 2023 at 05:17:02PM +0200, Eugenio Perez Martin wrote:
> > > > On Thu, Apr 13, 2023 at 7:55 PM Hanna Czenczek  
> > > > wrote:
> > > > >
> > > > > On 13.04.23 13:38, Stefan Hajnoczi wrote:
> > > > > > On Thu, 13 Apr 2023 at 05:24, Hanna Czenczek  
> > > > > > wrote:
> > > > > >> On 12.04.23 23:06, Stefan Hajnoczi wrote:
> > > > > >>> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > > Basically, what I’m hearing is that I need to implement a different
> > > > > feature that has no practical impact right now, and also fix bugs 
> > > > > around
> > > > > it along the way...
> > > > >
> > > >
> > > > To fix this properly requires iterative device migration in qemu as
> > > > far as I know, instead of using VMStates [1]. This way the state is
> > > > requested to virtiofsd before the device reset.
> > >
> > > I don't follow. Many devices are fine with non-iterative migration. They
> > > shouldn't be forced to do iterative migration.
> > >
> >
> > Sorry I think I didn't express myself well. I didn't mean to force
> > virtiofsd to support the iterative migration, but to use the device
> > iterative migration API in QEMU to send the needed commands before
> > vhost_dev_stop. In that regard, the device or the vhost-user commands
> > would not require changes.
> >
> > I think it is convenient in the long run for virtiofsd, as if the
> > state grows so much that it's not feasible to fetch it in one shot
> > there is no need to make changes in the qemu migration protocol. I
> > think it is not unlikely in virtiofs, but maybe I'm missing something
> > obvious and it's state will never grow.
>
> I don't understand. vCPUs are still running at that point and the
> device state could change. It's not safe to save the full device state
> until vCPUs have stopped (after vhost_dev_stop).
>

I think the vCPU is already stopped at save_live_complete_precopy
callback. Maybe my understanding is wrong?

Thanks!

> If you're suggestion somehow doing non-iterative migration but during
> the iterative phase, then I don't think that's possible?
>
> Stefan
>




Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Eugenio Perez Martin
On Mon, Apr 17, 2023 at 5:38 PM Stefan Hajnoczi  wrote:
>
> On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  
> > wrote:
> > >
> > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > So-called "internal" virtio-fs migration refers to transporting the
> > > > back-end's (virtiofsd's) state through qemu's migration stream.  To do
> > > > this, we need to be able to transfer virtiofsd's internal state to and
> > > > from virtiofsd.
> > > >
> > > > Because virtiofsd's internal state will not be too large, we believe it
> > > > is best to transfer it as a single binary blob after the streaming
> > > > phase.  Because this method should be useful to other vhost-user
> > > > implementations, too, it is introduced as a general-purpose addition to
> > > > the protocol, not limited to vhost-user-fs.
> > > >
> > > > These are the additions to the protocol:
> > > > - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > >   This feature signals support for transferring state, and is added so
> > > >   that migration can fail early when the back-end has no support.
> > > >
> > > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
> > > >   over which to transfer the state.  The front-end sends an FD to the
> > > >   back-end into/from which it can write/read its state, and the back-end
> > > >   can decide to either use it, or reply with a different FD for the
> > > >   front-end to override the front-end's choice.
> > > >   The front-end creates a simple pipe to transfer the state, but maybe
> > > >   the back-end already has an FD into/from which it has to write/read
> > > >   its state, in which case it will want to override the simple pipe.
> > > >   Conversely, maybe in the future we find a way to have the front-end
> > > >   get an immediate FD for the migration stream (in some cases), in which
> > > >   case we will want to send this to the back-end instead of creating a
> > > >   pipe.
> > > >   Hence the negotiation: If one side has a better idea than a plain
> > > >   pipe, we will want to use that.
> > > >
> > > > - CHECK_DEVICE_STATE: After the state has been transferred through the
> > > >   pipe (the end indicated by EOF), the front-end invokes this function
> > > >   to verify success.  There is no in-band way (through the pipe) to
> > > >   indicate failure, so we need to check explicitly.
> > > >
> > > > Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > > > (which includes establishing the direction of transfer and migration
> > > > phase), the sending side writes its data into the pipe, and the reading
> > > > side reads it until it sees an EOF.  Then, the front-end will check for
> > > > success via CHECK_DEVICE_STATE, which on the destination side includes
> > > > checking for integrity (i.e. errors during deserialization).
> > > >
> > > > Suggested-by: Stefan Hajnoczi 
> > > > Signed-off-by: Hanna Czenczek 
> > > > ---
> > > >  include/hw/virtio/vhost-backend.h |  24 +
> > > >  include/hw/virtio/vhost.h |  79 
> > > >  hw/virtio/vhost-user.c| 147 ++
> > > >  hw/virtio/vhost.c |  37 
> > > >  4 files changed, 287 insertions(+)
> > > >
> > > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > > b/include/hw/virtio/vhost-backend.h
> > > > index ec3fbae58d..5935b32fe3 100644
> > > > --- a/include/hw/virtio/vhost-backend.h
> > > > +++ b/include/hw/virtio/vhost-backend.h
> > > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > >  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > > >  } VhostSetConfigType;
> > > >
> > > > +typedef enum VhostDeviceStateDirection {
> > > > +/* Transfer state from back-end (device) to front-end */
> > > > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > > > +/* Transfer state from front-end to back-end (device) */
> > > > +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > > > +} VhostDeviceStateDirection;
> > > > +
> > > > +typedef enum VhostDeviceStatePhase {
> > > > +/* The device (and all its vrings) is stopped */
> > > > +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > > > +} VhostDeviceStatePhase;
> > >
> > > vDPA has:
> > >
> > >   /* Suspend a device so it does not process virtqueue requests anymore
> > >*
> > >* After the return of ioctl the device must preserve all the necessary 
> > > state
> > >* (the virtqueue vring base plus the possible device specific states) 
> > > that is
> > >* required for restoring in the future. The device must not change its
> > >* configuration after that point.
> > >*/
> > >   #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)
> > >
> > >   /* Resume a device so it can resume processing virtqueue requests
> > >*
> > >* After the return of this ioctl the device will have restored all the
> > >* necessary states and it is 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Stefan Hajnoczi
On Mon, 17 Apr 2023 at 14:56, Eugenio Perez Martin  wrote:
>
> On Mon, Apr 17, 2023 at 5:18 PM Stefan Hajnoczi  wrote:
> >
> > On Fri, Apr 14, 2023 at 05:17:02PM +0200, Eugenio Perez Martin wrote:
> > > On Thu, Apr 13, 2023 at 7:55 PM Hanna Czenczek  wrote:
> > > >
> > > > On 13.04.23 13:38, Stefan Hajnoczi wrote:
> > > > > On Thu, 13 Apr 2023 at 05:24, Hanna Czenczek  
> > > > > wrote:
> > > > >> On 12.04.23 23:06, Stefan Hajnoczi wrote:
> > > > >>> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > Basically, what I’m hearing is that I need to implement a different
> > > > feature that has no practical impact right now, and also fix bugs around
> > > > it along the way...
> > > >
> > >
> > > To fix this properly requires iterative device migration in qemu as
> > > far as I know, instead of using VMStates [1]. This way the state is
> > > requested to virtiofsd before the device reset.
> >
> > I don't follow. Many devices are fine with non-iterative migration. They
> > shouldn't be forced to do iterative migration.
> >
>
> Sorry I think I didn't express myself well. I didn't mean to force
> virtiofsd to support the iterative migration, but to use the device
> iterative migration API in QEMU to send the needed commands before
> vhost_dev_stop. In that regard, the device or the vhost-user commands
> would not require changes.
>
> I think it is convenient in the long run for virtiofsd, as if the
> state grows so much that it's not feasible to fetch it in one shot
> there is no need to make changes in the qemu migration protocol. I
> think it is not unlikely in virtiofs, but maybe I'm missing something
> obvious and it's state will never grow.

I don't understand. vCPUs are still running at that point and the
device state could change. It's not safe to save the full device state
until vCPUs have stopped (after vhost_dev_stop).

If you're suggestion somehow doing non-iterative migration but during
the iterative phase, then I don't think that's possible?

Stefan



Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Eugenio Perez Martin
On Mon, Apr 17, 2023 at 7:14 PM Stefan Hajnoczi  wrote:
>
> On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  
> > wrote:
> > >
> > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > So-called "internal" virtio-fs migration refers to transporting the
> > > > back-end's (virtiofsd's) state through qemu's migration stream.  To do
> > > > this, we need to be able to transfer virtiofsd's internal state to and
> > > > from virtiofsd.
> > > >
> > > > Because virtiofsd's internal state will not be too large, we believe it
> > > > is best to transfer it as a single binary blob after the streaming
> > > > phase.  Because this method should be useful to other vhost-user
> > > > implementations, too, it is introduced as a general-purpose addition to
> > > > the protocol, not limited to vhost-user-fs.
> > > >
> > > > These are the additions to the protocol:
> > > > - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > >   This feature signals support for transferring state, and is added so
> > > >   that migration can fail early when the back-end has no support.
> > > >
> > > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
> > > >   over which to transfer the state.  The front-end sends an FD to the
> > > >   back-end into/from which it can write/read its state, and the back-end
> > > >   can decide to either use it, or reply with a different FD for the
> > > >   front-end to override the front-end's choice.
> > > >   The front-end creates a simple pipe to transfer the state, but maybe
> > > >   the back-end already has an FD into/from which it has to write/read
> > > >   its state, in which case it will want to override the simple pipe.
> > > >   Conversely, maybe in the future we find a way to have the front-end
> > > >   get an immediate FD for the migration stream (in some cases), in which
> > > >   case we will want to send this to the back-end instead of creating a
> > > >   pipe.
> > > >   Hence the negotiation: If one side has a better idea than a plain
> > > >   pipe, we will want to use that.
> > > >
> > > > - CHECK_DEVICE_STATE: After the state has been transferred through the
> > > >   pipe (the end indicated by EOF), the front-end invokes this function
> > > >   to verify success.  There is no in-band way (through the pipe) to
> > > >   indicate failure, so we need to check explicitly.
> > > >
> > > > Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > > > (which includes establishing the direction of transfer and migration
> > > > phase), the sending side writes its data into the pipe, and the reading
> > > > side reads it until it sees an EOF.  Then, the front-end will check for
> > > > success via CHECK_DEVICE_STATE, which on the destination side includes
> > > > checking for integrity (i.e. errors during deserialization).
> > > >
> > > > Suggested-by: Stefan Hajnoczi 
> > > > Signed-off-by: Hanna Czenczek 
> > > > ---
> > > >  include/hw/virtio/vhost-backend.h |  24 +
> > > >  include/hw/virtio/vhost.h |  79 
> > > >  hw/virtio/vhost-user.c| 147 ++
> > > >  hw/virtio/vhost.c |  37 
> > > >  4 files changed, 287 insertions(+)
> > > >
> > > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > > b/include/hw/virtio/vhost-backend.h
> > > > index ec3fbae58d..5935b32fe3 100644
> > > > --- a/include/hw/virtio/vhost-backend.h
> > > > +++ b/include/hw/virtio/vhost-backend.h
> > > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > >  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > > >  } VhostSetConfigType;
> > > >
> > > > +typedef enum VhostDeviceStateDirection {
> > > > +/* Transfer state from back-end (device) to front-end */
> > > > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > > > +/* Transfer state from front-end to back-end (device) */
> > > > +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > > > +} VhostDeviceStateDirection;
> > > > +
> > > > +typedef enum VhostDeviceStatePhase {
> > > > +/* The device (and all its vrings) is stopped */
> > > > +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > > > +} VhostDeviceStatePhase;
> > >
> > > vDPA has:
> > >
> > >   /* Suspend a device so it does not process virtqueue requests anymore
> > >*
> > >* After the return of ioctl the device must preserve all the necessary 
> > > state
> > >* (the virtqueue vring base plus the possible device specific states) 
> > > that is
> > >* required for restoring in the future. The device must not change its
> > >* configuration after that point.
> > >*/
> > >   #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)
> > >
> > >   /* Resume a device so it can resume processing virtqueue requests
> > >*
> > >* After the return of this ioctl the device will have restored all the
> > >* necessary states and it is 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Eugenio Perez Martin
On Mon, Apr 17, 2023 at 5:18 PM Stefan Hajnoczi  wrote:
>
> On Fri, Apr 14, 2023 at 05:17:02PM +0200, Eugenio Perez Martin wrote:
> > On Thu, Apr 13, 2023 at 7:55 PM Hanna Czenczek  wrote:
> > >
> > > On 13.04.23 13:38, Stefan Hajnoczi wrote:
> > > > On Thu, 13 Apr 2023 at 05:24, Hanna Czenczek  wrote:
> > > >> On 12.04.23 23:06, Stefan Hajnoczi wrote:
> > > >>> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > >  So-called "internal" virtio-fs migration refers to transporting the
> > >  back-end's (virtiofsd's) state through qemu's migration stream.  To 
> > >  do
> > >  this, we need to be able to transfer virtiofsd's internal state to 
> > >  and
> > >  from virtiofsd.
> > > 
> > >  Because virtiofsd's internal state will not be too large, we believe 
> > >  it
> > >  is best to transfer it as a single binary blob after the streaming
> > >  phase.  Because this method should be useful to other vhost-user
> > >  implementations, too, it is introduced as a general-purpose addition 
> > >  to
> > >  the protocol, not limited to vhost-user-fs.
> > > 
> > >  These are the additions to the protocol:
> > >  - New vhost-user protocol feature 
> > >  VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > >  This feature signals support for transferring state, and is 
> > >  added so
> > >  that migration can fail early when the back-end has no support.
> > > 
> > >  - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a 
> > >  pipe
> > >  over which to transfer the state.  The front-end sends an FD to 
> > >  the
> > >  back-end into/from which it can write/read its state, and the 
> > >  back-end
> > >  can decide to either use it, or reply with a different FD for the
> > >  front-end to override the front-end's choice.
> > >  The front-end creates a simple pipe to transfer the state, but 
> > >  maybe
> > >  the back-end already has an FD into/from which it has to 
> > >  write/read
> > >  its state, in which case it will want to override the simple 
> > >  pipe.
> > >  Conversely, maybe in the future we find a way to have the 
> > >  front-end
> > >  get an immediate FD for the migration stream (in some cases), in 
> > >  which
> > >  case we will want to send this to the back-end instead of 
> > >  creating a
> > >  pipe.
> > >  Hence the negotiation: If one side has a better idea than a plain
> > >  pipe, we will want to use that.
> > > 
> > >  - CHECK_DEVICE_STATE: After the state has been transferred through 
> > >  the
> > >  pipe (the end indicated by EOF), the front-end invokes this 
> > >  function
> > >  to verify success.  There is no in-band way (through the pipe) to
> > >  indicate failure, so we need to check explicitly.
> > > 
> > >  Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > >  (which includes establishing the direction of transfer and migration
> > >  phase), the sending side writes its data into the pipe, and the 
> > >  reading
> > >  side reads it until it sees an EOF.  Then, the front-end will check 
> > >  for
> > >  success via CHECK_DEVICE_STATE, which on the destination side 
> > >  includes
> > >  checking for integrity (i.e. errors during deserialization).
> > > 
> > >  Suggested-by: Stefan Hajnoczi 
> > >  Signed-off-by: Hanna Czenczek 
> > >  ---
> > > include/hw/virtio/vhost-backend.h |  24 +
> > > include/hw/virtio/vhost.h |  79 
> > > hw/virtio/vhost-user.c| 147 
> > >  ++
> > > hw/virtio/vhost.c |  37 
> > > 4 files changed, 287 insertions(+)
> > > 
> > >  diff --git a/include/hw/virtio/vhost-backend.h 
> > >  b/include/hw/virtio/vhost-backend.h
> > >  index ec3fbae58d..5935b32fe3 100644
> > >  --- a/include/hw/virtio/vhost-backend.h
> > >  +++ b/include/hw/virtio/vhost-backend.h
> > >  @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > > } VhostSetConfigType;
> > > 
> > >  +typedef enum VhostDeviceStateDirection {
> > >  +/* Transfer state from back-end (device) to front-end */
> > >  +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > >  +/* Transfer state from front-end to back-end (device) */
> > >  +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > >  +} VhostDeviceStateDirection;
> > >  +
> > >  +typedef enum VhostDeviceStatePhase {
> > >  +/* The device (and all its vrings) is stopped */
> > >  +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > >  +} VhostDeviceStatePhase;
> > > >>> vDPA has:
> > > >>>
> > > 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Eugenio Perez Martin
On Thu, Apr 13, 2023 at 7:32 PM Hanna Czenczek  wrote:
>
> On 13.04.23 12:14, Eugenio Perez Martin wrote:
> > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  
> > wrote:
> >> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> >>> So-called "internal" virtio-fs migration refers to transporting the
> >>> back-end's (virtiofsd's) state through qemu's migration stream.  To do
> >>> this, we need to be able to transfer virtiofsd's internal state to and
> >>> from virtiofsd.
> >>>
> >>> Because virtiofsd's internal state will not be too large, we believe it
> >>> is best to transfer it as a single binary blob after the streaming
> >>> phase.  Because this method should be useful to other vhost-user
> >>> implementations, too, it is introduced as a general-purpose addition to
> >>> the protocol, not limited to vhost-user-fs.
> >>>
> >>> These are the additions to the protocol:
> >>> - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> >>>This feature signals support for transferring state, and is added so
> >>>that migration can fail early when the back-end has no support.
> >>>
> >>> - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
> >>>over which to transfer the state.  The front-end sends an FD to the
> >>>back-end into/from which it can write/read its state, and the back-end
> >>>can decide to either use it, or reply with a different FD for the
> >>>front-end to override the front-end's choice.
> >>>The front-end creates a simple pipe to transfer the state, but maybe
> >>>the back-end already has an FD into/from which it has to write/read
> >>>its state, in which case it will want to override the simple pipe.
> >>>Conversely, maybe in the future we find a way to have the front-end
> >>>get an immediate FD for the migration stream (in some cases), in which
> >>>case we will want to send this to the back-end instead of creating a
> >>>pipe.
> >>>Hence the negotiation: If one side has a better idea than a plain
> >>>pipe, we will want to use that.
> >>>
> >>> - CHECK_DEVICE_STATE: After the state has been transferred through the
> >>>pipe (the end indicated by EOF), the front-end invokes this function
> >>>to verify success.  There is no in-band way (through the pipe) to
> >>>indicate failure, so we need to check explicitly.
> >>>
> >>> Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> >>> (which includes establishing the direction of transfer and migration
> >>> phase), the sending side writes its data into the pipe, and the reading
> >>> side reads it until it sees an EOF.  Then, the front-end will check for
> >>> success via CHECK_DEVICE_STATE, which on the destination side includes
> >>> checking for integrity (i.e. errors during deserialization).
> >>>
> >>> Suggested-by: Stefan Hajnoczi 
> >>> Signed-off-by: Hanna Czenczek 
> >>> ---
> >>>   include/hw/virtio/vhost-backend.h |  24 +
> >>>   include/hw/virtio/vhost.h |  79 
> >>>   hw/virtio/vhost-user.c| 147 ++
> >>>   hw/virtio/vhost.c |  37 
> >>>   4 files changed, 287 insertions(+)
> >>>
> >>> diff --git a/include/hw/virtio/vhost-backend.h 
> >>> b/include/hw/virtio/vhost-backend.h
> >>> index ec3fbae58d..5935b32fe3 100644
> >>> --- a/include/hw/virtio/vhost-backend.h
> >>> +++ b/include/hw/virtio/vhost-backend.h
> >>> @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> >>>   VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> >>>   } VhostSetConfigType;
> >>>
> >>> +typedef enum VhostDeviceStateDirection {
> >>> +/* Transfer state from back-end (device) to front-end */
> >>> +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> >>> +/* Transfer state from front-end to back-end (device) */
> >>> +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> >>> +} VhostDeviceStateDirection;
> >>> +
> >>> +typedef enum VhostDeviceStatePhase {
> >>> +/* The device (and all its vrings) is stopped */
> >>> +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> >>> +} VhostDeviceStatePhase;
> >> vDPA has:
> >>
> >>/* Suspend a device so it does not process virtqueue requests anymore
> >> *
> >> * After the return of ioctl the device must preserve all the necessary 
> >> state
> >> * (the virtqueue vring base plus the possible device specific states) 
> >> that is
> >> * required for restoring in the future. The device must not change its
> >> * configuration after that point.
> >> */
> >>#define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)
> >>
> >>/* Resume a device so it can resume processing virtqueue requests
> >> *
> >> * After the return of this ioctl the device will have restored all the
> >> * necessary states and it is fully operational to continue processing 
> >> the
> >> * virtqueue descriptors.
> >> */
> >>#define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)
> 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Stefan Hajnoczi
On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  wrote:
> >
> > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > So-called "internal" virtio-fs migration refers to transporting the
> > > back-end's (virtiofsd's) state through qemu's migration stream.  To do
> > > this, we need to be able to transfer virtiofsd's internal state to and
> > > from virtiofsd.
> > >
> > > Because virtiofsd's internal state will not be too large, we believe it
> > > is best to transfer it as a single binary blob after the streaming
> > > phase.  Because this method should be useful to other vhost-user
> > > implementations, too, it is introduced as a general-purpose addition to
> > > the protocol, not limited to vhost-user-fs.
> > >
> > > These are the additions to the protocol:
> > > - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > >   This feature signals support for transferring state, and is added so
> > >   that migration can fail early when the back-end has no support.
> > >
> > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
> > >   over which to transfer the state.  The front-end sends an FD to the
> > >   back-end into/from which it can write/read its state, and the back-end
> > >   can decide to either use it, or reply with a different FD for the
> > >   front-end to override the front-end's choice.
> > >   The front-end creates a simple pipe to transfer the state, but maybe
> > >   the back-end already has an FD into/from which it has to write/read
> > >   its state, in which case it will want to override the simple pipe.
> > >   Conversely, maybe in the future we find a way to have the front-end
> > >   get an immediate FD for the migration stream (in some cases), in which
> > >   case we will want to send this to the back-end instead of creating a
> > >   pipe.
> > >   Hence the negotiation: If one side has a better idea than a plain
> > >   pipe, we will want to use that.
> > >
> > > - CHECK_DEVICE_STATE: After the state has been transferred through the
> > >   pipe (the end indicated by EOF), the front-end invokes this function
> > >   to verify success.  There is no in-band way (through the pipe) to
> > >   indicate failure, so we need to check explicitly.
> > >
> > > Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > > (which includes establishing the direction of transfer and migration
> > > phase), the sending side writes its data into the pipe, and the reading
> > > side reads it until it sees an EOF.  Then, the front-end will check for
> > > success via CHECK_DEVICE_STATE, which on the destination side includes
> > > checking for integrity (i.e. errors during deserialization).
> > >
> > > Suggested-by: Stefan Hajnoczi 
> > > Signed-off-by: Hanna Czenczek 
> > > ---
> > >  include/hw/virtio/vhost-backend.h |  24 +
> > >  include/hw/virtio/vhost.h |  79 
> > >  hw/virtio/vhost-user.c| 147 ++
> > >  hw/virtio/vhost.c |  37 
> > >  4 files changed, 287 insertions(+)
> > >
> > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > b/include/hw/virtio/vhost-backend.h
> > > index ec3fbae58d..5935b32fe3 100644
> > > --- a/include/hw/virtio/vhost-backend.h
> > > +++ b/include/hw/virtio/vhost-backend.h
> > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > >  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > >  } VhostSetConfigType;
> > >
> > > +typedef enum VhostDeviceStateDirection {
> > > +/* Transfer state from back-end (device) to front-end */
> > > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > > +/* Transfer state from front-end to back-end (device) */
> > > +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > > +} VhostDeviceStateDirection;
> > > +
> > > +typedef enum VhostDeviceStatePhase {
> > > +/* The device (and all its vrings) is stopped */
> > > +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > > +} VhostDeviceStatePhase;
> >
> > vDPA has:
> >
> >   /* Suspend a device so it does not process virtqueue requests anymore
> >*
> >* After the return of ioctl the device must preserve all the necessary 
> > state
> >* (the virtqueue vring base plus the possible device specific states) 
> > that is
> >* required for restoring in the future. The device must not change its
> >* configuration after that point.
> >*/
> >   #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)
> >
> >   /* Resume a device so it can resume processing virtqueue requests
> >*
> >* After the return of this ioctl the device will have restored all the
> >* necessary states and it is fully operational to continue processing the
> >* virtqueue descriptors.
> >*/
> >   #define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)
> >
> > I wonder if it makes sense to import these into vhost-user so that the
> > difference between 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Stefan Hajnoczi
On Thu, Apr 13, 2023 at 12:14:24PM +0200, Eugenio Perez Martin wrote:
> On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  wrote:
> >
> > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > So-called "internal" virtio-fs migration refers to transporting the
> > > back-end's (virtiofsd's) state through qemu's migration stream.  To do
> > > this, we need to be able to transfer virtiofsd's internal state to and
> > > from virtiofsd.
> > >
> > > Because virtiofsd's internal state will not be too large, we believe it
> > > is best to transfer it as a single binary blob after the streaming
> > > phase.  Because this method should be useful to other vhost-user
> > > implementations, too, it is introduced as a general-purpose addition to
> > > the protocol, not limited to vhost-user-fs.
> > >
> > > These are the additions to the protocol:
> > > - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > >   This feature signals support for transferring state, and is added so
> > >   that migration can fail early when the back-end has no support.
> > >
> > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
> > >   over which to transfer the state.  The front-end sends an FD to the
> > >   back-end into/from which it can write/read its state, and the back-end
> > >   can decide to either use it, or reply with a different FD for the
> > >   front-end to override the front-end's choice.
> > >   The front-end creates a simple pipe to transfer the state, but maybe
> > >   the back-end already has an FD into/from which it has to write/read
> > >   its state, in which case it will want to override the simple pipe.
> > >   Conversely, maybe in the future we find a way to have the front-end
> > >   get an immediate FD for the migration stream (in some cases), in which
> > >   case we will want to send this to the back-end instead of creating a
> > >   pipe.
> > >   Hence the negotiation: If one side has a better idea than a plain
> > >   pipe, we will want to use that.
> > >
> > > - CHECK_DEVICE_STATE: After the state has been transferred through the
> > >   pipe (the end indicated by EOF), the front-end invokes this function
> > >   to verify success.  There is no in-band way (through the pipe) to
> > >   indicate failure, so we need to check explicitly.
> > >
> > > Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > > (which includes establishing the direction of transfer and migration
> > > phase), the sending side writes its data into the pipe, and the reading
> > > side reads it until it sees an EOF.  Then, the front-end will check for
> > > success via CHECK_DEVICE_STATE, which on the destination side includes
> > > checking for integrity (i.e. errors during deserialization).
> > >
> > > Suggested-by: Stefan Hajnoczi 
> > > Signed-off-by: Hanna Czenczek 
> > > ---
> > >  include/hw/virtio/vhost-backend.h |  24 +
> > >  include/hw/virtio/vhost.h |  79 
> > >  hw/virtio/vhost-user.c| 147 ++
> > >  hw/virtio/vhost.c |  37 
> > >  4 files changed, 287 insertions(+)
> > >
> > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > b/include/hw/virtio/vhost-backend.h
> > > index ec3fbae58d..5935b32fe3 100644
> > > --- a/include/hw/virtio/vhost-backend.h
> > > +++ b/include/hw/virtio/vhost-backend.h
> > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > >  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > >  } VhostSetConfigType;
> > >
> > > +typedef enum VhostDeviceStateDirection {
> > > +/* Transfer state from back-end (device) to front-end */
> > > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > > +/* Transfer state from front-end to back-end (device) */
> > > +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > > +} VhostDeviceStateDirection;
> > > +
> > > +typedef enum VhostDeviceStatePhase {
> > > +/* The device (and all its vrings) is stopped */
> > > +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > > +} VhostDeviceStatePhase;
> >
> > vDPA has:
> >
> >   /* Suspend a device so it does not process virtqueue requests anymore
> >*
> >* After the return of ioctl the device must preserve all the necessary 
> > state
> >* (the virtqueue vring base plus the possible device specific states) 
> > that is
> >* required for restoring in the future. The device must not change its
> >* configuration after that point.
> >*/
> >   #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)
> >
> >   /* Resume a device so it can resume processing virtqueue requests
> >*
> >* After the return of this ioctl the device will have restored all the
> >* necessary states and it is fully operational to continue processing the
> >* virtqueue descriptors.
> >*/
> >   #define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)
> >
> > I wonder if it makes sense to import these into vhost-user so that the
> > difference between 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Stefan Hajnoczi
On Fri, Apr 14, 2023 at 05:17:02PM +0200, Eugenio Perez Martin wrote:
> On Thu, Apr 13, 2023 at 7:55 PM Hanna Czenczek  wrote:
> >
> > On 13.04.23 13:38, Stefan Hajnoczi wrote:
> > > On Thu, 13 Apr 2023 at 05:24, Hanna Czenczek  wrote:
> > >> On 12.04.23 23:06, Stefan Hajnoczi wrote:
> > >>> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> >  So-called "internal" virtio-fs migration refers to transporting the
> >  back-end's (virtiofsd's) state through qemu's migration stream.  To do
> >  this, we need to be able to transfer virtiofsd's internal state to and
> >  from virtiofsd.
> > 
> >  Because virtiofsd's internal state will not be too large, we believe it
> >  is best to transfer it as a single binary blob after the streaming
> >  phase.  Because this method should be useful to other vhost-user
> >  implementations, too, it is introduced as a general-purpose addition to
> >  the protocol, not limited to vhost-user-fs.
> > 
> >  These are the additions to the protocol:
> >  - New vhost-user protocol feature 
> >  VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> >  This feature signals support for transferring state, and is added 
> >  so
> >  that migration can fail early when the back-end has no support.
> > 
> >  - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
> >  over which to transfer the state.  The front-end sends an FD to the
> >  back-end into/from which it can write/read its state, and the 
> >  back-end
> >  can decide to either use it, or reply with a different FD for the
> >  front-end to override the front-end's choice.
> >  The front-end creates a simple pipe to transfer the state, but 
> >  maybe
> >  the back-end already has an FD into/from which it has to write/read
> >  its state, in which case it will want to override the simple pipe.
> >  Conversely, maybe in the future we find a way to have the front-end
> >  get an immediate FD for the migration stream (in some cases), in 
> >  which
> >  case we will want to send this to the back-end instead of creating 
> >  a
> >  pipe.
> >  Hence the negotiation: If one side has a better idea than a plain
> >  pipe, we will want to use that.
> > 
> >  - CHECK_DEVICE_STATE: After the state has been transferred through the
> >  pipe (the end indicated by EOF), the front-end invokes this 
> >  function
> >  to verify success.  There is no in-band way (through the pipe) to
> >  indicate failure, so we need to check explicitly.
> > 
> >  Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> >  (which includes establishing the direction of transfer and migration
> >  phase), the sending side writes its data into the pipe, and the reading
> >  side reads it until it sees an EOF.  Then, the front-end will check for
> >  success via CHECK_DEVICE_STATE, which on the destination side includes
> >  checking for integrity (i.e. errors during deserialization).
> > 
> >  Suggested-by: Stefan Hajnoczi 
> >  Signed-off-by: Hanna Czenczek 
> >  ---
> > include/hw/virtio/vhost-backend.h |  24 +
> > include/hw/virtio/vhost.h |  79 
> > hw/virtio/vhost-user.c| 147 
> >  ++
> > hw/virtio/vhost.c |  37 
> > 4 files changed, 287 insertions(+)
> > 
> >  diff --git a/include/hw/virtio/vhost-backend.h 
> >  b/include/hw/virtio/vhost-backend.h
> >  index ec3fbae58d..5935b32fe3 100644
> >  --- a/include/hw/virtio/vhost-backend.h
> >  +++ b/include/hw/virtio/vhost-backend.h
> >  @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > } VhostSetConfigType;
> > 
> >  +typedef enum VhostDeviceStateDirection {
> >  +/* Transfer state from back-end (device) to front-end */
> >  +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> >  +/* Transfer state from front-end to back-end (device) */
> >  +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> >  +} VhostDeviceStateDirection;
> >  +
> >  +typedef enum VhostDeviceStatePhase {
> >  +/* The device (and all its vrings) is stopped */
> >  +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> >  +} VhostDeviceStatePhase;
> > >>> vDPA has:
> > >>>
> > >>> /* Suspend a device so it does not process virtqueue requests 
> > >>> anymore
> > >>>  *
> > >>>  * After the return of ioctl the device must preserve all the 
> > >>> necessary state
> > >>>  * (the virtqueue vring base plus the possible device specific 
> > >>> states) that is
> > >>>  * required for restoring in the future. The device must not change 
> > >>> its
> 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-17 Thread Stefan Hajnoczi
On Thu, Apr 13, 2023 at 07:31:57PM +0200, Hanna Czenczek wrote:
> On 13.04.23 12:14, Eugenio Perez Martin wrote:
> > On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  
> > wrote:
> > > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > > > So-called "internal" virtio-fs migration refers to transporting the
> > > > back-end's (virtiofsd's) state through qemu's migration stream.  To do
> > > > this, we need to be able to transfer virtiofsd's internal state to and
> > > > from virtiofsd.
> > > > 
> > > > Because virtiofsd's internal state will not be too large, we believe it
> > > > is best to transfer it as a single binary blob after the streaming
> > > > phase.  Because this method should be useful to other vhost-user
> > > > implementations, too, it is introduced as a general-purpose addition to
> > > > the protocol, not limited to vhost-user-fs.
> > > > 
> > > > These are the additions to the protocol:
> > > > - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> > > >This feature signals support for transferring state, and is added so
> > > >that migration can fail early when the back-end has no support.
> > > > 
> > > > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
> > > >over which to transfer the state.  The front-end sends an FD to the
> > > >back-end into/from which it can write/read its state, and the 
> > > > back-end
> > > >can decide to either use it, or reply with a different FD for the
> > > >front-end to override the front-end's choice.
> > > >The front-end creates a simple pipe to transfer the state, but maybe
> > > >the back-end already has an FD into/from which it has to write/read
> > > >its state, in which case it will want to override the simple pipe.
> > > >Conversely, maybe in the future we find a way to have the front-end
> > > >get an immediate FD for the migration stream (in some cases), in 
> > > > which
> > > >case we will want to send this to the back-end instead of creating a
> > > >pipe.
> > > >Hence the negotiation: If one side has a better idea than a plain
> > > >pipe, we will want to use that.
> > > > 
> > > > - CHECK_DEVICE_STATE: After the state has been transferred through the
> > > >pipe (the end indicated by EOF), the front-end invokes this function
> > > >to verify success.  There is no in-band way (through the pipe) to
> > > >indicate failure, so we need to check explicitly.
> > > > 
> > > > Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > > > (which includes establishing the direction of transfer and migration
> > > > phase), the sending side writes its data into the pipe, and the reading
> > > > side reads it until it sees an EOF.  Then, the front-end will check for
> > > > success via CHECK_DEVICE_STATE, which on the destination side includes
> > > > checking for integrity (i.e. errors during deserialization).
> > > > 
> > > > Suggested-by: Stefan Hajnoczi 
> > > > Signed-off-by: Hanna Czenczek 
> > > > ---
> > > >   include/hw/virtio/vhost-backend.h |  24 +
> > > >   include/hw/virtio/vhost.h |  79 
> > > >   hw/virtio/vhost-user.c| 147 ++
> > > >   hw/virtio/vhost.c |  37 
> > > >   4 files changed, 287 insertions(+)
> > > > 
> > > > diff --git a/include/hw/virtio/vhost-backend.h 
> > > > b/include/hw/virtio/vhost-backend.h
> > > > index ec3fbae58d..5935b32fe3 100644
> > > > --- a/include/hw/virtio/vhost-backend.h
> > > > +++ b/include/hw/virtio/vhost-backend.h
> > > > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> > > >   VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> > > >   } VhostSetConfigType;
> > > > 
> > > > +typedef enum VhostDeviceStateDirection {
> > > > +/* Transfer state from back-end (device) to front-end */
> > > > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > > > +/* Transfer state from front-end to back-end (device) */
> > > > +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > > > +} VhostDeviceStateDirection;
> > > > +
> > > > +typedef enum VhostDeviceStatePhase {
> > > > +/* The device (and all its vrings) is stopped */
> > > > +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > > > +} VhostDeviceStatePhase;
> > > vDPA has:
> > > 
> > >/* Suspend a device so it does not process virtqueue requests anymore
> > > *
> > > * After the return of ioctl the device must preserve all the 
> > > necessary state
> > > * (the virtqueue vring base plus the possible device specific states) 
> > > that is
> > > * required for restoring in the future. The device must not change its
> > > * configuration after that point.
> > > */
> > >#define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)
> > > 
> > >/* Resume a device so it can resume processing virtqueue requests
> > > *
> > > * After the return of this ioctl the device will have restored all the
> > 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-14 Thread Eugenio Perez Martin
On Thu, Apr 13, 2023 at 7:55 PM Hanna Czenczek  wrote:
>
> On 13.04.23 13:38, Stefan Hajnoczi wrote:
> > On Thu, 13 Apr 2023 at 05:24, Hanna Czenczek  wrote:
> >> On 12.04.23 23:06, Stefan Hajnoczi wrote:
> >>> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
>  So-called "internal" virtio-fs migration refers to transporting the
>  back-end's (virtiofsd's) state through qemu's migration stream.  To do
>  this, we need to be able to transfer virtiofsd's internal state to and
>  from virtiofsd.
> 
>  Because virtiofsd's internal state will not be too large, we believe it
>  is best to transfer it as a single binary blob after the streaming
>  phase.  Because this method should be useful to other vhost-user
>  implementations, too, it is introduced as a general-purpose addition to
>  the protocol, not limited to vhost-user-fs.
> 
>  These are the additions to the protocol:
>  - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
>  This feature signals support for transferring state, and is added so
>  that migration can fail early when the back-end has no support.
> 
>  - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
>  over which to transfer the state.  The front-end sends an FD to the
>  back-end into/from which it can write/read its state, and the 
>  back-end
>  can decide to either use it, or reply with a different FD for the
>  front-end to override the front-end's choice.
>  The front-end creates a simple pipe to transfer the state, but maybe
>  the back-end already has an FD into/from which it has to write/read
>  its state, in which case it will want to override the simple pipe.
>  Conversely, maybe in the future we find a way to have the front-end
>  get an immediate FD for the migration stream (in some cases), in 
>  which
>  case we will want to send this to the back-end instead of creating a
>  pipe.
>  Hence the negotiation: If one side has a better idea than a plain
>  pipe, we will want to use that.
> 
>  - CHECK_DEVICE_STATE: After the state has been transferred through the
>  pipe (the end indicated by EOF), the front-end invokes this function
>  to verify success.  There is no in-band way (through the pipe) to
>  indicate failure, so we need to check explicitly.
> 
>  Once the transfer pipe has been established via SET_DEVICE_STATE_FD
>  (which includes establishing the direction of transfer and migration
>  phase), the sending side writes its data into the pipe, and the reading
>  side reads it until it sees an EOF.  Then, the front-end will check for
>  success via CHECK_DEVICE_STATE, which on the destination side includes
>  checking for integrity (i.e. errors during deserialization).
> 
>  Suggested-by: Stefan Hajnoczi 
>  Signed-off-by: Hanna Czenczek 
>  ---
> include/hw/virtio/vhost-backend.h |  24 +
> include/hw/virtio/vhost.h |  79 
> hw/virtio/vhost-user.c| 147 ++
> hw/virtio/vhost.c |  37 
> 4 files changed, 287 insertions(+)
> 
>  diff --git a/include/hw/virtio/vhost-backend.h 
>  b/include/hw/virtio/vhost-backend.h
>  index ec3fbae58d..5935b32fe3 100644
>  --- a/include/hw/virtio/vhost-backend.h
>  +++ b/include/hw/virtio/vhost-backend.h
>  @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> } VhostSetConfigType;
> 
>  +typedef enum VhostDeviceStateDirection {
>  +/* Transfer state from back-end (device) to front-end */
>  +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
>  +/* Transfer state from front-end to back-end (device) */
>  +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
>  +} VhostDeviceStateDirection;
>  +
>  +typedef enum VhostDeviceStatePhase {
>  +/* The device (and all its vrings) is stopped */
>  +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
>  +} VhostDeviceStatePhase;
> >>> vDPA has:
> >>>
> >>> /* Suspend a device so it does not process virtqueue requests anymore
> >>>  *
> >>>  * After the return of ioctl the device must preserve all the 
> >>> necessary state
> >>>  * (the virtqueue vring base plus the possible device specific 
> >>> states) that is
> >>>  * required for restoring in the future. The device must not change 
> >>> its
> >>>  * configuration after that point.
> >>>  */
> >>> #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)
> >>>
> >>> /* Resume a device so it can resume processing virtqueue requests
> >>>  *
> >>>  * After the return of this ioctl the device will have restored all 
> >>> the
> >>>  * 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-13 Thread Stefan Hajnoczi
On Thu, 13 Apr 2023 at 13:55, Hanna Czenczek  wrote:
>
> On 13.04.23 13:38, Stefan Hajnoczi wrote:
> > On Thu, 13 Apr 2023 at 05:24, Hanna Czenczek  wrote:
> >> On 12.04.23 23:06, Stefan Hajnoczi wrote:
> >>> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
>  So-called "internal" virtio-fs migration refers to transporting the
>  back-end's (virtiofsd's) state through qemu's migration stream.  To do
>  this, we need to be able to transfer virtiofsd's internal state to and
>  from virtiofsd.
> 
>  Because virtiofsd's internal state will not be too large, we believe it
>  is best to transfer it as a single binary blob after the streaming
>  phase.  Because this method should be useful to other vhost-user
>  implementations, too, it is introduced as a general-purpose addition to
>  the protocol, not limited to vhost-user-fs.
> 
>  These are the additions to the protocol:
>  - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
>  This feature signals support for transferring state, and is added so
>  that migration can fail early when the back-end has no support.
> 
>  - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
>  over which to transfer the state.  The front-end sends an FD to the
>  back-end into/from which it can write/read its state, and the 
>  back-end
>  can decide to either use it, or reply with a different FD for the
>  front-end to override the front-end's choice.
>  The front-end creates a simple pipe to transfer the state, but maybe
>  the back-end already has an FD into/from which it has to write/read
>  its state, in which case it will want to override the simple pipe.
>  Conversely, maybe in the future we find a way to have the front-end
>  get an immediate FD for the migration stream (in some cases), in 
>  which
>  case we will want to send this to the back-end instead of creating a
>  pipe.
>  Hence the negotiation: If one side has a better idea than a plain
>  pipe, we will want to use that.
> 
>  - CHECK_DEVICE_STATE: After the state has been transferred through the
>  pipe (the end indicated by EOF), the front-end invokes this function
>  to verify success.  There is no in-band way (through the pipe) to
>  indicate failure, so we need to check explicitly.
> 
>  Once the transfer pipe has been established via SET_DEVICE_STATE_FD
>  (which includes establishing the direction of transfer and migration
>  phase), the sending side writes its data into the pipe, and the reading
>  side reads it until it sees an EOF.  Then, the front-end will check for
>  success via CHECK_DEVICE_STATE, which on the destination side includes
>  checking for integrity (i.e. errors during deserialization).
> 
>  Suggested-by: Stefan Hajnoczi 
>  Signed-off-by: Hanna Czenczek 
>  ---
> include/hw/virtio/vhost-backend.h |  24 +
> include/hw/virtio/vhost.h |  79 
> hw/virtio/vhost-user.c| 147 ++
> hw/virtio/vhost.c |  37 
> 4 files changed, 287 insertions(+)
> 
>  diff --git a/include/hw/virtio/vhost-backend.h 
>  b/include/hw/virtio/vhost-backend.h
>  index ec3fbae58d..5935b32fe3 100644
>  --- a/include/hw/virtio/vhost-backend.h
>  +++ b/include/hw/virtio/vhost-backend.h
>  @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> } VhostSetConfigType;
> 
>  +typedef enum VhostDeviceStateDirection {
>  +/* Transfer state from back-end (device) to front-end */
>  +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
>  +/* Transfer state from front-end to back-end (device) */
>  +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
>  +} VhostDeviceStateDirection;
>  +
>  +typedef enum VhostDeviceStatePhase {
>  +/* The device (and all its vrings) is stopped */
>  +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
>  +} VhostDeviceStatePhase;
> >>> vDPA has:
> >>>
> >>> /* Suspend a device so it does not process virtqueue requests anymore
> >>>  *
> >>>  * After the return of ioctl the device must preserve all the 
> >>> necessary state
> >>>  * (the virtqueue vring base plus the possible device specific 
> >>> states) that is
> >>>  * required for restoring in the future. The device must not change 
> >>> its
> >>>  * configuration after that point.
> >>>  */
> >>> #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)
> >>>
> >>> /* Resume a device so it can resume processing virtqueue requests
> >>>  *
> >>>  * After the return of this ioctl the device will have restored all 
> >>> the
> >>>  * 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-13 Thread Hanna Czenczek

On 13.04.23 13:38, Stefan Hajnoczi wrote:

On Thu, 13 Apr 2023 at 05:24, Hanna Czenczek  wrote:

On 12.04.23 23:06, Stefan Hajnoczi wrote:

On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:

So-called "internal" virtio-fs migration refers to transporting the
back-end's (virtiofsd's) state through qemu's migration stream.  To do
this, we need to be able to transfer virtiofsd's internal state to and
from virtiofsd.

Because virtiofsd's internal state will not be too large, we believe it
is best to transfer it as a single binary blob after the streaming
phase.  Because this method should be useful to other vhost-user
implementations, too, it is introduced as a general-purpose addition to
the protocol, not limited to vhost-user-fs.

These are the additions to the protocol:
- New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
This feature signals support for transferring state, and is added so
that migration can fail early when the back-end has no support.

- SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
over which to transfer the state.  The front-end sends an FD to the
back-end into/from which it can write/read its state, and the back-end
can decide to either use it, or reply with a different FD for the
front-end to override the front-end's choice.
The front-end creates a simple pipe to transfer the state, but maybe
the back-end already has an FD into/from which it has to write/read
its state, in which case it will want to override the simple pipe.
Conversely, maybe in the future we find a way to have the front-end
get an immediate FD for the migration stream (in some cases), in which
case we will want to send this to the back-end instead of creating a
pipe.
Hence the negotiation: If one side has a better idea than a plain
pipe, we will want to use that.

- CHECK_DEVICE_STATE: After the state has been transferred through the
pipe (the end indicated by EOF), the front-end invokes this function
to verify success.  There is no in-band way (through the pipe) to
indicate failure, so we need to check explicitly.

Once the transfer pipe has been established via SET_DEVICE_STATE_FD
(which includes establishing the direction of transfer and migration
phase), the sending side writes its data into the pipe, and the reading
side reads it until it sees an EOF.  Then, the front-end will check for
success via CHECK_DEVICE_STATE, which on the destination side includes
checking for integrity (i.e. errors during deserialization).

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Hanna Czenczek 
---
   include/hw/virtio/vhost-backend.h |  24 +
   include/hw/virtio/vhost.h |  79 
   hw/virtio/vhost-user.c| 147 ++
   hw/virtio/vhost.c |  37 
   4 files changed, 287 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index ec3fbae58d..5935b32fe3 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
   VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
   } VhostSetConfigType;

+typedef enum VhostDeviceStateDirection {
+/* Transfer state from back-end (device) to front-end */
+VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
+/* Transfer state from front-end to back-end (device) */
+VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
+} VhostDeviceStateDirection;
+
+typedef enum VhostDeviceStatePhase {
+/* The device (and all its vrings) is stopped */
+VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
+} VhostDeviceStatePhase;

vDPA has:

/* Suspend a device so it does not process virtqueue requests anymore
 *
 * After the return of ioctl the device must preserve all the necessary 
state
 * (the virtqueue vring base plus the possible device specific states) that 
is
 * required for restoring in the future. The device must not change its
 * configuration after that point.
 */
#define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)

/* Resume a device so it can resume processing virtqueue requests
 *
 * After the return of this ioctl the device will have restored all the
 * necessary states and it is fully operational to continue processing the
 * virtqueue descriptors.
 */
#define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)

I wonder if it makes sense to import these into vhost-user so that the
difference between kernel vhost and vhost-user is minimized. It's okay
if one of them is ahead of the other, but it would be nice to avoid
overlapping/duplicated functionality.

(And I hope vDPA will import the device state vhost-user messages
introduced in this series.)

I don’t understand your suggestion.  (Like, I very simply don’t
understand :))

These are vhost messages, right?  What purpose do you have in mind for
them in vhost-user for 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-13 Thread Hanna Czenczek

On 13.04.23 12:14, Eugenio Perez Martin wrote:

On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  wrote:

On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:

So-called "internal" virtio-fs migration refers to transporting the
back-end's (virtiofsd's) state through qemu's migration stream.  To do
this, we need to be able to transfer virtiofsd's internal state to and
from virtiofsd.

Because virtiofsd's internal state will not be too large, we believe it
is best to transfer it as a single binary blob after the streaming
phase.  Because this method should be useful to other vhost-user
implementations, too, it is introduced as a general-purpose addition to
the protocol, not limited to vhost-user-fs.

These are the additions to the protocol:
- New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
   This feature signals support for transferring state, and is added so
   that migration can fail early when the back-end has no support.

- SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
   over which to transfer the state.  The front-end sends an FD to the
   back-end into/from which it can write/read its state, and the back-end
   can decide to either use it, or reply with a different FD for the
   front-end to override the front-end's choice.
   The front-end creates a simple pipe to transfer the state, but maybe
   the back-end already has an FD into/from which it has to write/read
   its state, in which case it will want to override the simple pipe.
   Conversely, maybe in the future we find a way to have the front-end
   get an immediate FD for the migration stream (in some cases), in which
   case we will want to send this to the back-end instead of creating a
   pipe.
   Hence the negotiation: If one side has a better idea than a plain
   pipe, we will want to use that.

- CHECK_DEVICE_STATE: After the state has been transferred through the
   pipe (the end indicated by EOF), the front-end invokes this function
   to verify success.  There is no in-band way (through the pipe) to
   indicate failure, so we need to check explicitly.

Once the transfer pipe has been established via SET_DEVICE_STATE_FD
(which includes establishing the direction of transfer and migration
phase), the sending side writes its data into the pipe, and the reading
side reads it until it sees an EOF.  Then, the front-end will check for
success via CHECK_DEVICE_STATE, which on the destination side includes
checking for integrity (i.e. errors during deserialization).

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Hanna Czenczek 
---
  include/hw/virtio/vhost-backend.h |  24 +
  include/hw/virtio/vhost.h |  79 
  hw/virtio/vhost-user.c| 147 ++
  hw/virtio/vhost.c |  37 
  4 files changed, 287 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index ec3fbae58d..5935b32fe3 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
  } VhostSetConfigType;

+typedef enum VhostDeviceStateDirection {
+/* Transfer state from back-end (device) to front-end */
+VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
+/* Transfer state from front-end to back-end (device) */
+VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
+} VhostDeviceStateDirection;
+
+typedef enum VhostDeviceStatePhase {
+/* The device (and all its vrings) is stopped */
+VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
+} VhostDeviceStatePhase;

vDPA has:

   /* Suspend a device so it does not process virtqueue requests anymore
*
* After the return of ioctl the device must preserve all the necessary state
* (the virtqueue vring base plus the possible device specific states) that 
is
* required for restoring in the future. The device must not change its
* configuration after that point.
*/
   #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)

   /* Resume a device so it can resume processing virtqueue requests
*
* After the return of this ioctl the device will have restored all the
* necessary states and it is fully operational to continue processing the
* virtqueue descriptors.
*/
   #define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)

I wonder if it makes sense to import these into vhost-user so that the
difference between kernel vhost and vhost-user is minimized. It's okay
if one of them is ahead of the other, but it would be nice to avoid
overlapping/duplicated functionality.


That's what I had in mind in the first versions. I proposed VHOST_STOP
instead of VHOST_VDPA_STOP for this very reason. Later it did change
to SUSPEND.

Generally it is better if we make the interface less parametrized and
we trust in the messages and its semantics in my opinion. In other
words, instead of

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-13 Thread Stefan Hajnoczi
On Thu, 13 Apr 2023 at 05:24, Hanna Czenczek  wrote:
>
> On 12.04.23 23:06, Stefan Hajnoczi wrote:
> > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> >> So-called "internal" virtio-fs migration refers to transporting the
> >> back-end's (virtiofsd's) state through qemu's migration stream.  To do
> >> this, we need to be able to transfer virtiofsd's internal state to and
> >> from virtiofsd.
> >>
> >> Because virtiofsd's internal state will not be too large, we believe it
> >> is best to transfer it as a single binary blob after the streaming
> >> phase.  Because this method should be useful to other vhost-user
> >> implementations, too, it is introduced as a general-purpose addition to
> >> the protocol, not limited to vhost-user-fs.
> >>
> >> These are the additions to the protocol:
> >> - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> >>This feature signals support for transferring state, and is added so
> >>that migration can fail early when the back-end has no support.
> >>
> >> - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
> >>over which to transfer the state.  The front-end sends an FD to the
> >>back-end into/from which it can write/read its state, and the back-end
> >>can decide to either use it, or reply with a different FD for the
> >>front-end to override the front-end's choice.
> >>The front-end creates a simple pipe to transfer the state, but maybe
> >>the back-end already has an FD into/from which it has to write/read
> >>its state, in which case it will want to override the simple pipe.
> >>Conversely, maybe in the future we find a way to have the front-end
> >>get an immediate FD for the migration stream (in some cases), in which
> >>case we will want to send this to the back-end instead of creating a
> >>pipe.
> >>Hence the negotiation: If one side has a better idea than a plain
> >>pipe, we will want to use that.
> >>
> >> - CHECK_DEVICE_STATE: After the state has been transferred through the
> >>pipe (the end indicated by EOF), the front-end invokes this function
> >>to verify success.  There is no in-band way (through the pipe) to
> >>indicate failure, so we need to check explicitly.
> >>
> >> Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> >> (which includes establishing the direction of transfer and migration
> >> phase), the sending side writes its data into the pipe, and the reading
> >> side reads it until it sees an EOF.  Then, the front-end will check for
> >> success via CHECK_DEVICE_STATE, which on the destination side includes
> >> checking for integrity (i.e. errors during deserialization).
> >>
> >> Suggested-by: Stefan Hajnoczi 
> >> Signed-off-by: Hanna Czenczek 
> >> ---
> >>   include/hw/virtio/vhost-backend.h |  24 +
> >>   include/hw/virtio/vhost.h |  79 
> >>   hw/virtio/vhost-user.c| 147 ++
> >>   hw/virtio/vhost.c |  37 
> >>   4 files changed, 287 insertions(+)
> >>
> >> diff --git a/include/hw/virtio/vhost-backend.h 
> >> b/include/hw/virtio/vhost-backend.h
> >> index ec3fbae58d..5935b32fe3 100644
> >> --- a/include/hw/virtio/vhost-backend.h
> >> +++ b/include/hw/virtio/vhost-backend.h
> >> @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> >>   VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> >>   } VhostSetConfigType;
> >>
> >> +typedef enum VhostDeviceStateDirection {
> >> +/* Transfer state from back-end (device) to front-end */
> >> +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> >> +/* Transfer state from front-end to back-end (device) */
> >> +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> >> +} VhostDeviceStateDirection;
> >> +
> >> +typedef enum VhostDeviceStatePhase {
> >> +/* The device (and all its vrings) is stopped */
> >> +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> >> +} VhostDeviceStatePhase;
> > vDPA has:
> >
> >/* Suspend a device so it does not process virtqueue requests anymore
> > *
> > * After the return of ioctl the device must preserve all the necessary 
> > state
> > * (the virtqueue vring base plus the possible device specific states) 
> > that is
> > * required for restoring in the future. The device must not change its
> > * configuration after that point.
> > */
> >#define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)
> >
> >/* Resume a device so it can resume processing virtqueue requests
> > *
> > * After the return of this ioctl the device will have restored all the
> > * necessary states and it is fully operational to continue processing 
> > the
> > * virtqueue descriptors.
> > */
> >#define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)
> >
> > I wonder if it makes sense to import these into vhost-user so that the
> > difference between kernel vhost and vhost-user is minimized. It's okay
> > if one of 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-13 Thread Stefan Hajnoczi
On Thu, 13 Apr 2023 at 06:15, Eugenio Perez Martin  wrote:
> On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  wrote:
> > On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > (And I hope vDPA will import the device state vhost-user messages
> > introduced in this series.)
> >
>
> I guess they will be needed for vdpa-fs devices? Is there any emulated
> virtio-fs in qemu?

Maybe also virtio-gpu or virtio-crypto, if someone decides to create
hardware or in-kernel implementations.

virtiofs is not built into QEMU, there are only vhost-user implementations.

Stefan



Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-13 Thread Eugenio Perez Martin
On Wed, Apr 12, 2023 at 11:06 PM Stefan Hajnoczi  wrote:
>
> On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> > So-called "internal" virtio-fs migration refers to transporting the
> > back-end's (virtiofsd's) state through qemu's migration stream.  To do
> > this, we need to be able to transfer virtiofsd's internal state to and
> > from virtiofsd.
> >
> > Because virtiofsd's internal state will not be too large, we believe it
> > is best to transfer it as a single binary blob after the streaming
> > phase.  Because this method should be useful to other vhost-user
> > implementations, too, it is introduced as a general-purpose addition to
> > the protocol, not limited to vhost-user-fs.
> >
> > These are the additions to the protocol:
> > - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
> >   This feature signals support for transferring state, and is added so
> >   that migration can fail early when the back-end has no support.
> >
> > - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
> >   over which to transfer the state.  The front-end sends an FD to the
> >   back-end into/from which it can write/read its state, and the back-end
> >   can decide to either use it, or reply with a different FD for the
> >   front-end to override the front-end's choice.
> >   The front-end creates a simple pipe to transfer the state, but maybe
> >   the back-end already has an FD into/from which it has to write/read
> >   its state, in which case it will want to override the simple pipe.
> >   Conversely, maybe in the future we find a way to have the front-end
> >   get an immediate FD for the migration stream (in some cases), in which
> >   case we will want to send this to the back-end instead of creating a
> >   pipe.
> >   Hence the negotiation: If one side has a better idea than a plain
> >   pipe, we will want to use that.
> >
> > - CHECK_DEVICE_STATE: After the state has been transferred through the
> >   pipe (the end indicated by EOF), the front-end invokes this function
> >   to verify success.  There is no in-band way (through the pipe) to
> >   indicate failure, so we need to check explicitly.
> >
> > Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> > (which includes establishing the direction of transfer and migration
> > phase), the sending side writes its data into the pipe, and the reading
> > side reads it until it sees an EOF.  Then, the front-end will check for
> > success via CHECK_DEVICE_STATE, which on the destination side includes
> > checking for integrity (i.e. errors during deserialization).
> >
> > Suggested-by: Stefan Hajnoczi 
> > Signed-off-by: Hanna Czenczek 
> > ---
> >  include/hw/virtio/vhost-backend.h |  24 +
> >  include/hw/virtio/vhost.h |  79 
> >  hw/virtio/vhost-user.c| 147 ++
> >  hw/virtio/vhost.c |  37 
> >  4 files changed, 287 insertions(+)
> >
> > diff --git a/include/hw/virtio/vhost-backend.h 
> > b/include/hw/virtio/vhost-backend.h
> > index ec3fbae58d..5935b32fe3 100644
> > --- a/include/hw/virtio/vhost-backend.h
> > +++ b/include/hw/virtio/vhost-backend.h
> > @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
> >  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
> >  } VhostSetConfigType;
> >
> > +typedef enum VhostDeviceStateDirection {
> > +/* Transfer state from back-end (device) to front-end */
> > +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> > +/* Transfer state from front-end to back-end (device) */
> > +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> > +} VhostDeviceStateDirection;
> > +
> > +typedef enum VhostDeviceStatePhase {
> > +/* The device (and all its vrings) is stopped */
> > +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> > +} VhostDeviceStatePhase;
>
> vDPA has:
>
>   /* Suspend a device so it does not process virtqueue requests anymore
>*
>* After the return of ioctl the device must preserve all the necessary 
> state
>* (the virtqueue vring base plus the possible device specific states) that 
> is
>* required for restoring in the future. The device must not change its
>* configuration after that point.
>*/
>   #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)
>
>   /* Resume a device so it can resume processing virtqueue requests
>*
>* After the return of this ioctl the device will have restored all the
>* necessary states and it is fully operational to continue processing the
>* virtqueue descriptors.
>*/
>   #define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)
>
> I wonder if it makes sense to import these into vhost-user so that the
> difference between kernel vhost and vhost-user is minimized. It's okay
> if one of them is ahead of the other, but it would be nice to avoid
> overlapping/duplicated functionality.
>

That's what I had in mind in the first versions. I proposed VHOST_STOP
instead of VHOST_VDPA_STOP for 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-13 Thread Hanna Czenczek

On 13.04.23 10:50, Eugenio Perez Martin wrote:

On Tue, Apr 11, 2023 at 5:33 PM Hanna Czenczek  wrote:

So-called "internal" virtio-fs migration refers to transporting the
back-end's (virtiofsd's) state through qemu's migration stream.  To do
this, we need to be able to transfer virtiofsd's internal state to and
from virtiofsd.

Because virtiofsd's internal state will not be too large, we believe it
is best to transfer it as a single binary blob after the streaming
phase.  Because this method should be useful to other vhost-user
implementations, too, it is introduced as a general-purpose addition to
the protocol, not limited to vhost-user-fs.

These are the additions to the protocol:
- New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
   This feature signals support for transferring state, and is added so
   that migration can fail early when the back-end has no support.

- SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
   over which to transfer the state.  The front-end sends an FD to the
   back-end into/from which it can write/read its state, and the back-end
   can decide to either use it, or reply with a different FD for the
   front-end to override the front-end's choice.
   The front-end creates a simple pipe to transfer the state, but maybe
   the back-end already has an FD into/from which it has to write/read
   its state, in which case it will want to override the simple pipe.
   Conversely, maybe in the future we find a way to have the front-end
   get an immediate FD for the migration stream (in some cases), in which
   case we will want to send this to the back-end instead of creating a
   pipe.
   Hence the negotiation: If one side has a better idea than a plain
   pipe, we will want to use that.

- CHECK_DEVICE_STATE: After the state has been transferred through the
   pipe (the end indicated by EOF), the front-end invokes this function
   to verify success.  There is no in-band way (through the pipe) to
   indicate failure, so we need to check explicitly.

Once the transfer pipe has been established via SET_DEVICE_STATE_FD
(which includes establishing the direction of transfer and migration
phase), the sending side writes its data into the pipe, and the reading
side reads it until it sees an EOF.  Then, the front-end will check for
success via CHECK_DEVICE_STATE, which on the destination side includes
checking for integrity (i.e. errors during deserialization).

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Hanna Czenczek 
---
  include/hw/virtio/vhost-backend.h |  24 +
  include/hw/virtio/vhost.h |  79 
  hw/virtio/vhost-user.c| 147 ++
  hw/virtio/vhost.c |  37 
  4 files changed, 287 insertions(+)


[...]


diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 2fe02ed5d4..29449e0fe2 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -346,4 +346,83 @@ int vhost_dev_set_inflight(struct vhost_dev *dev,


[...]


+/**
+ * vhost_set_device_state_fd(): After transferring state from/to the

Nitpick: This function doc is for vhost_check_device_state not
vhost_set_device_state_fd.

Thanks!


Oops, right, thanks!

Hanna


+ * back-end via vhost_set_device_state_fd(), i.e. once the sending end
+ * has closed the pipe, inquire the back-end to report any potential
+ * errors that have occurred on its side.  This allows to sense errors
+ * like:
+ * - During outgoing migration, when the source side had already started
+ *   to produce its state, something went wrong and it failed to finish
+ * - During incoming migration, when the received state is somehow
+ *   invalid and cannot be processed by the back-end
+ *
+ * @dev: The vhost device
+ * @errp: Potential error description
+ *
+ * Returns 0 when the back-end reports successful state transfer and
+ * processing, and -errno when an error occurred somewhere.
+ */
+int vhost_check_device_state(struct vhost_dev *dev, Error **errp);
+





Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-13 Thread Hanna Czenczek

On 12.04.23 23:06, Stefan Hajnoczi wrote:

On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:

So-called "internal" virtio-fs migration refers to transporting the
back-end's (virtiofsd's) state through qemu's migration stream.  To do
this, we need to be able to transfer virtiofsd's internal state to and
from virtiofsd.

Because virtiofsd's internal state will not be too large, we believe it
is best to transfer it as a single binary blob after the streaming
phase.  Because this method should be useful to other vhost-user
implementations, too, it is introduced as a general-purpose addition to
the protocol, not limited to vhost-user-fs.

These are the additions to the protocol:
- New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
   This feature signals support for transferring state, and is added so
   that migration can fail early when the back-end has no support.

- SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
   over which to transfer the state.  The front-end sends an FD to the
   back-end into/from which it can write/read its state, and the back-end
   can decide to either use it, or reply with a different FD for the
   front-end to override the front-end's choice.
   The front-end creates a simple pipe to transfer the state, but maybe
   the back-end already has an FD into/from which it has to write/read
   its state, in which case it will want to override the simple pipe.
   Conversely, maybe in the future we find a way to have the front-end
   get an immediate FD for the migration stream (in some cases), in which
   case we will want to send this to the back-end instead of creating a
   pipe.
   Hence the negotiation: If one side has a better idea than a plain
   pipe, we will want to use that.

- CHECK_DEVICE_STATE: After the state has been transferred through the
   pipe (the end indicated by EOF), the front-end invokes this function
   to verify success.  There is no in-band way (through the pipe) to
   indicate failure, so we need to check explicitly.

Once the transfer pipe has been established via SET_DEVICE_STATE_FD
(which includes establishing the direction of transfer and migration
phase), the sending side writes its data into the pipe, and the reading
side reads it until it sees an EOF.  Then, the front-end will check for
success via CHECK_DEVICE_STATE, which on the destination side includes
checking for integrity (i.e. errors during deserialization).

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Hanna Czenczek 
---
  include/hw/virtio/vhost-backend.h |  24 +
  include/hw/virtio/vhost.h |  79 
  hw/virtio/vhost-user.c| 147 ++
  hw/virtio/vhost.c |  37 
  4 files changed, 287 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index ec3fbae58d..5935b32fe3 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
  } VhostSetConfigType;
  
+typedef enum VhostDeviceStateDirection {

+/* Transfer state from back-end (device) to front-end */
+VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
+/* Transfer state from front-end to back-end (device) */
+VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
+} VhostDeviceStateDirection;
+
+typedef enum VhostDeviceStatePhase {
+/* The device (and all its vrings) is stopped */
+VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
+} VhostDeviceStatePhase;

vDPA has:

   /* Suspend a device so it does not process virtqueue requests anymore
*
* After the return of ioctl the device must preserve all the necessary state
* (the virtqueue vring base plus the possible device specific states) that 
is
* required for restoring in the future. The device must not change its
* configuration after that point.
*/
   #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)

   /* Resume a device so it can resume processing virtqueue requests
*
* After the return of this ioctl the device will have restored all the
* necessary states and it is fully operational to continue processing the
* virtqueue descriptors.
*/
   #define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)

I wonder if it makes sense to import these into vhost-user so that the
difference between kernel vhost and vhost-user is minimized. It's okay
if one of them is ahead of the other, but it would be nice to avoid
overlapping/duplicated functionality.

(And I hope vDPA will import the device state vhost-user messages
introduced in this series.)


I don’t understand your suggestion.  (Like, I very simply don’t 
understand :))


These are vhost messages, right?  What purpose do you have in mind for 
them in vhost-user for internal migration?  They’re different from the 
state transfer messages, because they don’t transfer state to/from the 
front-end.  

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-13 Thread Eugenio Perez Martin
On Tue, Apr 11, 2023 at 5:33 PM Hanna Czenczek  wrote:
>
> So-called "internal" virtio-fs migration refers to transporting the
> back-end's (virtiofsd's) state through qemu's migration stream.  To do
> this, we need to be able to transfer virtiofsd's internal state to and
> from virtiofsd.
>
> Because virtiofsd's internal state will not be too large, we believe it
> is best to transfer it as a single binary blob after the streaming
> phase.  Because this method should be useful to other vhost-user
> implementations, too, it is introduced as a general-purpose addition to
> the protocol, not limited to vhost-user-fs.
>
> These are the additions to the protocol:
> - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
>   This feature signals support for transferring state, and is added so
>   that migration can fail early when the back-end has no support.
>
> - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
>   over which to transfer the state.  The front-end sends an FD to the
>   back-end into/from which it can write/read its state, and the back-end
>   can decide to either use it, or reply with a different FD for the
>   front-end to override the front-end's choice.
>   The front-end creates a simple pipe to transfer the state, but maybe
>   the back-end already has an FD into/from which it has to write/read
>   its state, in which case it will want to override the simple pipe.
>   Conversely, maybe in the future we find a way to have the front-end
>   get an immediate FD for the migration stream (in some cases), in which
>   case we will want to send this to the back-end instead of creating a
>   pipe.
>   Hence the negotiation: If one side has a better idea than a plain
>   pipe, we will want to use that.
>
> - CHECK_DEVICE_STATE: After the state has been transferred through the
>   pipe (the end indicated by EOF), the front-end invokes this function
>   to verify success.  There is no in-band way (through the pipe) to
>   indicate failure, so we need to check explicitly.
>
> Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> (which includes establishing the direction of transfer and migration
> phase), the sending side writes its data into the pipe, and the reading
> side reads it until it sees an EOF.  Then, the front-end will check for
> success via CHECK_DEVICE_STATE, which on the destination side includes
> checking for integrity (i.e. errors during deserialization).
>
> Suggested-by: Stefan Hajnoczi 
> Signed-off-by: Hanna Czenczek 
> ---
>  include/hw/virtio/vhost-backend.h |  24 +
>  include/hw/virtio/vhost.h |  79 
>  hw/virtio/vhost-user.c| 147 ++
>  hw/virtio/vhost.c |  37 
>  4 files changed, 287 insertions(+)
>
> diff --git a/include/hw/virtio/vhost-backend.h 
> b/include/hw/virtio/vhost-backend.h
> index ec3fbae58d..5935b32fe3 100644
> --- a/include/hw/virtio/vhost-backend.h
> +++ b/include/hw/virtio/vhost-backend.h
> @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
>  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
>  } VhostSetConfigType;
>
> +typedef enum VhostDeviceStateDirection {
> +/* Transfer state from back-end (device) to front-end */
> +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> +/* Transfer state from front-end to back-end (device) */
> +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> +} VhostDeviceStateDirection;
> +
> +typedef enum VhostDeviceStatePhase {
> +/* The device (and all its vrings) is stopped */
> +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> +} VhostDeviceStatePhase;
> +
>  struct vhost_inflight;
>  struct vhost_dev;
>  struct vhost_log;
> @@ -133,6 +145,15 @@ typedef int (*vhost_set_config_call_op)(struct vhost_dev 
> *dev,
>
>  typedef void (*vhost_reset_status_op)(struct vhost_dev *dev);
>
> +typedef bool (*vhost_supports_migratory_state_op)(struct vhost_dev *dev);
> +typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev,
> +VhostDeviceStateDirection 
> direction,
> +VhostDeviceStatePhase phase,
> +int fd,
> +int *reply_fd,
> +Error **errp);
> +typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error 
> **errp);
> +
>  typedef struct VhostOps {
>  VhostBackendType backend_type;
>  vhost_backend_init vhost_backend_init;
> @@ -181,6 +202,9 @@ typedef struct VhostOps {
>  vhost_force_iommu_op vhost_force_iommu;
>  vhost_set_config_call_op vhost_set_config_call;
>  vhost_reset_status_op vhost_reset_status;
> +vhost_supports_migratory_state_op vhost_supports_migratory_state;
> +vhost_set_device_state_fd_op vhost_set_device_state_fd;
> +vhost_check_device_state_op vhost_check_device_state;
>  } VhostOps;
>
>  int 

Re: [PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-12 Thread Stefan Hajnoczi
On Tue, Apr 11, 2023 at 05:05:13PM +0200, Hanna Czenczek wrote:
> So-called "internal" virtio-fs migration refers to transporting the
> back-end's (virtiofsd's) state through qemu's migration stream.  To do
> this, we need to be able to transfer virtiofsd's internal state to and
> from virtiofsd.
> 
> Because virtiofsd's internal state will not be too large, we believe it
> is best to transfer it as a single binary blob after the streaming
> phase.  Because this method should be useful to other vhost-user
> implementations, too, it is introduced as a general-purpose addition to
> the protocol, not limited to vhost-user-fs.
> 
> These are the additions to the protocol:
> - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
>   This feature signals support for transferring state, and is added so
>   that migration can fail early when the back-end has no support.
> 
> - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
>   over which to transfer the state.  The front-end sends an FD to the
>   back-end into/from which it can write/read its state, and the back-end
>   can decide to either use it, or reply with a different FD for the
>   front-end to override the front-end's choice.
>   The front-end creates a simple pipe to transfer the state, but maybe
>   the back-end already has an FD into/from which it has to write/read
>   its state, in which case it will want to override the simple pipe.
>   Conversely, maybe in the future we find a way to have the front-end
>   get an immediate FD for the migration stream (in some cases), in which
>   case we will want to send this to the back-end instead of creating a
>   pipe.
>   Hence the negotiation: If one side has a better idea than a plain
>   pipe, we will want to use that.
> 
> - CHECK_DEVICE_STATE: After the state has been transferred through the
>   pipe (the end indicated by EOF), the front-end invokes this function
>   to verify success.  There is no in-band way (through the pipe) to
>   indicate failure, so we need to check explicitly.
> 
> Once the transfer pipe has been established via SET_DEVICE_STATE_FD
> (which includes establishing the direction of transfer and migration
> phase), the sending side writes its data into the pipe, and the reading
> side reads it until it sees an EOF.  Then, the front-end will check for
> success via CHECK_DEVICE_STATE, which on the destination side includes
> checking for integrity (i.e. errors during deserialization).
> 
> Suggested-by: Stefan Hajnoczi 
> Signed-off-by: Hanna Czenczek 
> ---
>  include/hw/virtio/vhost-backend.h |  24 +
>  include/hw/virtio/vhost.h |  79 
>  hw/virtio/vhost-user.c| 147 ++
>  hw/virtio/vhost.c |  37 
>  4 files changed, 287 insertions(+)
> 
> diff --git a/include/hw/virtio/vhost-backend.h 
> b/include/hw/virtio/vhost-backend.h
> index ec3fbae58d..5935b32fe3 100644
> --- a/include/hw/virtio/vhost-backend.h
> +++ b/include/hw/virtio/vhost-backend.h
> @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
>  VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
>  } VhostSetConfigType;
>  
> +typedef enum VhostDeviceStateDirection {
> +/* Transfer state from back-end (device) to front-end */
> +VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
> +/* Transfer state from front-end to back-end (device) */
> +VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
> +} VhostDeviceStateDirection;
> +
> +typedef enum VhostDeviceStatePhase {
> +/* The device (and all its vrings) is stopped */
> +VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
> +} VhostDeviceStatePhase;

vDPA has:

  /* Suspend a device so it does not process virtqueue requests anymore
   *
   * After the return of ioctl the device must preserve all the necessary state
   * (the virtqueue vring base plus the possible device specific states) that is
   * required for restoring in the future. The device must not change its
   * configuration after that point.
   */
  #define VHOST_VDPA_SUSPEND  _IO(VHOST_VIRTIO, 0x7D)

  /* Resume a device so it can resume processing virtqueue requests
   *
   * After the return of this ioctl the device will have restored all the
   * necessary states and it is fully operational to continue processing the
   * virtqueue descriptors.
   */
  #define VHOST_VDPA_RESUME   _IO(VHOST_VIRTIO, 0x7E)

I wonder if it makes sense to import these into vhost-user so that the
difference between kernel vhost and vhost-user is minimized. It's okay
if one of them is ahead of the other, but it would be nice to avoid
overlapping/duplicated functionality.

(And I hope vDPA will import the device state vhost-user messages
introduced in this series.)

> +
>  struct vhost_inflight;
>  struct vhost_dev;
>  struct vhost_log;
> @@ -133,6 +145,15 @@ typedef int (*vhost_set_config_call_op)(struct vhost_dev 
> *dev,
>  
>  typedef void (*vhost_reset_status_op)(struct vhost_dev *dev);
>  
> +typedef bool 

[PATCH 2/4] vhost-user: Interface for migration state transfer

2023-04-11 Thread Hanna Czenczek
So-called "internal" virtio-fs migration refers to transporting the
back-end's (virtiofsd's) state through qemu's migration stream.  To do
this, we need to be able to transfer virtiofsd's internal state to and
from virtiofsd.

Because virtiofsd's internal state will not be too large, we believe it
is best to transfer it as a single binary blob after the streaming
phase.  Because this method should be useful to other vhost-user
implementations, too, it is introduced as a general-purpose addition to
the protocol, not limited to vhost-user-fs.

These are the additions to the protocol:
- New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE:
  This feature signals support for transferring state, and is added so
  that migration can fail early when the back-end has no support.

- SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe
  over which to transfer the state.  The front-end sends an FD to the
  back-end into/from which it can write/read its state, and the back-end
  can decide to either use it, or reply with a different FD for the
  front-end to override the front-end's choice.
  The front-end creates a simple pipe to transfer the state, but maybe
  the back-end already has an FD into/from which it has to write/read
  its state, in which case it will want to override the simple pipe.
  Conversely, maybe in the future we find a way to have the front-end
  get an immediate FD for the migration stream (in some cases), in which
  case we will want to send this to the back-end instead of creating a
  pipe.
  Hence the negotiation: If one side has a better idea than a plain
  pipe, we will want to use that.

- CHECK_DEVICE_STATE: After the state has been transferred through the
  pipe (the end indicated by EOF), the front-end invokes this function
  to verify success.  There is no in-band way (through the pipe) to
  indicate failure, so we need to check explicitly.

Once the transfer pipe has been established via SET_DEVICE_STATE_FD
(which includes establishing the direction of transfer and migration
phase), the sending side writes its data into the pipe, and the reading
side reads it until it sees an EOF.  Then, the front-end will check for
success via CHECK_DEVICE_STATE, which on the destination side includes
checking for integrity (i.e. errors during deserialization).

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Hanna Czenczek 
---
 include/hw/virtio/vhost-backend.h |  24 +
 include/hw/virtio/vhost.h |  79 
 hw/virtio/vhost-user.c| 147 ++
 hw/virtio/vhost.c |  37 
 4 files changed, 287 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index ec3fbae58d..5935b32fe3 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -26,6 +26,18 @@ typedef enum VhostSetConfigType {
 VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
 } VhostSetConfigType;
 
+typedef enum VhostDeviceStateDirection {
+/* Transfer state from back-end (device) to front-end */
+VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0,
+/* Transfer state from front-end to back-end (device) */
+VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1,
+} VhostDeviceStateDirection;
+
+typedef enum VhostDeviceStatePhase {
+/* The device (and all its vrings) is stopped */
+VHOST_TRANSFER_STATE_PHASE_STOPPED = 0,
+} VhostDeviceStatePhase;
+
 struct vhost_inflight;
 struct vhost_dev;
 struct vhost_log;
@@ -133,6 +145,15 @@ typedef int (*vhost_set_config_call_op)(struct vhost_dev 
*dev,
 
 typedef void (*vhost_reset_status_op)(struct vhost_dev *dev);
 
+typedef bool (*vhost_supports_migratory_state_op)(struct vhost_dev *dev);
+typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev,
+VhostDeviceStateDirection 
direction,
+VhostDeviceStatePhase phase,
+int fd,
+int *reply_fd,
+Error **errp);
+typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error 
**errp);
+
 typedef struct VhostOps {
 VhostBackendType backend_type;
 vhost_backend_init vhost_backend_init;
@@ -181,6 +202,9 @@ typedef struct VhostOps {
 vhost_force_iommu_op vhost_force_iommu;
 vhost_set_config_call_op vhost_set_config_call;
 vhost_reset_status_op vhost_reset_status;
+vhost_supports_migratory_state_op vhost_supports_migratory_state;
+vhost_set_device_state_fd_op vhost_set_device_state_fd;
+vhost_check_device_state_op vhost_check_device_state;
 } VhostOps;
 
 int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 2fe02ed5d4..29449e0fe2 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -346,4 +346,83 @@ int