Re: [Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-28 Thread Peter Maydell
On 28 November 2014 at 11:47, Greg Kurz  wrote:
> FWIW I think it should. Commit 3902d49e13c2428bd6381cfdf183103ca4477c1f is
> clearly bad: virtio-rng does not need the .load callback obviously... and
> the fact it breaks migration makes it even worse... :(
>
> Please apply to 2.2.

Applied to master, thanks.

-- PMM



Re: [Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-28 Thread Greg Kurz
On Fri, 28 Nov 2014 09:14:46 +
Peter Maydell  wrote:
> On 27 November 2014 at 09:26, Markus Armbruster  wrote:
> > David Gibson  writes:
> >
> >> VirtIO devices now remember which endianness they're operating in in order
> >> to support targets which may have guests of either endianness, such as
> >> powerpc.  This endianness state is transferred in a subsection of the
> >> virtio device's information.
> >>
> >> With virtio-rng this can lead to an abort after a loadvm hitting the
> >> assert() in virtio_is_big_endian().  This can be reproduced by doing a
> >> migrate and load from file on a bi-endian target with a virtio-rng device.
> >> The actual guest state isn't particularly important to triggering this.
> >>
> >> The cause is that virtio_rng_load_device() calls virtio_rng_process() which
> >> accesses the ring and thus needs the endianness.  However,
> >> virtio_rng_process() is called via virtio_load() before it loads the
> >> subsections.  Essentially the ->load callback in VirtioDeviceClass should
> >> only be used for actually reading the device state from the stream, not for
> >> post-load re-initialization.
> >>
> >> This patch fixes the bug by moving the virtio_rng_process() after the call
> >> to virtio_load().  Better yet would be to convert virtio to use vmsd and
> >> have the virtio_rng_process() as a post_load callback, but that's a bigger
> >> project for another day.
> >>
> >> This is bugfix, and should be considered for the 2.2 branch.
> >
> > "[PATCH for-2.2]" would have been a good idea then.  Next time :)
> 
> So do you want this patch in 2.2? I was planning to put in the
> virtio-vs-xen fixes today and tag rc4, so it's not too late if you're
> confident this patch is good. Let me know if you think it should go in,
> and I can apply it to master directly.
> 
> -- PMM
> 

Peter,

FWIW I think it should. Commit 3902d49e13c2428bd6381cfdf183103ca4477c1f is
clearly bad: virtio-rng does not need the .load callback obviously... and
the fact it breaks migration makes it even worse... :(

Please apply to 2.2.

Cheers.

--
Greg




Re: [Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-28 Thread David Gibson
On Fri, Nov 28, 2014 at 09:14:46AM +, Peter Maydell wrote:
> On 27 November 2014 at 09:26, Markus Armbruster  wrote:
> > David Gibson  writes:
> >
> >> VirtIO devices now remember which endianness they're operating in in order
> >> to support targets which may have guests of either endianness, such as
> >> powerpc.  This endianness state is transferred in a subsection of the
> >> virtio device's information.
> >>
> >> With virtio-rng this can lead to an abort after a loadvm hitting the
> >> assert() in virtio_is_big_endian().  This can be reproduced by doing a
> >> migrate and load from file on a bi-endian target with a virtio-rng device.
> >> The actual guest state isn't particularly important to triggering this.
> >>
> >> The cause is that virtio_rng_load_device() calls virtio_rng_process() which
> >> accesses the ring and thus needs the endianness.  However,
> >> virtio_rng_process() is called via virtio_load() before it loads the
> >> subsections.  Essentially the ->load callback in VirtioDeviceClass should
> >> only be used for actually reading the device state from the stream, not for
> >> post-load re-initialization.
> >>
> >> This patch fixes the bug by moving the virtio_rng_process() after the call
> >> to virtio_load().  Better yet would be to convert virtio to use vmsd and
> >> have the virtio_rng_process() as a post_load callback, but that's a bigger
> >> project for another day.
> >>
> >> This is bugfix, and should be considered for the 2.2 branch.
> >
> > "[PATCH for-2.2]" would have been a good idea then.  Next time :)
> 
> So do you want this patch in 2.2? I was planning to put in the
> virtio-vs-xen fixes today and tag rc4, so it's not too late if you're
> confident this patch is good. Let me know if you think it should go in,
> and I can apply it to master directly.

Yes, I think it should be applied.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpCGqBKs2Qmi.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-28 Thread Peter Maydell
On 27 November 2014 at 09:26, Markus Armbruster  wrote:
> David Gibson  writes:
>
>> VirtIO devices now remember which endianness they're operating in in order
>> to support targets which may have guests of either endianness, such as
>> powerpc.  This endianness state is transferred in a subsection of the
>> virtio device's information.
>>
>> With virtio-rng this can lead to an abort after a loadvm hitting the
>> assert() in virtio_is_big_endian().  This can be reproduced by doing a
>> migrate and load from file on a bi-endian target with a virtio-rng device.
>> The actual guest state isn't particularly important to triggering this.
>>
>> The cause is that virtio_rng_load_device() calls virtio_rng_process() which
>> accesses the ring and thus needs the endianness.  However,
>> virtio_rng_process() is called via virtio_load() before it loads the
>> subsections.  Essentially the ->load callback in VirtioDeviceClass should
>> only be used for actually reading the device state from the stream, not for
>> post-load re-initialization.
>>
>> This patch fixes the bug by moving the virtio_rng_process() after the call
>> to virtio_load().  Better yet would be to convert virtio to use vmsd and
>> have the virtio_rng_process() as a post_load callback, but that's a bigger
>> project for another day.
>>
>> This is bugfix, and should be considered for the 2.2 branch.
>
> "[PATCH for-2.2]" would have been a good idea then.  Next time :)

So do you want this patch in 2.2? I was planning to put in the
virtio-vs-xen fixes today and tag rc4, so it's not too late if you're
confident this patch is good. Let me know if you think it should go in,
and I can apply it to master directly.

-- PMM



Re: [Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-27 Thread Amit Shah
On (Fri) 28 Nov 2014 [11:50:51], David Gibson wrote:
> On Thu, Nov 27, 2014 at 02:38:42PM +0530, Amit Shah wrote:
> > On (Thu) 27 Nov 2014 [16:48:10], David Gibson wrote:
> > > VirtIO devices now remember which endianness they're operating in in order
> > > to support targets which may have guests of either endianness, such as
> > > powerpc.  This endianness state is transferred in a subsection of the
> > > virtio device's information.
> > > 
> > > With virtio-rng this can lead to an abort after a loadvm hitting the
> > > assert() in virtio_is_big_endian().  This can be reproduced by doing a
> > > migrate and load from file on a bi-endian target with a virtio-rng device.
> > > The actual guest state isn't particularly important to triggering this.
> > > 
> > > The cause is that virtio_rng_load_device() calls virtio_rng_process() 
> > > which
> > > accesses the ring and thus needs the endianness.  However,
> > > virtio_rng_process() is called via virtio_load() before it loads the
> > > subsections.  Essentially the ->load callback in VirtioDeviceClass should
> > > only be used for actually reading the device state from the stream, not 
> > > for
> > > post-load re-initialization.
> > 
> > Agreed.
> > 
> > > This patch fixes the bug by moving the virtio_rng_process() after the call
> > > to virtio_load().  Better yet would be to convert virtio to use vmsd and
> > > have the virtio_rng_process() as a post_load callback, but that's a bigger
> > > project for another day.
> > > 
> > > This is bugfix, and should be considered for the 2.2 branch.
> > 
> > This is undoing most of 3902d49e13c2428bd6381cfdf183103ca4477c1f ,
> > added Greg to CC list.
> > 
> > Did you try this on x86 guests, or with multiple rng devices?
> 
> Not so far, I'll see what I can do.

Thanks.

Amit



Re: [Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-27 Thread David Gibson
On Thu, Nov 27, 2014 at 02:38:42PM +0530, Amit Shah wrote:
> On (Thu) 27 Nov 2014 [16:48:10], David Gibson wrote:
> > VirtIO devices now remember which endianness they're operating in in order
> > to support targets which may have guests of either endianness, such as
> > powerpc.  This endianness state is transferred in a subsection of the
> > virtio device's information.
> > 
> > With virtio-rng this can lead to an abort after a loadvm hitting the
> > assert() in virtio_is_big_endian().  This can be reproduced by doing a
> > migrate and load from file on a bi-endian target with a virtio-rng device.
> > The actual guest state isn't particularly important to triggering this.
> > 
> > The cause is that virtio_rng_load_device() calls virtio_rng_process() which
> > accesses the ring and thus needs the endianness.  However,
> > virtio_rng_process() is called via virtio_load() before it loads the
> > subsections.  Essentially the ->load callback in VirtioDeviceClass should
> > only be used for actually reading the device state from the stream, not for
> > post-load re-initialization.
> 
> Agreed.
> 
> > This patch fixes the bug by moving the virtio_rng_process() after the call
> > to virtio_load().  Better yet would be to convert virtio to use vmsd and
> > have the virtio_rng_process() as a post_load callback, but that's a bigger
> > project for another day.
> > 
> > This is bugfix, and should be considered for the 2.2 branch.
> 
> This is undoing most of 3902d49e13c2428bd6381cfdf183103ca4477c1f ,
> added Greg to CC list.
> 
> Did you try this on x86 guests, or with multiple rng devices?

Not so far, I'll see what I can do.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp2kZwigOWnF.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-27 Thread Greg Kurz
On Thu, 27 Nov 2014 14:38:42 +0530
Amit Shah  wrote:
> On (Thu) 27 Nov 2014 [16:48:10], David Gibson wrote:
> > VirtIO devices now remember which endianness they're operating in in order
> > to support targets which may have guests of either endianness, such as
> > powerpc.  This endianness state is transferred in a subsection of the
> > virtio device's information.
> > 
> > With virtio-rng this can lead to an abort after a loadvm hitting the
> > assert() in virtio_is_big_endian().  This can be reproduced by doing a
> > migrate and load from file on a bi-endian target with a virtio-rng device.
> > The actual guest state isn't particularly important to triggering this.
> > 
> > The cause is that virtio_rng_load_device() calls virtio_rng_process() which
> > accesses the ring and thus needs the endianness.  However,
> > virtio_rng_process() is called via virtio_load() before it loads the
> > subsections.  Essentially the ->load callback in VirtioDeviceClass should
> > only be used for actually reading the device state from the stream, not for
> > post-load re-initialization.
> 
> Agreed.
> 
> > This patch fixes the bug by moving the virtio_rng_process() after the call
> > to virtio_load().  Better yet would be to convert virtio to use vmsd and
> > have the virtio_rng_process() as a post_load callback, but that's a bigger
> > project for another day.
> > 

I remember discussions on IRC last spring where I agreed I would work on it. :)

> > This is bugfix, and should be considered for the 2.2 branch.
> 
> This is undoing most of 3902d49e13c2428bd6381cfdf183103ca4477c1f ,
> added Greg to CC list.
> 

This commit is indeed completely wrong: the load callback only makes sense
when there's something to read which is obviously not the case here... This
is definitely post load stuff :-\

Thanks ! :)

Reviewed-by: Greg Kurz 

--
Greg

> Did you try this on x86 guests, or with multiple rng devices?
> 
> (keeping context for Greg)
> 
> > Signed-off-by: David Gibson 
> > ---
> >  hw/virtio/virtio-rng.c | 15 ---
> >  1 file changed, 8 insertions(+), 7 deletions(-)
> > 
> > diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
> > index e85a979..473c044 100644
> > --- a/hw/virtio/virtio-rng.c
> > +++ b/hw/virtio/virtio-rng.c
> > @@ -113,20 +113,22 @@ static void virtio_rng_save(QEMUFile *f, void *opaque)
> >  
> >  static int virtio_rng_load(QEMUFile *f, void *opaque, int version_id)
> >  {
> > +VirtIORNG *vrng = opaque;
> > +int ret;
> > +
> >  if (version_id != 1) {
> >  return -EINVAL;
> >  }
> > -return virtio_load(VIRTIO_DEVICE(opaque), f, version_id);
> > -}
> > +ret = virtio_load(VIRTIO_DEVICE(vrng), f, version_id);
> > +if (ret != 0) {
> > +return ret;
> > +}
> >  
> > -static int virtio_rng_load_device(VirtIODevice *vdev, QEMUFile *f,
> > -  int version_id)
> > -{
> >  /* We may have an element ready but couldn't process it due to a quota
> >   * limit.  Make sure to try again after live migration when the quota 
> > may
> >   * have been reset.
> >   */
> > -virtio_rng_process(VIRTIO_RNG(vdev));
> > +virtio_rng_process(vrng);
> >  
> >  return 0;
> >  }
> > @@ -231,7 +233,6 @@ static void virtio_rng_class_init(ObjectClass *klass, 
> > void *data)
> >  vdc->realize = virtio_rng_device_realize;
> >  vdc->unrealize = virtio_rng_device_unrealize;
> >  vdc->get_features = get_features;
> > -vdc->load = virtio_rng_load_device;
> >  }
> >  
> >  static void virtio_rng_initfn(Object *obj)
> 
> 
> Thanks,
> 
>   Amit
> 




Re: [Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-27 Thread Amit Shah
(actually adding Greg)

On (Thu) 27 Nov 2014 [14:38:42], Amit Shah wrote:
> On (Thu) 27 Nov 2014 [16:48:10], David Gibson wrote:
> > VirtIO devices now remember which endianness they're operating in in order
> > to support targets which may have guests of either endianness, such as
> > powerpc.  This endianness state is transferred in a subsection of the
> > virtio device's information.
> > 
> > With virtio-rng this can lead to an abort after a loadvm hitting the
> > assert() in virtio_is_big_endian().  This can be reproduced by doing a
> > migrate and load from file on a bi-endian target with a virtio-rng device.
> > The actual guest state isn't particularly important to triggering this.
> > 
> > The cause is that virtio_rng_load_device() calls virtio_rng_process() which
> > accesses the ring and thus needs the endianness.  However,
> > virtio_rng_process() is called via virtio_load() before it loads the
> > subsections.  Essentially the ->load callback in VirtioDeviceClass should
> > only be used for actually reading the device state from the stream, not for
> > post-load re-initialization.
> 
> Agreed.
> 
> > This patch fixes the bug by moving the virtio_rng_process() after the call
> > to virtio_load().  Better yet would be to convert virtio to use vmsd and
> > have the virtio_rng_process() as a post_load callback, but that's a bigger
> > project for another day.
> > 
> > This is bugfix, and should be considered for the 2.2 branch.
> 
> This is undoing most of 3902d49e13c2428bd6381cfdf183103ca4477c1f ,
> added Greg to CC list.
> 
> Did you try this on x86 guests, or with multiple rng devices?
> 
> (keeping context for Greg)
> 
> > Signed-off-by: David Gibson 
> > ---
> >  hw/virtio/virtio-rng.c | 15 ---
> >  1 file changed, 8 insertions(+), 7 deletions(-)
> > 
> > diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
> > index e85a979..473c044 100644
> > --- a/hw/virtio/virtio-rng.c
> > +++ b/hw/virtio/virtio-rng.c
> > @@ -113,20 +113,22 @@ static void virtio_rng_save(QEMUFile *f, void *opaque)
> >  
> >  static int virtio_rng_load(QEMUFile *f, void *opaque, int version_id)
> >  {
> > +VirtIORNG *vrng = opaque;
> > +int ret;
> > +
> >  if (version_id != 1) {
> >  return -EINVAL;
> >  }
> > -return virtio_load(VIRTIO_DEVICE(opaque), f, version_id);
> > -}
> > +ret = virtio_load(VIRTIO_DEVICE(vrng), f, version_id);
> > +if (ret != 0) {
> > +return ret;
> > +}
> >  
> > -static int virtio_rng_load_device(VirtIODevice *vdev, QEMUFile *f,
> > -  int version_id)
> > -{
> >  /* We may have an element ready but couldn't process it due to a quota
> >   * limit.  Make sure to try again after live migration when the quota 
> > may
> >   * have been reset.
> >   */
> > -virtio_rng_process(VIRTIO_RNG(vdev));
> > +virtio_rng_process(vrng);
> >  
> >  return 0;
> >  }
> > @@ -231,7 +233,6 @@ static void virtio_rng_class_init(ObjectClass *klass, 
> > void *data)
> >  vdc->realize = virtio_rng_device_realize;
> >  vdc->unrealize = virtio_rng_device_unrealize;
> >  vdc->get_features = get_features;
> > -vdc->load = virtio_rng_load_device;
> >  }
> >  
> >  static void virtio_rng_initfn(Object *obj)
> 
> 
> Thanks,
> 
>   Amit

Amit



Re: [Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-27 Thread Markus Armbruster
David Gibson  writes:

> VirtIO devices now remember which endianness they're operating in in order
> to support targets which may have guests of either endianness, such as
> powerpc.  This endianness state is transferred in a subsection of the
> virtio device's information.
>
> With virtio-rng this can lead to an abort after a loadvm hitting the
> assert() in virtio_is_big_endian().  This can be reproduced by doing a
> migrate and load from file on a bi-endian target with a virtio-rng device.
> The actual guest state isn't particularly important to triggering this.
>
> The cause is that virtio_rng_load_device() calls virtio_rng_process() which
> accesses the ring and thus needs the endianness.  However,
> virtio_rng_process() is called via virtio_load() before it loads the
> subsections.  Essentially the ->load callback in VirtioDeviceClass should
> only be used for actually reading the device state from the stream, not for
> post-load re-initialization.
>
> This patch fixes the bug by moving the virtio_rng_process() after the call
> to virtio_load().  Better yet would be to convert virtio to use vmsd and
> have the virtio_rng_process() as a post_load callback, but that's a bigger
> project for another day.
>
> This is bugfix, and should be considered for the 2.2 branch.

"[PATCH for-2.2]" would have been a good idea then.  Next time :)



Re: [Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-27 Thread Amit Shah
On (Thu) 27 Nov 2014 [16:48:10], David Gibson wrote:
> VirtIO devices now remember which endianness they're operating in in order
> to support targets which may have guests of either endianness, such as
> powerpc.  This endianness state is transferred in a subsection of the
> virtio device's information.
> 
> With virtio-rng this can lead to an abort after a loadvm hitting the
> assert() in virtio_is_big_endian().  This can be reproduced by doing a
> migrate and load from file on a bi-endian target with a virtio-rng device.
> The actual guest state isn't particularly important to triggering this.
> 
> The cause is that virtio_rng_load_device() calls virtio_rng_process() which
> accesses the ring and thus needs the endianness.  However,
> virtio_rng_process() is called via virtio_load() before it loads the
> subsections.  Essentially the ->load callback in VirtioDeviceClass should
> only be used for actually reading the device state from the stream, not for
> post-load re-initialization.

Agreed.

> This patch fixes the bug by moving the virtio_rng_process() after the call
> to virtio_load().  Better yet would be to convert virtio to use vmsd and
> have the virtio_rng_process() as a post_load callback, but that's a bigger
> project for another day.
> 
> This is bugfix, and should be considered for the 2.2 branch.

This is undoing most of 3902d49e13c2428bd6381cfdf183103ca4477c1f ,
added Greg to CC list.

Did you try this on x86 guests, or with multiple rng devices?

(keeping context for Greg)

> Signed-off-by: David Gibson 
> ---
>  hw/virtio/virtio-rng.c | 15 ---
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
> index e85a979..473c044 100644
> --- a/hw/virtio/virtio-rng.c
> +++ b/hw/virtio/virtio-rng.c
> @@ -113,20 +113,22 @@ static void virtio_rng_save(QEMUFile *f, void *opaque)
>  
>  static int virtio_rng_load(QEMUFile *f, void *opaque, int version_id)
>  {
> +VirtIORNG *vrng = opaque;
> +int ret;
> +
>  if (version_id != 1) {
>  return -EINVAL;
>  }
> -return virtio_load(VIRTIO_DEVICE(opaque), f, version_id);
> -}
> +ret = virtio_load(VIRTIO_DEVICE(vrng), f, version_id);
> +if (ret != 0) {
> +return ret;
> +}
>  
> -static int virtio_rng_load_device(VirtIODevice *vdev, QEMUFile *f,
> -  int version_id)
> -{
>  /* We may have an element ready but couldn't process it due to a quota
>   * limit.  Make sure to try again after live migration when the quota may
>   * have been reset.
>   */
> -virtio_rng_process(VIRTIO_RNG(vdev));
> +virtio_rng_process(vrng);
>  
>  return 0;
>  }
> @@ -231,7 +233,6 @@ static void virtio_rng_class_init(ObjectClass *klass, 
> void *data)
>  vdc->realize = virtio_rng_device_realize;
>  vdc->unrealize = virtio_rng_device_unrealize;
>  vdc->get_features = get_features;
> -vdc->load = virtio_rng_load_device;
>  }
>  
>  static void virtio_rng_initfn(Object *obj)


Thanks,

Amit