Re: [Qemu-devel] [RFC PATCH v2 19/23] qcow2: Add error handling to the l2meta coroutine

Kevin Wolf Thu, 21 Feb 2013 01:42:46 -0800

On Mon, Feb 18, 2013 at 04:42:55PM +0100, Stefan Hajnoczi wrote:
> On Wed, Feb 13, 2013 at 02:22:09PM +0100, Kevin Wolf wrote:
> > diff --git a/block/qcow2.c b/block/qcow2.c
> > index 57552aa..2819336 100644
> > --- a/block/qcow2.c
> > +++ b/block/qcow2.c
> > @@ -774,11 +774,33 @@ static void coroutine_fn process_l2meta(void *opaque)
> >          m->sleeping = false;
> >      }
> >  
> > +again:
> >      qemu_co_mutex_lock(&s->lock);
> >  
> >      ret = qcow2_alloc_cluster_link_l2(bs, m);
> >      if (ret < 0) {
> > -        /* FIXME */
> > +        /*
> > +         * This is a nasty situation: We have already completed the 
> > allocation
> > +         * write request and returned success, so just failing it isn't
> > +         * possible. We need to make sure to return an error during the 
> > next
> > +         * flush.
> > +         *
> > +         * However, we still can't drop the l2meta because we want I/O 
> > errors
> > +         * to be recoverable e.g. after the block device has been grown or 
> > the
> > +         * network connection restored. Sleep until the next flush comes 
> > and
> > +         * then retry.
> > +         */
> 
> A failed flush is live migrated by hw/virtio-blk.c but what happens when
> we fail during drain?


That's a very good questions. Looks like things become rather hairy...
This would be a case where we really need a VMState for block drivers
(which is in fact how the whole rerror/werror handling would have been
implemented best).

Juan, any chance to introduce such a thing without breaking everything?
Is there something like optional top-level sections?

Kevin

Re: [Qemu-devel] [RFC PATCH v2 19/23] qcow2: Add error handling to the l2meta coroutine

Reply via email to