Re: [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event

Luiz Capitulino Thu, 14 Aug 2014 06:07:04 -0700

On Tue, 5 Aug 2014 11:13:39 +0200
Kevin Wolf <kw...@redhat.com> wrote:


> Am 23.07.2014 um 15:17 hat Luiz Capitulino geschrieben:
> > 
> > Management software, such as OpenStack and RHEV's vdsm, wants to be able
> > to allocate VM disk space on demand. The basic use case is to start a VM
> > with a small disk and then the disk is enlarged when QEMU hits a ENOSPC
> > condition.
> > 
> > To this end, the management software has to be notified when QEMU
> > encounters ENOSPC. The most straightforward solution is to extend QMP's
> > BLOCK_IO_ERROR event with that information.
> > 
> > This series does exactly that. The approach taken is the simplest possible:
> > the BLOCK_IO_ERROR event is extended to contain a "nospace" key, which
> > will be true whenever the guest runs out of space *and* werror=stop|enospc.
> > Here's an example:
> > 
> > { "event": "BLOCK_IO_ERROR",
> >     "data": { "device": "ide0-hd1",
> >               "operation": "write",
> >                       "action": "stop",
> >                       "nospace": true },
> >     "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
> > 
> > There are three important things to observe:
> > 
> >  1. query-block already supports querying the event by means of the
> >     "io-status" key. Actually, "nospace" and "io-status" keys share
> >     the same semantics. This is a big advantage of this approach, no
> >     further extension of query-block is needed
> > 
> >  2. The event could also contain an error message key for debugging,
> >     But if we add it to the event, should we add it to query-block too?
> 
> I don't think it's strictly necessary, but I can imagine that it would
> be a very nice feature for debugging if you could check after that fact
> what caused the VM stop even if you don't have a QMP log with the event.
> 
> >  3. I'm not extending BLOCK_JOB_ERROR. The reason is that it seems
> >     that BLOCK_IO_ERROR is also emitted on BLOCK_JOB_ERROR
> 
> Hm, I can't see this in the code. Where do I need to look?
> 
> Or did you get both a BLOCK_JOB_ERROR and a BLOCK_IO_ERROR because the
> guest tried to access the image, too, and caused a separate error?

Yes. I saw this behavior while testing my code where BLOCK_IO_ERROR always
followed BLOCK_JOB_ERROR. I assumed this was not a coincidence. If I'm
wrong I can just extend BLOCK_JOB_ERROR too.

> 
> > Now, this series is an RFC because there's an alternative solution for
> > this problem: instead of extending the BLOCK_IO_ERROR event with no-space
> > indicator, we could have a stringfied errno. This way management apps
> > would also be able to distinguish among other errors.
> 
> I don't think sending errnos is a good approach (but if we took it,
> we should use an enum rather than strings) and prefer exposing the
> exact information that is actually needed.
> 
> > For example, we could have a "error-details" dict containing a
> > "reason" and a "message" key:
> > 
> > { "event": "BLOCK_IO_ERROR",
> >     "data": { "device": "ide0-hd1",
> >               "operation": "write",
> >               "action": "stop",
> >                       "error-details": { "reason": "eio", "message": "I/O 
> > error" },
> >     "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
> > 
> > And then query-block would have to be extended to contain the same
> > information.
> > 
> > IMO, this series implementation is good enough for the requirement we
> > currently have but I'm open to go complex if needed.
> 
> Agreed. I would like to see the human-readable strerror() string added,
> but that doesn't make this series any worse as a first step:
> 
> Acked-by: Kevin Wolf <kw...@redhat.com>
>

Re: [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event

Reply via email to