On Fri, 9 Sep 2016 10:53:05 +0200 Cornelia Huck <cornelia.h...@de.ibm.com> wrote:
> On Fri, 9 Sep 2016 10:46:25 +0200 > Greg Kurz <gr...@kaod.org> wrote: > > > On Fri, 9 Sep 2016 10:30:53 +0200 > > Cornelia Huck <cornelia.h...@de.ibm.com> wrote: > > > > > On Thu, 8 Sep 2016 19:55:16 +0300 > > > "Michael S. Tsirkin" <m...@redhat.com> wrote: > > > > > > > On Thu, Sep 08, 2016 at 06:26:52PM +0200, Greg Kurz wrote: > > > > > On Thu, 8 Sep 2016 18:19:27 +0300 > > > > > "Michael S. Tsirkin" <m...@redhat.com> wrote: > > > > > > > > > > > On Thu, Sep 08, 2016 at 05:04:47PM +0200, Cornelia Huck wrote: > > > > > > > On Thu, 8 Sep 2016 18:00:28 +0300 > > > > > > > "Michael S. Tsirkin" <m...@redhat.com> wrote: > > > > > > > > > > > > > > > On Thu, Sep 08, 2016 at 11:12:16AM +0200, Greg Kurz wrote: > > > > > > > > If it continues > > > > > execution, this means we're expecting the guest or the host to do > > > > > something > > > > > to fix the error condition. This requires QEMU to emit an event of > > > > > some > > > > > sort, but not necessarily to log an error message in a file. I guess > > > > > this > > > > > depends if QEMU is run by some tooling, or by a human. > > > > > > > > I'm not sure we need an event if tools are not expected to > > > > do anything with it. If we limit # of times error > > > > is printed, tools will need to reset this counter, > > > > so we will need an event on overflow. > > > > > > If the device goes into a broken state, it should be discoverable from > > > outside. I'm not sure we need an actual event signalling this if this > > > happens due to the guest doing something wrong: That would be a task > > > for tools monitoring _inside_ the guest. > > > > Well, in case of a virtio device being broken, section 2.1.2 in the spec > > suggests to set the status to DEVICE_NEEDS_RESET and to notify it to > > the guest (aka. event signalling). I'll send a patch shortly. > > Stefan had already sent > <1460467534-29147-4-git-send-email-stefa...@redhat.com> ages ago, but > it has not yet made it anywhere... > I don't know what to do with this message-id :\ > Anyhow, I was concerned with host signalling (sorry for being unclear), > and I still do not think we need to alert host monitoring software to > guest stupidity. > I agree. Sorry if my poor wording made you (and others) think I was suggesting that :) My point was that if QEMU exits because of guest stupidity, you are forced to error_report() something to the host, but this is really suboptimal (even if BUG_ON is worse)... then there was that discussion about log files getting to big, but I don't even know how we came there, as it does not really make sense when QEMU exits. > > > > > For tools monitoring the > > > health of the machine (from the host perspective), the discovery > > > interface would probably be enough? > > > > > > > Yeah, probably. > > > > Cheers. > > > > -- > > Greg > > >