Re: [libvirt] [RFC 0/5]: QMP: add balloon-get-memory-stats command

2012-01-20 Thread Luiz Capitulino
On Thu, 19 Jan 2012 10:15:55 -0700
Eric Blake ebl...@redhat.com wrote:

 On 01/19/2012 08:56 AM, Luiz Capitulino wrote:
  Long ago, commit 625a5be added the guest provided memory statistics to
  the query-balloon command. Unfortunately, it also introduced a severe
  bug: query-balloon would hang if the guest didn't respond. This, in turn,
  would also cause a hang in libvirt.
  
  Because of that, we decided to disable the guest memory stats feature
  (commit 11724ff).
  
  As we decided to let commands implement ad-hoc async mechanisms until we
  get a proper way to do it, I decided to try to re-enable that feature.
  
  My idea is to have a command and an event. The command gets the process
  started by sending a request to guest and returns. Later, when the guest
  makes the memory stats info available, it's sent to the client by means
  of an QMP event (please, take a look at patch 05/05 for full details).
  
  I'm not sure if that approach is good for libvirt though, so it would be
  very helpful to get their input (Eric, I'm CC'ing you here, but feel free
  to route this to someone else).
 
 [I went ahead and cc'd the libvirt list]
 
 Yes, libvirt can live with this approach.  And having this in parallel
 to a qemu-ga verb is nice, since, as it was pointed out, this would
 allow interaction with guests that have a balloon device but not a guest
 agent.
 
 You may want to read this thread [1], for thoughts on the impact of
 making another existing blocking command be extended into one that
 starts an async event and ends when an event is raised; libvirt can
 expose both a blocking and an asynchronous implementation to the user on
 top of the qemu model being just asynchronous.
 [1] https://www.redhat.com/archives/libvir-list/2012-January/msg00562.html
 
 Thinking aloud - do we need a means to poll the state of the
 balloon-stat query?

We could have a query-balloon-memory-stats command that returns the last
available stats (or none, if ballon-get-memory-stats wasn't issued), and
I think that it would be better to move the stats info from the event to
the query command too, this way the event would just signal that the stats
info are available.

I find that approach a bit more complicated though.

  On the one hand, if libvirtd issues the start
 command, then gets stopped, then the event occurs, then libvirtd is
 restarted, then libvirt won't know that the event was missed.  On the
 other hand, since this involves guest interaction, libvirt already has
 to assume that the guest may be malicious and refuse to report stats
 and/or report invalid stats, so libvirt would already have to be
 prepared to give up if no event has arrived in a fixed amount of time,
 and that also means that restarting libvirtd can just ignore any balloon
 query that was in flight before the restart.

Yes, there's no guarantee the event will be ever sent. If it doesn't
arrive after a fixed amount of time, the best thing to do is to issue
the start command again.

 So I guess I'm okay with just a start and an event, with no poll of the
 last-known guest response.  But it does mean that qemu has to gracefully
 handle if libvirt makes two start requests in a row without any
 intervening events, and conversely that libvirt has to be prepared for
 an event that happens even when libvirt doesn't remember triggering the
 start command.

There could be intervening events. Everything can happen between the
start command and the event (I/O Error, VM stop, etc). Libvirt has to be
prepared for that.

 
  Another interesting point is that, there's another way of doing this and
  it's using qemu-ga instead. That's, qemu-ga could read that information
  from proc and return it. This is easier  simpler, as it doesn't involve
  guest communication. We also could return a lot more information if needed.
  The only disadvantage I can see is the dependency on qemu-ga...
 
 Most likely, we would want to teach libvirt to use both methods, and
 give the choice to the user on which approach to use when the guest
 supports both.
 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC 0/5]: QMP: add balloon-get-memory-stats command

2012-01-19 Thread Eric Blake
On 01/19/2012 08:56 AM, Luiz Capitulino wrote:
 Long ago, commit 625a5be added the guest provided memory statistics to
 the query-balloon command. Unfortunately, it also introduced a severe
 bug: query-balloon would hang if the guest didn't respond. This, in turn,
 would also cause a hang in libvirt.
 
 Because of that, we decided to disable the guest memory stats feature
 (commit 11724ff).
 
 As we decided to let commands implement ad-hoc async mechanisms until we
 get a proper way to do it, I decided to try to re-enable that feature.
 
 My idea is to have a command and an event. The command gets the process
 started by sending a request to guest and returns. Later, when the guest
 makes the memory stats info available, it's sent to the client by means
 of an QMP event (please, take a look at patch 05/05 for full details).
 
 I'm not sure if that approach is good for libvirt though, so it would be
 very helpful to get their input (Eric, I'm CC'ing you here, but feel free
 to route this to someone else).

[I went ahead and cc'd the libvirt list]

Yes, libvirt can live with this approach.  And having this in parallel
to a qemu-ga verb is nice, since, as it was pointed out, this would
allow interaction with guests that have a balloon device but not a guest
agent.

You may want to read this thread [1], for thoughts on the impact of
making another existing blocking command be extended into one that
starts an async event and ends when an event is raised; libvirt can
expose both a blocking and an asynchronous implementation to the user on
top of the qemu model being just asynchronous.
[1] https://www.redhat.com/archives/libvir-list/2012-January/msg00562.html

Thinking aloud - do we need a means to poll the state of the
balloon-stat query?  On the one hand, if libvirtd issues the start
command, then gets stopped, then the event occurs, then libvirtd is
restarted, then libvirt won't know that the event was missed.  On the
other hand, since this involves guest interaction, libvirt already has
to assume that the guest may be malicious and refuse to report stats
and/or report invalid stats, so libvirt would already have to be
prepared to give up if no event has arrived in a fixed amount of time,
and that also means that restarting libvirtd can just ignore any balloon
query that was in flight before the restart.

So I guess I'm okay with just a start and an event, with no poll of the
last-known guest response.  But it does mean that qemu has to gracefully
handle if libvirt makes two start requests in a row without any
intervening events, and conversely that libvirt has to be prepared for
an event that happens even when libvirt doesn't remember triggering the
start command.

 Another interesting point is that, there's another way of doing this and
 it's using qemu-ga instead. That's, qemu-ga could read that information
 from proc and return it. This is easier  simpler, as it doesn't involve
 guest communication. We also could return a lot more information if needed.
 The only disadvantage I can see is the dependency on qemu-ga...

Most likely, we would want to teach libvirt to use both methods, and
give the choice to the user on which approach to use when the guest
supports both.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list