Re: [RFC] High availability in KVM

2010-07-13 Thread Takuya Yoshikawa
On Mon, 12 Jul 2010 02:49:55 -0700 (PDT)
da...@lang.hm wrote:

 On Mon, 12 Jul 2010, Takuya Yoshikawa wrote:
 

[...]

  1: Pacemaker starts Qemu.
 
  2: Pacemaker checks the state of Qemu via RA.
RA checks the state of Qemu using virsh(libvirt).
Qemu replies to RA RUNNING(normally executing), (*1)
and RA returns the state to Pacemaker as it's running correctly.
 
   (*1): libvirt defines the following domain states:
 
 enum virDomainState {
 
 VIR_DOMAIN_NOSTATE  = 0 : no state
 VIR_DOMAIN_RUNNING  = 1 : the domain is running
 VIR_DOMAIN_BLOCKED  = 2 : the domain is blocked on resource
 VIR_DOMAIN_PAUSED   = 3 : the domain is paused by user
 VIR_DOMAIN_SHUTDOWN = 4 : the domain is being shut down
 VIR_DOMAIN_SHUTOFF  = 5 : the domain is shut off
 VIR_DOMAIN_CRASHED  = 6 : the domain is crashed
 
 }
 
 We took the most common case RUNNING as an example, but this might be
 other states except for failover targets: SHUTOFF and CRASHED ?
 
   --- SOME ERROR HAPPENS ---
 
  3: Pacemaker checks the state of Qemu via RA.
RA checks the state of Qemu using virsh(libvirt).
Qemu replies to RA SHUTOFF, (*2)
 
 why would it return 'shutoff' if an error happened instead of 'crashed'?


Yes, it would be 'crashed'.

But 'shutoff' may also be returned I think: it depends on the type of the error
and how KVM/qemu handle it.

  I take into my mind not only hardware errors but virtualization specific
  errors like emulation errors.


 
and RA returns the state to Pacemaker as it's already stopped.
 
   (*2): Currently we are checking shut off answer from domstate command.
Yes, we should care about both SHUTOFF and CRASHED if possible.
 
  4: Pacemaker finally tries to confirm if it can safely start failover by
sending stop command. After killing Qemu, RA replies to Pacemaker
OK so that Pacemaker can start failover.
 
  Problems: We lose debuggable information of VM such as the contents of
guest memory.
 
 the OCF interface has start, stop, status (running or not) or an error 
 (plus API info)
 
 what I would do in this case is have the script notice that it's in 
 crashed status and return an error if it's told to start it. This will 
 cause pacemaker to start the service on another system.


I see.
So the key point is to how to check target, crashed in this case, status.

In the HA's point of view, we need that qemu guarantees:
 - Guest never start again
 - VM never modify external resources

But I'm not so sure if qemu currently guarantees such conditions in generic
manner.



Generically I agree that we always start the guest in another node for
failover.  But are there any benefits if we can start the guest in the
same node?


 
 if it's told to stop it, do whatever you can to save state, but definantly 
 pause/freeze the instance and return 'stopped'
 
 
 
 no need to define some additional state. As far as pacemaker is concerned 
 it's safe as long as there is no chance of it changing the state of any 
 shared resources that the other system would use, so simply pausing the 
 instance will make it safe. It will be interesting when someone wants to 
 investigate what's going on inside the instance (you need to have it be 
 functional, but not able to use the network or any shared 
 drives/filesystems), but I don't believe that you can get that right in a 
 generic manner, the details of what will cause grief and what won't will 
 vary from site to site.


If we cannot say in a generic manner, we usually choose the most conservative
one: memory and ... perservation only.

What we concern the most is qemu actually guarantees the conditions we are
talking in this thread.



 
 
  B. Our proposal: introduce a new domain state to indicate failover-safe
 
Pacemaker...(OCF)RA...(libvirt)...Qemu
| | |
| | |
  1: + start -++ state=RUNNING
| | |
+ monitor ---+ domstate --+
  2: | | |
+ OK --+--- RUNNING --+
| | |
| | |
| | * Error: state=FROZEN
| | |   Qemu releases resources
| | |   and VM gets frozen. (*3)
+ monitor ---+ domstate --+
  3: | | |
+-- STOPPED ---+--- FROZEN ---+
| | |
+ stop --+ domstate --+
  4: | | |
+ OK --+--- FROZEN ---+
| | |
| | |
 
 
  1: Pacemaker starts Qemu.
 
  2: Pacemaker checks the state of Qemu via RA.
RA checks the state of Qemu using 

Re: [RFC] High availability in KVM

2010-07-13 Thread david

On Tue, 13 Jul 2010, Takuya Yoshikawa wrote:


On Mon, 12 Jul 2010 02:49:55 -0700 (PDT)
da...@lang.hm wrote:


On Mon, 12 Jul 2010, Takuya Yoshikawa wrote:



  and RA returns the state to Pacemaker as it's already stopped.

 (*2): Currently we are checking shut off answer from domstate command.
  Yes, we should care about both SHUTOFF and CRASHED if possible.

4: Pacemaker finally tries to confirm if it can safely start failover by
  sending stop command. After killing Qemu, RA replies to Pacemaker
  OK so that Pacemaker can start failover.

Problems: We lose debuggable information of VM such as the contents of
  guest memory.


the OCF interface has start, stop, status (running or not) or an error
(plus API info)

what I would do in this case is have the script notice that it's in
crashed status and return an error if it's told to start it. This will
cause pacemaker to start the service on another system.



I see.
So the key point is to how to check target, crashed in this case, status.

In the HA's point of view, we need that qemu guarantees:
- Guest never start again
- VM never modify external resources

But I'm not so sure if qemu currently guarantees such conditions in generic
manner.


you don't have to depend on the return from qemu. there are many OCF 
scripts that maintain state internally (look at the e-mail script as an 
example), if your OCF script thinks that it should be running and it 
isn't, mark it as crashed and don't try to start it again until external 
actions clear the status (and you can have a boot do so in case you have 
an unclean shutdown)



Generically I agree that we always start the guest in another node for
failover.  But are there any benefits if we can start the guest in the
same node?


I don't believe that pacemaker supports this concept.

however, if you wanted to you could have the OCF script know that there is 
a 'crshed' instance and instead of trying to start it, start a fresh copy.






if it's told to stop it, do whatever you can to save state, but definantly
pause/freeze the instance and return 'stopped'



no need to define some additional state. As far as pacemaker is concerned
it's safe as long as there is no chance of it changing the state of any
shared resources that the other system would use, so simply pausing the
instance will make it safe. It will be interesting when someone wants to
investigate what's going on inside the instance (you need to have it be
functional, but not able to use the network or any shared
drives/filesystems), but I don't believe that you can get that right in a
generic manner, the details of what will cause grief and what won't will
vary from site to site.



If we cannot say in a generic manner, we usually choose the most conservative
one: memory and ... perservation only.

What we concern the most is qemu actually guarantees the conditions we are
talking in this thread.


I'll admit that I'm not familiar with using qemu/KVM, but vmware/virtual 
box/XEN all have an option to freeze all activity and save the ram to a 
disk file for a future restart. the OCF file can trigger such action 
easily.



B. Our proposal: introduce a new domain state to indicate failover-safe

  Pacemaker...(OCF)RA...(libvirt)...Qemu
  | | |
  | | |
1: + start -++ state=RUNNING
  | | |
  + monitor ---+ domstate --+
2: | | |
  + OK --+--- RUNNING --+
  | | |
  | | |
  | | * Error: state=FROZEN
  | | |   Qemu releases resources
  | | |   and VM gets frozen. (*3)
  + monitor ---+ domstate --+
3: | | |
  +-- STOPPED ---+--- FROZEN ---+
  | | |
  + stop --+ domstate --+
4: | | |
  + OK --+--- FROZEN ---+
  | | |
  | | |


1: Pacemaker starts Qemu.

2: Pacemaker checks the state of Qemu via RA.
  RA checks the state of Qemu using virsh(libvirt).
  Qemu replies to RA RUNNING(normally executing), (*1)
  and RA returns the state to Pacemaker as it's running correctly.

  --- SOME ERROR HAPPENS ---

3: Pacemaker checks the state of Qemu via RA.
  RA checks the state of Qemu using virsh(libvirt).
  Qemu replies to RA FROZEN(VM stopped in a failover-safe state), (*3)
  and RA keeps it in mind, then replies to Pacemaker STOPPED.

 (*3): this is what we want to introduce as a new state. Failover-safe means
   that Qemu released the external resources, including some namespaces, to
be
   available from another instance.


it doesn't need to release the resources. It just 

Re: [RFC] High availability in KVM

2010-07-12 Thread Takuya Yoshikawa

(2010/07/11 7:36), da...@lang.hm wrote:

On Thu, 17 Jun 2010, Fernando Luis Vazquez Cao wrote:


Existing open source HA stacks such as pacemaker/corosync and Red
Hat Cluster Suite rely on software clustering techniques to detect
both hardware failures and software failures, and employ fencing to
avoid split-brain situations which, in turn, makes it possible to
perform failover safely. However, when applied to virtualization
environments these solutions show some limitations:

- Hardware detection relies on polling mechanisms (for example
pinging a network interface to check for network connectivity),
imposing a trade off between failover time and the cost of
polling. The alternative is having the failing system send an
alarm to the HA software to trigger failover. The latter
approach is preferable but it is not always applicable when
dealing with bare-metal; depending on the failure type the
hardware may not able to get a message out to notify the HA
software. However, when it comes to virtualization environments
we can certainly do better. If a hardware failure, be it real
hardware or virtual hardware, is fully contained within a
virtual machine the host or hypervisor can detect that and
notify the HA software safely using clean resources.


you still need to detect failures that you won't be notified of.

what if a network cable goes bad and your data isn't getting through?
you won't get any notification of this without doing polling, even in a
virtualized environment.



I agree that we need polling anyway.




also, in a virtualized environment you may have firewall rules between
virtual hosts, if those get misconfigured you may have 'virual physical
connectivity' still, but not the logical connectivity that you need.


- In most cases, when a hardware failure is detected the state of
the failing node is not known which means that some kind of
fencing is needed to lock resources away from that
node. Depending on the hardware and the cluster configuration
fencing can be a pretty expensive operation that contributes to
system downtime. Virtualization can help here. Upon failure
detection the host or hypervisor could put the virtual machine
in a quiesced state and release its hardware resources before
notifying the HA software, so that it can start failover
immediately without having to mingle with the failing virtual
machine (we now know that it is in a known quiesced state). Of
course this only makes sense in the event-driven failover case
described above.

- Fencing operations commonly involve killing the virtual machine,
thus depriving us of potentially critical debugging information:
a dump of the virtual machine itself. This issue could be solved
by providing a virtual machine control that puts the virtual
machine in a known quiesced state, releases its hardware
resources, but keeps the guest and device model in memory so
that forensics can be conducted offline after failover. Polling
HA resource agents should use this new command if postmortem
analysis is important.


I don't see this as the job of the virtualization hypervisor. the
software HA stacks include the ability to run external scripts to
perform these tasks. These scripts can perform whatever calls to the
hypervisor that are appropriate to freeze, shutdown, or disconnect the
virtual server (and what is appropriate will vary from implementation to
implementation)



I see that it can be done with HA plus external scripts.

But don't you think we need a way to confirm that vm is in a known quiesced
state?

Although might not be the exact same scenario, here is what we are planning
as one possible next step (polling case):

==
A. Current management: Qemu/KVM + HA using libvirt interface

- Pacemaker interacts with RA(Resource Agent) through OCF interface.
- RA interacts with Qemu using virsh commands, IOW through libvirt interface.

   Pacemaker...(OCF)RA...(libvirt)...Qemu
   | | |
   | | |
1: + start -++ state=RUNNING
   | | |
   + monitor ---+ domstate --+
2: | | |
   + OK --+--- RUNNING --+
   | | |
   | | |
   | | * Error: state=SHUTOFF, or ...
   | | |
   | | |
   + monitor ---+ domstate --+
3: | | |
   +-- STOPPED ---+--- SHUTOFF --+
   | | |
   + stop --+ shutdown --+ VM killed (if still alive)
4: | | |
   + OK --+--- SHUTOFF --+
   | | |
   | | |

1: Pacemaker 

Re: [RFC] High availability in KVM

2010-07-12 Thread david

On Mon, 12 Jul 2010, Takuya Yoshikawa wrote:



I see that it can be done with HA plus external scripts.

But don't you think we need a way to confirm that vm is in a known quiesced
state?

Although might not be the exact same scenario, here is what we are planning
as one possible next step (polling case):

==
A. Current management: Qemu/KVM + HA using libvirt interface

- Pacemaker interacts with RA(Resource Agent) through OCF interface.
- RA interacts with Qemu using virsh commands, IOW through libvirt interface.

  Pacemaker...(OCF)RA...(libvirt)...Qemu
  | | |
  | | |
1: + start -++ state=RUNNING
  | | |
  + monitor ---+ domstate --+
2: | | |
  + OK --+--- RUNNING --+
  | | |
  | | |
  | | * Error: state=SHUTOFF, or ...
  | | |
  | | |
  + monitor ---+ domstate --+
3: | | |
  +-- STOPPED ---+--- SHUTOFF --+
  | | |
  + stop --+ shutdown --+ VM killed (if still alive)
4: | | |
  + OK --+--- SHUTOFF --+
  | | |
  | | |

1: Pacemaker starts Qemu.

2: Pacemaker checks the state of Qemu via RA.
  RA checks the state of Qemu using virsh(libvirt).
  Qemu replies to RA RUNNING(normally executing), (*1)
  and RA returns the state to Pacemaker as it's running correctly.

 (*1): libvirt defines the following domain states:

   enum virDomainState {

   VIR_DOMAIN_NOSTATE  = 0 : no state
   VIR_DOMAIN_RUNNING  = 1 : the domain is running
   VIR_DOMAIN_BLOCKED  = 2 : the domain is blocked on resource
   VIR_DOMAIN_PAUSED   = 3 : the domain is paused by user
   VIR_DOMAIN_SHUTDOWN = 4 : the domain is being shut down
   VIR_DOMAIN_SHUTOFF  = 5 : the domain is shut off
   VIR_DOMAIN_CRASHED  = 6 : the domain is crashed

   }

   We took the most common case RUNNING as an example, but this might be
   other states except for failover targets: SHUTOFF and CRASHED ?

 --- SOME ERROR HAPPENS ---

3: Pacemaker checks the state of Qemu via RA.
  RA checks the state of Qemu using virsh(libvirt).
  Qemu replies to RA SHUTOFF, (*2)


why would it return 'shutoff' if an error happened instead of 'crashed'?


  and RA returns the state to Pacemaker as it's already stopped.

 (*2): Currently we are checking shut off answer from domstate command.
  Yes, we should care about both SHUTOFF and CRASHED if possible.

4: Pacemaker finally tries to confirm if it can safely start failover by
  sending stop command. After killing Qemu, RA replies to Pacemaker
  OK so that Pacemaker can start failover.

Problems: We lose debuggable information of VM such as the contents of
  guest memory.


the OCF interface has start, stop, status (running or not) or an error 
(plus API info)


what I would do in this case is have the script notice that it's in 
crashed status and return an error if it's told to start it. This will 
cause pacemaker to start the service on another system.


if it's told to stop it, do whatever you can to save state, but definantly 
pause/freeze the instance and return 'stopped'




no need to define some additional state. As far as pacemaker is concerned 
it's safe as long as there is no chance of it changing the state of any 
shared resources that the other system would use, so simply pausing the 
instance will make it safe. It will be interesting when someone wants to 
investigate what's going on inside the instance (you need to have it be 
functional, but not able to use the network or any shared 
drives/filesystems), but I don't believe that you can get that right in a 
generic manner, the details of what will cause grief and what won't will 
vary from site to site.




B. Our proposal: introduce a new domain state to indicate failover-safe

  Pacemaker...(OCF)RA...(libvirt)...Qemu
  | | |
  | | |
1: + start -++ state=RUNNING
  | | |
  + monitor ---+ domstate --+
2: | | |
  + OK --+--- RUNNING --+
  | | |
  | | |
  | | * Error: state=FROZEN
  | | |   Qemu releases resources
  | | |   and VM gets frozen. (*3)
  + monitor ---+ domstate --+
3: | |

Re: [RFC] High availability in KVM

2010-07-10 Thread david

On Thu, 17 Jun 2010, Fernando Luis Vazquez Cao wrote:


Existing open source HA stacks such as pacemaker/corosync and Red
Hat Cluster Suite rely on software clustering techniques to detect
both hardware failures and software failures, and employ fencing to
avoid split-brain situations which, in turn, makes it possible to
perform failover safely. However, when applied to virtualization
environments these solutions show some limitations:

  - Hardware detection relies on polling mechanisms (for example
pinging a network interface to check for network connectivity),
imposing a trade off between failover time and the cost of
polling. The alternative is having the failing system send an
alarm to the HA software to trigger failover. The latter
approach is preferable but it is not always applicable when
dealing with bare-metal; depending on the failure type the
hardware may not able to get a message out to notify the HA
software. However, when it comes to virtualization environments
we can certainly do better. If a hardware failure, be it real
hardware or virtual hardware, is fully contained within a
virtual machine the host or hypervisor can detect that and
notify the HA software safely using clean resources.


you still need to detect failures that you won't be notified of.

what if a network cable goes bad and your data isn't getting through? you 
won't get any notification of this without doing polling, even in a 
virtualized environment.


also, in a virtualized environment you may have firewall rules between 
virtual hosts, if those get misconfigured you may have 'virual physical 
connectivity' still, but not the logical connectivity that you need.



  - In most cases, when a hardware failure is detected the state of
the failing node is not known which means that some kind of
fencing is needed to lock resources away from that
node. Depending on the hardware and the cluster configuration
fencing can be a pretty expensive operation that contributes to
system downtime. Virtualization can help here. Upon failure
detection the host or hypervisor could put the virtual machine
in a quiesced state and release its hardware resources before
notifying the HA software, so that it can start failover
immediately without having to mingle with the failing virtual
machine (we now know that it is in a known quiesced state). Of
course this only makes sense in the event-driven failover case
described above.

  - Fencing operations commonly involve killing the virtual machine,
thus depriving us of potentially critical debugging information:
a dump of the virtual machine itself. This issue could be solved
by providing a virtual machine control that puts the virtual
machine in a known quiesced state, releases its hardware
resources, but keeps the guest and device model in memory so
that forensics can be conducted offline after failover. Polling
HA resource agents should use this new command if postmortem
analysis is important.


I don't see this as the job of the virtualization hypervisor. the software 
HA stacks include the ability to run external scripts to perform these 
tasks. These scripts can perform whatever calls to the hypervisor that 
are appropriate to freeze, shutdown, or disconnect the virtual server (and 
what is appropriate will vary from implementation to implementation)


providing sample scripts that do this for the various HA stacks makes 
sense as it gives people examples of what can be done and lets them tailor 
exactly what does happen to their needs.



We are pursuing a scenario where current polling-based HA resource
agents are complemented with an event-driven failure notification
mechanism that allows for faster failover times by eliminating the
delay introduced by polling and by doing without fencing. This would
benefit traditional software clustering stacks and bring a feature
that is essential for fault tolerance solutions such as Kemari.


heartbeat/pacemaker has been able to do sub-second failovers for several 
years, I'm not sure that notification is really needed.


that being said the HA stacks do allow for commands to be fed into the HA 
system to tell a machine to go active/passive already, so why don't you 
have your notification just call scripts to make the appropriate calls?



Additionally, for those who want or need to stick with a polling
model we would like to provide a virtual machine control that
freezes a virtual machine into a failover-safe state without killing
it, so that postmortem analysis is still possible.


how is this different from simply pausing the virtual machine?


In the following sections we discuss the RAS-HA integration
challenges and the changes that need to be made to each component of
the qemu-KVM stack to realize this vision. While at it we will also
delve into some of the limitations of the current hardware error
subsystems of the 

Re: [RFC] High availability in KVM

2010-06-21 Thread Luiz Capitulino
On Thu, 17 Jun 2010 12:15:20 +0900
Fernando Luis Vazquez Cao ferna...@oss.ntt.co.jp wrote:

   * qemu-kvm
 
   Currently KVM is only notified about memory errors detected by the
   MCE subsystem. When running on newer x86 hardware, if MCE detects an
   error on user-space it signals the corresponding process with
   SIGBUS. Qemu, upon receiving the signal, checks the problematic
   address which the kernel stored in siginfo and decides whether to
   inject the MCE to the virtual machine.
 
   An obvious limitation is that we would like to be notified about
   other types of error too and, as suggested before, a file-based
   interface that can be sys_poll'ed might be needed for that.  
 
   On a different note, in a HA environment the qemu policy described
   above is not adequate; when a notification of a hardware error that
   our policy determines to be serious arrives the first thing we want
   to do is to put the virtual machine in a quiesced state to avoid
   further wreckage. If we injected the error into the guest we would
   risk a guest panic that might detectable only by polling or, worse,
   being killed by the kernel, which means that postmortem analysis of
   the guest is not possible. Once we had the guests in a quiesced
   state, where all the buffers have been flushed and the hardware
   sources released, we would have two modes of operation that can be
   used together and complement each other.
 
 - Proactive: A qmp event describing the error (severity, topology,
   etc) is emitted. The HA software would have to register to
   receive hardware error events, possibly using the libvirt
   bindings. Upon receiving the event the HA software would know
   that the guest is in a failover-safe quiesced state so it could
   do without fencing and proceed to the failover stage directly.

This seems to match the BLOCK_IO_ERROR event we have today: when a disk error
happens, an event is emitted and the virtual machine can be automatically
stopped (there's a configuration option for this).

On the other hand, there's a number of ways to do this differently. I think
the first thing to do is to agree on what qemu's behavior is going to be, then
we decide how to expose this info to qmp clients.

 - Passive: Polling resource agents that need to check the state of
   the guest generally use libvirt or a wrapper such as virsh. When
   the state is SHUTOFF or CRASHED the resource agent proceeds to
   the facing stage, which might be expensive and usually involves
   killing the qemu process. We propose adding a new state that
   indicates the failover-safe state described before. In this
   state the HA software would not need to use fencing techniques
   and since the qemu process is not killed postmortem analysis of
   the virtual machine is still possible.

It wouldn't be polling, I guess. We already have events for most state changes.
So, when the machine stops, reboots, etc.. the client would be notified and
then it could inspect the virtual machine by using query commands.

This method would be preferable in case we also want this information available
in the user Monitor and/or if the event gets too messy because of the amount of
information we want to put in it.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] High availability in KVM

2010-06-21 Thread Takuya Yoshikawa

(2010/06/21 23:19), Luiz Capitulino wrote:

   On a different note, in a HA environment the qemu policy described
   above is not adequate; when a notification of a hardware error that
   our policy determines to be serious arrives the first thing we want
   to do is to put the virtual machine in a quiesced state to avoid
   further wreckage. If we injected the error into the guest we would
   risk a guest panic that might detectable only by polling or, worse,
   being killed by the kernel, which means that postmortem analysis of
   the guest is not possible. Once we had the guests in a quiesced
   state, where all the buffers have been flushed and the hardware
   sources released, we would have two modes of operation that can be
   used together and complement each other.

 - Proactive: A qmp event describing the error (severity, topology,
   etc) is emitted. The HA software would have to register to
   receive hardware error events, possibly using the libvirt
   bindings. Upon receiving the event the HA software would know
   that the guest is in a failover-safe quiesced state so it could
   do without fencing and proceed to the failover stage directly.


This seems to match the BLOCK_IO_ERROR event we have today: when a disk error
happens, an event is emitted and the virtual machine can be automatically
stopped (there's a configuration option for this).

On the other hand, there's a number of ways to do this differently. I think
the first thing to do is to agree on what qemu's behavior is going to be, then
we decide how to expose this info to qmp clients.


I would like to support qemu/KVM bugs too in the same framework.

Even though there are some debugging ways, the easiest and most reliable one 
would
be using the frozen state of the guest at the moment the bug happened.


We've already experienced some qemu crashes which seemed to be caused by a KVM's
emulation failure in our test environment. Although we could guess what happened
by checking some messages like the exit reason, the guest state might have been
more help.

So what I want to get is:

 - new qemu/KVM mode in which guests are automatically stopped in a 
failover-safe
   state if qemu/KVM becomes impossible to continue,

 - new interface between qemu and HA to handle the failover-safe state,

Although I personally don't mind whether the interface is event based or polling
based, one important problem from the HA's point of view would be:

 * how to treat errors which can be caused in different layers uniformly.

E.g. if the problem is caused by guest side, qemu may normally exit without 
sending
any events to HA. So an interface for polling may be helpful even when we 
choose event
driven one.


Takuya





 - Passive: Polling resource agents that need to check the state of
   the guest generally use libvirt or a wrapper such as virsh. When
   the state is SHUTOFF or CRASHED the resource agent proceeds to
   the facing stage, which might be expensive and usually involves
   killing the qemu process. We propose adding a new state that
   indicates the failover-safe state described before. In this
   state the HA software would not need to use fencing techniques
   and since the qemu process is not killed postmortem analysis of
   the virtual machine is still possible.


It wouldn't be polling, I guess. We already have events for most state changes.
So, when the machine stops, reboots, etc.. the client would be notified and
then it could inspect the virtual machine by using query commands.

This method would be preferable in case we also want this information available
in the user Monitor and/or if the event gets too messy because of the amount of
information we want to put in it.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html