On Jun 27, 2008, at 2:18 PM, Keisuke MORI wrote:

Hi,

just about topic 4) in this mail...

Andrew Beekhof <[EMAIL PROTECTED]> writes:
4) node fencing without the poweroff
 (this is a kind of a new feature request)
Node fencing is just simple and good enough in most of our cases but
 we hesitate to use STONITH(poweroff/reboot) as the first action
 of a failure, because:
 - we want to shutdown the services gracefully as long as possible.
 - rebooting the failed node may lose the evidence of the
   real cause of a failure. We want to preserve it as possible
   to investigate it later and to ensure that the all problems are
resolved.

 We think that, ideally, when a resource failed the node would
 try to go to 'standby' state, and only when it failed it
 would escalate to STONITH to poweroff.

The problem with this is that it directly (and negatively) impacts
service availability.
It is unsafe to start services elsewhere until they are confirmed dead
on the existing node.

So relying on manual shutdowns greatly increases failover time.


Right, but I think it depends on applications.

In the case of database applications such as pgsql or oracle,
the most dominant factor of failover time is the recovery time.
Shutting down a node in the middle of a transaction will cause a
rollback action and will increase the recovery time more and more.
We estimates 3-5 minutes at most for the recovery time in our configuration.

Another case is Filesystem on a shared storage.
You should run fsck before mounting it on the failover-ed node
for the safety of the data if the filesystem was not umounted cleanly.
It would take a very long time particularly if the filesystem
is very large as used by a database.

Addition to this, there may be a risk of data loss if the power
was suddenly down.  Such risks may be neglected, but if there's
anything we can do to avoid or minimize such risks then we want
to take the steps for that.

I think you want on_fail=block.
The cluster wont do anything itself but will instead wait for human intervention.





One thing we used to do (but had to disable because we couldn't get it
100% right at the time) was move off the healthy resources before
shooting the node.  I think resurrecting this feature is a better
approach.

Yes, that sounds good to me.
One thing I'm wondering is that if the cluster manager was able
to confirm all the resouces were stopped on the failed node, it
does not necessarily need to be turned off, doesn't it?

If it could do that - then it wouldn't have tried to shoot it in the first place :-)


_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to