On Feb 16, 2009, at 11:24 AM, Glory Smith wrote:




we kill the node with STONITH.
very hard for a machine to write to shared media when its powered off.


we can kill nodes when:
- nodes become unresponsive
- nodes are not part of the cluster that has quorum
- resources fail to stop when instructed
- resources fail in any way (optional)

1) well if somehow STONITH fails to kill the errant node and the node is still alive ,

having an unreliable stonith mechanism is worse than not having one at all.

what if your resource fencing has a bug? its the same problem.

reliable fencing is a fundamental requirement of the cluster.

it will be able to do IO on shared disk. this can cause data integrity issue right??

2) suppose we have set STONITH action to reboot then the errant node can comeup and still write to shared disk , even if it does not suppose to do this.

1) well dont configure it like that then

2) no, it cant.
it wont have quorum and therefor isn't allowed to start cluster resources


if openais -pacemaker provide something for resouce fencing we would have completely ruled out above possiblities Please share your view.



_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to