Never mind. I am good now. I have figured out the syntax for fence_vmware and it works beautifully now.
Here it is, just in case someone breaks his head to get this done in future .. ... <clusternodes> <clusternode name="node1.localdomain" nodeid="1" votes="1"> <fence> <method name="fence_vmware"> <device name="vmware" port="node1.localdomain"/> </method> </fence> </clusternode> <clusternode name="node2.localdomain" nodeid="2" votes="1"> <fence> <method name="fence_vmware"> <device name="vmware" port="node2.localdomain"/> </method> </fence> </clusternode> .. <fencedevices> <fencedevice agent="fence_vmware" ipaddr="a.b.c.d" login="xxx" name="vmware" passwd="xxx"/> </fencedevices> ... I will be doing some series of fail-over scenarios ( node and service failures have worked very well so far) and will get back with the results if there are any concerns. Thanks for helping me thus far. I really appreciate. Param On Thu, Aug 30, 2012 at 1:37 PM, PARAM KRISH <mkpa...@gmail.com> wrote: > *Background : * > I am using two VM's hosted in my internal lab that has two interfaces one > configured with a valid IP and other being down. I have kept the VIP also > in the same network. My intention is to have a Apache configured as cluster > service in these two nodes and do a fail-over when the node or the > interface goes down. I try to use fence_vmware as fencing device. These two > VM's are now part of a ESX 4.1 host and the GuestOS in my VM's are RHEL6.0 > 32-bit. > > > I am seeing the following problems in my setup now ... > > 1. When starting a apache service from LUCI, it starts fine in a node. > But, if i kill httpd process from that node manually, it does not detect > the service is down to restart or to relocate > 2. -same- case if i do "ip adds del <VIP>" ; it just detects the node is > down but does not do a restart or relocate of the service > 3. Whenever i reboot the nodes, it comes online and the service properly > starts fine in either of the node and both nodes perfectly in Quorum but > the fail-over never happens if i stop that active node. > 4. I am not sure what format of fence that i must put in the cluster.conf, > since there is no way i can test that out if at all it works fine. > > Manual tests : > 1. I manually run something like this > "fence_vmware --action=status --ip=10.72.145.145 --username=<login> > --password=<password> --plug=<vm-name>" which works fine on both the nodes. > 2. Apache starts/stops just particularly fine from both nodes when i do > "rg_test test /etc/cluster/cluster.conf start service WEB" > > Cluster.conf is attached herewith. > rgmanager.log is attached herewith. > > Please let me know any specific debug commands that i can run manually to > find out the issues going on here, more particularly the "relocation" of > service and the "fencing"; both consistently fails. > > Please help. I have been spending more than 10 days now to set this up in > my internal lab to show it as Proof of Concept to my business heads to buy > RHEL cluster indeed works for our production requirement. > > -Param >
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster