Now that ricci is figured out, I am having some issues with fencing.

It seems VMWare Fence works very well, but our GFS2 volume is not available 
until it receives a "success" status.  This gives us maybe 30-60 seconds of 
time where we cannot access the GFS2 volumes which equates to downtime. SCSI 
Fencing seems faster, but very unreliable. If I try to fence a node, it will 
return "fence somenode success". Great. But the node can still access the GFS2 
volume.

Then I am also seeing conflicting information on using Qdisk with fence_scsi as 
it seems to be a no-no. I could swear I saw a note somewhere that Qdisk and 
fence_scsi worked together in newer versions of RHEL.

So what is my best bet in making sure GFS2 is as available as possible in the 
case of a node failure… or simply rebooting a node to apply say a software 
patch which is an even bigger concern?


Cluster.conf as it stands now:

<?xml version="1.0"?>
<cluster config_version="34" name="Xanadu">
<clusternodes>
<clusternode name="xanadunode1" nodeid="1">
<fence>
<method name="Method2">
<device name="SCSI_Fence"/>
</method>
</fence>
<unfence>
<device action="on" name="SCSI_Fence"/>
</unfence>
</clusternode>
<clusternode name="xanadunode2" nodeid="2">
<fence>
<method name="Method2">
<device name="SCSI_Fence"/>
</method>
</fence>
<unfence>
<device action="on" name="SCSI_Fence"/>
</unfence>
</clusternode>
<clusternode name="xanadunode3" nodeid="3">
<fence>
<method name="Method2">
<device name="SCSI_Fence"/>
</method>
</fence>
<unfence>
<device action="on" name="SCSI_Fence"/>
</unfence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice agent="fence_vmware_soap" ipaddr="vsphere.innova.local" 
login="vmwarefence" name="VMWare_Fence" passwd="XXXXXXXX"/>
<fencedevice agent="fence_scsi" name="SCSI_Fence"/>
</fencedevices>
<cman expected_votes="5"/>
<quorumd label="quorum"/>
<rm>
<failoverdomains>
<failoverdomain name="Cluster Management">
<failoverdomainnode name="xanadunode1"/>
<failoverdomainnode name="xanadunode2"/>
<failoverdomainnode name="xanadunode3"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="192.168.30.78" sleeptime="2"/>
</resources>
<service domain="Cluster Management" name="Cluster Management" 
recovery="relocate">
<ip ref="192.168.30.78"/>
</service>
</rm>
</cluster>

________________________________________
Chip Burke


--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to