On 06/29/2016 11:00 AM, Pavlov, Vladimir wrote: > Thanks a lot. > We also thought to use Fencing (stonith). > But production cluster works in the cloud, node1 and node2 is virtual > machines without any hardware fencing devices. But there are fence-agents that do fencing via the hypervisor (e.g. fence_xvm). > We looked in the direction of the SBR, but its use as far as we understand is > not justified without shared storage in two-node cluster: > http://blog.clusterlabs.org/blog/2015/sbd-fun-and-profit Using SBD with a watchdog (provided your virtual environment provides a watchdog device inside VMs) for self-fencing is probably better than no fencing at all.
Regards, Klaus > Are there any ways to do fencing? > Specifically for our situation, we have found another workaround - use DR > instead of NAT in IPVS. > In the case of DR, even if both servers are active at the same time it does > not matter which of them serve the connection from the client. Web servers > responds to the client directly. > This workaround has a right to life? > > Kind regards, > > Vladimir Pavlov > > Message: 2 > Date: Tue, 28 Jun 2016 18:53:38 +0300 > From: "Pavlov, Vladimir" <[email protected]> > To: "'[email protected]'" <[email protected]> > Subject: [ClusterLabs] Default Behavior > Message-ID: > <[email protected]> > Content-Type: text/plain; charset="koi8-r" > > Hello! > We have Pacemaker cluster of two node Active/Backup (OS Centos 6.7), with > resources IPaddr2 and ldirectord. > Cluster Properties: > cluster-infrastructure: cman > dc-version: 1.1.11-97629de > no-quorum-policy: ignore > stonith-enabled: false > The cluster has been configured for this documentation: > http://clusterlabs.org/quickstart-redhat-6.html > Recently, there was a communication failure between cluster nodes and the > behavior was like this: > > - During a network failure, each server has become the Master. > > - After the restoration of the network, one node killing services of > Pacemaker on the second node. > > - The second node was not available for the cluster, but all resources > remain active (Ldirectord,ipvs,ip address). That is, both nodes continue to > be active. > We decided to create a test stand and play the situation, but with current > version of Pacemaker in CentOS repos, ?luster behaves differently: > > - During a network failure, each server has become the Master. > > - After the restoration of the network, all resources are stopped. > > - Then the resources are run only on one node. - This behavior seems > to be more logical. > Current Cluster Properties on test stand: > cluster-infrastructure: cman > dc-version: 1.1.14-8.el6-70404b0 > have-watchdog: false > no-quorum-policy: ignore > stonith-enabled: false > Changed the behavior of the cluster in the new version or accident is not > fully emulated? > Thank you. > > > Kind regards, > > Vladimir Pavlov > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > <http://clusterlabs.org/pipermail/users/attachments/20160628/b340b971/attachment-0001.html> > > ------------------------------ > > Message: 3 > Date: Tue, 28 Jun 2016 12:07:36 -0500 > From: Ken Gaillot <[email protected]> > To: [email protected] > Subject: Re: [ClusterLabs] Default Behavior > Message-ID: <[email protected]> > Content-Type: text/plain; charset=UTF-8 > > On 06/28/2016 10:53 AM, Pavlov, Vladimir wrote: >> Hello! >> >> We have Pacemaker cluster of two node Active/Backup (OS Centos 6.7), >> with resources IPaddr2 and ldirectord. >> >> Cluster Properties: >> >> cluster-infrastructure: cman >> >> dc-version: 1.1.11-97629de >> >> no-quorum-policy: ignore >> >> stonith-enabled: false >> >> The cluster has been configured for this documentation: >> http://clusterlabs.org/quickstart-redhat-6.html >> >> Recently, there was a communication failure between cluster nodes and >> the behavior was like this: >> >> - During a network failure, each server has become the Master. >> >> - After the restoration of the network, one node killing services >> of Pacemaker on the second node. >> >> - The second node was not available for the cluster, but all >> resources remain active (Ldirectord,ipvs,ip address). That is, both >> nodes continue to be active. >> >> We decided to create a test stand and play the situation, but with >> current version of Pacemaker in CentOS repos, ?luster behaves differently: >> >> - During a network failure, each server has become the Master. >> >> - After the restoration of the network, all resources are stopped. >> >> - Then the resources are run only on one node. - This behavior >> seems to be more logical. >> >> Current Cluster Properties on test stand: >> >> cluster-infrastructure: cman >> >> dc-version: 1.1.14-8.el6-70404b0 >> >> have-watchdog: false >> >> no-quorum-policy: ignore >> >> stonith-enabled: false >> >> Changed the behavior of the cluster in the new version or accident is >> not fully emulated? > If I understand your description correctly, the situation was not > identical. The difference I see is that, in the original case, the > second node is not responding to the cluster even after the network is > restored. Thus, the cluster cannot communicate to carry out the behavior > observed in the test situation. > > Fencing (stonith) is the cluster's only recovery mechanism in such a > case. When the network splits, or a node becomes unresponsive, it can > only safely recover resources if it can ensure the other node is powered > off. Pacemaker supports both physical fencing devices such as an > intelligent power switch, and hardware watchdog devices for self-fencing > using sbd. > >> Thank you. >> >> >> >> >> >> Kind regards, >> >> >> >> *Vladimir Pavlov* > > > ------------------------------ > > Message: 4 > Date: Tue, 28 Jun 2016 16:51:50 -0400 > From: Digimer <[email protected]> > To: Cluster Labs - All topics related to open-source clustering > welcomed <[email protected]> > Subject: Re: [ClusterLabs] Default Behavior > Message-ID: <[email protected]> > Content-Type: text/plain; charset=UTF-8 > > On 28/06/16 11:53 AM, Pavlov, Vladimir wrote: >> Hello! >> >> We have Pacemaker cluster of two node Active/Backup (OS Centos 6.7), >> with resources IPaddr2 and ldirectord. >> >> Cluster Properties: >> >> cluster-infrastructure: cman >> >> dc-version: 1.1.11-97629de >> >> no-quorum-policy: ignore >> >> stonith-enabled: false > You need fencing to be enabled and configured. This is always true, but > particularly so on RHEL 6 because it uses the cman plugin. Please > configure and test stonith, and then repeat your tests to see if the > behavior is more predictable. > >> The cluster has been configured for this documentation: >> http://clusterlabs.org/quickstart-redhat-6.html >> >> Recently, there was a communication failure between cluster nodes and >> the behavior was like this: >> >> - During a network failure, each server has become the Master. >> >> - After the restoration of the network, one node killing services >> of Pacemaker on the second node. >> >> - The second node was not available for the cluster, but all >> resources remain active (Ldirectord,ipvs,ip address). That is, both >> nodes continue to be active. >> >> We decided to create a test stand and play the situation, but with >> current version of Pacemaker in CentOS repos, ?luster behaves differently: >> >> - During a network failure, each server has become the Master. >> >> - After the restoration of the network, all resources are stopped. >> >> - Then the resources are run only on one node. - This behavior >> seems to be more logical. >> >> Current Cluster Properties on test stand: >> >> cluster-infrastructure: cman >> >> dc-version: 1.1.14-8.el6-70404b0 >> >> have-watchdog: false >> >> no-quorum-policy: ignore >> >> stonith-enabled: false >> >> Changed the behavior of the cluster in the new version or accident is >> not fully emulated? >> >> Thank you. >> >> >> >> >> >> Kind regards, >> >> >> >> *Vladimir Pavlov* >> >> >> >> >> >> _______________________________________________ >> Users mailing list: [email protected] >> http://clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
