Re: [ClusterLabs] Pacemaker failover failure
Unfortunately I have nothing yet ... There's something I don't quite understand though. What's the role of stonith if the other machine crashes unexpectedly and totally unclean? Is it to reboot the machine and recreate the cluster, thus making the drbd volume available again? or is it other? The way I see it, even if stonith is functional the other node's drbd filesystem will not be acessible until the crashed node is back up, is this correct? Alex On Thu, Jul 2, 2015 at 2:56 PM, Ken Gaillot wrote: > - Original Message - > > Thank you! > > > > However, what is proper fencing in this situation? > > For virtual machines, there is a fence agent called fence_virt/fence_xvm, > but > it requires a daemon to be installed and configured on the underlying > physical machine(s). If that's not a possibility, you need some other means > of shutting the VM down. Whoever's providing your VM might also provide an > API to start and stop it, or if your VMs have access to some shared > external > storage, it might be possible to control it via that. > > > On Wed, Jul 1, 2015 at 11:30 PM, Ken Gaillot > wrote: > > > > > On 07/01/2015 09:39 AM, alex austin wrote: > > > > This is what crm_mon shows > > > > > > > > > > > > Last updated: Wed Jul 1 10:35:40 2015 > > > > > > > > Last change: Wed Jul 1 09:52:46 2015 > > > > > > > > Stack: classic openais (with plugin) > > > > > > > > Current DC: host2 - partition with quorum > > > > > > > > Version: 1.1.11-97629de > > > > > > > > 2 Nodes configured, 2 expected votes > > > > > > > > 4 Resources configured > > > > > > > > > > > > > > > > Online: [ host1 host2 ] > > > > > > > > > > > > ClusterIP (ocf::heartbeat:IPaddr2): Started host2 > > > > > > > > Master/Slave Set: redis_clone [redis] > > > > > > > > Masters: [ host2 ] > > > > > > > > Slaves: [ host1 ] > > > > > > > > pcmk-fencing(stonith:fence_pcmk): Started host2 > > > > > > > > On Wed, Jul 1, 2015 at 3:37 PM, alex austin > > > wrote: > > > > > > > >> I am running version 1.4.7 of corosync > > > > > > If you can't upgrade to corosync 2 (which has many improvements), > you'll > > > need to set the no-quorum-policy=ignore cluster option. > > > > > > Proper fencing is necessary to avoid a split-brain situation, which can > > > corrupt your data. > > > > > > >> On Wed, Jul 1, 2015 at 3:25 PM, Ken Gaillot > > > wrote: > > > >> > > > >>> On 07/01/2015 08:57 AM, alex austin wrote: > > > >>>> I have now configured stonith-enabled=true. What device should I > use > > > for > > > >>>> fencing given the fact that it's a virtual machine but I don't > have > > > >>> access > > > >>>> to its configuration. would fence_pcmk do? if so, what parameters > > > >>> should I > > > >>>> configure for it to work properly? > > > >>> > > > >>> No, fence_pcmk is not for using in pacemaker, but for using in > RHEL6's > > > >>> CMAN to redirect its fencing requests to pacemaker. > > > >>> > > > >>> For a virtual machine, ideally you'd use fence_virtd running on the > > > >>> physical host, but I'm guessing from your comment that you can't do > > > >>> that. Does whoever provides your VM also provide an API for > controlling > > > >>> it (starting/stopping/rebooting)? > > > >>> > > > >>> Regarding your original problem, it sounds like the surviving node > > > >>> doesn't have quorum. What version of corosync are you using? If > you're > > > >>> using corosync 2, you need "two_node: 1" in corosync.conf, in > addition > > > >>> to configuring fencing in pacemaker. > > > >>> > > > >>>> This is my new config: > > > >>>> > > > >>>> > > > >>>> node dcwbpvmuas004.edc.nam.gm.com \ > > > >>>> > > > >>>> attributes standby=off > > > >>>> > > > >>>> node dcwbp
Re: [ClusterLabs] Pacemaker failover failure
Thank you! However, what is proper fencing in this situation? Kind Regards, Alex On Wed, Jul 1, 2015 at 11:30 PM, Ken Gaillot wrote: > On 07/01/2015 09:39 AM, alex austin wrote: > > This is what crm_mon shows > > > > > > Last updated: Wed Jul 1 10:35:40 2015 > > > > Last change: Wed Jul 1 09:52:46 2015 > > > > Stack: classic openais (with plugin) > > > > Current DC: host2 - partition with quorum > > > > Version: 1.1.11-97629de > > > > 2 Nodes configured, 2 expected votes > > > > 4 Resources configured > > > > > > > > Online: [ host1 host2 ] > > > > > > ClusterIP (ocf::heartbeat:IPaddr2): Started host2 > > > > Master/Slave Set: redis_clone [redis] > > > > Masters: [ host2 ] > > > > Slaves: [ host1 ] > > > > pcmk-fencing(stonith:fence_pcmk): Started host2 > > > > On Wed, Jul 1, 2015 at 3:37 PM, alex austin > wrote: > > > >> I am running version 1.4.7 of corosync > > If you can't upgrade to corosync 2 (which has many improvements), you'll > need to set the no-quorum-policy=ignore cluster option. > > Proper fencing is necessary to avoid a split-brain situation, which can > corrupt your data. > > >> On Wed, Jul 1, 2015 at 3:25 PM, Ken Gaillot > wrote: > >> > >>> On 07/01/2015 08:57 AM, alex austin wrote: > >>>> I have now configured stonith-enabled=true. What device should I use > for > >>>> fencing given the fact that it's a virtual machine but I don't have > >>> access > >>>> to its configuration. would fence_pcmk do? if so, what parameters > >>> should I > >>>> configure for it to work properly? > >>> > >>> No, fence_pcmk is not for using in pacemaker, but for using in RHEL6's > >>> CMAN to redirect its fencing requests to pacemaker. > >>> > >>> For a virtual machine, ideally you'd use fence_virtd running on the > >>> physical host, but I'm guessing from your comment that you can't do > >>> that. Does whoever provides your VM also provide an API for controlling > >>> it (starting/stopping/rebooting)? > >>> > >>> Regarding your original problem, it sounds like the surviving node > >>> doesn't have quorum. What version of corosync are you using? If you're > >>> using corosync 2, you need "two_node: 1" in corosync.conf, in addition > >>> to configuring fencing in pacemaker. > >>> > >>>> This is my new config: > >>>> > >>>> > >>>> node dcwbpvmuas004.edc.nam.gm.com \ > >>>> > >>>> attributes standby=off > >>>> > >>>> node dcwbpvmuas005.edc.nam.gm.com \ > >>>> > >>>> attributes standby=off > >>>> > >>>> primitive ClusterIP IPaddr2 \ > >>>> > >>>> params ip=198.208.86.242 cidr_netmask=23 \ > >>>> > >>>> op monitor interval=1s timeout=20s \ > >>>> > >>>> op start interval=0 timeout=20s \ > >>>> > >>>> op stop interval=0 timeout=20s \ > >>>> > >>>> meta is-managed=true target-role=Started > resource-stickiness=500 > >>>> > >>>> primitive pcmk-fencing stonith:fence_pcmk \ > >>>> > >>>> params pcmk_host_list="dcwbpvmuas004.edc.nam.gm.com > >>>> dcwbpvmuas005.edc.nam.gm.com" \ > >>>> > >>>> op monitor interval=10s \ > >>>> > >>>> meta target-role=Started > >>>> > >>>> primitive redis redis \ > >>>> > >>>> meta target-role=Master is-managed=true \ > >>>> > >>>> op monitor interval=1s role=Master timeout=5s on-fail=restart > >>>> > >>>> ms redis_clone redis \ > >>>> > >>>> meta notify=true is-managed=true ordered=false > interleave=false > >>>> globally-unique=false target-role=Master migration-threshold=1 > >>>> > >>>> colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master > >>>> > >>>> colocation ip-on-redis inf: ClusterIP redis_clone:Master > >>>> > >>>
Re: [ClusterLabs] Pacemaker failover failure
I am running version 1.4.7 of corosync On Wed, Jul 1, 2015 at 3:25 PM, Ken Gaillot wrote: > On 07/01/2015 08:57 AM, alex austin wrote: > > I have now configured stonith-enabled=true. What device should I use for > > fencing given the fact that it's a virtual machine but I don't have > access > > to its configuration. would fence_pcmk do? if so, what parameters should > I > > configure for it to work properly? > > No, fence_pcmk is not for using in pacemaker, but for using in RHEL6's > CMAN to redirect its fencing requests to pacemaker. > > For a virtual machine, ideally you'd use fence_virtd running on the > physical host, but I'm guessing from your comment that you can't do > that. Does whoever provides your VM also provide an API for controlling > it (starting/stopping/rebooting)? > > Regarding your original problem, it sounds like the surviving node > doesn't have quorum. What version of corosync are you using? If you're > using corosync 2, you need "two_node: 1" in corosync.conf, in addition > to configuring fencing in pacemaker. > > > This is my new config: > > > > > > node dcwbpvmuas004.edc.nam.gm.com \ > > > > attributes standby=off > > > > node dcwbpvmuas005.edc.nam.gm.com \ > > > > attributes standby=off > > > > primitive ClusterIP IPaddr2 \ > > > > params ip=198.208.86.242 cidr_netmask=23 \ > > > > op monitor interval=1s timeout=20s \ > > > > op start interval=0 timeout=20s \ > > > > op stop interval=0 timeout=20s \ > > > > meta is-managed=true target-role=Started resource-stickiness=500 > > > > primitive pcmk-fencing stonith:fence_pcmk \ > > > > params pcmk_host_list="dcwbpvmuas004.edc.nam.gm.com > > dcwbpvmuas005.edc.nam.gm.com" \ > > > > op monitor interval=10s \ > > > > meta target-role=Started > > > > primitive redis redis \ > > > > meta target-role=Master is-managed=true \ > > > > op monitor interval=1s role=Master timeout=5s on-fail=restart > > > > ms redis_clone redis \ > > > > meta notify=true is-managed=true ordered=false interleave=false > > globally-unique=false target-role=Master migration-threshold=1 > > > > colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master > > > > colocation ip-on-redis inf: ClusterIP redis_clone:Master > > > > colocation pcmk-fencing-on-redis inf: pcmk-fencing redis_clone:Master > > > > property cib-bootstrap-options: \ > > > > dc-version=1.1.11-97629de \ > > > > cluster-infrastructure="classic openais (with plugin)" \ > > > > expected-quorum-votes=2 \ > > > > stonith-enabled=true > > > > property redis_replication: \ > > > > redis_REPL_INFO=dcwbpvmuas005.edc.nam.gm.com > > > > On Wed, Jul 1, 2015 at 2:53 PM, Nekrasov, Alexander < > > alexander.nekra...@emc.com> wrote: > > > >> stonith-enabled=false > >> > >> this might be the issue. The way peer node death is resolved, the > >> surviving node must call STONITH on the peer. If it’s disabled it might > not > >> be able to resolve the event > >> > >> > >> > >> Alex > >> > >> > >> > >> *From:* alex austin [mailto:alexixa...@gmail.com] > >> *Sent:* Wednesday, July 01, 2015 9:51 AM > >> *To:* Users@clusterlabs.org > >> *Subject:* Re: [ClusterLabs] Pacemaker failover failure > >> > >> > >> > >> So I noticed that if I kill redis on one node, it starts on the other, > no > >> problem, but if I actually kill pacemaker itself on one node, the other > >> doesn't "sense" it so it doesn't fail over. > >> > >> > >> > >> > >> > >> > >> > >> On Wed, Jul 1, 2015 at 12:42 PM, alex austin > wrote: > >> > >> Hi all, > >> > >> > >> > >> I have configured a virtual ip and redis in master-slave with corosync > >> pacemaker. If redis fails, then the failover is successful, and redis > gets > >> promoted on the other node. However if pacemaker itself fails on the > active > >> node, the failover is not performed. Is there anything I missed in the > >> configuration? > >> > >> > >> > >> Here's my con
Re: [ClusterLabs] Pacemaker failover failure
This is what crm_mon shows Last updated: Wed Jul 1 10:35:40 2015 Last change: Wed Jul 1 09:52:46 2015 Stack: classic openais (with plugin) Current DC: host2 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 4 Resources configured Online: [ host1 host2 ] ClusterIP (ocf::heartbeat:IPaddr2): Started host2 Master/Slave Set: redis_clone [redis] Masters: [ host2 ] Slaves: [ host1 ] pcmk-fencing(stonith:fence_pcmk): Started host2 On Wed, Jul 1, 2015 at 3:37 PM, alex austin wrote: > I am running version 1.4.7 of corosync > > > > On Wed, Jul 1, 2015 at 3:25 PM, Ken Gaillot wrote: > >> On 07/01/2015 08:57 AM, alex austin wrote: >> > I have now configured stonith-enabled=true. What device should I use for >> > fencing given the fact that it's a virtual machine but I don't have >> access >> > to its configuration. would fence_pcmk do? if so, what parameters >> should I >> > configure for it to work properly? >> >> No, fence_pcmk is not for using in pacemaker, but for using in RHEL6's >> CMAN to redirect its fencing requests to pacemaker. >> >> For a virtual machine, ideally you'd use fence_virtd running on the >> physical host, but I'm guessing from your comment that you can't do >> that. Does whoever provides your VM also provide an API for controlling >> it (starting/stopping/rebooting)? >> >> Regarding your original problem, it sounds like the surviving node >> doesn't have quorum. What version of corosync are you using? If you're >> using corosync 2, you need "two_node: 1" in corosync.conf, in addition >> to configuring fencing in pacemaker. >> >> > This is my new config: >> > >> > >> > node dcwbpvmuas004.edc.nam.gm.com \ >> > >> > attributes standby=off >> > >> > node dcwbpvmuas005.edc.nam.gm.com \ >> > >> > attributes standby=off >> > >> > primitive ClusterIP IPaddr2 \ >> > >> > params ip=198.208.86.242 cidr_netmask=23 \ >> > >> > op monitor interval=1s timeout=20s \ >> > >> > op start interval=0 timeout=20s \ >> > >> > op stop interval=0 timeout=20s \ >> > >> > meta is-managed=true target-role=Started resource-stickiness=500 >> > >> > primitive pcmk-fencing stonith:fence_pcmk \ >> > >> > params pcmk_host_list="dcwbpvmuas004.edc.nam.gm.com >> > dcwbpvmuas005.edc.nam.gm.com" \ >> > >> > op monitor interval=10s \ >> > >> > meta target-role=Started >> > >> > primitive redis redis \ >> > >> > meta target-role=Master is-managed=true \ >> > >> > op monitor interval=1s role=Master timeout=5s on-fail=restart >> > >> > ms redis_clone redis \ >> > >> > meta notify=true is-managed=true ordered=false interleave=false >> > globally-unique=false target-role=Master migration-threshold=1 >> > >> > colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master >> > >> > colocation ip-on-redis inf: ClusterIP redis_clone:Master >> > >> > colocation pcmk-fencing-on-redis inf: pcmk-fencing redis_clone:Master >> > >> > property cib-bootstrap-options: \ >> > >> > dc-version=1.1.11-97629de \ >> > >> > cluster-infrastructure="classic openais (with plugin)" \ >> > >> > expected-quorum-votes=2 \ >> > >> > stonith-enabled=true >> > >> > property redis_replication: \ >> > >> > redis_REPL_INFO=dcwbpvmuas005.edc.nam.gm.com >> > >> > On Wed, Jul 1, 2015 at 2:53 PM, Nekrasov, Alexander < >> > alexander.nekra...@emc.com> wrote: >> > >> >> stonith-enabled=false >> >> >> >> this might be the issue. The way peer node death is resolved, the >> >> surviving node must call STONITH on the peer. If it’s disabled it >> might not >> >> be able to resolve the event >> >> >> >> >> >> >> >> Alex >> >> >> >> >> >> >> >> *From:* alex austin [mailto:alexixa...@gmail.com] >> >> *Sent:* Wednesday, July 01, 2015 9:51 AM >> >> *To:* Users@clusterlabs.org >> >> *Subject:* Re: [ClusterLabs] Pacemaker failover failure >> >> >>
Re: [ClusterLabs] Pacemaker failover failure
I have now configured stonith-enabled=true. What device should I use for fencing given the fact that it's a virtual machine but I don't have access to its configuration. would fence_pcmk do? if so, what parameters should I configure for it to work properly? This is my new config: node dcwbpvmuas004.edc.nam.gm.com \ attributes standby=off node dcwbpvmuas005.edc.nam.gm.com \ attributes standby=off primitive ClusterIP IPaddr2 \ params ip=198.208.86.242 cidr_netmask=23 \ op monitor interval=1s timeout=20s \ op start interval=0 timeout=20s \ op stop interval=0 timeout=20s \ meta is-managed=true target-role=Started resource-stickiness=500 primitive pcmk-fencing stonith:fence_pcmk \ params pcmk_host_list="dcwbpvmuas004.edc.nam.gm.com dcwbpvmuas005.edc.nam.gm.com" \ op monitor interval=10s \ meta target-role=Started primitive redis redis \ meta target-role=Master is-managed=true \ op monitor interval=1s role=Master timeout=5s on-fail=restart ms redis_clone redis \ meta notify=true is-managed=true ordered=false interleave=false globally-unique=false target-role=Master migration-threshold=1 colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master colocation ip-on-redis inf: ClusterIP redis_clone:Master colocation pcmk-fencing-on-redis inf: pcmk-fencing redis_clone:Master property cib-bootstrap-options: \ dc-version=1.1.11-97629de \ cluster-infrastructure="classic openais (with plugin)" \ expected-quorum-votes=2 \ stonith-enabled=true property redis_replication: \ redis_REPL_INFO=dcwbpvmuas005.edc.nam.gm.com On Wed, Jul 1, 2015 at 2:53 PM, Nekrasov, Alexander < alexander.nekra...@emc.com> wrote: > stonith-enabled=false > > this might be the issue. The way peer node death is resolved, the > surviving node must call STONITH on the peer. If it’s disabled it might not > be able to resolve the event > > > > Alex > > > > *From:* alex austin [mailto:alexixa...@gmail.com] > *Sent:* Wednesday, July 01, 2015 9:51 AM > *To:* Users@clusterlabs.org > *Subject:* Re: [ClusterLabs] Pacemaker failover failure > > > > So I noticed that if I kill redis on one node, it starts on the other, no > problem, but if I actually kill pacemaker itself on one node, the other > doesn't "sense" it so it doesn't fail over. > > > > > > > > On Wed, Jul 1, 2015 at 12:42 PM, alex austin wrote: > > Hi all, > > > > I have configured a virtual ip and redis in master-slave with corosync > pacemaker. If redis fails, then the failover is successful, and redis gets > promoted on the other node. However if pacemaker itself fails on the active > node, the failover is not performed. Is there anything I missed in the > configuration? > > > > Here's my configuration (i have hashed the ip address out): > > > > node host1.com > > node host2.com > > primitive ClusterIP IPaddr2 \ > > params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \ > > op monitor interval=1s timeout=20s \ > > op start interval=0 timeout=20s \ > > op stop interval=0 timeout=20s \ > > meta is-managed=true target-role=Started resource-stickiness=500 > > primitive redis redis \ > > meta target-role=Master is-managed=true \ > > op monitor interval=1s role=Master timeout=5s on-fail=restart > > ms redis_clone redis \ > > meta notify=true is-managed=true ordered=false interleave=false > globally-unique=false target-role=Master migration-threshold=1 > > colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master > > colocation ip-on-redis inf: ClusterIP redis_clone:Master > > property cib-bootstrap-options: \ > > dc-version=1.1.11-97629de \ > > cluster-infrastructure="classic openais (with plugin)" \ > > expected-quorum-votes=2 \ > > stonith-enabled=false > > property redis_replication: \ > > redis_REPL_INFO=host.com > > > > thank you in advance > > > > Kind regards, > > > > Alex > > > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Pacemaker failover failure
so did another test: two nodes: node1 and node2 Case: node1 is the active node node2: is pasive if I killall -9 pacemakerd corosync on node 1 the services do not fail over to node2, but if I start corosync and pacemaker on node1 then it fails over to node 2. Where am I mistaking? Alex On Wed, Jul 1, 2015 at 12:42 PM, alex austin wrote: > Hi all, > > I have configured a virtual ip and redis in master-slave with corosync > pacemaker. If redis fails, then the failover is successful, and redis gets > promoted on the other node. However if pacemaker itself fails on the active > node, the failover is not performed. Is there anything I missed in the > configuration? > > Here's my configuration (i have hashed the ip address out): > > node host1.com > > node host2.com > > primitive ClusterIP IPaddr2 \ > > params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \ > > op monitor interval=1s timeout=20s \ > > op start interval=0 timeout=20s \ > > op stop interval=0 timeout=20s \ > > meta is-managed=true target-role=Started resource-stickiness=500 > > primitive redis redis \ > > meta target-role=Master is-managed=true \ > > op monitor interval=1s role=Master timeout=5s on-fail=restart > > ms redis_clone redis \ > > meta notify=true is-managed=true ordered=false interleave=false > globally-unique=false target-role=Master migration-threshold=1 > > colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master > > colocation ip-on-redis inf: ClusterIP redis_clone:Master > > property cib-bootstrap-options: \ > > dc-version=1.1.11-97629de \ > > cluster-infrastructure="classic openais (with plugin)" \ > > expected-quorum-votes=2 \ > > stonith-enabled=false > > property redis_replication: \ > > redis_REPL_INFO=host.com > > > thank you in advance > > > Kind regards, > > > Alex > ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Pacemaker failover failure
So I noticed that if I kill redis on one node, it starts on the other, no problem, but if I actually kill pacemaker itself on one node, the other doesn't "sense" it so it doesn't fail over. On Wed, Jul 1, 2015 at 12:42 PM, alex austin wrote: > Hi all, > > I have configured a virtual ip and redis in master-slave with corosync > pacemaker. If redis fails, then the failover is successful, and redis gets > promoted on the other node. However if pacemaker itself fails on the active > node, the failover is not performed. Is there anything I missed in the > configuration? > > Here's my configuration (i have hashed the ip address out): > > node host1.com > > node host2.com > > primitive ClusterIP IPaddr2 \ > > params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \ > > op monitor interval=1s timeout=20s \ > > op start interval=0 timeout=20s \ > > op stop interval=0 timeout=20s \ > > meta is-managed=true target-role=Started resource-stickiness=500 > > primitive redis redis \ > > meta target-role=Master is-managed=true \ > > op monitor interval=1s role=Master timeout=5s on-fail=restart > > ms redis_clone redis \ > > meta notify=true is-managed=true ordered=false interleave=false > globally-unique=false target-role=Master migration-threshold=1 > > colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master > > colocation ip-on-redis inf: ClusterIP redis_clone:Master > > property cib-bootstrap-options: \ > > dc-version=1.1.11-97629de \ > > cluster-infrastructure="classic openais (with plugin)" \ > > expected-quorum-votes=2 \ > > stonith-enabled=false > > property redis_replication: \ > > redis_REPL_INFO=host.com > > > thank you in advance > > > Kind regards, > > > Alex > ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Pacemaker failover failure
Hi all, I have configured a virtual ip and redis in master-slave with corosync pacemaker. If redis fails, then the failover is successful, and redis gets promoted on the other node. However if pacemaker itself fails on the active node, the failover is not performed. Is there anything I missed in the configuration? Here's my configuration (i have hashed the ip address out): node host1.com node host2.com primitive ClusterIP IPaddr2 \ params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \ op monitor interval=1s timeout=20s \ op start interval=0 timeout=20s \ op stop interval=0 timeout=20s \ meta is-managed=true target-role=Started resource-stickiness=500 primitive redis redis \ meta target-role=Master is-managed=true \ op monitor interval=1s role=Master timeout=5s on-fail=restart ms redis_clone redis \ meta notify=true is-managed=true ordered=false interleave=false globally-unique=false target-role=Master migration-threshold=1 colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master colocation ip-on-redis inf: ClusterIP redis_clone:Master property cib-bootstrap-options: \ dc-version=1.1.11-97629de \ cluster-infrastructure="classic openais (with plugin)" \ expected-quorum-votes=2 \ stonith-enabled=false property redis_replication: \ redis_REPL_INFO=host.com thank you in advance Kind regards, Alex ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org