[ClusterLabs] pacemaker apache and umask on CentOS 7

2016-04-20 Thread fatcharly
Hi, I´m running a 2-node apache webcluster on a fully patched CentOS 7 (pacemaker-1.1.13-10.el7_2.2.x86_64 pcs-0.9.143-15.el7.x86_64). Some files which are generated by the apache are created with a umask 137 but I need this files created with a umask of 117. To change this I first tried to add

[ClusterLabs] Pacemaker resource for mysql 5.7 still calling mysqld_safe

2016-04-15 Thread Luke Varnadore
I am setting up a pacemaker resource for Percona 5.7 on CentOS 7 latest. According to the release notes for 5.7 systemd handles what safe_mysqld did. I'm wondering if I can update the resource appropriately. Do I need to revert to 5.6 until this is supported? * mysql_start_0 on localhost 'not

Re: [ClusterLabs] Pacemaker on-fail standby recovery does not start DRBD slave resource

2016-03-30 Thread Sam Gardner
One other note: Manually standby-ing and unstandby-ing a node gives the behavior I want (eg, after the node is unstandby-ed, the DRBDSlave resource works). -- Sam Gardner Trustwave | SMART SECURITY ON DEMAND On 3/30/16, 11:46 AM, "Ken Gaillot" wrote: >On 03/30/2016 11:20

Re: [ClusterLabs] Pacemaker on-fail standby recovery does not start DRBD slave resource

2016-03-30 Thread Ken Gaillot
On 03/30/2016 11:20 AM, Sam Gardner wrote: > I have configured some network resources to automatically standby their node > if the system detects a failure on them. However, the DRBD slave that I have > configured does not automatically restart after the node is "unstandby-ed" > after the

[ClusterLabs] Pacemaker on-fail standby recovery does not start DRBD slave resource

2016-03-30 Thread Sam Gardner
I have configured some network resources to automatically standby their node if the system detects a failure on them. However, the DRBD slave that I have configured does not automatically restart after the node is "unstandby-ed" after the failure-timeout expires. Is there any way to make the

Re: [ClusterLabs] Pacemaker connectivity loss to ISP

2016-03-24 Thread S0ke
Ended up setting up 2 static routes and then using those as the ips designated in the ocf:ping host_list. Works like a charm. Original Message Subject: Pacemaker connectivity loss to ISP Local Time: March 24, 2016 12:12 PM UTC Time: March 24, 2016 5:12 PM From:

[ClusterLabs] Pacemaker connectivity loss to ISP

2016-03-24 Thread S0ke
So I'm trying to figure out the best method to accomplish this. We have a 2 node cluster. We have multiple WANs connected to 2 different ISPs. Generally everything is forced out eth0, eth1 is the backup. ISP1 ISP2 ISP2 ISP1 | | | | | | | | eth0 eth1 eth0 eth1 -- --- | HA1 |

Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Ferenc Wágner
Andrei Borzenkov writes: > On Wed, Mar 16, 2016 at 2:22 PM, Ferenc Wágner wrote: > >> Pacemaker explained says about this cluster option: >> >> Advanced Use Only: Should the cluster shoot unseen nodes? Not using >> the default is very unsafe! >> >> 1.

Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Andrei Borzenkov
On Wed, Mar 16, 2016 at 4:18 PM, Lars Ellenberg wrote: > On Wed, Mar 16, 2016 at 01:47:52PM +0100, Ferenc Wágner wrote: >> >> And some more about fencing: >> >> >> >> 3. What's the difference in cluster behavior between >> >>- stonith-enabled=FALSE (9.3.2: how often

[ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Ferenc Wágner
Hi, Pacemaker explained says about this cluster option: Advanced Use Only: Should the cluster shoot unseen nodes? Not using the default is very unsafe! 1. What are those "unseen" nodes? And a possibly related question: 2. If I've got UNCLEAN (offline) nodes, is there a way to clean

Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Lars Ellenberg
On Wed, Mar 16, 2016 at 01:47:52PM +0100, Ferenc Wágner wrote: > >> And some more about fencing: > >> > >> 3. What's the difference in cluster behavior between > >>- stonith-enabled=FALSE (9.3.2: how often will the stop operation be > >> retried?) > >>- having no configured STONITH

Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-18 Thread Ferenc Wágner
takeover happens without further protection. > > Oh! Actually it is not quite clear from documentation; documentation > does not explain what happens in case of stonith-enabled=false at all. Yes, this is a crucially important piece of information, which should be prominently announc

Re: [ClusterLabs] pacemaker remote configuration on ubuntu 14.04

2016-03-11 Thread Ken Gaillot
On 03/10/2016 11:36 PM, Сергей Филатов wrote: > This one is the right log Something in the cluster configuration and state (for example, an unsatisfied constraint) is preventing the cluster from starting the resource: Mar 10 04:00:53 [11785] controller-1.domain.compengine: info:

Re: [ClusterLabs] pacemaker remote configuration on ubuntu 14.04

2016-03-09 Thread Ken Gaillot
On 03/08/2016 11:38 PM, Сергей Филатов wrote: > ssh -p 3121 compute-1 > ssh_exchange_identification: read: Connection reset by peer > > That’s what I get in /var/log/pacemaker.log after restarting pacemaker_remote: > Mar 09 05:30:27 [28031] compute-1.domain.com lrmd: info: >

Re: [ClusterLabs] pacemaker remote configuration on ubuntu 14.04

2016-03-08 Thread Сергей Филатов
ssh -p 3121 compute-1 ssh_exchange_identification: read: Connection reset by peer That’s what I get in /var/log/pacemaker.log after restarting pacemaker_remote: Mar 09 05:30:27 [28031] compute-1.domain.com lrmd: info: crm_signal_dispatch: Invoking handler for signal 15: Terminated Mar

Re: [ClusterLabs] pacemaker remote configuration on ubuntu 14.04

2016-03-08 Thread Ken Gaillot
On 03/07/2016 09:10 PM, Сергей Филатов wrote: > Thanks for an answer. Turned out the problem was not in ipv6. > Remote node is listening on 3121 port and it’s name is resolving fine. > Got authkey file at /etc/pacemaker on both remote and cluster nodes. > What can I check in addition? Is there any

Re: [ClusterLabs] pacemaker remote configuration on ubuntu 14.04

2016-03-07 Thread Сергей Филатов
Thanks for an answer. Turned out the problem was not in ipv6. Remote node is listening on 3121 port and it’s name is resolving fine. Got authkey file at /etc/pacemaker on both remote and cluster nodes. What can I check in addition? Is there any walkthrough for ubuntu? > On 07 Mar 2016, at 09:40,

Re: [ClusterLabs] Pacemaker issue lsb service

2016-03-07 Thread Kristoffer Grönlund
Thorsten Stremetzne writes: > Hello all, > > > I have built an HA setup for a OpenVPN server. > In my setup there are two hosts, running Ubuntu Linux, pacemaker & chorosync. > Also both hosts have a virtual IP which migrates to the host that is active, > when the other

Re: [ClusterLabs] pacemaker remote configuration on ubuntu 14.04

2016-03-07 Thread Ken Gaillot
On 03/06/2016 07:43 PM, Сергей Филатов wrote: > Hi, > I’m trying to set up pacemaker_remote resource on ubuntu 14.04 > I followed "remote node walkthrough” guide > (http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/#idm140473081667280 > >

Re: [ClusterLabs] Pacemaker issue lsb service

2016-03-05 Thread emmanuel segura
If you need help, the first thing that you need to do is show your cluster logs. 2016-03-05 15:17 GMT+01:00 Thorsten Stremetzne : > Hello all, > > I have built an HA setup for a OpenVPN server. > In my setup there are two hosts, running Ubuntu Linux, pacemaker & >

[ClusterLabs] Pacemaker issue lsb service

2016-03-05 Thread Thorsten Stremetzne
Hello all, I have built an HA setup for a OpenVPN server. In my setup there are two hosts, running Ubuntu Linux, pacemaker & chorosync. Also both hosts have a virtual IP which migrates to the host that is active, when the other fails. This works well, but I also configured a primitive for

Re: [ClusterLabs] Pacemaker for 389 directory server with multi-master replication

2016-02-20 Thread Bernie Jones
pcs constraint colocation add dirsrv-clone with LDAP_Cluster_IP INFINITY Regards, Bernie From: Andreas Kurz [mailto:andreas.k...@gmail.com] Sent: 20 February 2016 14:02 To: Cluster Labs - All topics related to open-source clustering welcomed Subject: Re: [ClusterLabs] Pacemaker for 389 dire

[ClusterLabs] Pacemaker for 389 directory server with multi-master replication

2016-02-20 Thread Bernie Jones
Hi all, I'm new to this list and fairly new to pacemaker and have just spent a couple of days trying unsuccessfully to solve a configuration challenge. I have seen a relevant post on this list from around four years ago but it doesn't seem to have helped - here's what I want to do. I

[ClusterLabs] Pacemaker issue when ethernet interface is pulled down

2016-02-14 Thread Debabrata Pani
Hi, We ran into some problems when we pull down the ethernet interface using “ifconfig eth0 down” Our cluster has the following configurations and resources * Two network interfaces : eth0 and lo(cal) * 3 nodes with one node put in maintenance mode * No-quorum-policy=stop *

Re: [ClusterLabs] Pacemaker issue when ethernet interface is pulled down

2016-02-14 Thread emmanuel segura
use fence and after you configured the fencing you need to use iptables for testing your cluster, with iptables you can block 5404 and 5405 ports 2016-02-14 14:09 GMT+01:00 Debabrata Pani : > Hi, > We ran into some problems when we pull down the ethernet interface

Re: [ClusterLabs] Pacemaker (remote) component relations

2016-02-08 Thread Ken Gaillot
On 02/08/2016 07:55 AM, Ferenc Wágner wrote: > Hi, > > I'm looking for information about the component interdependencies, > because I'd like to split the Pacemaker packages in Debian properly. > The current idea is to create two daemon packages, pacemaker and > pacemaker-remote, which exclude

Re: [ClusterLabs] Pacemaker 1.1.14 released

2016-01-24 Thread Andrew Beekhof
; cluster resource manager, version 1.1.14. The source code is available at: > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.14 > > This version introduces some valuable new features: > > * Resources will now start as soon as their state has been confirmed

Re: [ClusterLabs] Pacemaker shows false status of a resource and doesn't react on OCF_NOT_RUNNING rc.

2016-01-19 Thread Ken Gaillot
On 01/19/2016 10:30 AM, Kostiantyn Ponomarenko wrote: > The resource that wasn't running, but was reported as running, is > "adminServer". > > Here are a brief chronological description: > > [Jan 19 23:42:16] The first time Pacemaker triggers its monitor function at > line #1107. (those lines

Re: [ClusterLabs] Pacemaker shows false status of a resource and doesn't react on OCF_NOT_RUNNING rc.

2016-01-19 Thread Ken Gaillot
On 01/19/2016 11:02 AM, Kostiantyn Ponomarenko wrote: > Just in case, this is the monitor function from the resource agent: > ra_monitor() { > # ocf_log info "$RA: [monitor]" > systemctl status ${service} > rc=$? > if [ "$rc" -eq "0" ]; then > return $OCF_SUCCESS > fi >

Re: [ClusterLabs] Pacemaker shows false status of a resource and doesn't react on OCF_NOT_RUNNING rc.

2016-01-19 Thread Ken Gaillot
On 01/19/2016 12:20 PM, Kostiantyn Ponomarenko wrote: > I've put the wrong entry from "journalctl --since="2016-01-19" > --until="2016-01-20"". > The correct one is: > > Jan 19 23:42:24 A2-2U12-302-LS ntpd[2204]: 0.0.0.0 c61c 0c clock_step > -43194.111405 s > Jan 19 11:42:29 A2-2U12-302-LS

[ClusterLabs] Pacemaker shows false status of a resource and doesn't react on OCF_NOT_RUNNING rc.

2016-01-19 Thread Kostiantyn Ponomarenko
One of resources in my cluster is not actually running, but "crm_mon" shows it with the "Started" status. Its resource agent's monitor function returns "$OCF_NOT_RUNNING", but Pacemaker doesn't react on this anyhow - crm_mon show the resource as Started. I couldn't find an explanation to this

Re: [ClusterLabs] Pacemaker shows false status of a resource and doesn't react on OCF_NOT_RUNNING rc.

2016-01-19 Thread Bogdan Dobrelya
On 19.01.2016 13:49, Kostiantyn Ponomarenko wrote: > One of resources in my cluster is not actually running, but "crm_mon" > shows it with the "Started" status. > Its resource agent's monitor function returns "$OCF_NOT_RUNNING", but > Pacemaker doesn't react on this anyhow - crm_mon show the

Re: [ClusterLabs] Pacemaker shows false status of a resource and doesn't react on OCF_NOT_RUNNING rc.

2016-01-19 Thread Bogdan Dobrelya
On 19.01.2016 16:13, Ken Gaillot wrote: > On 01/19/2016 06:49 AM, Kostiantyn Ponomarenko wrote: >> One of resources in my cluster is not actually running, but "crm_mon" shows >> it with the "Started" status. >> Its resource agent's monitor function returns "$OCF_NOT_RUNNING", but >> Pacemaker

Re: [ClusterLabs] Pacemaker shows false status of a resource and doesn't react on OCF_NOT_RUNNING rc.

2016-01-19 Thread Kostiantyn Ponomarenko
Just in case, this is the monitor function from the resource agent: ra_monitor() { # ocf_log info "$RA: [monitor]" systemctl status ${service} rc=$? if [ "$rc" -eq "0" ]; then return $OCF_SUCCESS fi ocf_log warn "$RA: [monitor] : got rc=$rc" return

Re: [ClusterLabs] Pacemaker license

2016-01-19 Thread Jan Pokorný
On 12/01/16 11:27 +1100, Andrew Beekhof wrote: >> On 6 Oct 2015, at 9:39 AM, santosh_bidara...@dell.com wrote: >> As per the given link http://clusterlabs.org/wiki/License, it is >> mentioned that “Pacemaker programs are licensed under the GPLv2+ >> (version 2 or later of the GPL) and its headers

Re: [ClusterLabs] Pacemaker 1.1.14 released

2016-01-14 Thread Digimer
Congrats!! On 14/01/16 04:49 PM, Ken Gaillot wrote: > ClusterLabs is proud to announce the latest release of the Pacemaker > cluster resource manager, version 1.1.14. The source code is available at: > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.14 >

[ClusterLabs] Pacemaker 1.1.14 released

2016-01-14 Thread Ken Gaillot
ClusterLabs is proud to announce the latest release of the Pacemaker cluster resource manager, version 1.1.14. The source code is available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.14 This version introduces some valuable new features: * Resources will now start

Re: [ClusterLabs] Pacemaker license

2016-01-11 Thread Andrew Beekhof
> On 6 Oct 2015, at 9:39 AM, santosh_bidara...@dell.com wrote: > > Dell - Internal Use - Confidential > > Hello Pacemaker Admins, > > We have a query regarding licensing for pacemaker header files > > As per the given link http://clusterlabs.org/wiki/License, it is mentioned > that

Re: [ClusterLabs] Pacemaker documentation license clarification

2016-01-04 Thread Ferenc Wagner
se on the plate. Well, it's not pretty to say the least, but I don't think I have to touch that part. > You're welcome to submit a pull request to change it to use the local > brand directory. Done, it's part of https://github.com/ClusterLabs/pacemaker/pull/876. That pull request contains th

Re: [ClusterLabs] [Pacemaker] Beginner | Resources stuck unloading

2015-12-18 Thread Tyler Hampton
>But generally, stonith-enabled=false can lead to error recovery problems and make trouble harder to diagnose. If you can take the time to get stonith working, it should at least stop your first problem from causing further problems. Yeah, I feel like a lot of my post-failover cluster state is

Re: [ClusterLabs] [Pacemaker] Beginner | Resources stuck unloading

2015-12-16 Thread Ken Gaillot
On 12/14/2015 12:18 AM, Tyler Hampton wrote: > Hi! > > I'm currently trying to semi-follow Sebastien Han's blog post on > implementing HA with Ceph rbd volumes and I am hitting some walls. The > difference between what I'm trying to do and the blog post is that I'm > trying to implement an

Re: [ClusterLabs] Pacemaker documentation license clarification

2015-12-14 Thread Ken Gaillot
On 12/13/2015 06:56 PM, Ferenc Wagner wrote: > Ken Gaillot writes: > >> On 12/11/2015 10:07 AM, Ferenc Wagner wrote: >> >>> [...] the "Legal Notice" >>> section of the generated Publican documentation (for example >>> Pacemaker_Explained/desktop/en-US/index.html) says that

[ClusterLabs] Pacemaker 1.1.14 - Release Candidate 3

2015-12-14 Thread Ken Gaillot
The source code for the latest Pacemaker release candidate is available at https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.14-rc3 This is a bugfix release: * When deleting an attribute from a fence device, the entire device would sometimes be deleted. * 0f9a4eb0 introduced

Re: [ClusterLabs] Pacemaker documentation license clarification

2015-12-13 Thread Ferenc Wagner
Ken Gaillot writes: > On 12/11/2015 10:07 AM, Ferenc Wagner wrote: > >> [...] the "Legal Notice" >> section of the generated Publican documentation (for example >> Pacemaker_Explained/desktop/en-US/index.html) says that the material may >> only be distributed under

[ClusterLabs] Pacemaker documentation license clarification

2015-12-11 Thread Ferenc Wagner
Hi, We're packaging Pacemaker for Debian and this requires a clear picture of all licenses relevant to the package. The software part is clearly under GPL-2+ and LGPL-2+, which is fine. However, the "Legal Notice" section of the generated Publican documentation (for example

Re: [ClusterLabs] Pacemaker documentation license clarification

2015-12-11 Thread Ken Gaillot
On 12/11/2015 10:07 AM, Ferenc Wagner wrote: > Hi, > > We're packaging Pacemaker for Debian and this requires a clear picture > of all licenses relevant to the package. The software part is clearly > under GPL-2+ and LGPL-2+, which is fine. However, the "Legal Notice" > section of the generated

[ClusterLabs] Pacemaker 1.1.14 - Release Candidate (try it out!)

2015-12-08 Thread Ken Gaillot
The release cycle for Pacemaker 1.1.14 has begun! The source code for a release candidate is available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.14-rc2 This release candidate introduces some valuable new features: * Resources will now start as soon as their state

Re: [ClusterLabs] Pacemaker crash and fencing failure

2015-11-30 Thread Ken Gaillot
On 11/20/2015 06:38 PM, Brian Campbell wrote: > I've been trying to debug and do a root cause analysis for a cascading > series of failures that a customer hit a couple of days ago, that > caused their filesystem to be unavailable for a couple of hours. > > The original failure was in our own

[ClusterLabs] Pacemaker crash and fencing failure

2015-11-20 Thread Brian Campbell
I've been trying to debug and do a root cause analysis for a cascading series of failures that a customer hit a couple of days ago, that caused their filesystem to be unavailable for a couple of hours. The original failure was in our own distributed filesystem backend, a fork of LizardFS, which

Re: [ClusterLabs] [Pacemaker] large cluster - failure recovery

2015-11-19 Thread Cédric Dufour - Idiap Research Institute
[coming over from the old mailing list pacema...@oss.clusterlabs.org; sorry for any thread discrepancy] Hello, We've also setup a fairly large cluster - 24 nodes / 348 resources (pacemaker 1.1.12, corosync 1.4.7) - and pacemaker 1.1.12 is definitely the minimum version you'll want, thanks to

Re: [ClusterLabs] Pacemaker build error

2015-11-04 Thread Ken Gaillot
On 11/03/2015 11:10 PM, Jim Van Oosten wrote: > > > I am getting a compile error when building Pacemaker on Linux version > 2.6.32-431.el6.x86_64. > > The build commands: > > git clone git://github.com/ClusterLabs/pacemaker.git > cd pacemaker > ./autogen.sh && ./configure --prefix=/usr

Re: [ClusterLabs] Pacemaker build error

2015-11-04 Thread Ken Gaillot
On 11/04/2015 09:31 AM, Ken Gaillot wrote: > On 11/03/2015 11:10 PM, Jim Van Oosten wrote: >> >> >> I am getting a compile error when building Pacemaker on Linux version >> 2.6.32-431.el6.x86_64. >> >> The build commands: >> >> git clone git://github.com/ClusterLabs/pacemaker.git >> cd pacemaker

[ClusterLabs] Pacemaker build error

2015-11-03 Thread Jim Van Oosten
I am getting a compile error when building Pacemaker on Linux version 2.6.32-431.el6.x86_64. The build commands: git clone git://github.com/ClusterLabs/pacemaker.git cd pacemaker ./autogen.sh && ./configure --prefix=/usr --sysconfdir=/etc make make install The compile error: Making install

Re: [ClusterLabs] Pacemaker process 10-15% CPU

2015-11-02 Thread Ken Gaillot
sort of loop. > Thanks, > Karthik. > -Original Message- > From: Ken Gaillot [mailto:kgail...@redhat.com] > Sent: 31 அக்டோபர் 2015 03:33 > To: users@clusterlabs.org > Subject: Re: [ClusterLabs] Pacemaker process 10-15% CPU > > On 10/30/2015 05:14 AM, Karthikeyan

Re: [ClusterLabs] Pacemaker process 10-15% CPU

2015-10-30 Thread Ken Gaillot
On 10/30/2015 05:14 AM, Karthikeyan Ramasamy wrote: > Hello, > We are using Pacemaker to manage the services that run on a node, as part > of a service management framework, and manage the nodes running the services > as a cluster. One service will be running as 1+1 and other services with be

[ClusterLabs] Pacemaker process 10-15% CPU

2015-10-30 Thread Karthikeyan Ramasamy
Hello, We are using Pacemaker to manage the services that run on a node, as part of a service management framework, and manage the nodes running the services as a cluster. One service will be running as 1+1 and other services with be N+1. During our testing, we see that the pacemaker

[ClusterLabs] Pacemaker: Custom Health Checks possible?

2015-09-23 Thread Sebish
Dear all on clusterlabs mailing list, there remains a question, even google could not deliver the answer for: * *Does Pacemaker + Heartbeat 2 / Corosync (/openAIS) provide the possibility to use custom health checks?* * *Which part of the constellation must it be added to?* Especially I

Re: [ClusterLabs] Pacemaker: Custom Health Checks possible?

2015-09-23 Thread Kai Dupke
On 09/23/2015 02:10 PM, Sebish wrote: > * *Does Pacemaker + Heartbeat 2 / Corosync (/openAIS) provide the >possibility to use custom health checks?* Yes: write a resource agent for this. However, there is a possibility to use Nagios/Icinga probes, which are available for a wide range of

Re: [ClusterLabs] Pacemaker: Custom Health Checks possible?

2015-09-23 Thread Michael Schwartzkopff
Am Mittwoch, 23. September 2015, 14:10:52 schrieb Sebish: > Dear all on clusterlabs mailing list, > > there remains a question, even google could not deliver the answer for: > > * *Does Pacemaker + Heartbeat 2 / Corosync (/openAIS) provide the > possibility to use custom health checks?*

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-21 Thread Jason Gress
clusterlabs.org <mailto:users@clusterlabs.org>> >> Date: Friday, September 18, 2015 at 3:03 PM >> To: Cluster Labs - All topics related to open-source clustering welcomed >> <users@clusterlabs.org <mailto:users@clusterlabs.org>> >> Subject: Re: [Clus

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-21 Thread Jason Gress
<jgr...@accertify.com <mailto:jgr...@accertify.com>> >>>> Reply-To: Cluster Labs - All topics related to open-source clustering >>>> welcomed <users@clusterlabs.org <mailto:users@clusterlabs.org>> >>>> Date: Friday, September 18, 2015 at

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-21 Thread Digimer
On 21/09/15 11:23 AM, Jason Gress wrote: > Yeah, I had problems, which I am thinking might be firewall related. In a > previous place of employment I had ipmi working great (but with > heartbeat), so I do have some experience with IPMI STONITH. If you can query the IPMI sensor data, you should

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-21 Thread Digimer
All topics related to open-source clustering >>> welcomed <users@clusterlabs.org <mailto:users@clusterlabs.org>> >>> Date: Friday, September 18, 2015 at 3:03 PM >>> To: Cluster Labs - All topics related to open-source clustering welcomed >>> &

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-17 Thread Jason Gress
@clusterlabs.org>> Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped) The only difference in the DRBD resource between yours and mine that I can see is the monitoring parameters (mine works nicely, but is Centos 6). Here's mine: Master: ms_

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-17 Thread Luke Pascoe
s - All topics related to open-source clustering welcomed < > users@clusterlabs.org> > Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary > node to Slave (always Stopped) > > I may be wrong, but shouldn't "clone-node-max" be 2 on the ms_drbd_vmfs >

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-17 Thread Jason Gress
t;mailto:users@clusterlabs.org>> Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped) I may be wrong, but shouldn't "clone-node-max" be 2 on the ms_drbd_vmfs resource? Luke Pascoe [http://osnz.co.nz/logo_blue_80.png] E l...@osnz.

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-17 Thread Luke Pascoe
ursday, September 17, 2015 at 6:36 PM > To: Cluster Labs - All topics related to open-source clustering welcomed < > users@clusterlabs.org> > Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary > node to Slave (always Stopped) > > I can't say whether or

Re: [ClusterLabs] Pacemaker tries to demote resource that isn't running and returns OCF_FAILED_MASTER

2015-08-20 Thread Andrei Borzenkov
21.08.2015 00:35, Brian Campbell пишет: I have a master/slave resource (with a custom resource agent) which, if it uncleanly shut down, will return OCF_FAILED_MASTER on the next monitor operation. This seems to be what

[ClusterLabs] Pacemaker tries to demote resource that isn't running and returns OCF_FAILED_MASTER

2015-08-20 Thread Brian Campbell
I have a master/slave resource (with a custom resource agent) which, if it uncleanly shut down, will return OCF_FAILED_MASTER on the next monitor operation. This seems to be what http://www.linux-ha.org/doc/dev-guides/_literal_ocf_failed_master_literal_9.html suggests that exit code should be used

[ClusterLabs] Pacemaker serializes start operations

2015-08-11 Thread Nekrasov, Alexander
Hello, I have a dependency tree starting with PgsqlShared as root, and two sub trees: 1. PgsqlShared-apache-apl-c4fastvpa 2. PgsqlShared-iproute With the version on Pacemaker that came with SLES11, the starting was done in parallel, such as 1. PgsqlShared 2. apache and

[ClusterLabs] pacemaker doesn't correctly handle a resource after time/date change

2015-07-23 Thread Kostiantyn Ponomarenko
Hi, If you do: # date --set=1990-01-01 01:00:00 when only one node is present in the cluster and while the cluster is working, and then stop a resource (any resource), the cluster fails the resource once, shows it as Started, but the resource actually is still stopped. Is it the expected

Re: [ClusterLabs] Pacemaker failover failure

2015-07-14 Thread Digimer
As said before, fencing. On 01/07/15 06:54 AM, alex austin wrote: so did another test: two nodes: node1 and node2 Case: node1 is the active node node2: is pasive if I killall -9 pacemakerd corosync on node 1 the services do not fail over to node2, but if I start corosync and pacemaker

Re: [ClusterLabs] Pacemaker failover failure

2015-07-02 Thread alex austin
. If it’s disabled it might not be able to resolve the event Alex *From:* alex austin [mailto:alexixa...@gmail.com] *Sent:* Wednesday, July 01, 2015 9:51 AM *To:* Users@clusterlabs.org *Subject:* Re: [ClusterLabs] Pacemaker failover failure So I noticed that if I kill

Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread alex austin
so did another test: two nodes: node1 and node2 Case: node1 is the active node node2: is pasive if I killall -9 pacemakerd corosync on node 1 the services do not fail over to node2, but if I start corosync and pacemaker on node1 then it fails over to node 2. Where am I mistaking? Alex On

Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread Ken Gaillot
to resolve the event Alex *From:* alex austin [mailto:alexixa...@gmail.com] *Sent:* Wednesday, July 01, 2015 9:51 AM *To:* Users@clusterlabs.org *Subject:* Re: [ClusterLabs] Pacemaker failover failure So I noticed that if I kill redis on one node, it starts on the other, no problem

Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread alex austin
So I noticed that if I kill redis on one node, it starts on the other, no problem, but if I actually kill pacemaker itself on one node, the other doesn't sense it so it doesn't fail over. On Wed, Jul 1, 2015 at 12:42 PM, alex austin alexixa...@gmail.com wrote: Hi all, I have configured a

Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread alex austin
*To:* Users@clusterlabs.org *Subject:* Re: [ClusterLabs] Pacemaker failover failure So I noticed that if I kill redis on one node, it starts on the other, no problem, but if I actually kill pacemaker itself on one node, the other doesn't sense it so it doesn't fail over

Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread Ken Gaillot
STONITH on the peer. If it’s disabled it might not be able to resolve the event Alex *From:* alex austin [mailto:alexixa...@gmail.com] *Sent:* Wednesday, July 01, 2015 9:51 AM *To:* Users@clusterlabs.org *Subject:* Re: [ClusterLabs] Pacemaker failover failure So I noticed that if I

<    2   3   4   5   6   7