Re: [Pacemaker] Question about OCF RA reload behaviour

2014-10-29 Thread Felix Zachlod
Am 29.10.2014 01:39, schrieb Andrew Beekhof: I just have one question about how a resource agent should behave on reload invoked if the resource is currently stopped. Should the resource be started or remain stopped? I don't think that is defined. Certainly pacemaker wont trigger that case

[Pacemaker] Split brain and STONITH behavior (VMware fencing)

2014-10-29 Thread Ariel S
Hello, I'm trying to understand how this STONITH works. I have 2 VMware VMs (moon1a, moon1b) on two different hosts. Each have 2 nic assigned: eth0 for heartbeat while eth1 used for everything else. This is my testing configuration: node $id=168428034 moon1a node $id=168428035 moon1b

Re: [Pacemaker] Split brain and STONITH behavior (VMware fencing)

2014-10-29 Thread Sven Moeller
You can try to force a split brain by shutting down the heartbeat NICs and keep corosync running on both nodes. Regards Sven Ariel S ariel_bis2...@yahoo.co.id schrieb: Hello, I'm trying to understand how this STONITH works. I have 2 VMware VMs (moon1a, moon1b) on two different hosts. Each

Re: [Pacemaker] 2 Node Clustering, when primary server goes down(shutdown) the secondary server restarts

2014-10-29 Thread kamal kishi
Thanks for the info, was trying to configure IPMI in the servers. Can you please suggest a configuration procedure for enabling and configuring the IPMI(Which you might have referred to). The sites I came across are not understandable. The servers I'm using is DELL POWEREDGE R320 On Tue, Oct 28,

Re: [Pacemaker] Split brain and STONITH behavior (VMware fencing)

2014-10-29 Thread Andrei Borzenkov
On Wed, Oct 29, 2014 at 10:46 AM, Ariel S ariel_bis2...@yahoo.co.id wrote: Hello, I'm trying to understand how this STONITH works. I have 2 VMware VMs (moon1a, moon1b) on two different hosts. Each have 2 nic assigned: eth0 for heartbeat while eth1 used for everything else. This is my

Re: [Pacemaker] 2 Node Clustering, when primary server goes down(shutdown) the secondary server restarts

2014-10-29 Thread Arjun Pandey
Googling on fencing agent IPMI helps :) This link might be useful. https://fedorahosted.org/cluster/wiki/IPMI_FencingConfig Regards Arjun On Wed, Oct 29, 2014 at 2:11 PM, kamal kishi kamal.ki...@gmail.com wrote: Thanks for the info, was trying to configure IPMI in the servers. Can you

[Pacemaker] Unusual crmd log

2014-10-29 Thread Arjun Pandey
Hi I have a 2 node active-standby cluster setup. Pacemaker packages that i have on CentOS 6.5 pacemaker-1.1.10-14.el6_5.3.x86_64 pacemaker-libs-1.1.10-14.el6_5.3.x86_64 pacemaker-cli-1.1.10-14.el6_5.3.x86_64 pacemaker-cluster-libs-1.1.10-14.el6_5.3.x86_64 I saw the following logs a few days

Re: [Pacemaker] Split brain and STONITH behavior (VMware fencing)

2014-10-29 Thread Ariel S
Thank you for the reply, I try severing the heartbeat nic and add set the `no-quorum-policy` to `ignore` (I forgot that one) like this: # crm configure property no-quorum-policy=ignore and it works nicely. Thank you, Ariel On 10/29/2014 03:27 PM, Sven Moeller wrote: You can try to force

Re: [Pacemaker] Antwort: Re: fencing with multiple node cluster

2014-10-29 Thread Dejan Muhamedagic
Hi, On Tue, Oct 28, 2014 at 05:32:09PM +0100, philipp.achmuel...@arz.at wrote: hi, Von:Dejan Muhamedagic deja...@fastmail.fm An: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Datum: 28.10.2014 16:45 Betreff:Re: [Pacemaker] fencing with

Re: [Pacemaker] [Linux-HA] Announcing crmsh release 2.1.1

2014-10-29 Thread Dejan Muhamedagic
Hi Kristoffer, On Wed, Oct 29, 2014 at 12:33:55AM +0100, Kristoffer Grönlund wrote: Today we are proud to announce the release of `crmsh` version 2.1.1! This version primarily fixes all known issues found since the release of `crmsh` 2.1 in June. We recommend that all users of crmsh upgrade

[Pacemaker] pacemaker counts probe failure twice

2014-10-29 Thread Andrei Borzenkov
I observe strange behavior that I cannot understand. Pacemaker 1.1.11-3ca8c3b. There is master/slave resource running. Maintenance-mode was set, pacemaker restarted, maintenance-mode reset. This specific RA returns Slave instead of Master for the first probe. But what happens later is rather

[Pacemaker] No demote after pre demote.

2014-10-29 Thread Arjun Pandey
Including the package details pacemaker-1.1.10-14.el6_5.3.x86_64 pacemaker-libs-1.1.10-14.el6_5.3.x86_64 pacemaker-cli-1.1.10-14.el6_5.3.x86_64 pacemaker-cluster-libs-1.1.10-14.el6_5.3.x86_64 Regards Arjun -- Forwarded message -- From: Arjun Pandey apandepub...@gmail.com Date:

Re: [Pacemaker] 2 Node Clustering, when primary server goes down(shutdown) the secondary server restarts

2014-10-29 Thread Digimer
On 29/10/14 04:41 AM, kamal kishi wrote: Thanks for the info, was trying to configure IPMI in the servers. Can you please suggest a configuration procedure for enabling and configuring the IPMI(Which you might have referred to). The sites I came across are not understandable. The servers I'm

[Pacemaker] mysql resource agent

2014-10-29 Thread Keith Ouellette
I am running two servers (Ubuntu 14.04 LTS) as an HA setup for an application that uses mysql. I am using DRBD to replicate the data between the servers and am able to manually start mysql on each server. The DRBD replication is also working fine and Pacemaker (version 1.1.10) seems to handle

Re: [Pacemaker] mysql resource agent

2014-10-29 Thread Andrew Beekhof
On 30 Oct 2014, at 5:57 am, Keith Ouellette kei...@fibermountain.com wrote: I am running two servers (Ubuntu 14.04 LTS) as an HA setup for an application that uses mysql. I am using DRBD to replicate the data between the servers and am able to manually start mysql on each server. The DRBD

Re: [Pacemaker] Unusual crmd log

2014-10-29 Thread Andrew Beekhof
On 29 Oct 2014, at 8:02 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi I have a 2 node active-standby cluster setup. Pacemaker packages that i have on CentOS 6.5 pacemaker-1.1.10-14.el6_5.3.x86_64 pacemaker-libs-1.1.10-14.el6_5.3.x86_64 pacemaker-cli-1.1.10-14.el6_5.3.x86_64

Re: [Pacemaker] Split brain and STONITH behavior (VMware fencing)

2014-10-29 Thread Andrew Beekhof
On 29 Oct 2014, at 7:48 pm, Andrei Borzenkov arvidj...@gmail.com wrote: On Wed, Oct 29, 2014 at 10:46 AM, Ariel S ariel_bis2...@yahoo.co.id wrote: Hello, I'm trying to understand how this STONITH works. I have 2 VMware VMs (moon1a, moon1b) on two different hosts. Each have 2 nic

Re: [Pacemaker] pacemaker counts probe failure twice

2014-10-29 Thread Andrew Beekhof
On 29 Oct 2014, at 10:01 pm, Andrei Borzenkov arvidj...@gmail.com wrote: I observe strange behavior that I cannot understand. Pacemaker 1.1.11-3ca8c3b. There is master/slave resource running. Maintenance-mode was set, pacemaker restarted, maintenance-mode reset. This specific RA returns

Re: [Pacemaker] No demote after pre demote.

2014-10-29 Thread Andrew Beekhof
I cant do much without logs and PE files (ie. crm_report) On 29 Oct 2014, at 11:32 pm, Arjun Pandey apandepub...@gmail.com wrote: Including the package details pacemaker-1.1.10-14.el6_5.3.x86_64 pacemaker-libs-1.1.10-14.el6_5.3.x86_64 pacemaker-cli-1.1.10-14.el6_5.3.x86_64

[Pacemaker] Help understanding what went wrong

2014-10-29 Thread Alex Samad - Yieldbroker
Hi rpm -qa | grep pace pacemaker-libs-1.1.10-14.el6_5.3.x86_64 pacemaker-cluster-libs-1.1.10-14.el6_5.3.x86_64 pacemaker-cli-1.1.10-14.el6_5.3.x86_64 pacemaker-1.1.10-14.el6_5.3.x86_64 centos 6.5 I have a 2 node cluster pcs config Cluster Name: ybrp Corosync Nodes: Pacemaker Nodes: wwwrp1

Re: [Pacemaker] Help understanding what went wrong

2014-10-29 Thread Alex Samad - Yieldbroker
Hi Seems like I had the same problem on my prod cluster Oct 30 01:21:53 prodrp1 lrmd[2536]: warning: child_timeout_callback: ybrpstat_monitor_5000 process (PID 17918) timed out Oct 30 01:21:53 prodrp1 lrmd[2536]: warning: operation_finished: ybrpstat_monitor_5000:17918 - timed out after

Re: [Pacemaker] Help understanding what went wrong

2014-10-29 Thread Alex Samad - Yieldbroker
Another cluster another set of logs This is more interesting all the clusters are setup the same. On this cluster none of the nodes failed But I still got Oct 30 01:20:40 uatrp2 crmd[2595]:error: process_lrm_event: LRM operation ybrpstat_monitor_5000 (17) Timed Out (timeout=2ms) Oct

Re: [Pacemaker] Help understanding what went wrong

2014-10-29 Thread Andrew Beekhof
On 30 Oct 2014, at 11:42 am, Alex Samad - Yieldbroker alex.sa...@yieldbroker.com wrote: Am I reading this right. The reason the resource didn't fail over (ip address) was because there was no ybrpstat running on wwwrp2 and the reason for that was the monitor action failed/timed out

Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-29 Thread Andrew Beekhof
On 29 Oct 2014, at 1:01 pm, Sihan Goi gois...@gmail.com wrote: Hi, I've never used crm_report before. I just read the man file and generated a tarball from 1-2 hours before I reconfigured all the DRBD related resources. I've put the tarball here -

Re: [Pacemaker] Help understanding what went wrong

2014-10-29 Thread Alex Samad - Yieldbroker
Thanks, more investigating to do -Original Message- From: Andrew Beekhof [mailto:and...@beekhof.net] Sent: Thursday, 30 October 2014 1:28 PM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Help understanding what went wrong On 30 Oct 2014, at 11:42 am,

[Pacemaker] ocf-tester

2014-10-29 Thread Alex Samad - Yieldbroker
Hi Where can I find this ? Looked here https://github.com/ClusterLabs/pacemaker/tree/1.1 and its not in rpm resource-agents A ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project

Re: [Pacemaker] ocf-tester

2014-10-29 Thread Andrew Beekhof
On 30 Oct 2014, at 2:46 pm, Alex Samad - Yieldbroker alex.sa...@yieldbroker.com wrote: Hi Where can I find this ? Looked here https://github.com/ClusterLabs/pacemaker/tree/1.1 and its not in rpm resource-agents https://github.com/ClusterLabs/resource-agents/tree/master/tools A

Re: [Pacemaker] pacemaker counts probe failure twice

2014-10-29 Thread Andrei Borzenkov
В Thu, 30 Oct 2014 08:32:24 +1100 Andrew Beekhof and...@beekhof.net пишет: On 29 Oct 2014, at 10:01 pm, Andrei Borzenkov arvidj...@gmail.com wrote: I observe strange behavior that I cannot understand. Pacemaker 1.1.11-3ca8c3b. There is master/slave resource running.

Re: [Pacemaker] pacemaker counts probe failure twice

2014-10-29 Thread Andrew Beekhof
On 30 Oct 2014, at 2:51 pm, Andrei Borzenkov arvidj...@gmail.com wrote: В Thu, 30 Oct 2014 08:32:24 +1100 Andrew Beekhof and...@beekhof.net пишет: On 29 Oct 2014, at 10:01 pm, Andrei Borzenkov arvidj...@gmail.com wrote: I observe strange behavior that I cannot understand. Pacemaker