[Pacemaker] Jeez!

2016-06-05 Thread Andrew Beekhof
Hey, I was looking for something an I've suddenly came accross that thing, jeez... you have to take a look <http://nkecoshapri.bluesquarepopups.com/aewtts> Best Wishes, Andrew Beekhof ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.or

Re: [Pacemaker] Pacemaker Corosync Issue Exiting Automatically

2016-01-06 Thread Andrew Beekhof
"crmd:error: plugin_dispatch:” you’re using the plugin on a rhel6 based install. don’t do that. you’ll need to use pacemaker with cman instead. this happens automatically is you use 'pcs cluster create ...' > On 14 Dec 2015, at 6:46 PM, Sahil Aggarwal wrote: > >

Re: [Pacemaker] Cluster node getting stopped from other node(resending mail)

2015-08-03 Thread Andrew Beekhof
We need a crm_report archive to be able to comment on this sort of thing. A handful of logs from one of the nodes isn’t anywhere near enough. On 29 Jun 2015, at 4:42 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi I am running a 2 node cluster with this config on centos 6.5/6.6

Re: [Pacemaker] How can I wait for a device to be ready?

2015-05-19 Thread Andrew Beekhof
On 15 May 2015, at 3:55 am, Carlos Xavier cbas...@connection.com.br wrote: Hi. I are doing some testes with OCFS2 running on a AOE shared disk. The tests are going on a OpenSuse 12.3 with the following packages: ocfs2-tools-o2cb-1.8.2-4.8.1.x86_64 ocfs2-tools-1.8.2-4.8.1.x86_64

Re: [Pacemaker] [ClusterLabs] Unexpected behaviour of PEngine Recheck Timer while in maintenance mode

2015-05-17 Thread Andrew Beekhof
On 29 Apr 2015, at 5:40 am, Rolf Weber rolf.we...@asamnet.de wrote: Hi! On 07:27 Mon 27 Apr , Andrew Beekhof wrote: What exactly were you doing at this point? resizing a filesystem. fs was unexported and unmounted. as I understand maintenence mode this should have worked

Re: [Pacemaker] Centos 70-71 update fails with Application of an update diff failed (rc=-206)

2015-04-27 Thread Andrew Beekhof
On 27 Apr 2015, at 6:35 pm, Patrick Zwahlen p...@navixia.com wrote: Apart from those scary logs, does anything actually break? What your seeing is probably just ignorable noise from the older version - I would expect the underlying cib to resolve things correctly. Thanks Andrew for the

Re: [Pacemaker] Unique clone instance is stopped too early on move

2015-04-26 Thread Andrew Beekhof
On 17 Apr 2015, at 4:19 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 17.04.2015 00:48, Andrew Beekhof wrote: On 22 Jan 2015, at 12:04 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:44, Andrew Beekhof wrote: On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov bub

Re: [Pacemaker] Add new node in the case of unicast with corosync v1.4.2

2015-04-26 Thread Andrew Beekhof
On 26 Apr 2015, at 5:58 am, Kadlecsik József kadlecsik.joz...@wigner.mta.hu wrote: Hi, We have an unicast setup with corosync v1.4.2 which is unable to reload it's configuration file, and want to add a new node. Our plan is to enable maintenance mode in pacemaker and restart

Re: [Pacemaker] [ClusterLabs] Unexpected behaviour of PEngine Recheck Timer while in maintenance mode

2015-04-26 Thread Andrew Beekhof
On 21 Apr 2015, at 6:24 am, Rolf Weber rolf.we...@asamnet.de wrote: Hi all! I encountered some strange behaviour of the PEngine Recheck Timer while in maintenance mode ending in a reboot. Setup is a 2 node cluster, 1 resource group consisting of several drbds and filesystems that are

Re: [Pacemaker] stonith

2015-04-26 Thread Andrew Beekhof
On 19 Apr 2015, at 11:37 pm, Andrei Borzenkov arvidj...@gmail.com wrote: В Sun, 19 Apr 2015 14:23:27 +0200 Andreas Kurz andreas.k...@gmail.com пишет: On 2015-04-17 12:36, Thomas Manninger wrote: Hi list, i have a pacemaker/corosync2 setup with 4 nodes, stonith configured over ipmi

Re: [Pacemaker] Centos 70-71 update fails with Application of an update diff failed (rc=-206)

2015-04-26 Thread Andrew Beekhof
On 26 Apr 2015, at 7:27 pm, Patrick Zwahlen p...@navixia.com wrote: map your ip cluster to hostname using /etc/hosts and try to use an example like this http://clusterlabs.org/doc/fr/Pacemaker/1.1- pcs/html/Clusters_from_Scratch/_sample_corosync_configuration.html I've added name: fqdn

Re: [Pacemaker] coronosyc 1.2.1 with pacemaker and openais is suse11sp1

2015-04-19 Thread Andrew Beekhof
On 15 Apr 2015, at 10:06 pm, Timi alia...@gmail.com wrote: Hi guys, we have a cluster setup with: coronosyc 1.2.1 with pacemaker and openais is suse11sp1 on two nodes connected via direct cable for heartbeat, we checked the connection and its ok. we are having this on the logs:

Re: [Pacemaker] Unique clone instance is stopped too early on move

2015-04-16 Thread Andrew Beekhof
On 22 Jan 2015, at 12:04 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:44, Andrew Beekhof wrote: On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.01.2015 07:44, Andrew Beekhof wrote: On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov bub

Re: [Pacemaker] Unique clone instance is stopped too early on move

2015-04-13 Thread Andrew Beekhof
On 22 Jan 2015, at 12:04 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:44, Andrew Beekhof wrote: On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.01.2015 07:44, Andrew Beekhof wrote: On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov bub

Re: [Pacemaker] One node thinks everyone is online, but the other node doesn't think so

2015-03-29 Thread Andrew Beekhof
On 11 Mar 2015, at 2:21 am, Dmitry Koterov dmitry.kote...@gmail.com wrote: On Tue, Feb 24, 2015 at 2:07 AM, Andrew Beekhof and...@beekhof.net wrote: I have a 3-node cluster where node1 and node2 are running corosync+pacemaker and node3 is running corosync only (for quorum). Corosync

Re: [Pacemaker] why sometimes pengine seems lazy

2015-03-29 Thread Andrew Beekhof
On 10 Feb 2015, at 6:50 pm, d tbsky tbs...@gmail.com wrote: hi: I was using pacemaker and drbd with sl linux 6.5/6.6. all are fine. now I am tesing sl linux 7.0 and I notice when I want to promote the drbd resource with pcs resource meta my-ms-drbd master-max=2. sometimes

Re: [Pacemaker] Master-Slave role stickiness

2015-03-29 Thread Andrew Beekhof
On 23 Jan 2015, at 9:13 am, brook davis brook.da...@nimboxx.com wrote: snip It sounds like default-resource-stickiness does not kick in; and with default resource-stickiness=1 it is expected (10 6). Documentation says default-recource-stickiness is deprecated so may be it is ignored

Re: [Pacemaker] Colocating with unmanaged resource

2015-03-29 Thread Andrew Beekhof
On 28 Feb 2015, at 6:00 am, Покотиленко Костик cas...@meteor.dp.ua wrote: В Чтв, 22/01/2015 в 14:59 +1100, Andrew Beekhof пишет: On 15 Jan 2015, at 12:54 am, Покотиленко Костик cas...@meteor.dp.ua wrote: В Вто, 06/01/2015 в 16:27 +1100, Andrew Beekhof пишет: On 20 Dec 2014, at 6:21 am

Re: [Pacemaker] Pre/post-action notifications for master-slave and clone resources

2015-03-29 Thread Andrew Beekhof
On 27 Jan 2015, at 9:57 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, Playing with two-week old git master on a two-node cluster I discovered that only limited set of notify operations is performed for clone and master-slave instances when all of them are being

Re: [Pacemaker] pacemaker-1.1.12 - lots of Could not establish cib_ro connection: Resource temporarily unavailable (11) errors

2015-03-18 Thread Andrew Beekhof
On 18 Mar 2015, at 7:01 pm, Nikola Ciprich nikola.cipr...@linuxbox.cz wrote: Hello Andrew, It certainly explains the log message. Do you have a lot of these resources querying the CIB? Perhaps its overloaded well, it keeps happening when I try to start many those resources in one

Re: [Pacemaker] pacemaker-1.1.12 - lots of Could not establish cib_ro connection: Resource temporarily unavailable (11) errors

2015-03-15 Thread Andrew Beekhof
On 14 Mar 2015, at 5:53 pm, Nikola Ciprich nikola.cipr...@linuxbox.cz wrote: Hello Andrew, I'm really sorry for replying this late.. The python command has nothing to do with the cluster and no reason to connect to the cib? well, python script actually executes crm_mon to do some

Re: [Pacemaker] Suggestions for managing HA of containers from within a Pacemaker container?

2015-02-25 Thread Andrew Beekhof
. What's the reason in isolating something and then giving it all permissions on a host machine? Probably because someone realised that they wanted to container-ize the software for creating containers and nesting them was too horrible to contemplate. On Mon, Feb 23, 2015 at 5:20 PM, Andrew

Re: [Pacemaker] Suggestions for managing HA of containers from within a Pacemaker container?

2015-02-25 Thread Andrew Beekhof
On 26 Feb 2015, at 8:51 am, Digimer li...@alteeve.ca wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/02/15 04:45 PM, David Vossel wrote: - Original Message - Pacemaker as a scheduler in Mesos or Kubernates does sound like a very interesting idea. Packaging

Re: [Pacemaker] One more globally-unique clone question

2015-02-25 Thread Andrew Beekhof
On 24 Feb 2015, at 4:35 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 24.02.2015 01:58, Andrew Beekhof wrote: On 21 Jan 2015, at 5:08 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.01.2015 03:51, Andrew Beekhof wrote: On 20 Jan 2015, at 4:13 pm, Vladislav Bogdanov bub

Re: [Pacemaker] Running pacemaker as non-root user

2015-02-24 Thread Andrew Beekhof
On 24 Feb 2015, at 10:36 pm, N, Ravikiran ravikira...@hp.com wrote: Hi all, I was trying to find out whether it is possible to START/STOP pacemaker, and also run PCS commands as non-root user (in my case it is ‘admin’ user). I did add the user(‘admin’) to haclient group, but it is of

Re: [Pacemaker] Issues Migrating from 12.04 to 14.04 with resource-stickiness

2015-02-23 Thread Andrew Beekhof
It looks like jiravip1 has failed in a lot of places. Is this the complete configuration? I would have expected some colocation constraints from the behaviour. Also, you understand what symmetric-cluster=false does? On 13 Feb 2015, at 4:29 am, Krakowitzer, Merritt mkrakowit...@fnb.co.za

Re: [Pacemaker] Suggestions for managing HA of containers from within a Pacemaker container?

2015-02-23 Thread Andrew Beekhof
On 10 Feb 2015, at 1:45 pm, Serge Dubrouski serge...@gmail.com wrote: Hello Steve, Are you sure that Pacemaker is the right product for your project? Have you checked Mesos/Marathon or Kubernates? Those are frameworks being developed for managing containers. And in a few years they'll

Re: [Pacemaker] Suggestions for managing HA of containers from within a Pacemaker container?

2015-02-23 Thread Andrew Beekhof
On 8 Feb 2015, at 7:09 am, Steven Dake (stdake) std...@cisco.com wrote: Hi, I am working on Containerizing OpenStack in the Kolla project (http://launchpad.net/kolla). One of the key things we want to do over the next few months is add H/A support to our container tech. David Vossel

Re: [Pacemaker] One node thinks everyone is online, but the other node doesn't think so

2015-02-23 Thread Andrew Beekhof
On 26 Jan 2015, at 10:53 am, Dmitry Koterov dmitry.kote...@gmail.com wrote: Hello. I have a 3-node cluster where node1 and node2 are running corosync+pacemaker and node3 is running corosync only (for quorum). Corosync 2.3.3, pacemaker 1.1.10. Everything worked fine the first couple of

Re: [Pacemaker] Colocation constraint getting removed

2015-02-23 Thread Andrew Beekhof
On 22 Jan 2015, at 4:39 pm, Arjun Pandey apandepub...@gmail.com wrote: Any pointers on this would be helpful. Constraints don't get removed automatically unless someone asked for a resource that it references to be deleted. Other possibilities include, someone asked to delete the constraint

Re: [Pacemaker] Pacemaker won't start after node was fenced

2015-02-23 Thread Andrew Beekhof
On 27 Jan 2015, at 5:23 pm, Jake Smith jsm...@argotec.com wrote: Had a failover of my active/passive cluster and now the passive node will not rejoin the cluster. 2 nodes running Ubuntu 12.04 coro 1.4.2-2, openais 1.1.4-4, pcmk 1.1.6-2ubuntu3 Corosync ring membership is fine on

Re: [Pacemaker] pacemaker/corosync: a resource is started on 2 nodes

2015-02-23 Thread Andrew Beekhof
On 28 Jan 2015, at 9:20 pm, Sergey Arlashin sergeyarl.maill...@gmail.com wrote: Hi! I have a small corosync/pacemaker based cluster which consists of 4 nodes. 2 nodes are in standby mode, another 2 actually handle all the resources. corosync ver. 1.4.7-1. pacemaker ver 1.1.11.

Re: [Pacemaker] One more globally-unique clone question

2015-02-23 Thread Andrew Beekhof
On 21 Jan 2015, at 5:08 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.01.2015 03:51, Andrew Beekhof wrote: On 20 Jan 2015, at 4:13 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:47, Andrew Beekhof wrote: On 17 Jan 2015, at 1:25 am, Vladislav Bogdanov bub

Re: [Pacemaker] Querying resource status programatically

2015-02-22 Thread Andrew Beekhof
On 7 Feb 2015, at 3:01 pm, Brian Campbell brian.campb...@editshare.com wrote: Hi all! I'm writing a domain-specific frontend for Pacemaker, which can set up a few different pre-configured stacks of resources, and provide simplified monitoring and administration of those stacks. One

Re: [Pacemaker] pacemaker-1.1.12 - lots of Could not establish cib_ro connection: Resource temporarily unavailable (11) errors

2015-02-22 Thread Andrew Beekhof
On 9 Feb 2015, at 8:06 pm, Nikola Ciprich nikola.cipr...@linuxbox.cz wrote: Hello, I'd like to ask about following problem that troubles me for some time and I wan't able to find solution for: I've got cluster with quite a lot of resources, and when I try to do multiple operations at

Re: [Pacemaker] pacemaker does not start after cman config

2015-02-22 Thread Andrew Beekhof
On 20 Feb 2015, at 9:24 pm, Lukas Kostyan lukas.kost...@gmail.com wrote: 2015-02-19 22:49 GMT+01:00 Andrew Beekhof and...@beekhof.net: On 10 Feb 2015, at 11:53 pm, Lukas Kostyan lukas.kost...@gmail.com wrote: Hi all, was following the guide from clusterlab but use debian

Re: [Pacemaker] 'stop' operation passes outdated set of instance attributes to RA

2015-02-22 Thread Andrew Beekhof
On 14 Feb 2015, at 1:10 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, I believe that is a bug that 'stop' operation uses set of instance attributes from the original 'start' op, not what successful 'reload' had. Corresponding pe-input has correct set of attributes, and pre-stop

Re: [Pacemaker] Cannot fail-over Master/Slave resource collocated with ping resource at the HDD crash

2015-02-22 Thread Andrew Beekhof
On 16 Feb 2015, at 8:15 pm, NAKAHIRA Kazutomo nakahira_kazutomo...@lab.ntt.co.jp wrote: Hi all, I encountered trouble that Master/Slave resource collocated with ping resource can not fail-over at the HDD crash. After HDD crash, stop operation of the ping resource is looping and

Re: [Pacemaker] Multiple live cib in one pacemaker

2015-02-19 Thread Andrew Beekhof
On 20 Feb 2015, at 1:52 am, Kristoffer Grönlund kgronl...@suse.com wrote: Adam Błaszczykowski adam.blaszczykow...@gmail.com writes: Hello, I am using Pacemaker 1.1.12 together with Corosync 2.4.3 in my cluster environment. I have two nodes in cluster that are in different LAN locations.

Re: [Pacemaker] RHEL 6 to RHEL 7 upgrade

2015-02-19 Thread Andrew Beekhof
On 11 Feb 2015, at 9:36 pm, Alex Samad - Yieldbroker alex.sa...@yieldbroker.com wrote: Hi I am doing some planning for a Centos 6 to Centos 7 upgrade. Just wondering if there are any gotchas with pacemaker/cman? There is no cman in 7 You'll need to configure corosync2 directly

Re: [Pacemaker] [Linux-HA] Announcing the Heartbeat 3.0.6 Release

2015-02-19 Thread Andrew Beekhof
On 11 Feb 2015, at 8:24 am, Lars Ellenberg lars.ellenb...@linbit.com wrote: TL;DR: If you intend to set up a new High Availability cluster using the Pacemaker cluster manager, you typically should not care for Heartbeat, but use recent releases (2.3.x) of Corosync. If you

Re: [Pacemaker] pacemaker does not start after cman config

2015-02-19 Thread Andrew Beekhof
On 10 Feb 2015, at 11:53 pm, Lukas Kostyan lukas.kost...@gmail.com wrote: Hi all, was following the guide from clusterlab but use debian wheezy. corosync 1.4.2-3 pacemaker 1.1.7-1 cman 3.0.12-3.2+deb7u2 configured the active/passive with no

[Pacemaker] This mailing list will go away soon

2015-02-19 Thread Andrew Beekhof
One of the things that was agreed at the recent cluster summit was a consolidation cluster related irc channels, mailing lists, and websites. In keeping with this, the pacemaker mailing list should now be considered deprecated and will no longer accept messages as of March 1st 2015. We still

Re: [Pacemaker] Stop single resource on quorum loss but not others

2015-01-22 Thread Andrew Beekhof
vague in my previous email. On January 22, 2015 6:21:34 PM MST, Andrew Beekhof and...@beekhof.net wrote: On 23 Jan 2015, at 8:50 am, Rahim Millious rmilli...@gmail.com wrote: Hello, I am hoping someone can help me. I have a custom resource agent which requires access (via ssh

Re: [Pacemaker] (no subject)

2015-01-22 Thread Andrew Beekhof
On 23 Jan 2015, at 8:50 am, Rahim Millious rmilli...@gmail.com wrote: Hello, I am hoping someone can help me. I have a custom resource agent which requires access (via ssh) to the passive node in order to function correctly. Is it possible to stop the resource when quorum is lost and

Re: [Pacemaker] Colocating with unmanaged resource

2015-01-21 Thread Andrew Beekhof
On 15 Jan 2015, at 12:54 am, Покотиленко Костик cas...@meteor.dp.ua wrote: В Вто, 06/01/2015 в 16:27 +1100, Andrew Beekhof пишет: On 20 Dec 2014, at 6:21 am, Покотиленко Костик cas...@meteor.dp.ua wrote: Here are behaviors of different versions of pacemaker: 1.1.12: - stopping nginx

Re: [Pacemaker] breaking resource dependencies by replacing resource group by co-location constrains

2015-01-20 Thread Andrew Beekhof
group grp_application res_mount-application res_Service-IP res_mount-CIFSshareData res_mount-CIFSshareData2 res_mount-CIFSshareData3 res_MyApplication is just a shortcut for colocation res_Service-IP with res_mount-application colocation res_mount-CIFSshareData with res_Service-IP ... and

Re: [Pacemaker] One more globally-unique clone question

2015-01-20 Thread Andrew Beekhof
On 20 Jan 2015, at 4:13 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:47, Andrew Beekhof wrote: On 17 Jan 2015, at 1:25 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, Trying to reproduce problem with early stop of globally-unique clone instances

Re: [Pacemaker] One more globally-unique clone question

2015-01-19 Thread Andrew Beekhof
On 17 Jan 2015, at 1:25 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, Trying to reproduce problem with early stop of globally-unique clone instances during move to another node I found one more interesting problem. Due to the different order of resources in the CIB and

Re: [Pacemaker] Unique clone instance is stopped too early on move

2015-01-19 Thread Andrew Beekhof
On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.01.2015 07:44, Andrew Beekhof wrote: On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 13.01.2015 11:32, Andrei Borzenkov wrote: On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov

Re: [Pacemaker] Unique clone instance is stopped too early on move

2015-01-15 Thread Andrew Beekhof
On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 13.01.2015 11:32, Andrei Borzenkov wrote: On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all. I found a little bit strange operation ordering during transition

Re: [Pacemaker] Avoid one node from being a target for resources migration

2015-01-14 Thread Andrew Beekhof
. 'quorum.two_node:1' is only sane for a 2 node cluster On Wednesday, January 14, 2015, Andrew Beekhof and...@beekhof.net wrote: On 14 Jan 2015, at 12:06 am, Dmitry Koterov dmitry.kote...@gmail.com wrote: Then I see that, although node2 clearly knows it's isolated (it doesn't see other 2

Re: [Pacemaker] Setting cib-bootstrap-options parameters before DC election

2015-01-13 Thread Andrew Beekhof
On 14 Jan 2015, at 12:19 am, AWeber - Ryan Steele ry...@aweber.com wrote: Hi folks, For testing scenarios in which I’m only spinning up nodes on my laptop (test-kitchen), I don’t really need the full 20 seconds for dc-deadtime. However, I haven’t been successful in finding a way to

Re: [Pacemaker] Avoid one node from being a target for resources migration

2015-01-12 Thread Andrew Beekhof
On 13 Jan 2015, at 4:25 am, David Vossel dvos...@redhat.com wrote: - Original Message - Hello. I have 3-node cluster managed by corosync+pacemaker+crm. Node1 and Node2 are DRBD master-slave, also they have a number of other services installed (postgresql, nginx, ...). Node3

Re: [Pacemaker] Avoid one node from being a target for resources migration

2015-01-12 Thread Andrew Beekhof
On 13 Jan 2015, at 7:56 am, Dmitry Koterov dmitry.kote...@gmail.com wrote: 1. install the resource related packages on node3 even though you never want them to run there. This will allow the resource-agents to verify the resource is in fact inactive. Thanks, your advise helped: I

Re: [Pacemaker] Help needed to configure MySQL Cluster using Pacemaker, Corosync, DRBD and PCS

2015-01-11 Thread Andrew Beekhof
On 11 Jan 2015, at 5:39 pm, Shameer Babu shameerbab...@gmail.com wrote: Hi, I have configured Apache cluster by referring you document http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf and it was good and working. Now I would like to configure a simple MySQL cluster using

Re: [Pacemaker] Clustermon issue

2015-01-11 Thread Andrew Beekhof
On 9 Jan 2015, at 6:23 pm, Marco Querci mquerc...@gmail.com wrote: Sorry ... it was my error. My CentOS is 6.6: same as I said last time, except s/6.6/6.7/ [root@langate1 ~]# cat /etc/redhat-release CentOS release 6.6 (Final) full upgraded. Il 08/01/2015 23:00, Andrew Beekhof

Re: [Pacemaker] BUG: crm_mon prints status of clone instances being started as 'Started'

2015-01-11 Thread Andrew Beekhof
I'll push this up soon: diff --git a/lib/pengine/clone.c b/lib/pengine/clone.c index 596f701..b83798a 100644 --- a/lib/pengine/clone.c +++ b/lib/pengine/clone.c @@ -438,6 +438,10 @@ clone_print(resource_t * rsc, const char *pre_text, long options, void *print_da /* Unique, unmanaged

Re: [Pacemaker] Clustermon issue

2015-01-08 Thread Andrew Beekhof
maintainers aren't allowed to fix it until 6.6). Many Thanks. Il 08/01/2015 03:39, Andrew Beekhof ha scritto: On 8 Jan 2015, at 1:31 pm, Andrew Beekhof and...@beekhof.net wrote: And there is no indication this is being called? Doh. I know this one... you're actually using 1.1.12-rc3

Re: [Pacemaker] Patches: RFC before pull request

2015-01-07 Thread Andrew Beekhof
They all look sane to me. Please proceed with a pull request :-) We should probably start thinking about .13 (or .14 for the superstitious), there have been quite a few important patches arrive since .12 was released. On 10 Dec 2014, at 1:33 am, Lars Ellenberg lars.ellenb...@linbit.com

Re: [Pacemaker] pgsql troubles.

2015-01-07 Thread Andrew Beekhof
On 5 Dec 2014, at 4:16 am, steve st...@unliketea.com wrote: Good Afternoon, I am having loads of trouble with pacemaker/corosync/postgres. Defining the symptoms is rather difficult. The primary being that postgres starts as slave on both nodes. I have tested the pgsqlRA

Re: [Pacemaker] Clustermon issue

2015-01-07 Thread Andrew Beekhof
mquerc...@gmail.com Thanks. Il 06/01/2015 01:21, Andrew Beekhof ha scritto: On 6 Jan 2015, at 3:37 am, Marco Querci mquerc...@gmail.com wrote: Hi All. Any news for my problem? Maybe post your /home/administrator/clustermonitor_notification.sh script? Many thanks. Il 19/12

Re: [Pacemaker] Clustermon issue

2015-01-07 Thread Andrew Beekhof
On 8 Jan 2015, at 1:31 pm, Andrew Beekhof and...@beekhof.net wrote: And there is no indication this is being called? Doh. I know this one... you're actually using 1.1.12-rc3. You need this patch which landed after 1.1.12 shipped: https://github.com/beekhof/pacemaker/commit/3df6aff

Re: [Pacemaker] Corosync 1.4.7: zombie (defunct)

2015-01-05 Thread Andrew Beekhof
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux -- Best regards, Sergey Arlashin On Jan 5, 2015, at 7:59 AM, Andrew Beekhof and...@beekhof.net wrote: pacemaker version? it looks familiar but it depends on the version number. On 29 Dec 2014, at 10:24 pm, Sergey Arlashin sergeyarl.maill

Re: [Pacemaker] Clustermon issue

2015-01-05 Thread Andrew Beekhof
On 6 Jan 2015, at 3:37 am, Marco Querci mquerc...@gmail.com wrote: Hi All. Any news for my problem? Maybe post your /home/administrator/clustermonitor_notification.sh script? Many thanks. Il 19/12/2014 12:13, Marco Querci ha scritto: Many tahnk for your reply. Here is my

Re: [Pacemaker] Colocating with unmanaged resource

2015-01-05 Thread Andrew Beekhof
On 20 Dec 2014, at 6:21 am, Покотиленко Костик cas...@meteor.dp.ua wrote: Hi, Simple scenario, several floating IPs should be living on front nodes only if there is working Nginx. There are several reasons against Nginx being controlled by Pacemaker. So, decided to colocate FIPs with

Re: [Pacemaker] Corosync 1.4.7: zombie (defunct)

2015-01-04 Thread Andrew Beekhof
pacemaker version? it looks familiar but it depends on the version number. On 29 Dec 2014, at 10:24 pm, Sergey Arlashin sergeyarl.maill...@gmail.com wrote: Hi! Recently I've noticed that one of my nodes had OFFLINE status in 'crm status' output. But it actually was not. I could ssh on

[Pacemaker] Where is Beekhof?

2014-12-14 Thread Andrew Beekhof
Just a courtesy email for anyone looking for me or waiting on a reply from me specifically... I have been sucked into openstack hell and probably won't re-emerge until early January at the earliest. Until then its unlikely that I'll be able to deal with anything except the lowest of the low

Re: [Pacemaker] Suicide fencing and watchdog questions

2014-11-30 Thread Andrew Beekhof
On 29 Nov 2014, at 5:36 pm, Andrei Borzenkov arvidj...@gmail.com wrote: В Thu, 27 Nov 2014 08:24:56 +0300 Vladislav Bogdanov bub...@hoster-ok.com пишет: 27.11.2014 03:43, Andrew Beekhof wrote: On 25 Nov 2014, at 10:37 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi

Re: [Pacemaker] [ha-wg-technical] [ha-wg] [Cluster-devel] [Linux-HA] [RFC] Organizing HA Summit 2015

2014-11-26 Thread Andrew Beekhof
On 27 Nov 2014, at 2:41 am, Lars Marowsky-Bree l...@suse.com wrote: On 2014-11-25T16:46:01, David Vossel dvos...@redhat.com wrote: Okay, okay, apparently we have got enough topics to discuss. I'll grumble a bit more about Brno, but let's get the organisation of that thing on track ...

Re: [Pacemaker] Suicide fencing and watchdog questions

2014-11-26 Thread Andrew Beekhof
On 25 Nov 2014, at 10:37 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, Is there any information how watchdog integration is intended to work? What are currently-evaluated use-cases for that? It seems to be forcibly disabled id SBD is not detected... Are you referring to

Re: [Pacemaker] [Cluster-devel] [Linux-HA] [ha-wg] [RFC] Organizing HA Summit 2015

2014-11-25 Thread Andrew Beekhof
On 25 Nov 2014, at 8:54 pm, Lars Marowsky-Bree l...@suse.com wrote: On 2014-11-24T16:16:05, Fabio M. Di Nitto fdini...@redhat.com wrote: Yeah, well, devconf.cz is not such an interesting event for those who do not wear the fedora ;-) That would be the perfect opportunity for you to

Re: [Pacemaker] [ha-wg-technical] [Cluster-devel] [Linux-HA] [ha-wg] [RFC] Organizing HA Summit 2015

2014-11-25 Thread Andrew Beekhof
On 26 Nov 2014, at 10:06 am, Digimer li...@alteeve.ca wrote: On 25/11/14 04:31 PM, Andrew Beekhof wrote: Yeah, but you're already bringing him for your personal conference. That's a bit different. ;-) OK, let's switch tracks a bit. What *topics* do we actually have? Can we fill two days

Re: [Pacemaker] [ha-wg-technical] [ha-wg] [Linux-HA] [RFC] Organizing HA Summit 2015

2014-11-25 Thread Andrew Beekhof
On 26 Nov 2014, at 4:51 pm, Fabio M. Di Nitto fabbi...@fabbione.net wrote: On 11/25/2014 10:54 AM, Lars Marowsky-Bree wrote: On 2014-11-24T16:16:05, Fabio M. Di Nitto fdini...@redhat.com wrote: Yeah, well, devconf.cz is not such an interesting event for those who do not wear the

Re: [Pacemaker] Globally-unique clone cleanup on remote nodes

2014-11-20 Thread Andrew Beekhof
On 20 Nov 2014, at 5:44 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.11.2014 09:25, Andrew Beekhof пишет: On 20 Nov 2014, at 5:12 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.11.2014 01:57, Andrew Beekhof пишет: On 19 Nov 2014, at 4:27 pm, Vladislav Bogdanov bub

Re: [Pacemaker] Globally-unique clone cleanup on remote nodes

2014-11-19 Thread Andrew Beekhof
On 19 Nov 2014, at 4:27 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, just an observation. I have a globally-unique clone with 50 instances in a cluster consisting of one cluster node and 3 remote bare-metal nodes. When I run crm_resource -C -r 'g_u_clone' (crmsh does

Re: [Pacemaker] Globally-unique clone cleanup on remote nodes

2014-11-19 Thread Andrew Beekhof
On 20 Nov 2014, at 5:12 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.11.2014 01:57, Andrew Beekhof пишет: On 19 Nov 2014, at 4:27 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, just an observation. I have a globally-unique clone with 50 instances in a cluster

Re: [Pacemaker] Configuring Dependencies and Groups in CRM

2014-11-16 Thread Andrew Beekhof
On 17 Nov 2014, at 12:54 am, Stephan step...@projectlabs.de wrote: Hello, I've a cluster with two nodes with a master-slave DRBD and filesystem. On top of the filesystem there are several OpenVZ containers which are managed by ocf:heartbeat:ManageVE. Currently I've configured the

Re: [Pacemaker] resource-stickiness not working?

2014-11-16 Thread Andrew Beekhof
On 14 Nov 2014, at 5:52 am, Scott Donoho sdon...@cray.com wrote: Here is a simple Active/Passive configuration with a single Dummy resource (see end of message). The resource-stickiness default is set to 100. I was assuming that this would be enough to keep the Dummy resource on the

Re: [Pacemaker] Long failover

2014-11-16 Thread Andrew Beekhof
On 14 Nov 2014, at 10:57 pm, Dmitry Matveichev d.matveic...@mfisoft.ru wrote: Hello, We have a cluster configured via pacemaker+corosync+crm. The configuration is: node master node slave primitive HA-VIP1 IPaddr2 \ params ip=192.168.22.71 nic=bond0 \ op monitor

Re: [Pacemaker] doubt when cloning a resource

2014-11-16 Thread Andrew Beekhof
On 15 Nov 2014, at 3:17 am, david escartin descar...@systemonenoc.com wrote: Hello all we are trying to have in a 2 node cluster one resource TEST (LSB type) cloned Thats your problem. LSB resources cannot be cloned with globally-unique=true Why do you think you need

Re: [Pacemaker] Reset failcount for resources

2014-11-16 Thread Andrew Beekhof
On 13 Nov 2014, at 10:08 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi I am running a 2 node cluster with this config Master/Slave Set: foo-master [foo] Masters: [ bharat ] Slaves: [ ram ] AC_FLT (ocf::pw:IPaddr): Started bharat CR_CP_FLT (ocf::pw:IPaddr): Started bharat

Re: [Pacemaker] Intermittent Failovers: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)

2014-11-16 Thread Andrew Beekhof
On 11 Nov 2014, at 1:32 am, Zach Wolf zw...@doublepositive.com wrote: Hey Team, I’m receiving some strange intermittent failovers on a two-node cluster (happens once every week or two). When this happens, both nodes are unavailable; one node will be marked offline and the other will be

Re: [Pacemaker] Long failover

2014-11-16 Thread Andrew Beekhof
On 17 Nov 2014, at 6:17 pm, Andrei Borzenkov arvidj...@gmail.com wrote: On Mon, Nov 17, 2014 at 9:34 AM, Andrew Beekhof and...@beekhof.net wrote: On 14 Nov 2014, at 10:57 pm, Dmitry Matveichev d.matveic...@mfisoft.ru wrote: Hello, We have a cluster configured via pacemaker+corosync

Re: [Pacemaker] Notes on pacemaker installation on OmniOS

2014-11-13 Thread Andrew Beekhof
there was something I had missed. Andreas -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 13. November 2014 11:13 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Notes on pacemaker installation on OmniOS

Re: [Pacemaker] Notes on pacemaker installation on OmniOS

2014-11-13 Thread Andrew Beekhof
branch upstream4 with the changes against the current master. Excellent -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 13. November 2014 12:11 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Notes on pacemaker

Re: [Pacemaker] Loosing corosync communication clusterwide

2014-11-11 Thread Andrew Beekhof
On 11 Nov 2014, at 10:12 pm, Daniel Dehennin daniel.dehen...@baby-gnu.org wrote: Andrew Beekhof and...@beekhof.net writes: [...] I have fencing configured and working, modulo fencing VMs on dead host[1]. Are you saying that the host and the VMs running inside it are both part

Re: [Pacemaker] Loosing corosync communication clusterwide

2014-11-10 Thread Andrew Beekhof
On 11 Nov 2014, at 4:39 am, Daniel Dehennin daniel.dehen...@baby-gnu.org wrote: emmanuel segura emi2f...@gmail.com writes: I think, you don't have fencing configured in your cluster. I have fencing configured and working, modulo fencing VMs on dead host[1]. Are you saying that the

Re: [Pacemaker] batch-limit with many resources

2014-11-09 Thread Andrew Beekhof
On 8 Nov 2014, at 10:22 pm, Matthias Teege matthias-gm...@mteege.de wrote: Hello, I have a cluster with 300 resources. A lot of them are using the same monitoring intervalls. Is it necessary to increase the batch-limit to allow pacemaker to run all monitoring scripts in parallel? Short

Re: [Pacemaker] colocate three resources

2014-11-09 Thread Andrew Beekhof
On 9 Nov 2014, at 9:28 pm, Matthias Teege matthias-gm...@mteege.de wrote: Hallo, On a cluster I have to place three resources on the same node. ms ms_disk_R p_disk_R ms ms_disk_S p_disk_S primitive vm_srv ocf:heartbeat:VirtualDomain The colocation constraints looks like this:

Re: [Pacemaker] How to avoid CRM sending stop when ha.cf gets 2nd node configured

2014-11-09 Thread Andrew Beekhof
On 8 Nov 2014, at 11:58 am, aridh bose ari...@yahoo.com wrote: Hi, While using heartbeat and pacemaker, is it possible to bringup first node which can go as Master, followed by second node which should go as Slave without causing any issues to the first node? Currently, I see a couple

Re: [Pacemaker] One direction switchover not working

2014-11-04 Thread Andrew Beekhof
On 4 Nov 2014, at 3:46 am, Andrei Borzenkov arvidj...@gmail.com wrote: В Mon, 3 Nov 2014 15:05:39 +0530 Arjun Pandey apandepub...@gmail.com пишет: Hi I have a 2 node CentOS -6.5 based cluster. This being run in active standby mode. When i want to trigger switchover from blackwidow to

Re: [Pacemaker] stonith q

2014-11-04 Thread Andrew Beekhof
On 5 Nov 2014, at 9:39 am, Alex Samad - Yieldbroker alex.sa...@yieldbroker.com wrote: I read to mean that demorp2 killed this node Nov 4 23:21:37 demorp1 corosync[23415]: cman killed by node 2 because we were killed by cman_tool or other application Nov 4 23:21:37 demorp1

Re: [Pacemaker] [Problem] Error message of crm_failcount is not right.

2014-11-04 Thread Andrew Beekhof
On 5 Nov 2014, at 1:05 pm, renayama19661...@ybb.ne.jp wrote: Hi All, The next error is displayed when I carry out crm_failcount of Pacemaker. [root@rh70-node1 ~]# crm_failcount error: crm_abort:read_attr_delegate: Triggered assert at cib_attrs.c:342 : attr_name != NULL ||

Re: [Pacemaker] Occasional nonsensical resource agent errors, redux

2014-11-02 Thread Andrew Beekhof
On 1 Nov 2014, at 11:03 pm, Patrick Kane p...@wawd.com wrote: Hi all: In July, list member Ken Gaillot reported occasional nonsensical resource agent errors using Pacemaker (http://oss.clusterlabs.org/pipermail/pacemaker/2014-July/022231.html). We're seeing similar issues with our

Re: [Pacemaker] mysql resource agent

2014-10-29 Thread Andrew Beekhof
On 30 Oct 2014, at 5:57 am, Keith Ouellette kei...@fibermountain.com wrote: I am running two servers (Ubuntu 14.04 LTS) as an HA setup for an application that uses mysql. I am using DRBD to replicate the data between the servers and am able to manually start mysql on each server. The DRBD

Re: [Pacemaker] Unusual crmd log

2014-10-29 Thread Andrew Beekhof
On 29 Oct 2014, at 8:02 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi I have a 2 node active-standby cluster setup. Pacemaker packages that i have on CentOS 6.5 pacemaker-1.1.10-14.el6_5.3.x86_64 pacemaker-libs-1.1.10-14.el6_5.3.x86_64 pacemaker-cli-1.1.10-14.el6_5.3.x86_64

Re: [Pacemaker] Split brain and STONITH behavior (VMware fencing)

2014-10-29 Thread Andrew Beekhof
On 29 Oct 2014, at 7:48 pm, Andrei Borzenkov arvidj...@gmail.com wrote: On Wed, Oct 29, 2014 at 10:46 AM, Ariel S ariel_bis2...@yahoo.co.id wrote: Hello, I'm trying to understand how this STONITH works. I have 2 VMware VMs (moon1a, moon1b) on two different hosts. Each have 2 nic

Re: [Pacemaker] pacemaker counts probe failure twice

2014-10-29 Thread Andrew Beekhof
On 29 Oct 2014, at 10:01 pm, Andrei Borzenkov arvidj...@gmail.com wrote: I observe strange behavior that I cannot understand. Pacemaker 1.1.11-3ca8c3b. There is master/slave resource running. Maintenance-mode was set, pacemaker restarted, maintenance-mode reset. This specific RA returns

Re: [Pacemaker] No demote after pre demote.

2014-10-29 Thread Andrew Beekhof
I cant do much without logs and PE files (ie. crm_report) On 29 Oct 2014, at 11:32 pm, Arjun Pandey apandepub...@gmail.com wrote: Including the package details pacemaker-1.1.10-14.el6_5.3.x86_64 pacemaker-libs-1.1.10-14.el6_5.3.x86_64 pacemaker-cli-1.1.10-14.el6_5.3.x86_64

  1   2   3   4   5   6   7   8   9   10   >