Re: [Pacemaker] recover cib from raw file

2013-11-12 Thread s.oreilly
Brilliant, thanks Andrew. I was looking for a pcs option. Should have thought about cibadmin. Hopefully I will never break things badly enough to have to use it :-) Regards Sean O'Reilly On Mon 11/11/13 10:03 PM , Andrew Beekhof and...@beekhof.net sent: On 11 Nov 2013, at 9:41 pm, s.oreilly

Re: [Pacemaker] recover cib from raw file

2013-11-12 Thread Andrew Beekhof
I wouldn't be surprised to see a relevant pcs command in the future ;-) On 12 Nov 2013, at 8:51 pm, s.oreilly s.orei...@linnovations.co.uk wrote: Brilliant, thanks Andrew. I was looking for a pcs option. Should have thought about cibadmin. Hopefully I will never break things badly enough to

[Pacemaker] Follow up: Colocation constraint to External Managed Resource (cluster-recheck-interval=5m ignored after 1.1.10 update?)

2013-11-12 Thread Robert H.
Hello, for PaceMaker 1.1.8 (CentOS Version) the thread http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg18048.html was solved with adding cluster-recheck-interval=5m, causing the LRM to be executed every 5 minutes and detecting externally managed resources as started (in this

Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node

2013-11-12 Thread Vladislav Bogdanov
12.11.2013 09:56, Vladislav Bogdanov wrote: ... Ah, then in_ccm will be set to false only when corosync (2) is stopped on a node, not when pacemaker is stopped? Thus, current drbd agent/fencing logic does not (well) support just stop of pacemaker in my use-case, messaging layer should be

[Pacemaker] Network outage debugging

2013-11-12 Thread Sean Lutner
The folks testing the cluster I've been building have run a script which blocks all traffic except SSH on one node of the cluster for 15 seconds to mimic a network failure. During this time, the network being down seems to cause some odd behavior from pacemaker resulting in it dying. The

Re: [Pacemaker] recover cib from raw file

2013-11-12 Thread Lars Marowsky-Bree
On 2013-11-12T09:51:02, s.oreilly s.orei...@linnovations.co.uk wrote: Brilliant, thanks Andrew. I was looking for a pcs option. Should have thought about cibadmin. Hopefully I will never break things badly enough to have to use it :-) crm configure load xml ... ;-) Regards, Lars --

Re: [Pacemaker] Network outage debugging

2013-11-12 Thread Andrew Beekhof
On 13 Nov 2013, at 6:10 am, Sean Lutner s...@rentul.net wrote: The folks testing the cluster I've been building have run a script which blocks all traffic except SSH on one node of the cluster for 15 seconds to mimic a network failure. During this time, the network being down seems to

Re: [Pacemaker] Follow up: Colocation constraint to External Managed Resource (cluster-recheck-interval=5m ignored after 1.1.10 update?)

2013-11-12 Thread Andrew Beekhof
On 13 Nov 2013, at 12:06 am, Robert H. pacema...@elconas.de wrote: Hello, for PaceMaker 1.1.8 (CentOS Version) the thread http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg18048.html was solved with adding cluster-recheck-interval=5m, causing the LRM Its the policy engine

Re: [Pacemaker] why pacemaker does not control the resources

2013-11-12 Thread Andrew Beekhof
On 12 Nov 2013, at 4:42 pm, Andrey Groshev gre...@yandex.ru wrote: 11.11.2013, 03:44, Andrew Beekhof and...@beekhof.net: On 8 Nov 2013, at 7:49 am, Andrey Groshev gre...@yandex.ru wrote: Hi, PPL! I need help. I do not understand... Why has stopped working. This configuration work

Re: [Pacemaker] The larger cluster is tested.

2013-11-12 Thread Andrew Beekhof
Did you look at the load numbers in the logs? The CPUs are being slammed for over 20 minutes. The automatic tuning can only help so much, you're simply asking the cluster to do more work than it is capable of. Giving more priority to cib operations the come via IPC is one option, but as I

Re: [Pacemaker] Network outage debugging

2013-11-12 Thread Sean Lutner
On Nov 12, 2013, at 6:01 PM, Andrew Beekhof and...@beekhof.net wrote: On 13 Nov 2013, at 6:10 am, Sean Lutner s...@rentul.net wrote: The folks testing the cluster I've been building have run a script which blocks all traffic except SSH on one node of the cluster for 15 seconds to

Re: [Pacemaker] asymmetric clusters, remote nodes, and monitor operations

2013-11-12 Thread Andrew Beekhof
On 12 Sep 2013, at 3:44 am, Lindsay Todd rltodd@gmail.com wrote: What I am seeing in the syslog are messages like: Sep 11 13:19:52 db02 pacemaker_remoted[1736]: notice: operation_finished: p-my sql_monitor_2:19398:stderr [ 2013/09/11_13:19:52 INFO: MySQL monitor succeed ed ]

Re: [Pacemaker] Network outage debugging

2013-11-12 Thread Andrew Beekhof
On 13 Nov 2013, at 11:22 am, Sean Lutner s...@rentul.net wrote: On Nov 12, 2013, at 6:01 PM, Andrew Beekhof and...@beekhof.net wrote: On 13 Nov 2013, at 6:10 am, Sean Lutner s...@rentul.net wrote: The folks testing the cluster I've been building have run a script which blocks all

Re: [Pacemaker] Network outage debugging

2013-11-12 Thread Sean Lutner
On Nov 12, 2013, at 7:33 PM, Andrew Beekhof and...@beekhof.net wrote: On 13 Nov 2013, at 11:22 am, Sean Lutner s...@rentul.net wrote: On Nov 12, 2013, at 6:01 PM, Andrew Beekhof and...@beekhof.net wrote: On 13 Nov 2013, at 6:10 am, Sean Lutner s...@rentul.net wrote: The

Re: [Pacemaker] Network outage debugging

2013-11-12 Thread Andrew Beekhof
On 13 Nov 2013, at 11:49 am, Sean Lutner s...@rentul.net wrote: On Nov 12, 2013, at 7:33 PM, Andrew Beekhof and...@beekhof.net wrote: On 13 Nov 2013, at 11:22 am, Sean Lutner s...@rentul.net wrote: On Nov 12, 2013, at 6:01 PM, Andrew Beekhof and...@beekhof.net wrote: On 13

Re: [Pacemaker] Question about the resource to fence a node

2013-11-12 Thread Andrew Beekhof
On 16 Oct 2013, at 8:51 am, Andrew Beekhof and...@beekhof.net wrote: On 15/10/2013, at 8:24 PM, Kazunori INOUE kazunori.ino...@gmail.com wrote: Hi, I'm using pacemaker-1.1 (the latest devel). I started resource (f1 and f2) which fence vm3 on vm1. $ crm_mon -1 Last updated: Tue Oct