Re: [Pacemaker] CRM property mysql_replication

2015-07-17 Thread Vladislav Bogdanov
17.07.2015 10:46, Dejan Muhamedagic wrote: [...] The attribute is not listed in the PE meta-data and therefore considered an error. You can make crmsh less strict like this: # crm option set core.check_mode relaxed It'll still print the error, but your changes are going to be committed.

Re: [Pacemaker] Filesystem resource killing innocent processes on stop

2015-05-19 Thread Vladislav Bogdanov
19.05.2015 11:46, Dejan Muhamedagic wrote: On Mon, May 18, 2015 at 07:34:38PM +0300, Vladislav Bogdanov wrote: 18.05.2015 18:57, Nikola Ciprich wrote: Hi Vladislav, Isn't that a bind-mount? nope, but your question lead me to possible culprit.. it's cephfs mount, when I try to some local

Re: [Pacemaker] Filesystem resource killing innocent processes on stop

2015-05-19 Thread Vladislav Bogdanov
19.05.2015 11:44, Dejan Muhamedagic wrote: On Mon, May 18, 2015 at 05:14:14PM +0200, Nikola Ciprich wrote: Hi Dejan, The list below seems too extensive. Which version of resource-agents do you run? $ grep 'Build version:' /usr/lib/ocf/lib/heartbeat/ocf-shellfuncs yes, it's definitely

Re: [Pacemaker] Filesystem resource killing innocent processes on stop

2015-05-18 Thread Vladislav Bogdanov
18.05.2015 18:57, Nikola Ciprich wrote: Hi Vladislav, Isn't that a bind-mount? nope, but your question lead me to possible culprit.. it's cephfs mount, when I try to some local filesystem, I don't see this weird fuser behaviour.. so maybe fuser does not work correctly on cephfs? yep, for

Re: [Pacemaker] Filesystem resource killing innocent processes on stop

2015-05-18 Thread Vladislav Bogdanov
18.05.2015 13:20, Nikola Ciprich wrote: Hi, I noticed very annoying bug (or so I think), that resource-agents-3.9.5 in RHEL / centos 6 Filesystem OCF resource seems to be killing completely unrelated processes on shutdown although they're not using anything on mounted filesystem... Isn't

Re: [Pacemaker] Unique clone instance is stopped too early on move

2015-04-17 Thread Vladislav Bogdanov
17.04.2015 00:48, Andrew Beekhof wrote: On 22 Jan 2015, at 12:04 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:44, Andrew Beekhof wrote: On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.01.2015 07:44, Andrew Beekhof wrote: On 15 Jan 2015

Re: [Pacemaker] 'stop' operation passes outdated set of instance attributes to RA

2015-02-24 Thread Vladislav Bogdanov
23.02.2015 21:50, David Vossel wrote: - Original Message - On 14 Feb 2015, at 1:10 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, I believe that is a bug that 'stop' operation uses set of instance attributes from the original 'start' op, not what successful 'reload' had

Re: [Pacemaker] One more globally-unique clone question

2015-02-23 Thread Vladislav Bogdanov
24.02.2015 01:58, Andrew Beekhof wrote: On 21 Jan 2015, at 5:08 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.01.2015 03:51, Andrew Beekhof wrote: On 20 Jan 2015, at 4:13 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:47, Andrew Beekhof wrote: On 17 Jan 2015

[Pacemaker] 'stop' operation passes outdated set of instance attributes to RA

2015-02-13 Thread Vladislav Bogdanov
Hi, I believe that is a bug that 'stop' operation uses set of instance attributes from the original 'start' op, not what successful 'reload' had. Corresponding pe-input has correct set of attributes, and pre-stop 'notify' op uses updated set of attributes too. This is easily reproducible with

[Pacemaker] Gracefully failing reload operation

2015-01-28 Thread Vladislav Bogdanov
Hi, is there a way for resource agent to tell pacemaker that in some cases reload operation is insufficient to apply new resource definition and restart is required? I tried to return OCF_ERR_GENERIC, but that prevents resource from being started until failure-timeout lapses and cluster is

[Pacemaker] Pre/post-action notifications for master-slave and clone resources

2015-01-27 Thread Vladislav Bogdanov
Hi, Playing with two-week old git master on a two-node cluster I discovered that only limited set of notify operations is performed for clone and master-slave instances when all of them are being started/stopped. Clones (anonymous): * post-start * pre-stop M/S: * post-start * post-promote *

Re: [Pacemaker] new release date for resource-agents release 3.9.6

2015-01-26 Thread Vladislav Bogdanov
Hi Dejan, if it is not too late, would it be possible to add output of environment into resource trace file when tracing is enabled? --- ocf-shellfuncs.orig 2015-01-26 15:50:34.435001364 + +++ ocf-shellfuncs 2015-01-26 15:49:19.707001542 + @@ -822,6 +822,7 @@ fi

Re: [Pacemaker] [crmsh][Question] The order of resources is changed.

2015-01-21 Thread Vladislav Bogdanov
Hi, 21.01.2015 12:50, Kristoffer Grönlund wrote: Hello, renayama19661...@ybb.ne.jp writes: Hi All, We confirmed a function of crmsh by the next combination. * corosync-2.3.4 * pacemaker-Pacemaker-1.1.12 * crmsh-2.1.0 By new crmsh, does options sort-elements no not work? Is there the

Re: [Pacemaker] Unique clone instance is stopped too early on move

2015-01-21 Thread Vladislav Bogdanov
20.01.2015 02:44, Andrew Beekhof wrote: On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.01.2015 07:44, Andrew Beekhof wrote: On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 13.01.2015 11:32, Andrei Borzenkov wrote: On Tue, Jan 13

Re: [Pacemaker] One more globally-unique clone question

2015-01-20 Thread Vladislav Bogdanov
21.01.2015 03:51, Andrew Beekhof wrote: On 20 Jan 2015, at 4:13 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:47, Andrew Beekhof wrote: On 17 Jan 2015, at 1:25 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, Trying to reproduce problem with early stop

Re: [Pacemaker] One more globally-unique clone question

2015-01-19 Thread Vladislav Bogdanov
20.01.2015 02:47, Andrew Beekhof wrote: On 17 Jan 2015, at 1:25 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, Trying to reproduce problem with early stop of globally-unique clone instances during move to another node I found one more interesting problem. Due to the different

[Pacemaker] One more globally-unique clone question

2015-01-16 Thread Vladislav Bogdanov
Hi all, Trying to reproduce problem with early stop of globally-unique clone instances during move to another node I found one more interesting problem. Due to the different order of resources in the CIB and extensive use of constraints between other resources (odd number of resources

Re: [Pacemaker] Unique clone instance is stopped too early on move

2015-01-15 Thread Vladislav Bogdanov
16.01.2015 07:44, Andrew Beekhof wrote: On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 13.01.2015 11:32, Andrei Borzenkov wrote: On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all. I found a little bit strange

Re: [Pacemaker] BUG: crm_mon prints status of clone instances being started as 'Started'

2015-01-14 Thread Vladislav Bogdanov
; + } else if (child_rsc-fns-active(child_rsc, TRUE)) { /* Fully active anonymous clone */ node_t *location = child_rsc-fns-location(child_rsc, NULL, TRUE); On 10 Jan 2015, at 12:16 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, It seems like lib

Re: [Pacemaker] Unique clone instance is stopped too early on move

2015-01-14 Thread Vladislav Bogdanov
13.01.2015 11:32, Andrei Borzenkov wrote: On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all. I found a little bit strange operation ordering during transition execution. Could you please look at the following partial configuration (crmsh

[Pacemaker] Unique clone instance is stopped too early on move

2015-01-12 Thread Vladislav Bogdanov
Hi Andrew, David, all. I found a little bit strange operation ordering during transition execution. Could you please look at the following partial configuration (crmsh syntax)? === ... clone cl-broker broker \ meta interleave=true target-role=Started clone cl-broker-vips broker-vips \

[Pacemaker] BUG: crm_mon prints status of clone instances being started as 'Started'

2015-01-09 Thread Vladislav Bogdanov
Hi all, It seems like lib/pengine/clone.c/clone_print() doesn't respect pending state of clone/ms resource instances in the cumulative output: short_print(list_text, child_text, rsc-variant == pe_master ? Slaves : Started, options, print_data); in by-node output

Re: [Pacemaker] A secondary DRBD is not brought online after crm resource online in Ubuntu 14.04

2015-01-03 Thread Vladislav Bogdanov
03.01.2015 20:35, Dmitry Koterov wrote: Hello. Ubuntu 14.04, corosync 2.3.3, pacemaker 1.1.10. The cluster consists of 2 nodes (node1 and node2), when I run crm node standby node2 and then, in a minute, crm node online node2, DRBD secondary on node2 does not start. Logs say that drbdadm -c

Re: [Pacemaker] Fencing of bare-metal remote nodes

2014-12-11 Thread Vladislav Bogdanov
25.11.2014 23:41, David Vossel wrote: - Original Message - Hi! is subj implemented? Trying echo c /proc/sysrq-trigger on remote nodes and no fencing occurs. Yes, fencing remote-nodes works. Are you certain your fencing devices can handle fencing the remote-node? Fencing a

Re: [Pacemaker] Avoid monitoring of resources on nodes

2014-11-26 Thread Vladislav Bogdanov
26.11.2014 14:21, Daniel Dehennin wrote: Daniel Dehennin daniel.dehen...@baby-gnu.org writes: I'll try find how to make the change directly in XML. Ok, looking at git history this feature seems only available on master branch and not yet released. I do not have that feature on my

Re: [Pacemaker] Fencing of bare-metal remote nodes

2014-11-26 Thread Vladislav Bogdanov
26.11.2014 18:36, David Vossel wrote: - Original Message - 25.11.2014 23:41, David Vossel wrote: - Original Message - Hi! is subj implemented? Trying echo c /proc/sysrq-trigger on remote nodes and no fencing occurs. Yes, fencing remote-nodes works. Are you certain

Re: [Pacemaker] [Linux-HA] [ha-wg] [RFC] Organizing HA Summit 2015

2014-11-26 Thread Vladislav Bogdanov
25.11.2014 12:54, Lars Marowsky-Bree wrote:... OK, let's switch tracks a bit. What *topics* do we actually have? Can we fill two days? Where would we want to collect them? Just my 2c. - It would be interesting to get some bird-view information on what C APIs corosync and pacemaker currently

Re: [Pacemaker] Suicide fencing and watchdog questions

2014-11-26 Thread Vladislav Bogdanov
27.11.2014 03:43, Andrew Beekhof wrote: On 25 Nov 2014, at 10:37 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, Is there any information how watchdog integration is intended to work? What are currently-evaluated use-cases for that? It seems to be forcibly disabled id SBD

[Pacemaker] Fencing of bare-metal remote nodes

2014-11-25 Thread Vladislav Bogdanov
Hi! is subj implemented? Trying echo c /proc/sysrq-trigger on remote nodes and no fencing occurs. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home:

[Pacemaker] Suicide fencing and watchdog questions

2014-11-25 Thread Vladislav Bogdanov
Hi, Is there any information how watchdog integration is intended to work? What are currently-evaluated use-cases for that? It seems to be forcibly disabled id SBD is not detected... Also, is there any way to make node (in one-node cluster ;) ) to suicide if it detects fencing is required?

Re: [Pacemaker] Avoid monitoring of resources on nodes

2014-11-25 Thread Vladislav Bogdanov
25.11.2014 23:36, David Vossel wrote: - Original Message - Daniel Dehennin daniel.dehen...@baby-gnu.org writes: Hello, Hello, I have a 4 nodes cluster and some resources are only installed on 2 of them. I set cluster asymmetry and infinity location: primitive Mysqld

Re: [Pacemaker] Globally-unique clone cleanup on remote nodes

2014-11-19 Thread Vladislav Bogdanov
20.11.2014 01:57, Andrew Beekhof пишет: On 19 Nov 2014, at 4:27 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, just an observation. I have a globally-unique clone with 50 instances in a cluster consisting of one cluster node and 3 remote bare-metal nodes. When I run

Re: [Pacemaker] Globally-unique clone cleanup on remote nodes

2014-11-19 Thread Vladislav Bogdanov
20.11.2014 09:25, Andrew Beekhof пишет: On 20 Nov 2014, at 5:12 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.11.2014 01:57, Andrew Beekhof пишет: On 19 Nov 2014, at 4:27 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, just an observation. I have a globally-unique

[Pacemaker] Globally-unique clone cleanup on remote nodes

2014-11-18 Thread Vladislav Bogdanov
Hi all, just an observation. I have a globally-unique clone with 50 instances in a cluster consisting of one cluster node and 3 remote bare-metal nodes. When I run crm_resource -C -r 'g_u_clone' (crmsh does that for me), it writes that it expects to receive 50 answers (although it writes 4 lines

[Pacemaker] resource-discovery question

2014-11-12 Thread Vladislav Bogdanov
Hi David, all, I'm trying to get resource-discovery=never working with cd7c9ab, but still get Not installed probe failures from nodes which does not have corresponding resource agents installed. The only difference in my location constraints comparing to what is committed in #589 is that they

Re: [Pacemaker] resource-discovery question

2014-11-12 Thread Vladislav Bogdanov
12.11.2014 22:04, Vladislav Bogdanov wrote: Hi David, all, I'm trying to get resource-discovery=never working with cd7c9ab, but still get Not installed probe failures from nodes which does not have corresponding resource agents installed. The only difference in my location constraints

Re: [Pacemaker] resource-discovery question

2014-11-12 Thread Vladislav Bogdanov
12.11.2014 22:57, David Vossel wrote: - Original Message - 12.11.2014 22:04, Vladislav Bogdanov wrote: Hi David, all, I'm trying to get resource-discovery=never working with cd7c9ab, but still get Not installed probe failures from nodes which does not have corresponding

Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-11-11 Thread Vladislav Bogdanov
11.11.2014 07:27, Sihan Goi wrote: Hi, DocumentRoot is still set to /var/www/html ls -al /var/www/html shows different things on the 2 nodes node01: total 28 drwxr-xr-x. 3 root root 4096 Nov 11 12:25 . drwxr-xr-x. 6 root root 4096 Jul 23 22:18 .. -rw-r--r--. 1 root root50 Oct 28

Re: [Pacemaker] #kind eq container matches bare-metal nodes

2014-10-23 Thread Vladislav Bogdanov
23.10.2014 22:39, David Vossel wrote: - Original Message - 21.10.2014 06:25, Vladislav Bogdanov wrote: 21.10.2014 05:15, Andrew Beekhof wrote: On 20 Oct 2014, at 8:52 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all, It seems like #kind was introduced

Re: [Pacemaker] #kind eq container matches bare-metal nodes

2014-10-21 Thread Vladislav Bogdanov
21.10.2014 06:25, Vladislav Bogdanov wrote: 21.10.2014 05:15, Andrew Beekhof wrote: On 20 Oct 2014, at 8:52 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all, It seems like #kind was introduced before bare-metal remote node support, and now it is matched against

[Pacemaker] #kind eq container matches bare-metal nodes

2014-10-20 Thread Vladislav Bogdanov
Hi Andrew, David, all, It seems like #kind was introduced before bare-metal remote node support, and now it is matched against cluster and container. Bare-metal remote nodes match container (they are remote), but strictly speaking they are not containers. Could/should that attribute be extended

Re: [Pacemaker] #kind eq container matches bare-metal nodes

2014-10-20 Thread Vladislav Bogdanov
21.10.2014 05:15, Andrew Beekhof wrote: On 20 Oct 2014, at 8:52 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all, It seems like #kind was introduced before bare-metal remote node support, and now it is matched against cluster and container. Bare-metal remote nodes

Re: [Pacemaker] Corosync and Pacemaker Hangs

2014-09-14 Thread Vladislav Bogdanov
: [ node01 node02 ] Thank you, Norbert On Fri, Sep 12, 2014 at 12:06 PM, Vladislav Bogdanov bub...@hoster-ok.com mailto:bub...@hoster-ok.com wrote: 12.09.2014 05:00, Norbert Kiam Maclang wrote: Hi, After adding resource level fencing on drbd, I still ended up having

Re: [Pacemaker] Corosync and Pacemaker Hangs

2014-09-11 Thread Vladislav Bogdanov
11.09.2014 05:57, Norbert Kiam Maclang wrote: Is this something to do with quorum? But I already set You'd need to configure fencing at the drbd resources level. http://www.drbd.org/users-guide-emb/s-pacemaker-fencing.html#s-pacemaker-fencing-cib property no-quorum-policy=ignore \

Re: [Pacemaker] Corosync and Pacemaker Hangs

2014-09-11 Thread Vladislav Bogdanov
(restarting both nodes alternately - this always works). Will wait for a couple of hours after doing a failover test again (Which always fail on my previous setup). Thank you! Kiam On Thu, Sep 11, 2014 at 2:14 PM, Vladislav Bogdanov bub...@hoster-ok.com mailto:bub

Re: [Pacemaker] Configuration recommandations for (very?) large cluster

2014-08-14 Thread Vladislav Bogdanov
14.08.2014 10:35, Andrew Beekhof wrote: ... The load from the crmd is mostly from talking to the lrmd, which is dependant on resource placement rather than being (or not being) the DC. I've seen the different picture with 1024 unique clone instances. crmd's CPU load on DC is much higher

Re: [Pacemaker] Configuration recommandations for (very?) large cluster

2014-08-13 Thread Vladislav Bogdanov
14.08.2014 05:24, Andrew Beekhof wrote: On 14 Aug 2014, at 12:05 am, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Wed, Aug 13, 2014 at 10:33:55AM +1000, Andrew Beekhof wrote: On 13 Aug 2014, at 2:02 am, Cédric Dufour - Idiap Research Institute cedric.duf...@idiap.ch wrote: On

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-25 Thread Vladislav Bogdanov
25.07.2014 02:20, Andrew Beekhof wrote: ... On 15 Jul 2014, at 8:00 pm, Arjun Pandey apandepub...@gmail.com wrote: Right. Actually the issue i am facing is that i am starting the pacemaker service remotely from a wrapper and thus pacemakerd dies when the wrapper exits.nohup solves the

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-23 Thread Vladislav Bogdanov
24.07.2014 03:39, Andrew Beekhof wrote: On 23 Jul 2014, at 2:46 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 23.07.2014 05:56, Andrew Beekhof wrote: On 21 Jul 2014, at 3:45 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 08:36, Andrew Beekhof wrote: On 21 Jul 2014

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-22 Thread Vladislav Bogdanov
23.07.2014 05:56, Andrew Beekhof wrote: On 21 Jul 2014, at 3:45 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 08:36, Andrew Beekhof wrote: On 21 Jul 2014, at 2:50 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 06:28, Andrew Beekhof wrote: On 15 Jul 2014

Re: [Pacemaker] Managing big number of globally-unique clone instances

2014-07-21 Thread Vladislav Bogdanov
21.07.2014 13:37, Andrew Beekhof wrote: On 21 Jul 2014, at 3:09 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 06:21, Andrew Beekhof wrote: On 18 Jul 2014, at 5:16 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, I have a task which seems to be easily

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-20 Thread Vladislav Bogdanov
21.07.2014 06:28, Andrew Beekhof wrote: On 15 Jul 2014, at 8:45 pm, Arjun Pandey apandepub...@gmail.com wrote: On Tue, Jul 15, 2014 at 3:36 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 8:00 pm, Arjun Pandey apandepub...@gmail.com wrote: Right. Actually the issue i am

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-20 Thread Vladislav Bogdanov
21.07.2014 08:36, Andrew Beekhof wrote: On 21 Jul 2014, at 2:50 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 06:28, Andrew Beekhof wrote: On 15 Jul 2014, at 8:45 pm, Arjun Pandey apandepub...@gmail.com wrote: On Tue, Jul 15, 2014 at 3:36 PM, Andrew Beekhof

[Pacemaker] Managing big number of globally-unique clone instances

2014-07-18 Thread Vladislav Bogdanov
Hi Andrew, all, I have a task which seems to be easily solvable with the use of globally-unique clone: start huge number of specific virtual machines to provide a load to a connection multiplexer. I decided to look how pacemaker behaves in such setup with Dummy resource agent, and found that

Re: [Pacemaker] Pacemaker Digest, Vol 80, Issue 32

2014-07-13 Thread Vladislav Bogdanov
digest... Today's Topics: 1. Re: crm resourse (lsb:apache2) not starting (Michael Monette) 2. Re: crm resourse (lsb:apache2) not starting (W Forum W) 3. Re: crm resourse (lsb:apache2) not starting (Vladislav Bogdanov

Re: [Pacemaker] crm resourse (lsb:apache2) not starting

2014-07-11 Thread Vladislav Bogdanov
08.07.2014 16:15, W Forum W wrote: Hi, I have a two node cluster with a DRBD, heartbeat and pacemaker (on Debian Wheezy) The cluster is working fine. 2 DRBD resources, Shared IP, 2 File systems and a postgresql database start, stop, migrate, ... correctly. Now the problem is with the

Re: [Pacemaker] 2-node active/active cluster serving virtual machines (KVM via libvirt)

2014-06-30 Thread Vladislav Bogdanov
30.06.2014 15:45, Tony Atkinson wrote: Hi all, I'd really appreciate a helping hand here I'm so close to getting what I need, but just seem to be falling short at the last hurdle. 2-node active/active cluster serving virtual machines (KVM via libvirt) Virtual machines need to be able to

Re: [Pacemaker] votequorum for 2 node cluster

2014-06-11 Thread Vladislav Bogdanov
11.06.2014 16:26, Kostiantyn Ponomarenko wrote: Hi guys, I am trying to deal somehow with split brain situation in 2 node cluster using votequorum. Here is a quorum section in my corosync.conf: provider: corosync_votequorum expected_votes: 2 Just a side note, not an answer to your

Re: [Pacemaker] location anti affinity rule not working

2014-05-01 Thread Vladislav Bogdanov
02.05.2014 03:47, ESWAR RAO wrote: Hi All, I am working on 3 node cluster with corosync+pacemaker on RHEL 6. Eventhough I applied anti-affinity the app is still trying to run on 3rd node. # rpm -qa|grep corosync corosync-1.4.1-17.el6_5.1.x86_64 corosynclib-1.4.1-17.el6_5.1.x86_64 # rpm

Re: [Pacemaker] The next release

2014-04-27 Thread Vladislav Bogdanov
23.04.2014 13:29, Andrew Beekhof wrote: I'd like to get a release out in the next month or so, so expect a release candidate RealSoonNow(tm). The last item on my personal todo list is updating the ACL syntax to make the terms a little more generic (since it won't just be users anymore).

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-12 Thread Vladislav Bogdanov
12.03.2014 00:40, Andrew Beekhof wrote: On 11 Mar 2014, at 6:23 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 07.03.2014 10:30, Vladislav Bogdanov wrote: 07.03.2014 05:43, Andrew Beekhof wrote: On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 18.02.2014

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-11 Thread Vladislav Bogdanov
07.03.2014 10:30, Vladislav Bogdanov wrote: 07.03.2014 05:43, Andrew Beekhof wrote: On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida yusk.i...@gmail.com wrote: Hi, all I measure

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-11 Thread Vladislav Bogdanov
12.03.2014 00:37, Andrew Beekhof wrote: ... I'm somewhat confused at this point if crmsh is using --replace, then why is it doing diff calculations? Or are replace operations only for the load operation? It uses on of two methods depending on pacemaker version.

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-06 Thread Vladislav Bogdanov
18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida yusk.i...@gmail.com wrote: Hi, all I measure the performance of Pacemaker in the following combinations. Pacemaker-1.1.11.rc1 libqb-0.16.0 corosync-2.3.2 All nodes are KVM virtual machines. stopped the

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-06 Thread Vladislav Bogdanov
07.03.2014 05:43, Andrew Beekhof wrote: On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida yusk.i...@gmail.com wrote: Hi, all I measure the performance of Pacemaker in the following

Re: [Pacemaker] Migrating resources on custom conditions

2014-02-21 Thread Vladislav Bogdanov
21.02.2014 13:45, Lars Marowsky-Bree wrote: On 2014-02-21T13:02:23, Vladislav Bogdanov bub...@hoster-ok.com wrote: It could be nice feature to have kind of general SLA concept (it could be very similar to the utilization one from the resource configuration perspective), so resources try

Re: [Pacemaker] node1 fencing itself after node2 being fenced

2014-02-19 Thread Vladislav Bogdanov
Hi Fabio, 19.02.2014 12:32, Fabio M. Di Nitto wrote: On 2/19/2014 9:39 AM, Fabio M. Di Nitto wrote: On 2/18/2014 9:24 PM, Asgaroth wrote: Just a guess. Do you have startup fencing enabled in dlm-controld (I actually do not remember if it is applicable to cman's version, but it exists in

Re: [Pacemaker] Order of resources in a group and crm_diff

2014-02-18 Thread Vladislav Bogdanov
29.01.2014 08:44, Andrew Beekhof wrote: ... Thats a known deficiency in the v1 diff format (and why we need costly digests to detect ordering changes). Happily .12 will have a new and improve diff format that will handle this correctly. Does your recent cib-performance rewrite address

Re: [Pacemaker] Manual resource reload

2014-02-18 Thread Vladislav Bogdanov
17.02.2014 04:11, Andrew Beekhof wrote: On 11 Feb 2014, at 2:49 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, cannot find anywhere (am I blind?), is it possible to manually inject 'reload' op for a given resource? Background for this is if some configuration files are edited

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-02-18 Thread Vladislav Bogdanov
18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida yusk.i...@gmail.com wrote: Hi, all I measure the performance of Pacemaker in the following combinations. Pacemaker-1.1.11.rc1 libqb-0.16.0 corosync-2.3.2 All nodes are KVM virtual machines. stopped the

Re: [Pacemaker] node1 fencing itself after node2 being fenced

2014-02-18 Thread Vladislav Bogdanov
18.02.2014 19:49, Asgaroth wrote: i sometimes have the same situation. sleep ~30 seconds between startup cman and clvmd helps a lot. Thanks for the tip, I just tried this (added sleep 30 in the start section of case statement in cman script, but this did not resolve the issue for me), for

Re: [Pacemaker] node1 fencing itself after node2 being fenced

2014-02-18 Thread Vladislav Bogdanov
18.02.2014 23:01, David Vossel wrote: - Original Message - From: Vladislav Bogdanov bub...@hoster-ok.com To: pacemaker@oss.clusterlabs.org Sent: Tuesday, February 18, 2014 1:02:09 PM Subject: Re: [Pacemaker] node1 fencing itself after node2 being fenced 18.02.2014 19:49

Re: [Pacemaker] node1 fencing itself after node2 being fenced

2014-02-10 Thread Vladislav Bogdanov
10.02.2014 14:46, Asgaroth wrote: Hi All, OK, here is my testing using cman/clvmd enabled on system startup and clvmd outside of pacemaker control. I still seem to be getting the clvmd hang/fail situation even when running outside of pacemaker control, I cannot see off-hand where the

[Pacemaker] Manual resource reload

2014-02-10 Thread Vladislav Bogdanov
Hi, cannot find anywhere (am I blind?), is it possible to manually inject 'reload' op for a given resource? Background for this is if some configuration files are edited, and resource-agent (or LSB script) supports 'reload' operation, then it would be nice to have a way to request that reload to

Re: [Pacemaker] node1 fencing itself after node2 being fenced

2014-02-10 Thread Vladislav Bogdanov
10.02.2014 18:54, Asgaroth wrote: -Original Message- From: Vladislav Bogdanov [mailto:bub...@hoster-ok.com] Sent: 10 February 2014 13:27 To: pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] node1 fencing itself after node2 being fenced I cannot really recall

Re: [Pacemaker] node1 fencing itself after node2 being fenced

2014-02-07 Thread Vladislav Bogdanov
07.02.2014 14:22, Asgaroth wrote: ... Thanks for the explanation, this is interresting for me as I need a volume manager in the cluster to manager the shared file systems in case I need to resize for some reason. I think I may be coming up against something similar now that I am testing cman

Re: [Pacemaker] node1 fencing itself after node2 being fenced

2014-02-05 Thread Vladislav Bogdanov
05.02.2014 20:10, Asgaroth wrote: On 05/02/2014 16:12, Digimer wrote: You say it's working now? If so, excellent. If you have any troubles though, please share your cluster.conf and 'pcs config show'. Hi Digimer, no its not working as I expect it to when I test a crash of node 2, clvmd

[Pacemaker] Order of resources in a group and crm_diff

2014-01-28 Thread Vladislav Bogdanov
Hi all, Just discovered, that when I add resource to a middle of (running) group, it is added to the end. I mean, if I update following (crmsh syntax) group dhcp-server vip-10-5-200-244 dhcpd with group dhcp-server vip-10-5-200-244 vip-10-5-201-244 dhcpd with 'crm configure load update',

[Pacemaker] [PATCH] Downgrade probe log message for promoted ms resources

2014-01-12 Thread Vladislav Bogdanov
Hi, This is the only one message I see in logs in otherwise static cluster (with rechecks enabled), probably it is good idea to downgrade it to info. diff --git a/lib/pengine/unpack.c b/lib/pengine/unpack.c index 97e114f..6dbcf19 100644 --- a/lib/pengine/unpack.c +++ b/lib/pengine/unpack.c @@

Re: [Pacemaker] again return code, now in crm_attribute

2014-01-09 Thread Vladislav Bogdanov
10.01.2014 08:00, Andrew Beekhof wrote: On 10 Jan 2014, at 3:51 pm, Andrey Groshev gre...@yandex.ru wrote: 10.01.2014, 03:28, Andrew Beekhof and...@beekhof.net: On 9 Jan 2014, at 4:44 pm, Andrey Groshev gre...@yandex.ru wrote: 09.01.2014, 02:39, Andrew Beekhof and...@beekhof.net: On

Re: [Pacemaker] dc-version and cluster-infrastructure cluster options may be lost when editing

2014-01-02 Thread Vladislav Bogdanov
02.01.2014 12:53, Kristoffer Grönlund wrote: On Tue, 24 Dec 2013 14:41:20 +0300 Vladislav Bogdanov bub...@hoster-ok.com wrote: I would expect that only one option is modified, but crmsh intend to remove all others. May be it is possible to fix it by one-line crmsh patch? Kristoffer

Re: [Pacemaker] Unable to start cloned apache service on node 2

2014-01-02 Thread Vladislav Bogdanov
03.01.2014 07:46, Digimer wrote: Hi all, While trying to test to answer questions from my previous thread, I hit another problem. Since posting the first thread, I moved on in the Cluster from Scratch tutorial and got to the point where I was running Active/Active. Here I have a

Re: [Pacemaker] Odd issues with apache on RHEL 7 beta

2013-12-26 Thread Vladislav Bogdanov
27.12.2013 09:34, Digimer wrote: ... 3. I know I mentioned this on IRC before, but I thought I should mention it here again. In the pcs CfS, it shows to set: Location /server-status SetHandler server-status Order deny,allow Deny from all Allow from 127.0.0.1 /Location

Re: [Pacemaker] Odd issues with apache on RHEL 7 beta

2013-12-26 Thread Vladislav Bogdanov
27.12.2013 09:45, Digimer wrote: On 27/12/13 01:44 AM, Vladislav Bogdanov wrote: 27.12.2013 09:34, Digimer wrote: ... 3. I know I mentioned this on IRC before, but I thought I should mention it here again. In the pcs CfS, it shows to set: Location /server-status SetHandler server

Re: [Pacemaker] dc-version and cluster-infrastructure cluster options may be lost when editing

2013-12-24 Thread Vladislav Bogdanov
20.12.2013 14:15, Vladislav Bogdanov wrote: Hi all, Just discovered that it is possible to remove options in subject with admin action (at least with cibadmin --patch). This is notably annoying when updating CIB with 'crm configure load update' (which uses crm_diff to create patch) when

[Pacemaker] dc-version and cluster-infrastructure cluster options may be lost when editing

2013-12-20 Thread Vladislav Bogdanov
Hi all, Just discovered that it is possible to remove options in subject with admin action (at least with cibadmin --patch). This is notably annoying when updating CIB with 'crm configure load update' (which uses crm_diff to create patch) when updating other cluster options. While most (all?)

Re: [Pacemaker] crmsh: New syntax for location constraints, suggestions / comments

2013-12-18 Thread Vladislav Bogdanov
18.12.2013 23:21, Rainer Brestan wrote: Hi Lars, maybe a little off topic. What i really miss in crmsh is the possibility to specify resource parameters which are different on different nodes, so the parameter is node dependant. In XML syntax this is existing, Andrew gave me the hint as

Re: [Pacemaker] crmsh: New syntax for location constraints, suggestions / comments

2013-12-13 Thread Vladislav Bogdanov
13.12.2013 15:39, Lars Marowsky-Bree wrote: On 2013-12-13T13:11:30, Rainer Brestan rainer.bres...@gmx.net wrote: Please do not merge colocation and order together in a way that only none or both is present. This was never the plan. The idea was to offer an additional construct that

Re: [Pacemaker] Reg. clone and order attributes

2013-12-08 Thread Vladislav Bogdanov
in latter two. But I couldn't understand why pacemaker is giving errors with this type of configuration. Thanks Eswar On Sat, Dec 7, 2013 at 2:47 PM, Vladislav Bogdanov bub...@hoster-ok.com mailto:bub...@hoster-ok.com wrote: 06.12.2013 14:07, ESWAR RAO wrote: Hi Vladislav

Re: [Pacemaker] Pacemaker 1.1.10 and pacemaker-remote

2013-12-07 Thread Vladislav Bogdanov
06.12.2013 19:13, James Oakley wrote: On Thursday, December 5, 2013 9:55:47 PM Vladislav Bogdanov bub...@hoster-ok.com wrote: Does 'db0' resolve to a correct IP address? If not, then you probably want either fix that or use remote-addr option as well. I saw that you can ping/ssh

Re: [Pacemaker] Reg. clone and order attributes

2013-12-07 Thread Vladislav Bogdanov
...@gmail.com mailto:eswar7...@gmail.com wrote: Thanks Vladislav. I will work on that. Thanks Eswar On Fri, Dec 6, 2013 at 11:05 AM, Vladislav Bogdanov bub...@hoster-ok.com mailto:bub...@hoster-ok.com wrote: 06.12.2013 07:58, ESWAR RAO wrote: Hi

Re: [Pacemaker] Pacemaker 1.1.10 and pacemaker-remote

2013-12-06 Thread Vladislav Bogdanov
06.12.2013 11:41, Lars Marowsky-Bree wrote: On 2013-12-06T08:55:47, Vladislav Bogdanov bub...@hoster-ok.com wrote: BTW, pacemaker cib accepts any meta attributes (and that is very convenient way for me to store some 'meta' information), while crmsh limits them to a pre-defined list. While

Re: [Pacemaker] Reg. cone and order attributes

2013-12-05 Thread Vladislav Bogdanov
06.12.2013 07:58, ESWAR RAO wrote: Hi All, Can someone help me with below configuration?? I have a 3 node HB setup (node1, node2, node3) which runs HB+pacemaker. I have 3 apps dummy1, dummy2 , dummy3 which needs to be run on only 2 nodes among the 3 nodes. By using the below

Re: [Pacemaker] Pacemaker 1.1.10 and pacemaker-remote

2013-12-05 Thread Vladislav Bogdanov
05.12.2013 19:57, James Oakley wrote: I have Pacemaker 1.1.10 cluster running on openSUSE 13.1 and I am trying to get pacemaker-remote working so I can manage resources in LXC containers. I have pacemaker_remoted running in the containers. However, I can't seem to get crm configured to talk

Re: [Pacemaker] Where the heck is Beekhof?

2013-11-28 Thread Vladislav Bogdanov
28.11.2013 04:04, Andrew Beekhof wrote: If you find yourself asking $subject at some point in the next couple of months, the answer is that I'm taking leave to look after our new son (Lawson Tiberius Beekhof) who was born on Tuesday. I will be dropping in occasionally to see how things are

Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node

2013-11-12 Thread Vladislav Bogdanov
12.11.2013 09:56, Vladislav Bogdanov wrote: ... Ah, then in_ccm will be set to false only when corosync (2) is stopped on a node, not when pacemaker is stopped? Thus, current drbd agent/fencing logic does not (well) support just stop of pacemaker in my use-case, messaging layer should

Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node

2013-11-11 Thread Vladislav Bogdanov
11.11.2013 09:00, Vladislav Bogdanov wrote: ... Looking at crm-fence-peer.sh script, it would determine peer state as offline immediately if node state (all of) * doesn't contain expected tag or has it set to down * has in_ccm tag set to false * has crmd tag set to anything except online

Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node

2013-11-11 Thread Vladislav Bogdanov
12.11.2013 03:05, Andrew Beekhof wrote: On 12 Nov 2013, at 10:29 am, Andrew Beekhof and...@beekhof.net wrote: On 12 Nov 2013, at 2:46 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: 11.11.2013 09:00, Vladislav Bogdanov wrote: ... Looking at crm-fence-peer.sh script, it would

Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node

2013-11-11 Thread Vladislav Bogdanov
12.11.2013 03:15, Andrew Beekhof wrote: Can you try with these two patches please? + Andrew Beekhof (4 seconds ago) fec946a: Fix: crmd: When the DC gracefully shuts down, record the new expected state into the cib (HEAD, master) + Andrew Beekhof (10 seconds ago) 740122a: Fix: crmd: When a

Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node

2013-11-10 Thread Vladislav Bogdanov
11.11.2013 02:30, Andrew Beekhof wrote: On 5 Nov 2013, at 2:22 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all, Just found interesting fact, don't know is it a bug or not. When doing service pacemaker stop on a node which has drbd resource promoted

  1   2   3   4   >