Re: [Pacemaker] RFC: Any interesting in 2.0.0 betas?

2012-10-26 Thread Vladislav Bogdanov
26.10.2012 12:43, Andrew Beekhof wrote: ... May be also set it forcibly to uname if uname contains full lexem found in dns name? Run that past me again? I mean that if ip address resolves to fqdn, and that fqdn begins with what uname call returns (so both node itself and DNS agree on a node

Re: [Pacemaker] RFC: Any interesting in 2.0.0 betas?

2012-10-26 Thread Vladislav Bogdanov
26.10.2012 13:38, Vladislav Bogdanov wrote: 26.10.2012 12:43, Andrew Beekhof wrote: ... May be also set it forcibly to uname if uname contains full lexem found in dns name? Run that past me again? I mean that if ip address resolves to fqdn, and that fqdn begins with what uname call

Re: [Pacemaker] RFC: Any interesting in 2.0.0 betas?

2012-10-25 Thread Vladislav Bogdanov
26.10.2012 04:06, Andrew Beekhof wrote: On Thu, Oct 25, 2012 at 7:42 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 25.10.2012 07:50, Andrew Beekhof wrote: On Thu, Oct 25, 2012 at 3:08 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 25.10.2012 04:47, Andrew Beekhof wrote: Does anyone

Re: [Pacemaker] RFC: Any interesting in 2.0.0 betas?

2012-10-24 Thread Vladislav Bogdanov
25.10.2012 04:47, Andrew Beekhof wrote: Does anyone out there have the capacity and interest to test betas of 2.0.0 if I release them? Sure. If so, for what distro and version? Git tag would be enough for me. (Anyone looking for heartbeat to be supported in 2.0.0 would be highly

Re: [Pacemaker] cloned resources show as stopped

2012-10-06 Thread Vladislav Bogdanov
06.10.2012 14:49, James Harper wrote: This is just a cosmetic thing, but I have a cloned resource that shows up like this: Clone Set: c_ping_X [p_ping_X] Started: [ node1 node2 ] Stopped: [ p_ping_X:0 p_ping_X:1 p_ping_X:4 ] That's correct in that the location restriction on

Re: [Pacemaker] crmsh resource update issue

2012-10-03 Thread Vladislav Bogdanov
03.10.2012 08:57, Vladislav Bogdanov wrote: 03.10.2012 01:53, Andrew Beekhof wrote: ... Do you mean get the LSB and/or systemd metadata from the new lrmd? That should work already. Is there an utility which implements new lrm API? Answering to myself. lrmd_test It allows to do

Re: [Pacemaker] crmsh resource update issue

2012-10-03 Thread Vladislav Bogdanov
03.10.2012 20:11, Dejan Muhamedagic wrote: On Wed, Oct 03, 2012 at 11:57:47AM +0300, Vladislav Bogdanov wrote: 03.10.2012 11:48, Dejan Muhamedagic wrote: On Wed, Oct 03, 2012 at 09:21:41AM +0300, Vladislav Bogdanov wrote: 03.10.2012 08:57, Vladislav Bogdanov wrote: 03.10.2012 01:53, Andrew

Re: [Pacemaker] crmsh resource update issue

2012-10-02 Thread Vladislav Bogdanov
Hi Dejan, 28.09.2012 14:07, Dejan Muhamedagic wrote: ... OK. Silly me. Sorry for the non-vi users ;-) Thanks for the patch! Dejan One more patch to fix lrmadmin disfunction with 1.1.8. It is a little bit intrusive, but I'm definitely not a python coder :) You can just take it as a

Re: [Pacemaker] crmsh resource update issue

2012-10-02 Thread Vladislav Bogdanov
02.10.2012 15:41, Vladislav Bogdanov wrote: ... for req_op in self.required_ops: if req_op not in n_ops: -n_ops[req_op] = {} +if not (self.ra_class == stonith and op in (start, stop)): s/ op/ req_op/ +n_ops[req_op

Re: [Pacemaker] crmsh resource update issue

2012-09-28 Thread Vladislav Bogdanov
27.09.2012 20:03, Dejan Muhamedagic wrote: On Thu, Sep 27, 2012 at 05:02:24PM +0200, Dejan Muhamedagic wrote: Hi Vladimir, On Thu, Sep 27, 2012 at 10:22:43AM +0300, Vladislav Bogdanov wrote: Hi Dejan, list, It looks like shell 1.2.0 (I use b58a3398bf11621fe7811380f00245dac52d34c6

[Pacemaker] crmsh resource update issue

2012-09-27 Thread Vladislav Bogdanov
Hi Dejan, list, It looks like shell 1.2.0 (I use b58a3398bf11621fe7811380f00245dac52d34c6 with patch you sent recently) incorrectly replaces the whole cib with just resources section, so all node state sections are dropped. Logs and code analysis show that crm calls 'cibadmin -p -R'

Re: [Pacemaker] Pacemaker 1.1.8 is out now

2012-09-25 Thread Vladislav Bogdanov
20.09.2012 13:25, Andrew Beekhof wrote: This has been coming for a while now, the final hold-up was some stonith interactions in the presence of multiple clients that while strictly correct, wasn't really good enough. 1.1.8 seems to be in a very good shape. Didn't you think about making it a

Re: [Pacemaker] pacemaker processes RSS growth

2012-09-14 Thread Vladislav Bogdanov
13.09.2012 15:18, Vladislav Bogdanov wrote: ... and now it runs on my testing cluster. Ipc-related memory problems seem to be completely fixed now, processes own memory (RES-SHR in terms of htop) does not grow any longer (after 40 minutes). Although I see that both RES and SHR counters

Re: [Pacemaker] pacemaker processes RSS growth

2012-09-14 Thread Vladislav Bogdanov
14.09.2012 09:54, Vladislav Bogdanov wrote: 13.09.2012 15:18, Vladislav Bogdanov wrote: ... and now it runs on my testing cluster. Ipc-related memory problems seem to be completely fixed now, processes own memory (RES-SHR in terms of htop) does not grow any longer (after 40 minutes

[Pacemaker] crm shell and corosync2 node id

2012-09-13 Thread Vladislav Bogdanov
Hi Dejan, all, current crm shell (b58a3398bf11) can not parse node 'id' attribute when running on corosync 2: # crm configure show |grep xml INFO: object 1074005258 cannot be represented in the CLI notation INFO: object 1090782474 cannot be represented in the CLI notation INFO: object 1107559690

Re: [Pacemaker] pacemaker processes RSS growth

2012-09-13 Thread Vladislav Bogdanov
13.09.2012 06:16, Andrew Beekhof wrote: On Thu, Sep 13, 2012 at 12:30 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 12.09.2012 10:35, Andrew Beekhof wrote: On Tue, Sep 11, 2012 at 6:14 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 07.09.2012 09:25, Vladislav Bogdanov wrote

Re: [Pacemaker] crm shell and corosync2 node id

2012-09-13 Thread Vladislav Bogdanov
13.09.2012 15:37, Dejan Muhamedagic wrote: Hi, On Thu, Sep 13, 2012 at 02:57:54PM +0300, Vladislav Bogdanov wrote: Hi Dejan, all, current crm shell (b58a3398bf11) can not parse node 'id' attribute when running on corosync 2: # crm configure show |grep xml INFO: object 1074005258 cannot

Re: [Pacemaker] crm shell and corosync2 node id

2012-09-13 Thread Vladislav Bogdanov
13.09.2012 15:42, Dejan Muhamedagic wrote: Hi again, On Thu, Sep 13, 2012 at 02:57:54PM +0300, Vladislav Bogdanov wrote: Hi Dejan, all, current crm shell (b58a3398bf11) can not parse node 'id' attribute when running on corosync 2: # crm configure show |grep xml INFO: object 1074005258

Re: [Pacemaker] pacemaker processes RSS growth

2012-09-12 Thread Vladislav Bogdanov
12.09.2012 10:35, Andrew Beekhof wrote: On Tue, Sep 11, 2012 at 6:14 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 07.09.2012 09:25, Vladislav Bogdanov wrote: 06.09.2012 12:58, Vladislav Bogdanov wrote: ... lrmd seems not to clean up gio channels properly: I prefer to call

Re: [Pacemaker] pacemaker processes RSS growth

2012-09-07 Thread Vladislav Bogdanov
06.09.2012 12:58, Vladislav Bogdanov wrote: ... lrmd seems not to clean up gio channels properly: I prefer to call g_io_channel_unref() right after g_io_add_watch_full() instead of doing so when deleting descriptor (g_source_remove() is enough there). So channel will be automatically freed after

Re: [Pacemaker] problem starting new instance of pacemaker (via corosync)

2012-09-07 Thread Vladislav Bogdanov
07.09.2012 18:28, John White wrote: An odd update to this. We run in a stateless environment (nodes are pxe booted and have NFS roots, etc). Trying the same install on a VM works just fine. I wonder if anyone has experience with pacemaker and stateless nodes. I run it with iso image loaded

[Pacemaker] pacemaker processes RSS growth

2012-09-06 Thread Vladislav Bogdanov
Hi, I noticed that some pacemaker processes grow during operation (commit 8535316). Running on top of corosync 2.0.1. I notched RSS size (RES as htop reports) with interval of ~18 hours. First column is notched after ~1 hour of operation. Results are: pengine 23568 23780 crmd

Re: [Pacemaker] pacemaker processes RSS growth

2012-09-06 Thread Vladislav Bogdanov
06.09.2012 10:19, Andrew Beekhof wrote: On Thu, Sep 6, 2012 at 5:14 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, I noticed that some pacemaker processes grow during operation (commit 8535316). Running on top of corosync 2.0.1. I notched RSS size (RES as htop reports

Re: [Pacemaker] pacemaker processes RSS growth

2012-09-06 Thread Vladislav Bogdanov
06.09.2012 12:39, Andrew Beekhof wrote: On Thu, Sep 6, 2012 at 5:33 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 06.09.2012 10:19, Andrew Beekhof wrote: On Thu, Sep 6, 2012 at 5:14 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, I noticed that some pacemaker processes grow

Re: [Pacemaker] pacemaker processes RSS growth

2012-09-06 Thread Vladislav Bogdanov
06.09.2012 12:58, Vladislav Bogdanov wrote: 06.09.2012 12:39, Andrew Beekhof wrote: On Thu, Sep 6, 2012 at 5:33 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 06.09.2012 10:19, Andrew Beekhof wrote: On Thu, Sep 6, 2012 at 5:14 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, I

Re: [Pacemaker] pacemaker processes RSS growth

2012-09-06 Thread Vladislav Bogdanov
07.09.2012 03:03, Andrew Beekhof wrote: ... cib shows tons of reachable memory in finished child processes. Not important but log is huge, so it is very hard to find real errors. I found two minor (very slow) leaks in cib too: ==1732== 80 bytes in 20 blocks are still reachable in loss

[Pacemaker] Two c72f5ca stonithd coredumps

2012-09-03 Thread Vladislav Bogdanov
Hi Andrew, all, as I wrote before, I caught two paths where stonithd (c72f5ca) dumps core. Here are gdb backtraces for them (sorry for posting them inline, I was requested to do that ASAP and I hope it is not yet too late for 1.1.8 ;) ). Some vars are optimized out, but I hope that doesn't

Re: [Pacemaker] Compilation problem in centos 6.3

2012-09-03 Thread Vladislav Bogdanov
04.09.2012 03:46, Keisuke MORI wrote: Hi, I've seen a similar problem. It was caused by an unseen escape sequence produced by the crm shell (readline library in particular) when TERM=xterm. Try export TERM=vt100 and rebuild it. Or grab the latest crm shell. Shouldn't it be listed as

Re: [Pacemaker] getnameinfo() vs uname()

2012-08-31 Thread Vladislav Bogdanov
31.08.2012 05:43, Andrew Beekhof wrote: On Wed, Aug 29, 2012 at 8:57 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 29.08.2012 13:33, Andrew Beekhof wrote: On Wed, Aug 29, 2012 at 4:22 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, It looks like pacemaker (current master

Re: [Pacemaker] getnameinfo() vs uname()

2012-08-29 Thread Vladislav Bogdanov
29.08.2012 13:33, Andrew Beekhof wrote: On Wed, Aug 29, 2012 at 4:22 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, It looks like pacemaker (current master) current master changes quite rapidly, could you be specific? c72f5ca does not always work nicely on top of corosync2

Re: [Pacemaker] None of the standard agents in ocf:heartbeat are working in centos 6

2012-07-30 Thread Vladislav Bogdanov
30.07.2012 09:30, Andrew Beekhof wrote: On Mon, Jul 30, 2012 at 2:21 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 30.07.2012 02:39, Andrew Beekhof wrote: On Tue, Jul 24, 2012 at 2:25 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 24.07.2012 04:50, Andrew Beekhof wrote: On Tue, Jul

Re: [Pacemaker] None of the standard agents in ocf:heartbeat are working in centos 6

2012-07-29 Thread Vladislav Bogdanov
30.07.2012 02:39, Andrew Beekhof wrote: On Tue, Jul 24, 2012 at 2:25 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 24.07.2012 04:50, Andrew Beekhof wrote: On Tue, Jul 24, 2012 at 5:38 AM, David Barchas d...@barchas.com wrote: On Monday, July 23, 2012 at 7:48 AM, David Barchas wrote

Re: [Pacemaker] None of the standard agents in ocf:heartbeat are working in centos 6

2012-07-24 Thread Vladislav Bogdanov
24.07.2012 14:23, Vadym Chepkov wrote: On Jul 24, 2012, at 12:25 AM, Vladislav Bogdanov wrote: 24.07.2012 04:50, Andrew Beekhof wrote: On Tue, Jul 24, 2012 at 5:38 AM, David Barchas d...@barchas.com wrote: On Monday, July 23, 2012 at 7:48 AM, David Barchas wrote: Date: Mon, 23 Jul 2012

Re: [Pacemaker] None of the standard agents in ocf:heartbeat are working in centos 6

2012-07-23 Thread Vladislav Bogdanov
23.07.2012 08:06, David Barchas wrote: Hello. I have been working on this for 3 days now, and must be so stressed out that I am being blinded to what is probably an obvious cause of this. In a word, HELP. setenforce 0 ? ___ Pacemaker mailing

Re: [Pacemaker] None of the standard agents in ocf:heartbeat are working in centos 6

2012-07-23 Thread Vladislav Bogdanov
24.07.2012 04:50, Andrew Beekhof wrote: On Tue, Jul 24, 2012 at 5:38 AM, David Barchas d...@barchas.com wrote: On Monday, July 23, 2012 at 7:48 AM, David Barchas wrote: Date: Mon, 23 Jul 2012 14:15:27 +0300 From: Vladislav Bogdanov 23.07.2012 08:06, David Barchas wrote: Hello. I have

Re: [Pacemaker] resources not migrating when some are not runnable on one node, maybe because of groups or master/slave clones?

2012-06-18 Thread Vladislav Bogdanov
18.06.2012 16:39, Phil Frost wrote: I'm attempting to configure an NFS cluster, and I've observed that under some failure conditions, resources that depend on a failed resource simply stop, and no migration to another node is attempted, even though a manual migration demonstrates the other

Re: [Pacemaker] Advisory ordering and Cannot migrate

2012-06-01 Thread Vladislav Bogdanov
01.06.2012 20:15, David Vossel wrote: - Original Message - From: Vladislav Bogdanov bub...@hoster-ok.com To: pacemaker@oss.clusterlabs.org Sent: Tuesday, May 29, 2012 11:26:35 PM Subject: Re: [Pacemaker] Advisory ordering and Cannot migrate 30.05.2012 01:37, David Vossel wrote

[Pacemaker] Advisory ordering and Cannot migrate

2012-05-29 Thread Vladislav Bogdanov
Hi Andrew, David, all, It seems that advisory ordering is honored when pengine wants to move two advisory-ordered resources in one transition, and one of resources (then) is migrateable. I have advisory ordering configured for two resources, mgs and drbd-testfs-stacked: order

Re: [Pacemaker] Advisory ordering and Cannot migrate

2012-05-29 Thread Vladislav Bogdanov
29.05.2012 18:51, David Vossel wrote: - Original Message - From: Vladislav Bogdanov bub...@hoster-ok.com To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Tuesday, May 29, 2012 7:27:12 AM Subject: [Pacemaker] Advisory ordering and Cannot migrate Hi

Re: [Pacemaker] Advisory ordering and Cannot migrate

2012-05-29 Thread Vladislav Bogdanov
30.05.2012 01:37, David Vossel wrote: - Original Message - From: Vladislav Bogdanov bub...@hoster-ok.com To: pacemaker@oss.clusterlabs.org Sent: Tuesday, May 29, 2012 3:48:12 PM Subject: Re: [Pacemaker] Advisory ordering and Cannot migrate 29.05.2012 18:51, David Vossel wrote

Re: [Pacemaker] crm_mon on Node-2 shows both Node-1 Node-2 as online but crm_mon on Node-1 shows Node-2 as offline

2012-04-20 Thread Vladislav Bogdanov
20.04.2012 03:09, Andrew Beekhof wrote: On Thu, Apr 19, 2012 at 11:51 PM, Dan Frincu df.clus...@gmail.com wrote: Hi, On Thu, Apr 19, 2012 at 3:56 PM, Parshvi parshvi...@gmail.com wrote: 1) What is the use of ssh without pass key between cluster nodes in pacemaker ? a. Use case: i. Two

Re: [Pacemaker] Convenience Groups - WAS Re: [Linux-HA] Unordered groups (was Re: Is 'resource_set' still experimental?)

2012-04-20 Thread Vladislav Bogdanov
20.04.2012 03:21, Andrew Beekhof wrote: On Fri, Apr 20, 2012 at 7:41 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 19.04.2012 20:48, David Vossel wrote: - Original Message - From: Alan Robertson al...@unix.sh To: pacemaker@oss.clusterlabs.org, Andrew Beekhof and...@beekhof.net

Re: [Pacemaker] Periodically appear non-existent nodes

2012-04-19 Thread Vladislav Bogdanov
19.04.2012 11:24, Andreas Kurz wrote: On 04/18/2012 11:46 PM, ruslan usifov wrote: 2012/4/18 Andreas Kurz andr...@hastexo.com mailto:andr...@hastexo.com On 04/17/2012 09:31 PM, ruslan usifov wrote: 2012/4/17 Proskurin Kirill k.prosku...@corp.mail.ru

Re: [Pacemaker] Convenience Groups - WAS Re: [Linux-HA] Unordered groups (was Re: Is 'resource_set' still experimental?)

2012-04-19 Thread Vladislav Bogdanov
19.04.2012 20:48, David Vossel wrote: - Original Message - From: Alan Robertson al...@unix.sh To: pacemaker@oss.clusterlabs.org, Andrew Beekhof and...@beekhof.net Cc: Dejan Muhamedagic de...@hello-penguin.com Sent: Thursday, April 19, 2012 10:22:48 AM Subject: [Pacemaker] Convenience

Re: [Pacemaker] Issue with ordering

2012-04-04 Thread Vladislav Bogdanov
30.03.2012 02:20, Andrew Beekhof wrote: On Thu, Mar 29, 2012 at 7:07 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, I'm continuing experiments with lustre on stacked drbd, and see following problem: I have one drbd resource (ms-drbd-testfs-mdt) is stacked on top

Re: [Pacemaker] Migration of lower resource causes dependent resources to restart

2012-04-04 Thread Vladislav Bogdanov
04.04.2012 02:12, Andrew Beekhof wrote: On Fri, Mar 30, 2012 at 7:10 PM, Florian Haas flor...@hastexo.com wrote: On Thu, Mar 29, 2012 at 8:35 AM, Andrew Beekhof and...@beekhof.net wrote: On Thu, Mar 29, 2012 at 5:28 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all

[Pacemaker] Migration of lower resource causes dependent resources to restart

2012-03-29 Thread Vladislav Bogdanov
Hi Andrew, all, Pacemaker restarts resources when resource they depend on (ordering only, no colocation) is migrated. I mean that when I do crm resource migrate lustre, I get LogActions: Migrate lustre#011(Started lustre03-left - lustre04-left) LogActions: Restart mgs#011(Started lustre01-left)

Re: [Pacemaker] Migration of lower resource causes dependent resources to restart

2012-03-29 Thread Vladislav Bogdanov
29.03.2012 09:35, Andrew Beekhof wrote: On Thu, Mar 29, 2012 at 5:28 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, Pacemaker restarts resources when resource they depend on (ordering only, no colocation) is migrated. I mean that when I do crm resource migrate lustre, I

Re: [Pacemaker] Migration of lower resource causes dependent resources to restart

2012-03-29 Thread Vladislav Bogdanov
29.03.2012 09:43, Vladislav Bogdanov wrote: 29.03.2012 09:35, Andrew Beekhof wrote: On Thu, Mar 29, 2012 at 5:28 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, Pacemaker restarts resources when resource they depend on (ordering only, no colocation) is migrated. I mean

Re: [Pacemaker] Migration of lower resource causes dependent resources to restart

2012-03-29 Thread Vladislav Bogdanov
29.03.2012 10:07, Andrew Beekhof wrote: On Thu, Mar 29, 2012 at 5:43 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 29.03.2012 09:35, Andrew Beekhof wrote: On Thu, Mar 29, 2012 at 5:28 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, Pacemaker restarts resources when

[Pacemaker] Issue with ordering

2012-03-29 Thread Vladislav Bogdanov
Hi Andrew, all, I'm continuing experiments with lustre on stacked drbd, and see following problem: I have one drbd resource (ms-drbd-testfs-mdt) is stacked on top of other (ms-drbd-testfs-mdt-left), and have following constraints between them: colocation

Re: [Pacemaker] Issue with ordering

2012-03-29 Thread Vladislav Bogdanov
Hi Florian, 29.03.2012 11:54, Florian Haas wrote: On Thu, Mar 29, 2012 at 10:07 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, I'm continuing experiments with lustre on stacked drbd, and see following problem: At the risk of going off topic, can you explain *why* you

Re: [Pacemaker] offtopic scalable block-device

2012-03-16 Thread Vladislav Bogdanov
16.03.2012 12:13, ruslan usifov wrote: Hello I search a solution for scalable block device (dist that can extend if we add some machines to cluster). Only what i find accepten on my task is ceph + RDB, but ceph on my test i very unstable(regulary crash of all it daemons) + have poor

Re: [Pacemaker] Migration atomicity

2012-03-16 Thread Vladislav Bogdanov
16.03.2012 05:46, Andrew Beekhof wrote: On Thu, Mar 15, 2012 at 3:22 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 15.03.2012 01:49, Andreas Kurz wrote: On 03/14/2012 08:40 AM, Vladislav Bogdanov wrote: Hi, I'm observing a little bit unintuitive behavior of migration logic when

[Pacemaker] Migration atomicity

2012-03-14 Thread Vladislav Bogdanov
Hi, I'm observing a little bit unintuitive behavior of migration logic when transition is aborted (due to CIB change) in the middle of the resource migration. That is: 1. nodea: migrate_to nodeb 2. transition abort 3. nodeb: stop 4. nodea: migrate_to nodec 5. nodec: migrate_from nodea (note: no

Re: [Pacemaker] Migration atomicity

2012-03-14 Thread Vladislav Bogdanov
15.03.2012 01:49, Andreas Kurz wrote: On 03/14/2012 08:40 AM, Vladislav Bogdanov wrote: Hi, I'm observing a little bit unintuitive behavior of migration logic when transition is aborted (due to CIB change) in the middle of the resource migration. That is: 1. nodea: migrate_to nodeb 2

Re: [Pacemaker] Problems with resource scaling

2012-02-28 Thread Vladislav Bogdanov
25.02.2012 06:35, Atif Faheem wrote: Hi. I have been experimenting with resource scalability in Pacemaker. I started with no resources, and attempted to configure start a few hundred dummy resources (a dummy ocf script that does not load the CPU) on a cluster of 4 virtual machines using crm

Re: [Pacemaker] Advisory ordering on clones not working?

2012-02-23 Thread Vladislav Bogdanov
24.02.2012 02:26, Andrew Beekhof wrote: On Tue, Feb 21, 2012 at 3:12 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.02.2012 02:40, Andrew Beekhof wrote: On Mon, Feb 20, 2012 at 11:49 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.02.2012 14:36, Andrew Beekhof wrote: On Mon

Re: [Pacemaker] Upstart resources

2012-02-22 Thread Vladislav Bogdanov
22.02.2012 20:55, Ante Karamatic write: On 16.02.2012 05:23, Vladislav Bogdanov wrote: Newer versions of pacemaker and lrmd are able to deal with upstart resources via dbus. However I do not like this way, so please find resource-agent attached, which is able to manage arbitrary upstart job

Re: [Pacemaker] Upstart resources

2012-02-22 Thread Vladislav Bogdanov
22.02.2012 22:44, Ante Karamatic wrote: On 22.02.2012 19:45, Vladislav Bogdanov wrote: I looked at that RAexec very early, just after it was commited, and I understand that it requires running dbus daemon to operate. I prefer to simplify operation chains, so that is really not an option

Re: [Pacemaker] Advisory ordering on clones not working?

2012-02-20 Thread Vladislav Bogdanov
20.02.2012 14:36, Andrew Beekhof wrote: On Mon, Feb 20, 2012 at 10:26 PM, Adrian Fita adrian.f...@gmail.com wrote: Thanks, I figured it out by now. But the real problem I'm facing is explained in http://oss.clusterlabs.org/pipermail/pacemaker/2012-February/013124.html . Please also take a

Re: [Pacemaker] Advisory ordering on clones not working?

2012-02-20 Thread Vladislav Bogdanov
21.02.2012 02:40, Andrew Beekhof wrote: On Mon, Feb 20, 2012 at 11:49 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.02.2012 14:36, Andrew Beekhof wrote: On Mon, Feb 20, 2012 at 10:26 PM, Adrian Fita adrian.f...@gmail.com wrote: Thanks, I figured it out by now. But the real problem I'm

Re: [Pacemaker] Upstart resources

2012-02-15 Thread Vladislav Bogdanov
to libvirt ml). Best, Vladislav #!/bin/bash # # OCF resource agent which manages upstart jobs. # # Copyright (c) 2011 Vladislav Bogdanov bub...@hoster-ok.com # # OCF instance parameters: #OCF_RESKEY_job_name: name of upstart job #OCF_RESKEY_process_name: name of process # # Initialization

Re: [Pacemaker] cLVM stuck

2012-02-10 Thread Vladislav Bogdanov
Hi Karl, 10.02.2012 10:56, Karl Rößmann wrote: Quoting Andreas Kurz andr...@hastexo.com: Hello, On 02/09/2012 03:29 PM, Karl Rößmann wrote: Hi all, we run a three Node HA Cluster using cLVM and Xen. After installing some online updates node by node I was struggling with this problem a

Re: [Pacemaker] Proposed new stonith topology syntax

2012-02-06 Thread Vladislav Bogdanov
. Something else needs to have failed before fencing has even a chance to do so. Unless you put all the nodes on the same PDU... but that would be silly. On Mon, Feb 6, 2012 at 3:29 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 06.02.2012 01:55, Andrew Beekhof wrote: On Sat, Feb 4

Re: [Pacemaker] Proposed new stonith topology syntax

2012-02-05 Thread Vladislav Bogdanov
06.02.2012 01:55, Andrew Beekhof wrote: On Sat, Feb 4, 2012 at 5:50 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, Dejan, all, 25.01.2012 03:24, Andrew Beekhof wrote: [snip] If they're for the same host but different devices, then at most you'll get the commands sent

Re: [Pacemaker] don't want to restart clone resource

2012-02-01 Thread Vladislav Bogdanov
01.02.2012 20:57, Lars Ellenberg wrote: On Wed, Feb 01, 2012 at 03:43:55PM +0100, Andreas Kurz wrote: Hello, On 02/01/2012 10:39 AM, Fanghao Sha wrote: Hi Lars, Yes, you are right. But how to prevent the orphaned resources from stopping by default, please? crm configure property

[Pacemaker] current pacemaker tree and flatiron

2012-01-19 Thread Vladislav Bogdanov
Hi Andrew, all, corosync plugin in current github tree at beekhof/pacemaker does not load in flatiron due to two chunks in lib/ais/utils.c and lib/ais/utils.h which reference qb_log_from_external_source(). Best, Vladislav ___ Pacemaker mailing list:

Re: [Pacemaker] current pacemaker tree and flatiron

2012-01-19 Thread Vladislav Bogdanov
20, 2012 at 1:39 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, corosync plugin in current github tree at beekhof/pacemaker does not load in flatiron due to two chunks in lib/ais/utils.c and lib/ais/utils.h which reference qb_log_from_external_source(). Best, Vladislav

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2012-01-17 Thread Vladislav Bogdanov
17.01.2012 07:27, Andrew Beekhof wrote: On Tue, Jan 17, 2012 at 3:04 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 17.01.2012 04:01, Andrew Beekhof wrote: On Mon, Jan 16, 2012 at 5:45 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.01.2012 09:20, Andrew Beekhof wrote: [snip

[Pacemaker] [PATCH] compile without libqb

2012-01-16 Thread Vladislav Bogdanov
Hi Andrew, pacemaker from your private repo does not compile without libqb again (b456827) Following patch fixes that. --- a/lib/common/utils.c2012-01-16 09:23:36.0 + +++ b/lib/common/utils.c2012-01-16 09:23:49.681899538 + @@ -674,13 +674,13 @@

[Pacemaker] rsc_ticket and 1.2 rng

2012-01-16 Thread Vladislav Bogdanov
Hi Andrew, is it intentional that 1.2 schema which is now default misses rsc_ticket which is now not only works but even well documented by suse? Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org

Re: [Pacemaker] rsc_ticket and 1.2 rng

2012-01-16 Thread Vladislav Bogdanov
16.01.2012 19:24, Dan Frincu wrote: Hi, On Mon, Jan 16, 2012 at 5:58 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, is it intentional that 1.2 schema which is now default misses rsc_ticket which is now not only works but even well documented by suse? Sorry to barge

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2012-01-16 Thread Vladislav Bogdanov
17.01.2012 04:01, Andrew Beekhof wrote: On Mon, Jan 16, 2012 at 5:45 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.01.2012 09:20, Andrew Beekhof wrote: [snip] At the same time, stonith_admin -B succeeds. The main difference I see is st_opt_sync_call in a latter case. Will try

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2012-01-16 Thread Vladislav Bogdanov
17.01.2012 04:02, Andrew Beekhof wrote: On Mon, Jan 16, 2012 at 5:49 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.01.2012 09:17, Andrew Beekhof wrote: [snip] At the same time, stonith_admin -B succeeds. The main difference I see is st_opt_sync_call in a latter case. Will try

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2012-01-15 Thread Vladislav Bogdanov
16.01.2012 09:20, Andrew Beekhof wrote: [snip] At the same time, stonith_admin -B succeeds. The main difference I see is st_opt_sync_call in a latter case. Will try to experiment with it. Ys!!! Now I see following: Dec 19 11:53:34 vd01-a cluster-dlm: [2474]: info:

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2012-01-15 Thread Vladislav Bogdanov
16.01.2012 09:17, Andrew Beekhof wrote: [snip] At the same time, stonith_admin -B succeeds. The main difference I see is st_opt_sync_call in a latter case. Will try to experiment with it. /Shouldn't/ matter. It really looks like it matters. Can't discuss it at more depth though because of

Re: [Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets

2012-01-12 Thread Vladislav Bogdanov
12.01.2012 15:01, Florian Haas wrote: On Thu, Jan 5, 2012 at 10:15 PM, Florian Haas flor...@hastexo.com wrote: Florian Haas (2): extra: add rsyslog configuration snippet extra: add logrotate configuration snippet configure.ac |4 +++ extra/Makefile.am

Re: [Pacemaker] Feature request: cleanup resource on primitive definition change

2011-12-20 Thread Vladislav Bogdanov
21.12.2011 06:21, Andrew Beekhof wrote: On Tue, Dec 13, 2011 at 11:32 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, I'm now testing latest changes in git, everything goes much cleaner then a week ago, I'll (hopefully) make report later. One feature came into mind

Re: [Pacemaker] Feature request: cleanup resource on primitive definition change

2011-12-20 Thread Vladislav Bogdanov
21.12.2011 09:04, Andrew Beekhof wrote: On Wed, Dec 21, 2011 at 3:24 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.12.2011 06:21, Andrew Beekhof wrote: On Tue, Dec 13, 2011 at 11:32 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, I'm now testing latest changes

Re: [Pacemaker] Feature request: cleanup resource on primitive definition change

2011-12-20 Thread Vladislav Bogdanov
21.12.2011 09:11, Rasto Levrinc wrote: On Wed, Dec 21, 2011 at 5:24 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.12.2011 06:21, Andrew Beekhof wrote: On Tue, Dec 13, 2011 at 11:32 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, I'm now testing latest changes

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2011-12-19 Thread Vladislav Bogdanov
09.12.2011 08:44, Andrew Beekhof wrote: On Fri, Dec 9, 2011 at 3:16 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 09.12.2011 03:11, Andrew Beekhof wrote: On Fri, Dec 2, 2011 at 1:32 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, I investigated on my test cluster what

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2011-12-19 Thread Vladislav Bogdanov
19.12.2011 14:39, Vladislav Bogdanov wrote: 09.12.2011 08:44, Andrew Beekhof wrote: On Fri, Dec 9, 2011 at 3:16 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 09.12.2011 03:11, Andrew Beekhof wrote: On Fri, Dec 2, 2011 at 1:32 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2011-12-08 Thread Vladislav Bogdanov
09.12.2011 03:15, Andrew Beekhof wrote: On Thu, Nov 24, 2011 at 6:21 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 24.11.2011 08:49, Andrew Beekhof wrote: On Thu, Nov 24, 2011 at 3:58 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 24.11.2011 07:33, Andrew Beekhof wrote: On Tue, Nov

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2011-12-08 Thread Vladislav Bogdanov
09.12.2011 03:11, Andrew Beekhof wrote: On Fri, Dec 2, 2011 at 1:32 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, I investigated on my test cluster what actually happens with dlm and fencing. I added more debug messages to dlm dump, and also did a re-kick of nodes after

Re: [Pacemaker] CLVM Pacemaker Corosync on Ubuntu Omeiric Server

2011-12-02 Thread Vladislav Bogdanov
02.12.2011 11:06, Vadim Bulst wrote: [snip] Now I run into new problems: I created a cloneset for managing the volume groups: node bbzclnode04 node bbzclnode06 node bbzclnode07 primitive clvm ocf:lvm2:clvmd \ params daemon_timeout=30 \ meta target-role=Started primitive dlm

[Pacemaker] Excessive migrate_from is run after migrate_to failed

2011-12-01 Thread Vladislav Bogdanov
Hi Andrew, all, I found that pacemaker runs migrate_from on a migration destination node even if preceding migrate_to command failed (github master). Is it intentional? hb_report? Best, Vladislav ___ Pacemaker mailing list:

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2011-12-01 Thread Vladislav Bogdanov
Hi Andrew, I investigated on my test cluster what actually happens with dlm and fencing. I added more debug messages to dlm dump, and also did a re-kick of nodes after some time. Results are that stonith history actually doesn't contain any information until pacemaker decides to fence node

Re: [Pacemaker] CLVM Pacemaker Corosync on Ubuntu Omeiric Server

2011-11-30 Thread Vladislav Bogdanov
30.11.2011 14:08, Vadim Bulst wrote: Hello, first of all I'd like to ask you a general question: Does somebody successfully set up a clvm cluster with pacemaker and run it in productive mode? I will say yes after I finally resolve remaining dlmfencing issues. Now back to the concrete

Re: [Pacemaker] CLVM Pacemaker Corosync on Ubuntu Omeiric Server

2011-11-30 Thread Vladislav Bogdanov
over accidentally found one, but it does its function for me. Set 'avoid_lck' to 1. Best, Vladislav Am 30.11.2011 13:10, schrieb Vadim Bulst: Am 30.11.2011 12:22, schrieb Vladislav Bogdanov: 30.11.2011 14:08, Vadim Bulst wrote: Hello, first of all I'd like to ask you a general question

Re: [Pacemaker] Fencing libvirt/KVM nodes running on different hosts?

2011-11-28 Thread Vladislav Bogdanov
28.11.2011 22:55, Andreas Ntaflos wrote: Hi, Scenario: two physical virtualisation hosts run various KVM-based virtual machines, managed by Libvirt. Two VMs, one on each host, form a Pacemaker cluster, say for a simple database server, using DRBD and a virtual/cluster IP address. Using

[Pacemaker] Reload does not work with current github tree (2d8fad5)

2011-11-23 Thread Vladislav Bogdanov
Hi Andrew, all Just noticed that reload action does not happen when resource definition change: Nov 23 08:16:08 v03-a pengine: [2091]: CRIT: check_action_definition: Parameters to c5-x64-devel.vds-ok.com-vm_monitor_1 on v03-b changed: recorded 94f8fd587de8d9dd8454443 cbde11b4e vs.

Re: [Pacemaker] Reload does not work with current github tree (2d8fad5)

2011-11-23 Thread Vladislav Bogdanov
Hi Andreas, 23.11.2011 13:13, Andreas Kurz wrote: On 11/23/2011 09:30 AM, Vladislav Bogdanov wrote: Hi Andrew, all Just noticed that reload action does not happen when resource definition change: Nov 23 08:16:08 v03-a pengine: [2091]: CRIT: check_action_definition: Parameters to c5-x64

Re: [Pacemaker] Reload does not work with current github tree (2d8fad5)

2011-11-23 Thread Vladislav Bogdanov
24.11.2011 07:37, Andrew Beekhof wrote: On Wed, Nov 23, 2011 at 7:30 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all Just noticed that reload action does not happen when resource definition change: Nov 23 08:16:08 v03-a pengine: [2091]: CRIT: check_action_definition

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2011-11-23 Thread Vladislav Bogdanov
24.11.2011 08:49, Andrew Beekhof wrote: On Thu, Nov 24, 2011 at 3:58 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 24.11.2011 07:33, Andrew Beekhof wrote: On Tue, Nov 15, 2011 at 7:36 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, I just found another problem

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2011-11-14 Thread Vladislav Bogdanov
be you remember? Best, Vladislav 28.09.2011 17:41, Vladislav Bogdanov wrote: Hi Andrew, All the more reason to start using the stonith api directly. I was playing around list night with the dlm_controld.pcmk code: https://github.com/beekhof/dlm/commit

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2011-11-14 Thread Vladislav Bogdanov
to zero if node have been seen after if was fenced and then appeared again? 14.11.2011 23:36, Vladislav Bogdanov wrote: Hi Andrew, I just found another problem with dlm_controld.pcmk (with your latest patch from github applied and also my fixes to actually build it - they are included

Re: [Pacemaker] [Linux-HA] pcmk + corosync + cman for dlm support?

2011-11-03 Thread Vladislav Bogdanov
03.11.2011 15:37, Nick Khamis wrote: Hello Vlad, Thank you so much for your response. I am experiencing the same hang as well. Did you have better luck with GFS2, or any other network file system? If you see almost simultaneous kernel panic on all cluster nodes, then you probably hit the

Re: [Pacemaker] [Linux-HA] pcmk + corosync + cman for dlm support?

2011-11-02 Thread Vladislav Bogdanov
02.11.2011 16:36, Nick Khamis wrote: Vladislav, Thank you so much for your response. Just to make sure, all I need is to: * Apply the three patches to cman. Found here http://www.gossamer-threads.com/lists/linuxha/pacemaker/75164?do=post_view_threaded;. * Recompile CMAN * Do I have to

Re: [Pacemaker] pcmk + corosync + cman for dlm support?

2011-10-28 Thread Vladislav Bogdanov
28.10.2011 04:04, Nick Khamis wrote: Hello Everyone, I just want to make sure this is still the case before I go through with it. I am trying to setup an active/active using: Corosync 1.4.2 Pacemaker 1.1.6 Cluster3 DRBD 8.3.7 OCFS2 The only reason I installed Cluster3 was for dlm

<    1   2   3   4   >