Re: [Pacemaker] Project updates
On Tue, Nov 16, 2010 at 8:07 AM, Keisuke MORI keisuke.mori...@gmail.com wrote: Hi Andrew, On Fri, Nov 12, 2010 at 9:32 AM, Andrew Beekhof and...@beekhof.net wrote: For those that aren't using RSS readers, I wanted to draw people's attention to a couple of updates that went out today. Congratulations for the 1.0.10 release! It is the great excitement for us too, but one thing I would like to comment... 2010/11/13 Andrew Beekhof and...@beekhof.net: On Fri, Nov 12, 2010 at 5:49 PM, Vadym Chepkov vchep...@gmail.com wrote: STABLE_SERIES = stable-1.0 RPM_ROOT = $(shell pwd) diff -r 99f5a1e61667 configure.ac --- a/configure.ac Fri Nov 12 09:12:32 2010 +0100 +++ b/configure.ac Fri Nov 12 11:47:28 2010 -0500 @@ -19,7 +19,7 @@ dnl checks for library functions dnl checks for system services -AC_INIT(pacemaker, 1.0.9, pacemaker@oss.clusterlabs.org) +AC_INIT(pacemaker, 1.0.10, pacemaker@oss.clusterlabs.org) thats kinda annoying but not crucial. thanks for pointing it out This would be confusing for users to tell which version they're actually using when they are going to report a problem because all the logs and crm_mon output shows the version as 1.0.9. Any chance of the release for another RPMs with this fix? Oh, I forgot about crm_mon. I'll see what I can do. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] how to use sonith externel/vmware
Dear All, What is the externel/vmware sonith using for? Do some body know how to using it? Thanks. Best wishes, Dika.Ye ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] options listed more than once
On Mon, Nov 15, 2010 at 4:35 PM, Pavlos Parissis pavlos.paris...@gmail.com wrote: Hi, When I have multiple values for a cluster options, ow do I check which value is currently being used by the cluster? In the configuration explained there is a reference to rules chapter but I couldn't find an answer on that chapter. Good question, I don't think we have a way to do that directly at the moment. You might be able to use crm_resource to infer it with -g. Not a bad feature request though, could you add it to bugzilla? Here is what I have and I want to get the current value of resource-stickiness [r...@node-03 log]# crm_attribute --type rsc_defaults --name resource-stickiness --query Multiple attributes match name=resource-stickiness Value: INFINITY (id=working-hours-stickiness) Value: 0 (id=after-hours-stickiness) Cheers, Pavlos ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Balancing of clone resources (globally-unique=true)
On Mon, Nov 15, 2010 at 2:38 PM, Chris Picton ch...@ecntelecoms.com wrote: On Mon, 15 Nov 2010 08:37:52 +0100, Andrew Beekhof wrote: On Fri, Nov 12, 2010 at 7:41 AM, Chris Picton ch...@ecntelecoms.com wrote: I have attached the output as requested Normally it would get balanced, but its being pushed to 01 because there are so many resources on 02 sort_node_weight: slb-test-02.ecntelecoms.za.net (12) slb-test-01.ecntelecoms.za.net (2) : resources So the cluster is trying to balance out the resources, just not at the level you were expecting. I agree with the above. However, how would I weight the clusterip clone so it preferentially is balanced across the nodes, even in the presence of many other resources on a single node. A reasonable request, could you create a bugzilla for that? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] options listed more than once
On 16 November 2010 10:49, Andrew Beekhof and...@beekhof.net wrote: On Mon, Nov 15, 2010 at 4:35 PM, Pavlos Parissis pavlos.paris...@gmail.com wrote: Hi, When I have multiple values for a cluster options, ow do I check which value is currently being used by the cluster? In the configuration explained there is a reference to rules chapter but I couldn't find an answer on that chapter. Good question, I don't think we have a way to do that directly at the moment. You might be able to use crm_resource to infer it with -g. Nope [r...@node-01 ~]# crm_resource -g resource-stickiness -r ip_01 Error performing operation: The object/attribute does not exist [r...@node-01 ~]# crm_resource -g resource-stickiness -t group -r pbx_service_01 Error performing operation: The object/attribute does not exist [r...@node-01 ~]# crm_resource -g resource-stickiness -t primitive -r ip_01 Error performing operation: The object/attribute does not exist Not a bad feature request though, could you add it to bugzilla? Done, http://developerbugs.linux-foundation.org/show_bug.cgi?id=2521 Here is what I have and I want to get the current value of resource-stickiness [r...@node-03 log]# crm_attribute --type rsc_defaults --name resource-stickiness --query Multiple attributes match name=resource-stickiness Value: INFINITY (id=working-hours-stickiness) Value: 0 (id=after-hours-stickiness) Cheers, Pavlos ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] options listed more than once
On Tue, Nov 16, 2010 at 11:04 AM, Pavlos Parissis pavlos.paris...@gmail.com wrote: On 16 November 2010 10:49, Andrew Beekhof and...@beekhof.net wrote: On Mon, Nov 15, 2010 at 4:35 PM, Pavlos Parissis pavlos.paris...@gmail.com wrote: Hi, When I have multiple values for a cluster options, ow do I check which value is currently being used by the cluster? In the configuration explained there is a reference to rules chapter but I couldn't find an answer on that chapter. Good question, I don't think we have a way to do that directly at the moment. You might be able to use crm_resource to infer it with -g. Nope did you try with -m too? [r...@node-01 ~]# crm_resource -g resource-stickiness -r ip_01 Error performing operation: The object/attribute does not exist [r...@node-01 ~]# crm_resource -g resource-stickiness -t group -r pbx_service_01 Error performing operation: The object/attribute does not exist [r...@node-01 ~]# crm_resource -g resource-stickiness -t primitive -r ip_01 Error performing operation: The object/attribute does not exist Not a bad feature request though, could you add it to bugzilla? Done, http://developerbugs.linux-foundation.org/show_bug.cgi?id=2521 Here is what I have and I want to get the current value of resource-stickiness [r...@node-03 log]# crm_attribute --type rsc_defaults --name resource-stickiness --query Multiple attributes match name=resource-stickiness Value: INFINITY (id=working-hours-stickiness) Value: 0 (id=after-hours-stickiness) Cheers, Pavlos ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] options listed more than once
On 16 November 2010 11:12, Andrew Beekhof and...@beekhof.net wrote: On Tue, Nov 16, 2010 at 11:04 AM, Pavlos Parissis pavlos.paris...@gmail.com wrote: On 16 November 2010 10:49, Andrew Beekhof and...@beekhof.net wrote: On Mon, Nov 15, 2010 at 4:35 PM, Pavlos Parissis pavlos.paris...@gmail.com wrote: Hi, When I have multiple values for a cluster options, ow do I check which value is currently being used by the cluster? In the configuration explained there is a reference to rules chapter but I couldn't find an answer on that chapter. Good question, I don't think we have a way to do that directly at the moment. You might be able to use crm_resource to infer it with -g. Nope did you try with -m too? [r...@node-01 ~]# crm_resource -g resource-stickiness -r ip_01 Error performing operation: The object/attribute does not exist [r...@node-01 ~]# crm_resource -g resource-stickiness -t group -r pbx_service_01 Error performing operation: The object/attribute does not exist [r...@node-01 ~]# crm_resource -g resource-stickiness -t primitive -r ip_01 Error performing operation: The object/attribute does not exist oops, no # crm_resource -m -g resource-stickiness -t primitive -r ip_01 1000 Not a bad feature request though, could you add it to bugzilla? Done, http://developerbugs.linux-foundation.org/show_bug.cgi?id=2521 Here is what I have and I want to get the current value of resource-stickiness [r...@node-03 log]# crm_attribute --type rsc_defaults --name resource-stickiness --query Multiple attributes match name=resource-stickiness Value: INFINITY (id=working-hours-stickiness) Value: 0 (id=after-hours-stickiness) Cheers, Pavlos ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Stonith Device APC AP7900
Hi, On Mon, Nov 15, 2010 at 10:41:22AM -0700, Devin Reade wrote: --On Monday, November 15, 2010 08:40:45 AM -0700 Rick Cone rc...@securepaymentsystems.com wrote: In production I am planning to have 2 separate AP7900 units each plugged into 2 different APC UPS units to achieve that. I would then have the single node name on each, for each of the 2 PS's on the individual systems. So for this setup, you would have to trigger two stonith devices, one for each AP7900, with identical node names. Only after both succeeded would you be able to consider the node to be dead. Correct? I don't recall reading anything in the pacemaker et al documentation that would cover this case. I was under the impression that after one stonith resource is successfully invoked, the node would be considered to be offline. If so, I'd be suspicious about assuming both PDUs would get activated without further investigation and testing. (I don't think that you could consider two node names on one PDU to be equivalent to one on each of two PDUs.) Right, there's currently no way to do a simultaneous reset on two distinct fencing devices. I think that in such a case you'd also have to ensure that your stonith action is poweroff rather than reset, or your node may not actually lose power (although you could mitigate that likelihood by configuring a longer reset time in the PDU). Defining more than one stonith resource wouldn't work in this case either, because as soon as one of them reports success, the node is considered fenced. Thanks, Dejan However, while I've written RAs before, I've never looked at the stonith logic, so I could be completely out to lunch. It sounds like an interesting edge case, and edge cases make me nervous :) Devin ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] understanding scores
Hi, On Mon, Nov 15, 2010 at 05:01:27PM +0100, Pavlos Parissis wrote: On 15 November 2010 16:43, Andrew Beekhof and...@beekhof.net wrote: On Mon, Nov 15, 2010 at 3:18 PM, Pavlos Parissis pavlos.paris...@gmail.com wrote: On 15 November 2010 08:07, Andrew Beekhof and...@beekhof.net wrote: On Fri, Nov 12, 2010 at 7:54 PM, Pavlos Parissis pavlos.paris...@gmail.com wrote: Hi, I am trying to understand how the scores are calculated based on the output of ptest -sL and I have few questions Below is my scores with a line number column and the bottom you will find my configuration So, let's start 1 group_color: pbx_service_01 allocation score on node-01: 200 2 group_color: pbx_service_01 allocation score on node-03: 10 3 group_color: ip_01 allocation score on node-01: 1200 4 group_color: ip_01 allocation score on node-03: 10 so for so good, ip_01 has 1000 due to resource-stickiness=1000 plus 200 from the group location constraint 5 group_color: fs_01 allocation score on node-01: 1000 6 group_color: fs_01 allocation score on node-03: 0 7 group_color: pbx_01 allocation score on node-01: 1000 8 group_color: pbx_01 allocation score on node-03: 0 9 group_color: sshd_01 allocation score on node-01: 1000 10 group_color: sshd_01 allocation score on node-03: 0 11 group_color: mailAlert-01 allocation score on node-01: 1000 12 group_color: mailAlert-01 allocation score on node-03: 0 hold on now, why all the above resources have 1000 on node-01 and not 1200 as fs_01 its only applied to ip_01, the rest inherit it from there 13 native_color: ip_01 allocation score on node-01: 5200 5 resources x 1000 from resource-stickiness=1000 plus, right? what is the difference between in native and group? Many things, can you be specific? In principles what are the difference? if my question sounds stupid then it is because I don't understand the terminology. well, groups are an ordered collection of natives. 14 native_color: ip_01 allocation score on node-03: 10 15 clone_color: ms-drbd_01 allocation score on node-01: 4100 why 4100? probably the promotion score I have order pbx_service_01-after-drbd_01 inf: ms-drbd_01:promote pbx_service_01:start does promotion score you mention come out the above contstraint? only from colocation constraints then it comes out from colocation fs_01-on-drbd_01 inf: fs_01 ms-drbd_01:Master which has score inf, where at line 15 score is 4100. I think my issue here is how I look at the numbers, I assume that every time I see score for a resource, that score also includes any scores mentioned before. Is my assumption correct? often, it depends on what colocation constraints you have set up Ok, let me ask it differently, how by looking at the output of ptest -sL can I find the score of a resource for a specific node? Since the score of a resource for a specific node is mentioned in several lines, it is not that easy- at least to me. You can try with showscores.sh. It should do exactly what you want. It is not package, but you can get it from the pacemaker repository. Thanks, Dejan Cheers, Pavlos ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] DRBD MC 0.8.4 / Pacemaker GUI
Hi, This is the next DRBD MC release 0.8.4. DRBD MC is a Java application that helps to configure DRBD, Pacemaker, VMs or any combination of them. It uses SSH to connect to the cluster from a desktop computer. Focus of this release was on performance, profiling, fixing leaks and stuff like that. Thanks to this DRBD MC went straight from being bloated and sloppy application to lightweight and lean, making the source code a textbook of concentrated good-coding practices. It turned out, that after couple of changes it is now possible to run DRBD MC as an applet without any performance or functionality loss. See here: http://oss.linbit.com/drbd-mc/img/drbd-mc-0.8.4.png It just sits in the browser and makes all this everything-must-be-web people happy. For now if you want to enable the applet functionality, you'd have to compile and sign it yourself. You can get DRBD MC here: http://www.drbd.org/mc/management-console/ http://oss.linbit.com/drbd-mc/DMC-0.8.4.jar http://oss.linbit.com/drbd-mc/drbd-mc-0.8.4.tar.gz 1. Download the DMC-0.8.4.jar file. 2. Make sure you use SUN Java not the OpenJDK 1.6. 3. Start it: java -jar DMC-0.8.4.jar 4. It connects to the cluster via SSH. DRBD MC is compatible with Heartbeat 2.1.3 to the Pacemaker 1.1.3 with Corosync or Heartbeat and DRBD 8. Here are the most important changes: * add clone-node-max meta attribute * fix defaults in IPaddr/IPaddr2 RAs * remove useless node name and DNS check in host dialog wizard * add --no-upgrade-check option * fix applying of clones * fix leaks with groups and clones * graph fixes * fix graph resizing * fix leak with DRBD resources * make it possible to run as an applet * upgrade Jung library to 2.0.1 Rasto Levrinc -- : Dipl-Ing Rastislav Levrinc : DRBD-MC http://www.drbd.org/mc/management-console/ : DRBD/HA support and consulting http://www.linbit.com/ DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] AP9606 fencing device
--On Wednesday, October 27, 2010 09:47:14 AM +0200 Pavlos Parissis pavlos.paris...@gmail.com wrote: I have a APC AP9606 PDU and I am trying to find a stonith agent which works with that PDU. I know that this is an old thread, but I'll reply anyway. I have a one cluster that uses an old APC AP9606 for which I've not been able to obtain a flash update. In particular, it is: hardware revision: J13 APP version 2.2.0 AOS version 3.0.3 It is running just fine (see caveat below) with the following configuration, and I can attest that it has properly stonith'd nodes many times. primitive msw stonith:apcmastersnmp \ operations $id=msw-operations \ op monitor interval=15 timeout=15 start-delay=15 \ params ipaddr=IPADDR port=161 community=COMMUNITY clone msw-clone msw \ meta clone-max=2 target-role=started (yeah, that monitor interval is probably a little quick ...) That particular cluster is getting long in the tooth: pacemaker-1.0.5-4.6.x86_64 openais-0.80.5-15.1.x86_64 The caveat is that this PDU used to work with the default implementation, however at some point someone updated the OIDs in apcmastersnmp to match newer firmware. Therefore, I had to reverse patch that RA: === --- apcmastersnmp.c.orig2009-09-26 16:12:27.0 -0600 +++ apcmastersnmp.c 2009-09-28 16:46:17.0 -0600 @@ -137,12 +137,12 @@ #define OUTLET_NO_CMD_PEND 2 /* oids */ -#define OID_IDENT .1.3.6.1.4.1.318.1.1.12.1.5.0 -#define OID_NUM_OUTLETS .1.3.6.1.4.1.318.1.1.12.1.8.0 -#define OID_OUTLET_NAMES .1.3.6.1.4.1.318.1.1.12.3.4.1.1.2.%i -#define OID_OUTLET_STATE .1.3.6.1.4.1.318.1.1.12.3.3.1.1.4.%i -#define OID_OUTLET_COMMAND_PENDING .1.3.6.1.4.1.318.1.1.12.3.5.1.1.5.%i -#define OID_OUTLET_REBOOT_DURATION .1.3.6.1.4.1.318.1.1.12.3.4.1.1.6.%i +#define OID_IDENT .1.3.6.1.4.1.318.1.1.4.1.4.0 +#define OID_NUM_OUTLETS.1.3.6.1.4.1.318.1.1.4.4.1.0 +#define OID_OUTLET_NAMES .1.3.6.1.4.1.318.1.1.4.5.2.1.3.%i +#define OID_OUTLET_STATE .1.3.6.1.4.1.318.1.1.4.4.2.1.3.%i +#define OID_OUTLET_COMMAND_PENDING .1.3.6.1.4.1.318.1.1.4.4.2.1.2.%i +#define OID_OUTLET_REBOOT_DURATION .1.3.6.1.4.1.318.1.1.4.5.2.1.5.%i /* snmpset -c private -v1 172.16.0.32:161 === ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] colocation that doesn't
On Nov 15, 2010, at 2:18 AM, Andrew Beekhof wrote: On Fri, Nov 5, 2010 at 4:07 AM, Vadym Chepkov vchep...@gmail.com wrote: On Nov 4, 2010, at 12:53 PM, Alan Jones wrote: If I understand you correctly, the role of the second resource in the colocation command was defaulting to that of the first Master which is not defined or is untested for none-ms resources. Unfortunately, after changed that line to: colocation mystateful-ms-loc inf: mystateful-ms:Master myprim:Started ...it still doesn't work: myprim (ocf::pacemaker:DummySlow): Started node6.acme.com Master/Slave Set: mystateful-ms Masters: [ node5.acme.com ] Slaves: [ node6.acme.com ] And after: location myprim-loc myprim -inf: node5.acme.com myprim (ocf::pacemaker:DummySlow): Started node6.acme.com Master/Slave Set: mystateful-ms Masters: [ node6.acme.com ] Slaves: [ node5.acme.com ] What I would like to do is enable logging for the code that calculates the weights, etc. It is obvious to me that the weights are calculated differently for mystateful-ms based on the weights used in myprim. Can you enable more verbose logging online or do you have to recompile? My version is 1.0.9-89bd754939df5150de7cd76835f98fe90851b677 which is different from Vadym's. BTW: Is there another release planned for the stable branch? 1.0.9.1 is now 4 months old. I understand that I could take the top of tree, but I would like to believe that others are running the same version. ;) Thank you! Alan On Thu, Nov 4, 2010 at 8:22 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Thu, Nov 04, 2010 at 06:51:59AM -0400, Vadym Chepkov wrote: On Thu, Nov 4, 2010 at 5:37 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: This should be: colocation mystateful-ms-loc inf: mystateful-ms:Master myprim:Started Interesting, so in this case it is not necessary? colocation fs_on_drbd inf: WebFS WebDataClone:Master (taken from Cluster_from_Scratch) but other way around it is? Yes, the role of the second resource defaults to the role of the first. Ditto for order and actions. A bit confusing, I know. Thanks, Dejan I did it a bit different this time and I observe the same anomaly. First I started stateful clone primitive s1 ocf:pacemaker:Stateful ms ms1 s1 meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true Then a primitive: primitive d1 ocf:pacemaker:Dummy Made sure Master and primitive are running on different hosts location ld1 d1 10: xen-12 and then I added constraint colocation c1 inf: ms1:Master d1:Started Master/Slave Set: ms1 Masters: [ xen-11 ] Slaves: [ xen-12 ] d1 (ocf::pacemaker:Dummy): Started xen-12 It seems colocation constraint is not enough to promote a clone. Looks like a bug. # ptest -sL|grep s1 clone_color: ms1 allocation score on xen-11: 0 clone_color: ms1 allocation score on xen-12: 0 clone_color: s1:0 allocation score on xen-11: 11 clone_color: s1:0 allocation score on xen-12: 0 clone_color: s1:1 allocation score on xen-11: 0 clone_color: s1:1 allocation score on xen-12: 6 native_color: s1:0 allocation score on xen-11: 11 native_color: s1:0 allocation score on xen-12: 0 native_color: s1:1 allocation score on xen-11: -100 native_color: s1:1 allocation score on xen-12: 6 s1:0 promotion score on xen-11: 20 s1:1 promotion score on xen-12: 20 Vadym Could you attach the result of cibadmin -Ql when the cluster is in this state please? I created http://developerbugs.linux-foundation.org/show_bug.cgi?id=2522 with hb_report included Thank you, Vadym ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] drbd-xen and fencing
On Nov 15, 2010, at 2:08 AM, Andrew Beekhof wrote: Don't use init.d/drbd, use the ocf script that comes with the drbd packages Well, that doesn't help with live migration, unfortunately. This is quote from /etc/xen/scripts/block-drbd # This script will not load the DRBD kernel module for you, nor will # it attach, detach, connect, or disconnect your resource. The init # script distributed with DRBD will do that for you. Make sure it is # started before attempting to start a DRBD-backed domU. On Thu, Nov 11, 2010 at 2:19 PM, Vadym Chepkov vchep...@gmail.com wrote: Hi, I posted a less elaborate version of this question to drbd mail-list, but, unfortunately, didn't get a reply, maybe audience of this list has more experience. I am trying to make xen live migration to work reliably, but wasn't successful so far. Here is the problem. In a cluster configuration I have two type of resources - file systems on drbd, with explicit drbd resources configuration and Xen resources with implicit, using drbd-xen block device helper. For the former everything works great, but the latter doesn't work quite well. In order for helper script to work, drbd module has to be loaded and underlying resources up. So I have to start init.d/drbd script. I can't make it an lsb cluster resource, because stop will be disastrous for file system resources. Enable it in startup sequence breaks /usr/lib/drbd/crm-unfence-peer.sh, because cluster stack is not completely up by the time drbd script finishes, and there is no way to configure only specific resources that need to be initialized. Also, I can't find a way fence Xen resource. I tried fence-peer /usr/lib/drbd/crm-fence-peer.sh -i xen_vsvn, where xen_svn is the name of Xen primitive, but it doesn't work, so there is a danger of starting Xen VM on an out-of-date node. Then there is no way of monitoring underlying drbd resources too. I thought of adding underlying drbd resource explicitly in the cluster, but I can't figure out what would be the configuration for this resource can be master on both nodes, but if just on one, it's fine too. allow-two-primaries has to be allowed for live migration and at the time of migration resources are primary on both nodes, but when migration finishes, it's again primary/slave. But if I configure drbd resource in the cluster with meta master-max=2 master-node-max=1, cluster insists on having them both primary all the time. Hope I didn't bore you to death and there is an elegant solution for this conundrum :) Thank you, Vadym ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] drbd-xen and fencing
On Nov 15, 2010, at 3:45 AM, Lars Ellenberg wrote: On Mon, Nov 15, 2010 at 08:08:40AM +0100, Andrew Beekhof wrote: Don't use init.d/drbd, use the ocf script that comes with the drbd packages Then he loses xen live migration, I think. Because Pacemaker cannot migrate dependent resources, and because xen apparently still thinks it needs both nodes Primary for a short amount of time. And something needs to bring underlying storage into Connected state. It seems current drbd RA can't help with this problem On Thu, Nov 11, 2010 at 2:19 PM, Vadym Chepkov vchep...@gmail.com wrote: Hi, I posted a less elaborate version of this question to drbd mail-list, but, unfortunately, didn't get a reply, maybe audience of this list has more experience. I am trying to make xen live migration to work reliably, but wasn't successful so far. Here is the problem. In a cluster configuration I have two type of resources - file systems on drbd, with explicit drbd resources configuration and Xen resources with implicit, using drbd-xen block device helper. For the former everything works great, but the latter doesn't work quite well. In order for helper script to work, drbd module has to be loaded and underlying resources up. So I have to start init.d/drbd script. I can't make it an lsb cluster resource, because stop will be disastrous for file system resources. Enable it in startup sequence breaks /usr/lib/drbd/crm-unfence-peer.sh, because cluster stack is not completely up by the time drbd script finishes, and there is no way to configure only specific resources that need to be initialized. Also, I can't find a way fence Xen resource. I tried fence-peer /usr/lib/drbd/crm-fence-peer.sh -i xen_vsvn, where xen_svn is the name of Xen primitive, but it doesn't work, Can you be more specific? What did you try, what did you expect, and how do you determine it did not work? whenever a drbd resource gets disconnected fencing script usually define a constraint, for example: location drbd-fence-by-handler-ms_drbd_ldap ms_drbd_ldap \ rule $id=drbd-fence-by-handler-rule-ms_drbd_ldap $role=Master -inf: #uname ne xen-11 This doesn't happen if I define handler fence-peer /usr/lib/drbd/crm-fence-peer.sh or fence-peer /usr/lib/drbd/crm-fence-peer.sh -i xen_vsvn, where xen_svn is Xen resource with drbd helper, I assume, because crm-fence-peer.sh expects drbd resource to be present in the cluster configuration and it is not the case. I guess to solve this issue correctly a drbd clone is necessary, which will bring resources to Connected state and then won't freak out if both nodes are Secondary (when Xen VM is shutdown) or both nodes are Primary (when migration is in progress). Vadym so there is a danger of starting Xen VM on an out-of-date node. Then there is no way of monitoring underlying drbd resources too. I thought of adding underlying drbd resource explicitly in the cluster, but I can't figure out what would be the configuration for this resource can be master on both nodes, but if just on one, it's fine too. allow-two-primaries has to be allowed for live migration and at the time of migration resources are primary on both nodes, but when migration finishes, it's again primary/slave. But if I configure drbd resource in the cluster with meta master-max=2 master-node-max=1, cluster insists on having them both primary all the time. Hope I didn't bore you to death and there is an elegant solution for this conundrum :) Thank you, Vadym ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home:
Re: [Pacemaker] AP9606 fencing device
On 17 November 2010 04:15, Devin Reade g...@gno.org wrote: --On Wednesday, October 27, 2010 09:47:14 AM +0200 Pavlos Parissis pavlos.paris...@gmail.com wrote: I have a APC AP9606 PDU and I am trying to find a stonith agent which works with that PDU. I know that this is an old thread, but I'll reply anyway. I have a one cluster that uses an old APC AP9606 for which I've not been able to obtain a flash update. In particular, it is: hardware revision: J13 APP version 2.2.0 AOS version 3.0.3 It is running just fine (see caveat below) with the following configuration, and I can attest that it has properly stonith'd nodes many times. primitive msw stonith:apcmastersnmp \ operations $id=msw-operations \ op monitor interval=15 timeout=15 start-delay=15 \ params ipaddr=IPADDR port=161 community=COMMUNITY clone msw-clone msw \ meta clone-max=2 target-role=started (yeah, that monitor interval is probably a little quick ...) That particular cluster is getting long in the tooth: pacemaker-1.0.5-4.6.x86_64 openais-0.80.5-15.1.x86_64 The caveat is that this PDU used to work with the default implementation, however at some point someone updated the OIDs in apcmastersnmp to match newer firmware. Therefore, I had to reverse patch that RA: === --- apcmastersnmp.c.orig2009-09-26 16:12:27.0 -0600 +++ apcmastersnmp.c 2009-09-28 16:46:17.0 -0600 @@ -137,12 +137,12 @@ #define OUTLET_NO_CMD_PEND 2 /* oids */ -#define OID_IDENT .1.3.6.1.4.1.318.1.1.12.1.5.0 -#define OID_NUM_OUTLETS .1.3.6.1.4.1.318.1.1.12.1.8.0 -#define OID_OUTLET_NAMES .1.3.6.1.4.1.318.1.1.12.3.4.1.1.2.%i -#define OID_OUTLET_STATE .1.3.6.1.4.1.318.1.1.12.3.3.1.1.4.%i -#define OID_OUTLET_COMMAND_PENDING .1.3.6.1.4.1.318.1.1.12.3.5.1.1.5.%i -#define OID_OUTLET_REBOOT_DURATION .1.3.6.1.4.1.318.1.1.12.3.4.1.1.6.%i +#define OID_IDENT .1.3.6.1.4.1.318.1.1.4.1.4.0 +#define OID_NUM_OUTLETS.1.3.6.1.4.1.318.1.1.4.4.1.0 +#define OID_OUTLET_NAMES .1.3.6.1.4.1.318.1.1.4.5.2.1.3.%i +#define OID_OUTLET_STATE .1.3.6.1.4.1.318.1.1.4.4.2.1.3.%i +#define OID_OUTLET_COMMAND_PENDING .1.3.6.1.4.1.318.1.1.4.4.2.1.2.%i +#define OID_OUTLET_REBOOT_DURATION .1.3.6.1.4.1.318.1.1.4.5.2.1.5.%i /* snmpset -c private -v1 172.16.0.32:161 === I faced the same problem and because I didn't want to modify the code of apcmastersnmp RA, I used the rackpdu RA where I could set OIDs in the parameters. This RA worked perfectly until the PDU died! I suggest to use the rackpdu RA because if you upgrade your cluster software your modification will be gone. Cheers, Pavlos ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker