Re: [Pacemaker] CRM property mysql_replication
17.07.2015 10:46, Dejan Muhamedagic wrote: [...] The attribute is not listed in the PE meta-data and therefore considered an error. You can make crmsh less strict like this: # crm option set core.check_mode relaxed It'll still print the error, but your changes are going to be committed. Given that it is common practice for some RA to stuff its attributes into cluster properties, this should be handled better. Probably by not running the PE meta-data check for property elements which are not cib-bootstrap-options. +1 I run patched version of crmsh (with meta checks disabled) for many years. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Filesystem resource killing innocent processes on stop
19.05.2015 11:46, Dejan Muhamedagic wrote: On Mon, May 18, 2015 at 07:34:38PM +0300, Vladislav Bogdanov wrote: 18.05.2015 18:57, Nikola Ciprich wrote: Hi Vladislav, Isn't that a bind-mount? nope, but your question lead me to possible culprit.. it's cephfs mount, when I try to some local filesystem, I don't see this weird fuser behaviour.. so maybe fuser does not work correctly on cephfs? yep, for bind-mounts fuser shows/kills both processes which use bounded tree and original filesystem. this is how fs is mounted: 10.0.0.1,10.0.0.2,10.0.0.3:/ on /home/cluster/virt type ceph (name=admin,key=client.admin) I should probably ask in ceph maillist.. There are alternative ways to determine mountpoint usage btw. One of them is lsof, I use it for bind-mounts. The Filesystem RA supports bind mounts. Is there a problem then with it using fuser? Definitely (but may be kernel/fuser version specific). Thanks, Dejan n. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Filesystem resource killing innocent processes on stop
19.05.2015 11:44, Dejan Muhamedagic wrote: On Mon, May 18, 2015 at 05:14:14PM +0200, Nikola Ciprich wrote: Hi Dejan, The list below seems too extensive. Which version of resource-agents do you run? $ grep 'Build version:' /usr/lib/ocf/lib/heartbeat/ocf-shellfuncs yes, it's definitely wrong.. here's the info you've requested: # Build version: 5434e9646462d2c3c8f7aad2609d0ef1875839c7 rpm version: resource-agents-3.9.5-12.el6_6.5.x86_64 I can already see the problem, this version simply uses fuser -m $MOUNTPOINT which seems to return pretty wrong results: [root@denovav1b ~]# fuser -m /home/cluster/virt/ /home/cluster/virt/: 1m 3295m 3314m 4817m 4846m 4847m 4890m 4891m 4916m 4944m 4952m 4999m 5007m 5037m 5069m 5137m 5162m 5164m 5166m 5168m 5170m 5172m 5575m 8055m 9604m 9605m 10984m 11186m 11370m 11813m 11871m 11887m 11946m 12020m 12026m 12027m 12028m 12029m 12030m 12031m 14218m 15294m 15374m 15396m 15399m 17479m 17693m 17694m 20705m 20718m 20948m 20982m 23902m 24572m 24580m 26300m 29790m 29792m 30785m (notice even process # 1!) while lsof returns: lsof | grep cluster.*virt qemu-syst 8055 root 21r REG0,0 232783872 1099511634304 /home/cluster/virt/images/debian-7.8.0-amd64-netinst.iso which seems much saner to me.. Indeed. Is fuser broken or is there some kernel side confusion? As far as was able to investigate, that comes from the fact that fuser uses device field which is the same for source and bind mount (yes, that is centos6). Did you also try: lsof /home/cluster/virt/ Anyway, it would be good to bring this up with the centos people. Thanks, Dejan BR nik here's example of the log: Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 3606 1 0 Feb12 ?Ss0:01 /sbin/udevd -d Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4249 1 0 Feb12 ttyS2Ss+0:00 agetty ttyS2 115200 vt100 Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4271 4395 0 21:58 ?Ss 0:00 sshd: root@pts/12 Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4273 1 0 21:58 ?Rs 0:00 [bash] Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4395 1 0 Feb24 ?Ss 0:03 /usr/sbin/sshd Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4677 1 0 Feb12 ?Ss 0:00 /sbin/portreserve Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4690 1 0 Feb12 ?S 0:00 supervising syslog-ng Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4691 1 0 Feb12 ?Ss 0:46 syslog-ng -p /var/run/syslog-ng.pid Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: rpc 4746 1 0 Feb12 ?Ss 0:05 rpcbind Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: rpcuser 4764 1 0 Feb12 ?Ss 0:00 rpc.statd Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4797 1 0 Feb12 ?Ss 0:00 rpc.idmapd Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4803 12028 0 21:59 ?S 0:00 /bin/sh /usr/lib/ocf/resource.d/heartbeat/Filesystem stop while unmounting /home/cluster/virt directory.. what is quite curious, is, that last killed process seems to be Filesystem resource itself.. Hmm, that's quite strange. That implies that the RA script itself had /home/cluster/virt as its WD. before I dig deeper into this, did anyone else noticed this problem? Is this some known (and possibly already issue)? Never heard of this. Thanks, Dejan thanks a lot in advance nik -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- - Ing. Nikola CIPRICH
Re: [Pacemaker] Filesystem resource killing innocent processes on stop
18.05.2015 18:57, Nikola Ciprich wrote: Hi Vladislav, Isn't that a bind-mount? nope, but your question lead me to possible culprit.. it's cephfs mount, when I try to some local filesystem, I don't see this weird fuser behaviour.. so maybe fuser does not work correctly on cephfs? yep, for bind-mounts fuser shows/kills both processes which use bounded tree and original filesystem. this is how fs is mounted: 10.0.0.1,10.0.0.2,10.0.0.3:/ on /home/cluster/virt type ceph (name=admin,key=client.admin) I should probably ask in ceph maillist.. There are alternative ways to determine mountpoint usage btw. One of them is lsof, I use it for bind-mounts. n. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Filesystem resource killing innocent processes on stop
18.05.2015 13:20, Nikola Ciprich wrote: Hi, I noticed very annoying bug (or so I think), that resource-agents-3.9.5 in RHEL / centos 6 Filesystem OCF resource seems to be killing completely unrelated processes on shutdown although they're not using anything on mounted filesystem... Isn't that a bind-mount? unfortunately, one of processes very often killed is sshd :-( here's example of the log: Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 3606 1 0 Feb12 ?Ss0:01 /sbin/udevd -d Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4249 1 0 Feb12 ttyS2Ss+0:00 agetty ttyS2 115200 vt100 Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4271 4395 0 21:58 ?Ss 0:00 sshd: root@pts/12 Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4273 1 0 21:58 ?Rs 0:00 [bash] Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4395 1 0 Feb24 ?Ss 0:03 /usr/sbin/sshd Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4677 1 0 Feb12 ?Ss 0:00 /sbin/portreserve Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4690 1 0 Feb12 ?S 0:00 supervising syslog-ng Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4691 1 0 Feb12 ?Ss 0:46 syslog-ng -p /var/run/syslog-ng.pid Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: rpc 4746 1 0 Feb12 ?Ss 0:05 rpcbind Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: rpcuser 4764 1 0 Feb12 ?Ss 0:00 rpc.statd Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4797 1 0 Feb12 ?Ss 0:00 rpc.idmapd Filesystem(virt-fs)[4803]: 2015/05/17_21:59:48 INFO: sending signal TERM to: root 4803 12028 0 21:59 ?S 0:00 /bin/sh /usr/lib/ocf/resource.d/heartbeat/Filesystem stop while unmounting /home/cluster/virt directory.. what is quite curious, is, that last killed process seems to be Filesystem resource itself.. before I dig deeper into this, did anyone else noticed this problem? Is this some known (and possibly already issue)? thanks a lot in advance nik ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Unique clone instance is stopped too early on move
17.04.2015 00:48, Andrew Beekhof wrote: On 22 Jan 2015, at 12:04 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:44, Andrew Beekhof wrote: On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.01.2015 07:44, Andrew Beekhof wrote: On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 13.01.2015 11:32, Andrei Borzenkov wrote: On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all. I found a little bit strange operation ordering during transition execution. Could you please look at the following partial configuration (crmsh syntax)? === ... clone cl-broker broker \ meta interleave=true target-role=Started clone cl-broker-vips broker-vips \ meta clone-node-max=2 globally-unique=true interleave=true resource-stickiness=0 target-role=Started clone cl-ctdb ctdb \ meta interleave=true target-role=Started colocation broker-vips-with-broker inf: cl-broker-vips cl-broker colocation broker-with-ctdb inf: cl-broker cl-ctdb order broker-after-ctdb inf: cl-ctdb cl-broker order broker-vips-after-broker 0: cl-broker cl-broker-vips ... === After I put one node to standby and then back to online, I see the following transition (relevant excerpt): === * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Pseudo action: cl-ctdb_start_0 * Resource action: ctdbstart on c-pa-1 * Pseudo action: cl-ctdb_running_0 * Pseudo action: cl-broker_start_0 * Resource action: ctdbmonitor=1 on c-pa-1 * Resource action: broker start on c-pa-1 * Pseudo action: cl-broker_running_0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker monitor=1 on c-pa-1 * Resource action: broker-vips:1 start on c-pa-1 * Pseudo action: cl-broker-vips_running_0 * Resource action: broker-vips:1 monitor=3 on c-pa-1 === What could be a reason to stop unique clone instance so early for move? Do not take it as definitive answer, but cl-broker-vips cannot run unless both other resources are started. So if you compute closure of all required transitions it looks rather logical. Having cl-broker-vips started while broker is still stopped would violate constraint. Problem is that broker-vips:1 is stopped on one (source) node unnecessarily early. It looks to be moving from c-pa-0 to c-pa-1 It might be unnecessarily early, but it is what you asked for... we have to unwind the resource stack before we can build it up. Yes, I understand that it is valid, but could its stop be delayed until cluster is in the state when all dependencies are satisfied to start it on another node (like migration?)? No, because we have to unwind the resource stack before we can build it up. Doing anything else would be one of those things that is trivial for a human to identify but rather complex for a computer. I believe there is also an issue with migration of clone instances. I modified pe-input to allow migration of cl-broker-vips (and also set inf score for broker-vips-after-broker and make cl-broker-vips interleaved). Relevant part is: clone cl-broker broker \ meta interleave=true target-role=Started clone cl-broker-vips broker-vips \ meta clone-node-max=2 globally-unique=true interleave=true allow-migrate=true resource-stickiness=0 target-role=Started clone cl-ctdb ctdb \ meta interleave=true target-role=Started colocation broker-vips-with-broker inf: cl-broker-vips cl-broker colocation broker-with-ctdb inf: cl-broker cl-ctdb order broker-after-ctdb inf: cl-ctdb cl-broker order broker-vips-after-broker inf: cl-broker cl-broker-vips After that (part of) transition is: * Resource action: broker-vips:1 migrate_to on c-pa-0 * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 migrate_from on c-pa-1 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Pseudo action: all_stopped * Pseudo action: cl-ctdb_start_0 * Resource action: ctdbstart on c-pa-1 * Pseudo action: cl-ctdb_running_0 * Pseudo action: cl-broker_start_0 * Resource action: ctdbmonitor=1 on c-pa-1 * Resource action: broker start on c-pa-1 * Pseudo action: cl-broker_running_0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker monitor=1 on c-pa-1 * Pseudo action: broker-vips:1_start_0 * Pseudo action: cl-broker-vips_running_0 * Resource action: broker-vips:1 monitor=3 on c-pa-1 But, I would say that at least from a human logic PoV the above breaks ordering rule broker-vips-after-broker (cl-broker-vips finished migrating and thus runs on c-pa-1 before cl-broker started there). Technically broker-vips:1_start_0 goes at the right position, but actually resource is started in migrate_to
Re: [Pacemaker] 'stop' operation passes outdated set of instance attributes to RA
23.02.2015 21:50, David Vossel wrote: - Original Message - On 14 Feb 2015, at 1:10 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, I believe that is a bug that 'stop' operation uses set of instance attributes from the original 'start' op, not what successful 'reload' had. Corresponding pe-input has correct set of attributes, and pre-stop 'notify' op uses updated set of attributes too. This is easily reproducible with 3.9.6 resource agents and trace_ra. pacemaker is c529898. Should I provide more information? Yes please. I suspect the lrmd needs to update it's parameter cache for the reload operation. David? This falls on the crmd I believe. I haven't tested it, but something like this should fix it I bet. diff --git a/crmd/lrm.c b/crmd/lrm.c index ead2e05..45641d2 100644 --- a/crmd/lrm.c +++ b/crmd/lrm.c @@ -186,6 +186,7 @@ update_history_cache(lrm_state_t * lrm_state, lrmd_rsc_info_t * rsc, lrmd_event_ if (op-params (safe_str_eq(CRMD_ACTION_START, op-op_type) || + safe_str_eq(reload, op-op_type) || safe_str_eq(CRMD_ACTION_STATUS, op-op_type))) { if (entry-stop_params) { This definitely fixes the issue, thank you! ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] One more globally-unique clone question
24.02.2015 01:58, Andrew Beekhof wrote: On 21 Jan 2015, at 5:08 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.01.2015 03:51, Andrew Beekhof wrote: On 20 Jan 2015, at 4:13 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:47, Andrew Beekhof wrote: On 17 Jan 2015, at 1:25 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, Trying to reproduce problem with early stop of globally-unique clone instances during move to another node I found one more interesting problem. Due to the different order of resources in the CIB and extensive use of constraints between other resources (odd number of resources cluster-wide) two CLUSTERIP instances are always allocated to the same node in the new testing cluster. Ah, so this is why broker-vips:1 was moving. That are two different 2-node clusters with different order of resources. In the first one broker-vips go after even number of resources, and one instance wants to return to a mother-node after it is brought back online, thus broker-vips:1 is moving. In the second one, broker-vips go after odd number of resources (actually three more resources are allocated to one node due to constraints) and both boker-vips go to another node. What would be the best/preferred way to make them run on different nodes by default? By default they will. I'm assuming its the constraints that are preventing this. I only see that they are allocated similar to any other resources. Are they allocated in stages though? Ie. Was there a point at which the mother-node was available but constraints prevented broker-vips:1 running there? There are three pe-inputs for the node start. First one starts fence device for the other node, dlm+clvm+gfs and drbd on the online-back node. Second one tries to start/promote/move everything else until it is interrupted (by the drbd RA?). Third one finishes that attempt. I've lost all context on this and I don't seem to be able to reconstruct it :) Which part of the above is the problem? ;) In this thread the point is: * all resources have the same default priority * there are several triples of resources which are grouped by order/colocation constraints. Let's call them triples. * There is globally-unique cluster-ip clone with clone-max=2 clone-node-max=2 stickiness=0, which is allocated after all triples (it goes after them in CIB). * If number of triples is odd, then in two-node cluster both cluster-ip instances are allocated to the same node. And yes, CTDB depends on GFS2 filesystem, so broker-vips:1 can't be allocated immediately due to constraints. It is allocated in the second pe-input. May be it is worth sending crm-report to you in order to not overload list by long listings and you have complete information? Getting them to auto-rebalance is the harder problem I see. Should it be possible to solve it without priority or utilization use? it meaning auto-rebalancing or your original issue? I meant auto-rebalancing. It should be something we handle internally. I've made a note of it. I see following options: * Raise priority of globally-unique clone so its instances are always allocated first of all. * Use utilization attributes (with high values for nodes and low values for cluster resources). * Anything else? If I configure virtual IPs one-by-one (without clone), I can add a colocation constraint with negative score between them. I do not see a way to scale that setup well though (5-10 IPs). So, what would be the best option to achieve the same with globally-unique cloned resource? May be there should be some internal preference/colocation not to place them together (like default stickiness=1 for clones)? Or even allow special negative colocation constraint and the same resource in both 'what' and 'with' (colocation col1 -1: clone clone)? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http
[Pacemaker] 'stop' operation passes outdated set of instance attributes to RA
Hi, I believe that is a bug that 'stop' operation uses set of instance attributes from the original 'start' op, not what successful 'reload' had. Corresponding pe-input has correct set of attributes, and pre-stop 'notify' op uses updated set of attributes too. This is easily reproducible with 3.9.6 resource agents and trace_ra. pacemaker is c529898. Should I provide more information? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Gracefully failing reload operation
Hi, is there a way for resource agent to tell pacemaker that in some cases reload operation is insufficient to apply new resource definition and restart is required? I tried to return OCF_ERR_GENERIC, but that prevents resource from being started until failure-timeout lapses and cluster is rechecked. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Pre/post-action notifications for master-slave and clone resources
Hi, Playing with two-week old git master on a two-node cluster I discovered that only limited set of notify operations is performed for clone and master-slave instances when all of them are being started/stopped. Clones (anonymous): * post-start * pre-stop M/S: * post-start * post-promote * pre-demote * pre-stop According to Pacemaker Explained there should be more: * pre-start * pre-promote * post-demote * post-stop Some notifications (pre-stop for my clone and pre-demote for ms) are repeated twice (due to transition aborts or fact that multiple instances are stopping/demoting?) but that has minor impact for me. I tested that by setting stop-all-resources property to 'true' and 'false'. On the other hand, if I put one node with running instances into standby and then into online states, I see all missing notifications. I that intended that actions above are not performed when all instances are handled simultaneously? One more question about 'post' notifications: Are they send to RA right after corresponding main action is finished or they wait in the transition queue? In other words, is it possible to get post-stop notification that the foreign instance is stopped during the time the stop action on the local instance is still running? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] new release date for resource-agents release 3.9.6
Hi Dejan, if it is not too late, would it be possible to add output of environment into resource trace file when tracing is enabled? --- ocf-shellfuncs.orig 2015-01-26 15:50:34.435001364 + +++ ocf-shellfuncs 2015-01-26 15:49:19.707001542 + @@ -822,6 +822,7 @@ fi PS4='+ `date +%T`: ${FUNCNAME[0]:+${FUNCNAME[0]}:}${LINENO}: ' set -x + env=$( echo; printenv | sort ) } ocf_stop_trace() { set +x Best, Vladislav 23.01.2015 18:45, Dejan Muhamedagic wrote: Hello everybody, Someone warned us that three days is too short a period to test a release, so let's postpone the final release of resource-agents v3.9.6 to: Tuesday, Jan 27 Please do more testing in the meantime. The v3.9.6-rc1 packages are available for most popular platforms: http://download.opensuse.org/repositories/home:/dmuhamedagic:/branches:/network:/ha-clustering:/Stable RHEL-7 and Fedora 21 are unfortunately missing, due to some strange unresolvable dependencies issue. Debian/Ubuntu people can use alien. Many thanks! The resource-agents crowd ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [crmsh][Question] The order of resources is changed.
Hi, 21.01.2015 12:50, Kristoffer Grönlund wrote: Hello, renayama19661...@ybb.ne.jp writes: Hi All, We confirmed a function of crmsh by the next combination. * corosync-2.3.4 * pacemaker-Pacemaker-1.1.12 * crmsh-2.1.0 By new crmsh, does options sort-elements no not work? Is there the option which does not change order elsewhere? I second for this issue as I was just hit by it - pacemaker handles them in the order the appear in the CIB (http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_which_resource_is_preferred_to_be_chosen_to_get_assigned_first.html - the first runnable resource listed in cib gets allocated first). Thus it may unnecessarily allocate all globally-unique clone instances to just one node if other node already has much more resources allocated due to constraints. More, I would say that due to how clone/ms resources are presented in the XML, it would be nice not to put them after all primitive resources in the crmsh if sorting is disabled. I would suggest: ... primitive p1 ... primitive p2 ... clone cl-p2 p2 ... primitive p3 ... ... instead of the current ... primitive p1 ... primitive p2 ... primitive p3 ... clone cl-p2 p2 ... ... It would also be nice to have a way to change existing resource order in the 'crmsh configure edit/upload'. Best, Vladislav It is possible that there is a bug in crmsh, I will investigate. Could you file an issue for this problem at http://github.com/crmsh/crmsh/issues ? This would help me track the problem. Thank you! ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Unique clone instance is stopped too early on move
20.01.2015 02:44, Andrew Beekhof wrote: On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.01.2015 07:44, Andrew Beekhof wrote: On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 13.01.2015 11:32, Andrei Borzenkov wrote: On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all. I found a little bit strange operation ordering during transition execution. Could you please look at the following partial configuration (crmsh syntax)? === ... clone cl-broker broker \ meta interleave=true target-role=Started clone cl-broker-vips broker-vips \ meta clone-node-max=2 globally-unique=true interleave=true resource-stickiness=0 target-role=Started clone cl-ctdb ctdb \ meta interleave=true target-role=Started colocation broker-vips-with-broker inf: cl-broker-vips cl-broker colocation broker-with-ctdb inf: cl-broker cl-ctdb order broker-after-ctdb inf: cl-ctdb cl-broker order broker-vips-after-broker 0: cl-broker cl-broker-vips ... === After I put one node to standby and then back to online, I see the following transition (relevant excerpt): === * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Pseudo action: cl-ctdb_start_0 * Resource action: ctdbstart on c-pa-1 * Pseudo action: cl-ctdb_running_0 * Pseudo action: cl-broker_start_0 * Resource action: ctdbmonitor=1 on c-pa-1 * Resource action: broker start on c-pa-1 * Pseudo action: cl-broker_running_0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker monitor=1 on c-pa-1 * Resource action: broker-vips:1 start on c-pa-1 * Pseudo action: cl-broker-vips_running_0 * Resource action: broker-vips:1 monitor=3 on c-pa-1 === What could be a reason to stop unique clone instance so early for move? Do not take it as definitive answer, but cl-broker-vips cannot run unless both other resources are started. So if you compute closure of all required transitions it looks rather logical. Having cl-broker-vips started while broker is still stopped would violate constraint. Problem is that broker-vips:1 is stopped on one (source) node unnecessarily early. It looks to be moving from c-pa-0 to c-pa-1 It might be unnecessarily early, but it is what you asked for... we have to unwind the resource stack before we can build it up. Yes, I understand that it is valid, but could its stop be delayed until cluster is in the state when all dependencies are satisfied to start it on another node (like migration?)? No, because we have to unwind the resource stack before we can build it up. Doing anything else would be one of those things that is trivial for a human to identify but rather complex for a computer. I believe there is also an issue with migration of clone instances. I modified pe-input to allow migration of cl-broker-vips (and also set inf score for broker-vips-after-broker and make cl-broker-vips interleaved). Relevant part is: clone cl-broker broker \ meta interleave=true target-role=Started clone cl-broker-vips broker-vips \ meta clone-node-max=2 globally-unique=true interleave=true allow-migrate=true resource-stickiness=0 target-role=Started clone cl-ctdb ctdb \ meta interleave=true target-role=Started colocation broker-vips-with-broker inf: cl-broker-vips cl-broker colocation broker-with-ctdb inf: cl-broker cl-ctdb order broker-after-ctdb inf: cl-ctdb cl-broker order broker-vips-after-broker inf: cl-broker cl-broker-vips After that (part of) transition is: * Resource action: broker-vips:1 migrate_to on c-pa-0 * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 migrate_from on c-pa-1 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Pseudo action: all_stopped * Pseudo action: cl-ctdb_start_0 * Resource action: ctdbstart on c-pa-1 * Pseudo action: cl-ctdb_running_0 * Pseudo action: cl-broker_start_0 * Resource action: ctdbmonitor=1 on c-pa-1 * Resource action: broker start on c-pa-1 * Pseudo action: cl-broker_running_0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker monitor=1 on c-pa-1 * Pseudo action: broker-vips:1_start_0 * Pseudo action: cl-broker-vips_running_0 * Resource action: broker-vips:1 monitor=3 on c-pa-1 But, I would say that at least from a human logic PoV the above breaks ordering rule broker-vips-after-broker (cl-broker-vips finished migrating and thus runs on c-pa-1 before cl-broker started there). Technically broker-vips:1_start_0 goes at the right position, but actually resource is started in migrate_to/mifrate_from. I also went further and injected
Re: [Pacemaker] One more globally-unique clone question
21.01.2015 03:51, Andrew Beekhof wrote: On 20 Jan 2015, at 4:13 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.01.2015 02:47, Andrew Beekhof wrote: On 17 Jan 2015, at 1:25 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, Trying to reproduce problem with early stop of globally-unique clone instances during move to another node I found one more interesting problem. Due to the different order of resources in the CIB and extensive use of constraints between other resources (odd number of resources cluster-wide) two CLUSTERIP instances are always allocated to the same node in the new testing cluster. Ah, so this is why broker-vips:1 was moving. That are two different 2-node clusters with different order of resources. In the first one broker-vips go after even number of resources, and one instance wants to return to a mother-node after it is brought back online, thus broker-vips:1 is moving. In the second one, broker-vips go after odd number of resources (actually three more resources are allocated to one node due to constraints) and both boker-vips go to another node. What would be the best/preferred way to make them run on different nodes by default? By default they will. I'm assuming its the constraints that are preventing this. I only see that they are allocated similar to any other resources. Are they allocated in stages though? Ie. Was there a point at which the mother-node was available but constraints prevented broker-vips:1 running there? There are three pe-inputs for the node start. First one starts fence device for the other node, dlm+clvm+gfs and drbd on the online-back node. Second one tries to start/promote/move everything else until it is interrupted (by the drbd RA?). Third one finishes that attempt. And yes, CTDB depends on GFS2 filesystem, so broker-vips:1 can't be allocated immediately due to constraints. It is allocated in the second pe-input. May be it is worth sending crm-report to you in order to not overload list by long listings and you have complete information? Getting them to auto-rebalance is the harder problem I see. Should it be possible to solve it without priority or utilization use? it meaning auto-rebalancing or your original issue? I meant auto-rebalancing. I see following options: * Raise priority of globally-unique clone so its instances are always allocated first of all. * Use utilization attributes (with high values for nodes and low values for cluster resources). * Anything else? If I configure virtual IPs one-by-one (without clone), I can add a colocation constraint with negative score between them. I do not see a way to scale that setup well though (5-10 IPs). So, what would be the best option to achieve the same with globally-unique cloned resource? May be there should be some internal preference/colocation not to place them together (like default stickiness=1 for clones)? Or even allow special negative colocation constraint and the same resource in both 'what' and 'with' (colocation col1 -1: clone clone)? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] One more globally-unique clone question
20.01.2015 02:47, Andrew Beekhof wrote: On 17 Jan 2015, at 1:25 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, Trying to reproduce problem with early stop of globally-unique clone instances during move to another node I found one more interesting problem. Due to the different order of resources in the CIB and extensive use of constraints between other resources (odd number of resources cluster-wide) two CLUSTERIP instances are always allocated to the same node in the new testing cluster. Ah, so this is why broker-vips:1 was moving. That are two different 2-node clusters with different order of resources. In the first one broker-vips go after even number of resources, and one instance wants to return to a mother-node after it is brought back online, thus broker-vips:1 is moving. In the second one, broker-vips go after odd number of resources (actually three more resources are allocated to one node due to constraints) and both boker-vips go to another node. What would be the best/preferred way to make them run on different nodes by default? By default they will. I'm assuming its the constraints that are preventing this. I only see that they are allocated similar to any other resources. Getting them to auto-rebalance is the harder problem I see. Should it be possible to solve it without priority or utilization use? I see following options: * Raise priority of globally-unique clone so its instances are always allocated first of all. * Use utilization attributes (with high values for nodes and low values for cluster resources). * Anything else? If I configure virtual IPs one-by-one (without clone), I can add a colocation constraint with negative score between them. I do not see a way to scale that setup well though (5-10 IPs). So, what would be the best option to achieve the same with globally-unique cloned resource? May be there should be some internal preference/colocation not to place them together (like default stickiness=1 for clones)? Or even allow special negative colocation constraint and the same resource in both 'what' and 'with' (colocation col1 -1: clone clone)? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] One more globally-unique clone question
Hi all, Trying to reproduce problem with early stop of globally-unique clone instances during move to another node I found one more interesting problem. Due to the different order of resources in the CIB and extensive use of constraints between other resources (odd number of resources cluster-wide) two CLUSTERIP instances are always allocated to the same node in the new testing cluster. What would be the best/preferred way to make them run on different nodes by default? I see following options: * Raise priority of globally-unique clone so its instances are always allocated first of all. * Use utilization attributes (with high values for nodes and low values for cluster resources). * Anything else? If I configure virtual IPs one-by-one (without clone), I can add a colocation constraint with negative score between them. I do not see a way to scale that setup well though (5-10 IPs). So, what would be the best option to achieve the same with globally-unique cloned resource? May be there should be some internal preference/colocation not to place them together (like default stickiness=1 for clones)? Or even allow special negative colocation constraint and the same resource in both 'what' and 'with' (colocation col1 -1: clone clone)? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Unique clone instance is stopped too early on move
16.01.2015 07:44, Andrew Beekhof wrote: On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 13.01.2015 11:32, Andrei Borzenkov wrote: On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all. I found a little bit strange operation ordering during transition execution. Could you please look at the following partial configuration (crmsh syntax)? === ... clone cl-broker broker \ meta interleave=true target-role=Started clone cl-broker-vips broker-vips \ meta clone-node-max=2 globally-unique=true interleave=true resource-stickiness=0 target-role=Started clone cl-ctdb ctdb \ meta interleave=true target-role=Started colocation broker-vips-with-broker inf: cl-broker-vips cl-broker colocation broker-with-ctdb inf: cl-broker cl-ctdb order broker-after-ctdb inf: cl-ctdb cl-broker order broker-vips-after-broker 0: cl-broker cl-broker-vips ... === After I put one node to standby and then back to online, I see the following transition (relevant excerpt): === * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Pseudo action: cl-ctdb_start_0 * Resource action: ctdbstart on c-pa-1 * Pseudo action: cl-ctdb_running_0 * Pseudo action: cl-broker_start_0 * Resource action: ctdbmonitor=1 on c-pa-1 * Resource action: broker start on c-pa-1 * Pseudo action: cl-broker_running_0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker monitor=1 on c-pa-1 * Resource action: broker-vips:1 start on c-pa-1 * Pseudo action: cl-broker-vips_running_0 * Resource action: broker-vips:1 monitor=3 on c-pa-1 === What could be a reason to stop unique clone instance so early for move? Do not take it as definitive answer, but cl-broker-vips cannot run unless both other resources are started. So if you compute closure of all required transitions it looks rather logical. Having cl-broker-vips started while broker is still stopped would violate constraint. Problem is that broker-vips:1 is stopped on one (source) node unnecessarily early. It looks to be moving from c-pa-0 to c-pa-1 It might be unnecessarily early, but it is what you asked for... we have to unwind the resource stack before we can build it up. Yes, I understand that it is valid, but could its stop be delayed until cluster is in the state when all dependencies are satisfied to start it on another node (like migration?)? Like: === * Pseudo action: cl-ctdb_start_0 * Resource action: ctdbstart on c-pa-1 * Pseudo action: cl-ctdb_running_0 * Pseudo action: cl-broker_start_0 * Resource action: ctdbmonitor=1 on c-pa-1 * Resource action: broker start on c-pa-1 * Pseudo action: cl-broker_running_0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker monitor=1 on c-pa-1 * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Resource action: broker-vips:1 start on c-pa-1 * Pseudo action: cl-broker-vips_running_0 * Resource action: broker-vips:1 monitor=3 on c-pa-1 === That would be the great optimization toward five nines... Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] BUG: crm_mon prints status of clone instances being started as 'Started'
That fixes the issue, thanks. 12.01.2015 03:38, Andrew Beekhof wrote: I'll push this up soon: diff --git a/lib/pengine/clone.c b/lib/pengine/clone.c index 596f701..b83798a 100644 --- a/lib/pengine/clone.c +++ b/lib/pengine/clone.c @@ -438,6 +438,10 @@ clone_print(resource_t * rsc, const char *pre_text, long options, void *print_da /* Unique, unmanaged or failed clone */ print_full = TRUE; +} else if (is_set(options, pe_print_pending) native_pending_state(child_rsc) != NULL) { +/* In a pending state */ +print_full = TRUE; + } else if (child_rsc-fns-active(child_rsc, TRUE)) { /* Fully active anonymous clone */ node_t *location = child_rsc-fns-location(child_rsc, NULL, TRUE); On 10 Jan 2015, at 12:16 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, It seems like lib/pengine/clone.c/clone_print() doesn't respect pending state of clone/ms resource instances in the cumulative output: short_print(list_text, child_text, rsc-variant == pe_master ? Slaves : Started, options, print_data); in by-node output crm_mon prints correct Starting. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Unique clone instance is stopped too early on move
13.01.2015 11:32, Andrei Borzenkov wrote: On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all. I found a little bit strange operation ordering during transition execution. Could you please look at the following partial configuration (crmsh syntax)? === ... clone cl-broker broker \ meta interleave=true target-role=Started clone cl-broker-vips broker-vips \ meta clone-node-max=2 globally-unique=true interleave=true resource-stickiness=0 target-role=Started clone cl-ctdb ctdb \ meta interleave=true target-role=Started colocation broker-vips-with-broker inf: cl-broker-vips cl-broker colocation broker-with-ctdb inf: cl-broker cl-ctdb order broker-after-ctdb inf: cl-ctdb cl-broker order broker-vips-after-broker 0: cl-broker cl-broker-vips ... === After I put one node to standby and then back to online, I see the following transition (relevant excerpt): === * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Pseudo action: cl-ctdb_start_0 * Resource action: ctdbstart on c-pa-1 * Pseudo action: cl-ctdb_running_0 * Pseudo action: cl-broker_start_0 * Resource action: ctdbmonitor=1 on c-pa-1 * Resource action: broker start on c-pa-1 * Pseudo action: cl-broker_running_0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker monitor=1 on c-pa-1 * Resource action: broker-vips:1 start on c-pa-1 * Pseudo action: cl-broker-vips_running_0 * Resource action: broker-vips:1 monitor=3 on c-pa-1 === What could be a reason to stop unique clone instance so early for move? Do not take it as definitive answer, but cl-broker-vips cannot run unless both other resources are started. So if you compute closure of all required transitions it looks rather logical. Having cl-broker-vips started while broker is still stopped would violate constraint. Problem is that broker-vips:1 is stopped on one (source) node unnecessarily early. ctdb resource takes very long time to start (almost minute?), so broker-vips:1 is unavailable during all that time. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Unique clone instance is stopped too early on move
Hi Andrew, David, all. I found a little bit strange operation ordering during transition execution. Could you please look at the following partial configuration (crmsh syntax)? === ... clone cl-broker broker \ meta interleave=true target-role=Started clone cl-broker-vips broker-vips \ meta clone-node-max=2 globally-unique=true interleave=true resource-stickiness=0 target-role=Started clone cl-ctdb ctdb \ meta interleave=true target-role=Started colocation broker-vips-with-broker inf: cl-broker-vips cl-broker colocation broker-with-ctdb inf: cl-broker cl-ctdb order broker-after-ctdb inf: cl-ctdb cl-broker order broker-vips-after-broker 0: cl-broker cl-broker-vips ... === After I put one node to standby and then back to online, I see the following transition (relevant excerpt): === * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Pseudo action: cl-ctdb_start_0 * Resource action: ctdbstart on c-pa-1 * Pseudo action: cl-ctdb_running_0 * Pseudo action: cl-broker_start_0 * Resource action: ctdbmonitor=1 on c-pa-1 * Resource action: broker start on c-pa-1 * Pseudo action: cl-broker_running_0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker monitor=1 on c-pa-1 * Resource action: broker-vips:1 start on c-pa-1 * Pseudo action: cl-broker-vips_running_0 * Resource action: broker-vips:1 monitor=3 on c-pa-1 === What could be a reason to stop unique clone instance so early for move? I tried different clone/order configurations, including cl-broker-vips:interleave=false, broker-vips-after-broker:score=inf and broker-vips-after-broker:symmetrical=false, but picture is always the same: broker-vips:1 is stopped first of all. Complete crm_report is available if needed. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] BUG: crm_mon prints status of clone instances being started as 'Started'
Hi all, It seems like lib/pengine/clone.c/clone_print() doesn't respect pending state of clone/ms resource instances in the cumulative output: short_print(list_text, child_text, rsc-variant == pe_master ? Slaves : Started, options, print_data); in by-node output crm_mon prints correct Starting. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] A secondary DRBD is not brought online after crm resource online in Ubuntu 14.04
03.01.2015 20:35, Dmitry Koterov wrote: Hello. Ubuntu 14.04, corosync 2.3.3, pacemaker 1.1.10. The cluster consists of 2 nodes (node1 and node2), when I run crm node standby node2 and then, in a minute, crm node online node2, DRBD secondary on node2 does not start. Logs say that drbdadm -c /etc/drbd.conf check-resize vlv fails with an error message: No valid meta data found on the onlining node. And, surprisingly, after I run service drbd start on node2 manually, everything becomes fine. Maybe something is broken in /usr/lib/ocf/resource.d/linbit/drbd, why cannot it start DRBD? Or I am misconfigured somehow? Could you please give an advice what to do? Please see inline in drbdadm output. I have the following configuration (drbd + mount + postgresql, but postgresql is innocent here, so just ignore it): *root@node2:/var/log#* crm configure show node $id=1017525950 node2 attributes standby=off node $id=1760315215 node1 primitive drbd ocf:linbit:drbd \ params drbd_resource=vlv \ op start interval=0 timeout=240 \ op stop interval=0 timeout=120 primitive fs ocf:heartbeat:Filesystem \ params device=/dev/drbd0 directory=/var/lib/vlv.drbd/root options=noatime,nodiratime fstype=xfs \ op start interval=0 timeout=300 \ op stop interval=0 timeout=300 primitive postgresql lsb:postgresql \ op monitor interval=4 timeout=60 \ op start interval=0 timeout=60 \ op stop interval=0 timeout=60 group pgserver fs postgresql ms ms_drbd drbd \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true location cli-prefer-pgserver pgserver inf: node1 colocation col_pgserver inf: pgserver ms_drbd:Master order ord_pgserver inf: ms_drbd:promote pgserver:start property $id=cib-bootstrap-options dc-version=1.1.10-42f2063 cluster-infrastructure=corosync stonith-enabled=false no-quorum-policy=ignore last-lrm-refresh=1420304078 rsc_defaults $id=rsc-options \ resource-stickiness=100 The cluster and DRBD statuses on node2 look healthy: *root@node2:/var/log#* crm status ... Online: [ node1 node2 ] Master/Slave Set: ms_drbd [drbd] Masters: [ node1 ] Slaves: [ node2 ] Resource Group: pgserver fs(ocf::heartbeat:Filesystem):Started node1 postgresql(lsb:postgresql):Started node1 *root@node2:/var/log#* cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) srcversion: F97798065516C94BE0F27DC 0: cs:Connected ro:Secondary/Primary ds:Diskless/UpToDate C r- that is your problem ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 Now I switch node2 to standby and verify that DRBD on it has really shot down: *root@node1:/etc/rc2.d#* crm node standby node2 *root@node2:/var/log#* cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) srcversion: F97798065516C94BE0F27DC *root@node2:/var/log#* * * Then I switch node2 back online and see that DRBD has not been initialized and reattached again! *root@node2:/var/log#* syslog *root@node1:/etc#* crm node online node2 *root@node2:/var/log#* crm status ... Online: [ node1 node2 ] Master/Slave Set: ms_drbd [drbd] Masters: [ node1 ] Stopped: [ node2 ] Resource Group: pgserver fs(ocf::heartbeat:Filesystem):Started node1 postgresql(lsb:postgresql):Started node1 Failed actions: drbd_start_0 (node=node2, call=81, rc=1, status=complete, last-rc-change=Sat Jan 3 12:05:32 2015 , queued=1118ms, exec=0ms ): unknown error *root@node2:/var/log#* cat syslog | head -n 30 Jan 3 12:05:31 node2 crmd[918]: notice: do_state_transition: State transition S_IDLE - S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Jan 3 12:05:31 node2 cib[913]: notice: cib:diff: Diff: --- 0.29.3 Jan 3 12:05:31 node2 cib[913]: notice: cib:diff: Diff: +++ 0.30.1 027344551b46745123e4a52562e55974 Jan 3 12:05:31 node2 pengine[917]: notice: unpack_config: On loss of CCM Quorum: Ignore Jan 3 12:05:31 node2 pengine[917]: notice: LogActions: Start drbd:1#011(node2) Jan 3 12:05:31 node2 crmd[918]: notice: te_rsc_command: Initiating action 46: notify drbd_pre_notify_start_0 on node1 Jan 3 12:05:31 node2 pengine[917]: notice: process_pe_message: Calculated Transition 11: /var/lib/pacemaker/pengine/pe-input-11.bz2 Jan 3 12:05:32 node2 crmd[918]: notice: te_rsc_command: Initiating action 10: start drbd:1_start_0 on node2 (local) Jan 3 12:05:32 node2 drbd(drbd)[1931]: ERROR: vlv: Called drbdadm -c /etc/drbd.conf check-resize vlv Jan 3 12:05:32 node2 drbd(drbd)[1931]: ERROR: vlv: Exit code 255 Jan 3 12:05:32 node2 drbd(drbd)[1931]: ERROR: vlv: Command output: Jan 3 12:05:32 node2 drbd(drbd)[1931]: ERROR: vlv: Called drbdadm -c /etc/drbd.conf --peer node1 attach vlv Jan 3 12:05:32 node2 drbd(drbd)[1931]: ERROR: vlv: Exit code 255 Jan 3 12:05:32 node2 drbd(drbd)[1931]: ERROR: vlv: Command output: Jan 3 12:05:33 node2 drbd(drbd)[1931]: ERROR: vlv: Called drbdadm -c /etc/drbd.conf --peer node1
Re: [Pacemaker] Fencing of bare-metal remote nodes
25.11.2014 23:41, David Vossel wrote: - Original Message - Hi! is subj implemented? Trying echo c /proc/sysrq-trigger on remote nodes and no fencing occurs. Yes, fencing remote-nodes works. Are you certain your fencing devices can handle fencing the remote-node? Fencing a remote-node requires a cluster node to invoke the agent that actually performs the fencing action on the remote-node. David, a couple of questions. I see that in your fencing tests you just stop systemd unit. Shouldn't pacemaker_remoted somehow notify crmd that it is being shutdown? And shouldn't crmd stop all resources on that remote node before granting that shutdown? Also, from what I see now it would be natural to hide current implementation of remote node configuration under node/ syntax. Now remote nodes do have almost all features of normal nodes, including node attributes. What do you think about it? Best, Vladislav -- Vossel Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Avoid monitoring of resources on nodes
26.11.2014 14:21, Daniel Dehennin wrote: Daniel Dehennin daniel.dehen...@baby-gnu.org writes: I'll try find how to make the change directly in XML. Ok, looking at git history this feature seems only available on master branch and not yet released. I do not have that feature on my pacemaker version. Does it sounds normal, I have: - asymmetrical Opt-in cluster[1] - a group of resources with INFINITY location on a specific node And the nodes excluded are fenced because of many monitor errors about this resource. Nodes may be fenced because of resource _only_ if resource fails to stop. I can only guess what exactly happens: * cluster probes all resource on all nodes (to prevent that you need feature mentioned by David) * some of resource probes return something except not running * cluster tries to stop that resources * stop fails * node is fenced You need to locate what exactly resource returns error on probe and fix that agent (actually you do not use OCF agents but rather upstart jobs and LSB scripts). Above is for the case if all nodes have mysql job and both scripts installed. If pacemaker decides to fence because one of them is missing - that should be a bug. Regards. Footnotes: [1] http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Pacemaker_Explained/_asymmetrical_opt_in_clusters.html ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Fencing of bare-metal remote nodes
26.11.2014 18:36, David Vossel wrote: - Original Message - 25.11.2014 23:41, David Vossel wrote: - Original Message - Hi! is subj implemented? Trying echo c /proc/sysrq-trigger on remote nodes and no fencing occurs. Yes, fencing remote-nodes works. Are you certain your fencing devices can handle fencing the remote-node? Fencing a remote-node requires a cluster node to invoke the agent that actually performs the fencing action on the remote-node. Yes, if I invoke fencing action manually ('crm node fence rnode' in crmsh syntax), node is fenced. So the issue seems to be related to the detection of a need fencing. Comments in related git commits are a little bit terse in this area. So could you please explain what exactly needs to happen on a remote node to initiate fencing? I tried so far: * kill pacemaker_remoted when no resources are running. systemd restated it and crmd reconnected after some time. * crash kernel when no resources are running * crash kernel during massive start of resources this last one should definitely cause fencing. What version of pacemaker are you using? I've made changes in this area recently. Can you provide a crm_report. It's c191bf3. crm_report is ready, but I still wait an approval from a customer to send it. -- David No fencing happened. In the last case that start actions 'hung' and were failed by timeout (it is rather long), node was not even listed as failed. My customer asked me to stop crashing nodes because one of them does not boot anymore (I like that modern UEFI hardware very much.), so it is hard for me to play more with that. Best, Vladislav -- Vossel Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [Linux-HA] [ha-wg] [RFC] Organizing HA Summit 2015
25.11.2014 12:54, Lars Marowsky-Bree wrote:... OK, let's switch tracks a bit. What *topics* do we actually have? Can we fill two days? Where would we want to collect them? Just my 2c. - It would be interesting to get some bird-view information on what C APIs corosync and pacemaker currently provide to application developers (one immediate use-case is in-app monitoring of the cluster events). - One more (more developer-bounded) topic could be a resource degraded state support. From the user perspective it would be nice to have. One immediate example is iscsi connection to several portals. When some portals are not accessible, connection still may work, but in the degraded state. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Suicide fencing and watchdog questions
27.11.2014 03:43, Andrew Beekhof wrote: On 25 Nov 2014, at 10:37 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, Is there any information how watchdog integration is intended to work? What are currently-evaluated use-cases for that? It seems to be forcibly disabled id SBD is not detected... Are you referring to no-quorum-policy=suicide? That too. But main intention was to understand what value that feature can bring at all. I tried to enable it without SBD or no-quorum-policy=suicide and watchdog was not fired up. Then I looked at sources and realized that it is enabled only when SBD is detected, and is not actually managed by the cluster option. Also, is there any way to make node (in one-node cluster ;) ) to suicide if it detects fencing is required? Technically, that can be done with IPMI 'power cycle' or 'power reset' commands - but node (and thus the whole cluster) will not know about fencing is succeeded, because if it received the answer, then fencing failed. But node will be hard reboot and thus cleaned up otherwise. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Fencing of bare-metal remote nodes
Hi! is subj implemented? Trying echo c /proc/sysrq-trigger on remote nodes and no fencing occurs. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Suicide fencing and watchdog questions
Hi, Is there any information how watchdog integration is intended to work? What are currently-evaluated use-cases for that? It seems to be forcibly disabled id SBD is not detected... Also, is there any way to make node (in one-node cluster ;) ) to suicide if it detects fencing is required? Technically, that can be done with IPMI 'power cycle' or 'power reset' commands - but node (and thus the whole cluster) will not know about fencing is succeeded, because if it received the answer, then fencing failed. But node will be hard reboot and thus cleaned up otherwise. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Avoid monitoring of resources on nodes
25.11.2014 23:36, David Vossel wrote: - Original Message - Daniel Dehennin daniel.dehen...@baby-gnu.org writes: Hello, Hello, I have a 4 nodes cluster and some resources are only installed on 2 of them. I set cluster asymmetry and infinity location: primitive Mysqld upstart:mysql \ op monitor interval=60 primitive OpenNebula-Sunstone-Sysv lsb:opennebula-sunstone \ op monitor interval=60 primitive OpenNebula-Sysv lsb:opennebula \ op monitor interval=60 group OpenNebula Mysqld OpenNebula-Sysv OpenNebula-Sunstone-Sysv \ meta target-role=Started location OpenNebula-runs-on-Frontend OpenNebula inf: one-frontend property $id=cib-bootstrap-options \ dc-version=1.1.10-42f2063 \ cluster-infrastructure=corosync \ symmetric-cluster=false \ stonith-enabled=true \ stonith-timeout=30 \ last-lrm-refresh=1416817941 \ no-quorum-policy=stop \ stop-all-resources=off But I have a lot of failing monitoring on other nodes of these resources because they are not installed on them. Is there a way to completely exclude the resources from nodes, even the monitoring? actually, this is possible now. I am unaware of any configuration tools (pcs or crmsh) that support this feature yet though. You might have to edit the cib xml manually. There's a new 'resource-discovery' option you can set on a location constraint that help prevent resources from ever being started or monitored on a node. crmsh git master supports that. One note is that pacemaker validation schema should be set to 'pacemaker-next'. Example: never start or monitor the resource FAKE1 on 18node2. rsc_location id=location-FAKE1-18node2 node=18node2 resource-discovery=never rsc=FAKE1 score=-INFINITY/ There are more examples in this regression test. https://github.com/ClusterLabs/pacemaker/blob/master/pengine/test10/resource-discovery.xml#L99 -- Vossel This cause troubles on my setup, as resources fails, my nodes are all fenced. Any hints? Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Globally-unique clone cleanup on remote nodes
20.11.2014 01:57, Andrew Beekhof пишет: On 19 Nov 2014, at 4:27 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, just an observation. I have a globally-unique clone with 50 instances in a cluster consisting of one cluster node and 3 remote bare-metal nodes. When I run crm_resource -C -r 'g_u_clone' (crmsh does that for me), it writes that it expects to receive 50 answers (although it writes 4 lines for per-instance requests, one for each node). Is that correct (I would expect to see 200 there)? At most it would be 151, not 200. How many times do you see this printed? printf(Cleaning up %s on %s\n, rsc-id, host_uname); 200. Ok, I didn't count, but this is printed once per instance per node, 50 instances, 4 nodes (one cluster node and 3 remote ones). Version is c191bf3. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Globally-unique clone cleanup on remote nodes
20.11.2014 09:25, Andrew Beekhof пишет: On 20 Nov 2014, at 5:12 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 20.11.2014 01:57, Andrew Beekhof пишет: On 19 Nov 2014, at 4:27 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, just an observation. I have a globally-unique clone with 50 instances in a cluster consisting of one cluster node and 3 remote bare-metal nodes. When I run crm_resource -C -r 'g_u_clone' (crmsh does that for me), it writes that it expects to receive 50 answers (although it writes 4 lines for per-instance requests, one for each node). Is that correct (I would expect to see 200 there)? At most it would be 151, not 200. How many times do you see this printed? printf(Cleaning up %s on %s\n, rsc-id, host_uname); 200. Ok, I didn't count, but this is printed once per instance per node, 50 instances, 4 nodes (one cluster node and 3 remote ones). ah, 201 then. But actually, we don't expect replies from pacemaker-remote nodes. Aha, got the idea, so what I see is correct, right? Btw, if I run the same command per remote node, no expectation line is printed at all. Thanks. Version is c191bf3. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Globally-unique clone cleanup on remote nodes
Hi all, just an observation. I have a globally-unique clone with 50 instances in a cluster consisting of one cluster node and 3 remote bare-metal nodes. When I run crm_resource -C -r 'g_u_clone' (crmsh does that for me), it writes that it expects to receive 50 answers (although it writes 4 lines for per-instance requests, one for each node). Is that correct (I would expect to see 200 there)? Version is c191bf3. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] resource-discovery question
Hi David, all, I'm trying to get resource-discovery=never working with cd7c9ab, but still get Not installed probe failures from nodes which does not have corresponding resource agents installed. The only difference in my location constraints comparing to what is committed in #589 is that they are rule-based (to match #kind). Is that supposed to work with the current master or still TBD? My location constraints look like: rsc_location id=vlan003-on-cluster-nodes rsc=vlan003 resource-discovery=never rule score=-INFINITY id=vlan003-on-cluster-nodes-rule expression attribute=#kind operation=ne value=cluster id=vlan003-on-cluster-nodes-rule-expression/ /rule /rsc_location Do I miss something? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] resource-discovery question
12.11.2014 22:04, Vladislav Bogdanov wrote: Hi David, all, I'm trying to get resource-discovery=never working with cd7c9ab, but still get Not installed probe failures from nodes which does not have corresponding resource agents installed. The only difference in my location constraints comparing to what is committed in #589 is that they are rule-based (to match #kind). Is that supposed to work with the current master or still TBD? Yep, after I modified constraint to a rule-less syntax, it works: rsc_location id=vlan003-on-cluster-nodes rsc=vlan003 score=-INFINITY node=rnode001 resource-discovery=never/ But I'd prefer to that killer feature to work with rules too :) Although resource-discovery=exclusive with score 0 for multiple nodes should probably also work for me, correct? I cannot test that on a cluster with one cluster node and one remote node. My location constraints look like: rsc_location id=vlan003-on-cluster-nodes rsc=vlan003 resource-discovery=never rule score=-INFINITY id=vlan003-on-cluster-nodes-rule expression attribute=#kind operation=ne value=cluster id=vlan003-on-cluster-nodes-rule-expression/ /rule /rsc_location Do I miss something? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] resource-discovery question
12.11.2014 22:57, David Vossel wrote: - Original Message - 12.11.2014 22:04, Vladislav Bogdanov wrote: Hi David, all, I'm trying to get resource-discovery=never working with cd7c9ab, but still get Not installed probe failures from nodes which does not have corresponding resource agents installed. The only difference in my location constraints comparing to what is committed in #589 is that they are rule-based (to match #kind). Is that supposed to work with the current master or still TBD? Yep, after I modified constraint to a rule-less syntax, it works: ahh, good catch. I'll take a look! rsc_location id=vlan003-on-cluster-nodes rsc=vlan003 score=-INFINITY node=rnode001 resource-discovery=never/ But I'd prefer to that killer feature to work with rules too :) Although resource-discovery=exclusive with score 0 for multiple nodes should probably also work for me, correct? yep it should. I cannot test that on a cluster with one cluster node and one remote node. this feature should work the same with remote nodes and cluster nodes. I'll get a patch out for the rule issue. I'm also pushing out some documentation for the resource-discovery option. It seems like you've got a good handle on it already though :) Oh, I see new pull-request, thank you very much! One side question: Is default value for clone-max influenced by resource-discovery value(s)? My location constraints look like: rsc_location id=vlan003-on-cluster-nodes rsc=vlan003 resource-discovery=never rule score=-INFINITY id=vlan003-on-cluster-nodes-rule expression attribute=#kind operation=ne value=cluster id=vlan003-on-cluster-nodes-rule-expression/ /rule /rsc_location Do I miss something? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5
11.11.2014 07:27, Sihan Goi wrote: Hi, DocumentRoot is still set to /var/www/html ls -al /var/www/html shows different things on the 2 nodes node01: total 28 drwxr-xr-x. 3 root root 4096 Nov 11 12:25 . drwxr-xr-x. 6 root root 4096 Jul 23 22:18 .. -rw-r--r--. 1 root root50 Oct 28 18:00 index.html drwx--. 2 root root 16384 Oct 28 17:59 lost+found node02 only has index.html, no lost+found, and it's a different version of the file. It look like apache is unable to stat its document root. Could you please show output of two commands: getenforce ls -dZ /var/www/html on both nodes when fs is mounted on one of them? If you see 'Enforcing', and the last part of the selinux context of a mounted fs root is not httpd_sys_content_t, then run 'restorecon -R /var/www/html' on that node. Status URL is enabled in both nodes. On Oct 30, 2014 11:14 AM, Andrew Beekhof and...@beekhof.net mailto:and...@beekhof.net wrote: On 29 Oct 2014, at 1:01 pm, Sihan Goi gois...@gmail.com mailto:gois...@gmail.com wrote: Hi, I've never used crm_report before. I just read the man file and generated a tarball from 1-2 hours before I reconfigured all the DRBD related resources. I've put the tarball here - https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0 Hope you can help figure out what I'm doing wrong. Thanks for the help! Oct 28 18:13:38 node02 Filesystem(WebFS)[29940]: INFO: Running start for /dev/drbd/by-res/wwwdata on /var/www/html Oct 28 18:13:39 node02 kernel: EXT4-fs (drbd1): mounted filesystem with ordered data mode. Opts: Oct 28 18:13:39 node02 crmd[9870]: notice: process_lrm_event: LRM operation WebFS_start_0 (call=164, rc=0, cib-update=298, confirmed=true) ok Oct 28 18:13:39 node02 crmd[9870]: notice: te_rsc_command: Initiating action 7: start WebSite_start_0 on node02 (local) Oct 28 18:13:39 node02 apache(WebSite)[30007]: ERROR: Syntax error on line 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory Is DocumentRoot still set to /var/www/html? If so, what happens if you run 'ls -al /var/www/html' in a shell? Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: apache not running Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up Did you enable the status url? http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_enable_the_apache_status_url.html ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org mailto:Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] #kind eq container matches bare-metal nodes
23.10.2014 22:39, David Vossel wrote: - Original Message - 21.10.2014 06:25, Vladislav Bogdanov wrote: 21.10.2014 05:15, Andrew Beekhof wrote: On 20 Oct 2014, at 8:52 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all, It seems like #kind was introduced before bare-metal remote node support, and now it is matched against cluster and container. Bare-metal remote nodes match container (they are remote), but strictly speaking they are not containers. Could/should that attribute be extended to the bare-metal use case? Unclear, the intent was 'nodes that aren't really cluster nodes'. Whats the usecase for wanting to tell them apart? (I can think of some, just want to hear yours) I want VM resources to be placed only on bare-metal remote nodes. -inf: #kind ne container looks a little bit strange. #kind ne remote would be more descriptive (having now them listed in CIB with 'remote' type). One more case (which is what I'd like to use in the mid-future) is a mixed remote-node environment, where VMs run on bare-metal remote nodes using storage from cluster nodes (f.e. sheepdog), and some of that VMs are whitebox containers themselves (they run services controlled by pacemaker via pacemaker_remoted). Having constraint '-inf: #kind ne container' is not enough to not try to run VMs inside of VMs - both bare-metal remote nodes and whitebox containers match 'container'. remember, you can't run remote-nodes nested within remote-nodes... so container nodes on baremetal remote-nodes won't work. Good to know, thanks. That imho should go into the documentation in bold red :) Is that a conceptual limitation or it is just not yet supported? You don't have to be careful about not messing this up or anything. You can mix container nodes and baremetal remote-nodes and everything should work fine. The policy engine will never allow you to place a container node on a baremetal remote-node though. -- David ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] #kind eq container matches bare-metal nodes
21.10.2014 06:25, Vladislav Bogdanov wrote: 21.10.2014 05:15, Andrew Beekhof wrote: On 20 Oct 2014, at 8:52 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all, It seems like #kind was introduced before bare-metal remote node support, and now it is matched against cluster and container. Bare-metal remote nodes match container (they are remote), but strictly speaking they are not containers. Could/should that attribute be extended to the bare-metal use case? Unclear, the intent was 'nodes that aren't really cluster nodes'. Whats the usecase for wanting to tell them apart? (I can think of some, just want to hear yours) I want VM resources to be placed only on bare-metal remote nodes. -inf: #kind ne container looks a little bit strange. #kind ne remote would be more descriptive (having now them listed in CIB with 'remote' type). One more case (which is what I'd like to use in the mid-future) is a mixed remote-node environment, where VMs run on bare-metal remote nodes using storage from cluster nodes (f.e. sheepdog), and some of that VMs are whitebox containers themselves (they run services controlled by pacemaker via pacemaker_remoted). Having constraint '-inf: #kind ne container' is not enough to not try to run VMs inside of VMs - both bare-metal remote nodes and whitebox containers match 'container'. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] #kind eq container matches bare-metal nodes
Hi Andrew, David, all, It seems like #kind was introduced before bare-metal remote node support, and now it is matched against cluster and container. Bare-metal remote nodes match container (they are remote), but strictly speaking they are not containers. Could/should that attribute be extended to the bare-metal use case? Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] #kind eq container matches bare-metal nodes
21.10.2014 05:15, Andrew Beekhof wrote: On 20 Oct 2014, at 8:52 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all, It seems like #kind was introduced before bare-metal remote node support, and now it is matched against cluster and container. Bare-metal remote nodes match container (they are remote), but strictly speaking they are not containers. Could/should that attribute be extended to the bare-metal use case? Unclear, the intent was 'nodes that aren't really cluster nodes'. Whats the usecase for wanting to tell them apart? (I can think of some, just want to hear yours) I want VM resources to be placed only on bare-metal remote nodes. -inf: #kind ne container looks a little bit strange. #kind ne remote would be more descriptive (having now them listed in CIB with 'remote' type). ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Corosync and Pacemaker Hangs
: action_timer_callback: Timer popped (timeout=2, abort_level=100, complete=false) node01 crmd[952]:error: print_synapse: [Action 68]: In-flight rsc op drbd_pg_pre_notify_demote_0 on node02 (priority: 0, waiting: none) node01 crmd[952]: warning: cib_action_update: rsc_op 68: drbd_pg_pre_notify_demote_0 on node02 timed out node01 crmd[952]:error: cib_action_updated: Update 297 FAILED: Timer expired node01 crmd[952]:error: stonith_async_timeout_handler: Async call 2 timed out after 12ms node01 crmd[952]: notice: tengine_stonith_callback: Stonith operation 2/54:261:0:6978227d-ce2d-4dc6-955a-eb9313f112a5: Timer expired (-62) node01 crmd[952]: notice: tengine_stonith_callback: Stonith operation 2 for node02 failed (Timer expired): aborting transition. node01 crmd[952]: notice: run_graph: Transition 261 (Complete=6, Pending=0, Fired=0, Skipped=29, Incomplete=15, Source=/var/lib/pacemaker/pengine/pe-warn-0.bz2): Stopped node01 pengine[951]: notice: unpack_config: On loss of CCM Quorum: Ignore node01 pengine[951]: warning: pe_fence_node: Node node02 will be fenced because our peer process is no longer available node01 pengine[951]: warning: determine_online_status: Node node02 is unclean node01 pengine[951]: warning: stage6: Scheduling Node node02 for STONITH node01 pengine[951]: notice: LogActions: Movefs_pg#011(Started node02 - node01) node01 pengine[951]: notice: LogActions: Moveip_pg#011(Started node02 - node01) node01 pengine[951]: notice: LogActions: Movelsb_pg#011(Started node02 - node01) node01 pengine[951]: notice: LogActions: Demote drbd_pg:0#011(Master - Stopped node02) node01 pengine[951]: notice: LogActions: Promote drbd_pg:1#011(Slave - Master node01) node01 pengine[951]: notice: LogActions: Stopp_fence:0#011(node02) node01 crmd[952]: notice: te_fence_node: Executing reboot fencing operation (53) on node02 (timeout=6) node01 stonith-ng[948]: notice: handle_request: Client crmd.952.6d7ac808 wants to fence (reboot) 'node02' with device '(any)' node01 stonith-ng[948]: notice: initiate_remote_stonith_op: Initiating remote operation reboot for node02: a4fae8ce-3a6c-4fe5-a934-b5b83ae123cb (0) node01 crmd[952]: notice: te_rsc_command: Initiating action 67: notify drbd_pg_pre_notify_demote_0 on node02 node01 crmd[952]: notice: te_rsc_command: Initiating action 69: notify drbd_pg_pre_notify_demote_0 on node01 (local) node01 pengine[951]: warning: process_pe_message: Calculated Transition 262: /var/lib/pacemaker/pengine/pe-warn-1.bz2 node01 crmd[952]: notice: process_lrm_event: LRM operation drbd_pg_notify_0 (call=66, rc=0, cib-update=0, confirmed=true) ok Last updated: Mon Sep 15 01:15:59 2014 Last change: Sat Sep 13 15:23:45 2014 via cibadmin on node01 Stack: corosync Current DC: node01 (167936788) - partition with quorum Version: 1.1.10-42f2063 2 Nodes configured 7 Resources configured Node node02 (167936789): UNCLEAN (online) Online: [ node01 ] Resource Group: PGServer fs_pg (ocf::heartbeat:Filesystem):Started node02 ip_pg (ocf::heartbeat:IPaddr2): Started node02 lsb_pg (lsb:postgresql): Started node02 Master/Slave Set: ms_drbd_pg [drbd_pg] Masters: [ node02 ] Slaves: [ node01 ] Clone Set: cln_p_fence [p_fence] Started: [ node01 node02 ] Thank you, Norbert On Fri, Sep 12, 2014 at 12:06 PM, Vladislav Bogdanov bub...@hoster-ok.com mailto:bub...@hoster-ok.com wrote: 12.09.2014 05:00, Norbert Kiam Maclang wrote: Hi, After adding resource level fencing on drbd, I still ended up having problems with timeouts on drbd. Is there a recommended settings for this? I followed what is written in the drbd documentation - http://www.drbd.org/users-guide-emb/s-pacemaker-crm-drbd-backed-service.html , Another thing I can't understand is why during initial tests, even I reboot the vms several times, failover works. But after I soak it for a couple of hours (say for example 8 hours or more) and continue with the tests, it will not failover and experience split brain. I confirmed it though that everything is healthy before performing a reboot. Disk health and network is good, drbd is synced, time beetween servers is good. I recall I've seen something similar a year ago (near the time your pacemaker version is dated). I do not remember what was the exact problem cause, but I saw that drbd RA timeouts because it waits for something (fencing) in the kernel space to be done. drbd calls userspace scripts from within kernelspace, and you'll see them in the process list with the drbd kernel thread as a parent. I'd also upgrade your corosync configuration from member to nodelist syntax, specifying name parameter together with ring0_addr for nodes (that parameter
Re: [Pacemaker] Corosync and Pacemaker Hangs
11.09.2014 05:57, Norbert Kiam Maclang wrote: Is this something to do with quorum? But I already set You'd need to configure fencing at the drbd resources level. http://www.drbd.org/users-guide-emb/s-pacemaker-fencing.html#s-pacemaker-fencing-cib property no-quorum-policy=ignore \ expected-quorum-votes=1 Thanks in advance, Kiam On Thu, Sep 11, 2014 at 10:09 AM, Norbert Kiam Maclang norbert.kiam.macl...@gmail.com mailto:norbert.kiam.macl...@gmail.com wrote: Hi, Please help me understand what is causing the problem. I have a 2 node cluster running on vms using KVM. Each vm (I am using Ubuntu 14.04) runs on a separate hypervisor on separate machines. All are working well during testing (I restarted the vms alternately), but after a day when I kill the other node, I always end up corosync and pacemaker hangs on the surviving node. Date and time on the vms are in sync, I use unicast, tcpdump shows both nodes exchanges, confirmed that DRBD is healthy and crm_mon show good status before I kill the other node. Below are my configurations and versions I used: corosync 2.3.3-1ubuntu1 crmsh1.2.5+hg1034-1ubuntu3 drbd8-utils 2:8.4.4-1ubuntu1 libcorosync-common4 2.3.3-1ubuntu1 libcrmcluster4 1.1.10+git20130802-1ubuntu2 libcrmcommon31.1.10+git20130802-1ubuntu2 libcrmservice1 1.1.10+git20130802-1ubuntu2 pacemaker1.1.10+git20130802-1ubuntu2 pacemaker-cli-utils 1.1.10+git20130802-1ubuntu2 postgresql-9.3 9.3.5-0ubuntu0.14.04.1 # /etc/corosync/corosync: totem { version: 2 token: 3000 token_retransmits_before_loss_const: 10 join: 60 consensus: 3600 vsftype: none max_messages: 20 clear_node_high_bit: yes secauth: off threads: 0 rrp_mode: none interface { member { memberaddr: 10.2.136.56 } member { memberaddr: 10.2.136.57 } ringnumber: 0 bindnetaddr: 10.2.136.0 mcastport: 5405 } transport: udpu } amf { mode: disabled } quorum { provider: corosync_votequorum expected_votes: 1 } aisexec { user: root group: root } logging { fileline: off to_stderr: yes to_logfile: no to_syslog: yes syslog_facility: daemon debug: off timestamp: on logger_subsys { subsys: AMF debug: off tags: enter|leave|trace1|trace2|trace3|trace4|trace6 } } # /etc/corosync/service.d/pcmk: service { name: pacemaker ver: 1 } /etc/drbd.d/global_common.conf: global { usage-count no; } common { net { protocol C; } } # /etc/drbd.d/pg.res: resource pg { device /dev/drbd0; disk /dev/vdb; meta-disk internal; startup { wfc-timeout 15; degr-wfc-timeout 60; } disk { on-io-error detach; resync-rate 40M; } on node01 { address 10.2.136.56:7789 http://10.2.136.56:7789; } on node02 { address 10.2.136.57:7789 http://10.2.136.57:7789; } net { verify-alg md5; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } } # Pacemaker configuration: node $id=167938104 node01 node $id=167938105 node02 primitive drbd_pg ocf:linbit:drbd \ params drbd_resource=pg \ op monitor interval=29s role=Master \ op monitor interval=31s role=Slave primitive fs_pg ocf:heartbeat:Filesystem \ params device=/dev/drbd0 directory=/var/lib/postgresql/9.3/main fstype=ext4 primitive ip_pg ocf:heartbeat:IPaddr2 \ params ip=10.2.136.59 cidr_netmask=24 nic=eth0 primitive lsb_pg lsb:postgresql group PGServer fs_pg lsb_pg ip_pg ms ms_drbd_pg drbd_pg \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true colocation pg_on_drbd inf: PGServer ms_drbd_pg:Master order pg_after_drbd inf: ms_drbd_pg:promote PGServer:start property $id=cib-bootstrap-options \ dc-version=1.1.10-42f2063 \ cluster-infrastructure=corosync \ stonith-enabled=false \ no-quorum-policy=ignore rsc_defaults $id=rsc-options \ resource-stickiness=100 # Logs on node01 Sep 10 10:25:33 node01 crmd[1019]: notice: peer_update_callback: Our peer on the DC is dead
Re: [Pacemaker] Corosync and Pacemaker Hangs
12.09.2014 05:00, Norbert Kiam Maclang wrote: Hi, After adding resource level fencing on drbd, I still ended up having problems with timeouts on drbd. Is there a recommended settings for this? I followed what is written in the drbd documentation - http://www.drbd.org/users-guide-emb/s-pacemaker-crm-drbd-backed-service.html , Another thing I can't understand is why during initial tests, even I reboot the vms several times, failover works. But after I soak it for a couple of hours (say for example 8 hours or more) and continue with the tests, it will not failover and experience split brain. I confirmed it though that everything is healthy before performing a reboot. Disk health and network is good, drbd is synced, time beetween servers is good. I recall I've seen something similar a year ago (near the time your pacemaker version is dated). I do not remember what was the exact problem cause, but I saw that drbd RA timeouts because it waits for something (fencing) in the kernel space to be done. drbd calls userspace scripts from within kernelspace, and you'll see them in the process list with the drbd kernel thread as a parent. I'd also upgrade your corosync configuration from member to nodelist syntax, specifying name parameter together with ring0_addr for nodes (that parameter is not referenced in corosync docs but should be somewhere in the Pacemaker Explained - it is used only by the pacemaker). Also there is trace_ra functionality support in both pacemaker and crmsh (cannot say if that is supported in versions you have though, probably yes) so you may want to play with that to get the exact picture from the resource agent. Anyways, upgrading to 1.1.12 and more recent crmsh is nice to have for you because you may be just hitting a long-ago solved and forgotten bug/issue. Concerning your expected-quorum-votes=1 You need to configure votequorum in corosync with two_node: 1 instead of that line. # Logs: node01 lrmd[1036]: warning: child_timeout_callback: drbd_pg_monitor_29000 process (PID 27744) timed out node01 lrmd[1036]: warning: operation_finished: drbd_pg_monitor_29000:27744 - timed out after 2ms node01 crmd[1039]:error: process_lrm_event: LRM operation drbd_pg_monitor_29000 (69) Timed Out (timeout=2ms) node01 crmd[1039]: warning: update_failcount: Updating failcount for drbd_pg on tyo1mqdb01p after failed monitor: rc=1 (update=value++, time=1410486352) Thanks, Kiam On Thu, Sep 11, 2014 at 6:58 PM, Norbert Kiam Maclang norbert.kiam.macl...@gmail.com mailto:norbert.kiam.macl...@gmail.com wrote: Thank you Vladislav. I have configured resource level fencing on drbd and removed wfc-timeout and defr-wfc-timeout (is this required?). My drbd configuration is now: resource pg { device /dev/drbd0; disk /dev/vdb; meta-disk internal; disk { fencing resource-only; on-io-error detach; resync-rate 40M; } handlers { fence-peer /usr/lib/drbd/crm-fence-peer.sh; after-resync-target /usr/lib/drbd/crm-unfence-peer.sh; split-brain /usr/lib/drbd/notify-split-brain.sh nkbm; } on node01 { address 10.2.136.52:7789 http://10.2.136.52:7789; } on node02 { address 10.2.136.55:7789 http://10.2.136.55:7789; } net { verify-alg md5; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } } Failover works on my initial test (restarting both nodes alternately - this always works). Will wait for a couple of hours after doing a failover test again (Which always fail on my previous setup). Thank you! Kiam On Thu, Sep 11, 2014 at 2:14 PM, Vladislav Bogdanov bub...@hoster-ok.com mailto:bub...@hoster-ok.com wrote: 11.09.2014 05:57, Norbert Kiam Maclang wrote: Is this something to do with quorum? But I already set You'd need to configure fencing at the drbd resources level. http://www.drbd.org/users-guide-emb/s-pacemaker-fencing.html#s-pacemaker-fencing-cib property no-quorum-policy=ignore \ expected-quorum-votes=1 Thanks in advance, Kiam On Thu, Sep 11, 2014 at 10:09 AM, Norbert Kiam Maclang norbert.kiam.macl...@gmail.com mailto:norbert.kiam.macl...@gmail.com mailto:norbert.kiam.macl...@gmail.com mailto:norbert.kiam.macl...@gmail.com wrote: Hi, Please help me understand what is causing the problem. I have a 2 node cluster running on vms using KVM. Each vm (I am using Ubuntu 14.04) runs on a separate hypervisor on separate machines. All are working well during testing (I restarted
Re: [Pacemaker] Configuration recommandations for (very?) large cluster
14.08.2014 10:35, Andrew Beekhof wrote: ... The load from the crmd is mostly from talking to the lrmd, which is dependant on resource placement rather than being (or not being) the DC. I've seen the different picture with 1024 unique clone instances. crmd's CPU load on DC is much higher for that case during probe/start/stop. Thats with the new CIB code? Good question. For some unknown reason it is 1.1.11-rc4 (a21e2a2). So, it is with the old one. I need to recheck with the newer one. Sorry for noise. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Configuration recommandations for (very?) large cluster
14.08.2014 05:24, Andrew Beekhof wrote: On 14 Aug 2014, at 12:05 am, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Wed, Aug 13, 2014 at 10:33:55AM +1000, Andrew Beekhof wrote: On 13 Aug 2014, at 2:02 am, Cédric Dufour - Idiap Research Institute cedric.duf...@idiap.ch wrote: On 12/08/14 07:52, Andrew Beekhof wrote: On 11 Aug 2014, at 10:10 pm, Cédric Dufour - Idiap Research Institute cedric.duf...@idiap.ch wrote: ... While I still had the ~450 resources, I also accidentally brought all 22 nodes back to life together (well, actually started the DC alone and then started the remaining 21 nodes together). As could be expected, the DC got quite busy (dispatching/executing the ~450*22 monitoring operations on all nodes). It took 40 minutes for the cluster to stabilize. But it did stabilize, with no timeout and not monitor operations failure! A few high CIB load detected / throttle down mode messages popped up but all went well. Cool. Thats about 0.12s per operation, not too bad. More importantly, I'm glad to hear that real-world clusters are seeing the same kind of improvements as those in the lab. It would be interesting to know how the 40 minutes compares to bringing one node online at a time. Q: Is there a way to favorize more powerful nodes for the DC (iow. push the DC election process in a preferred direction) ? Only by starting it first and ensuring it doesn't die (we prfioritize the node with the largest crmd process uptime). Uhm, there was a patch once for pacemaker-1.0. The latest version I found right now is below. Written by Klaus Wenninger, iirc. The new CIB code actually reduces the case for a patch like this - since all updates are performed on all hosts. So the workload from the CIB should be pretty much identical on all nodes. The load from the crmd is mostly from talking to the lrmd, which is dependant on resource placement rather than being (or not being) the DC. I've seen the different picture with 1024 unique clone instances. crmd's CPU load on DC is much higher for that case during probe/start/stop. About the only reason would be to make the pengine go faster - and I'm not completely convinced that is a sufficient justification. In one cluster I have heterogeneous HW - some nodes are Xeons, some are Atoms. I'd prefer to have a way to perform pe calculations on the former ones. The idea was to communicate via environment a HA_dc_prio value, with meanings: unset = use default of 1 = -1: Node does not become DC (does not vote) The 0 behaviour sounds dangerous. That would preclude a DC from being elected in a partition where all nodes had this value. == 0: Node may only become DC if no node with = 1 is available. It also will trigger an election whenever a node joins. (== 1: default) = 1: classic pacemaker behavior, but changed so positive prio will be checked first, and higher positive prio will win It may still apply (with white space changes) to current pacemaker 1.0. It will need some more adjustments for pacemaker 1.1, but a quick browse through the code suggests it won't be too much work. Lars --- crmd/election.c.orig 2011-11-28 16:24:54.345431668 +0100 +++ crmd/election.c 2011-11-28 16:39:18.008420543 +0100 @@ -33,6 +33,7 @@ GHashTable *voted = NULL; uint highest_born_on = -1; static int current_election_id = 1; +static int our_dc_prio = INT_MIN; /* INT_MIN/0/==0/0 not_set/not_voting/retrigger_election/default_behaviour_plus_prio */ static int crm_uptime(struct timeval *output) @@ -107,6 +108,20 @@ break; } +if (our_dc_prio == INT_MIN) { +char * dc_prio_str = getenv(HA_dc_prio); + +if (dc_prio_str == NULL) { +our_dc_prio = 1; +} else { +our_dc_prio = atoi(dc_prio_str); +} +} + +if (our_dc_prio 0) { +not_voting = TRUE; +} + if (not_voting == FALSE) { if (is_set(fsa_input_register, R_STARTING)) { not_voting = TRUE; @@ -123,6 +138,7 @@ current_election_id++; crm_xml_add(vote, F_CRM_ELECTION_OWNER, fsa_our_uuid); crm_xml_add_int(vote, F_CRM_ELECTION_ID, current_election_id); +crm_xml_add_int(vote, F_CRM_DC_PRIO, our_dc_prio); crm_uptime(age); crm_xml_add_int(vote, F_CRM_ELECTION_AGE_S, age.tv_sec); @@ -241,8 +258,9 @@ { struct timeval your_age; int age; int election_id = -1; +int your_dc_prio = 1; int log_level = LOG_INFO; gboolean use_born_on = FALSE; gboolean done = FALSE; gboolean we_loose = FALSE; @@ -273,6 +291,18 @@ your_version = crm_element_value(vote-msg, F_CRM_VERSION); election_owner = crm_element_value(vote-msg, F_CRM_ELECTION_OWNER); crm_element_value_int(vote-msg, F_CRM_ELECTION_ID, election_id); +crm_element_value_int(vote-msg, F_CRM_DC_PRIO, your_dc_prio); + +if (our_dc_prio == INT_MIN) {
Re: [Pacemaker] Signal hangup handling for pacemaker and corosync
25.07.2014 02:20, Andrew Beekhof wrote: ... On 15 Jul 2014, at 8:00 pm, Arjun Pandey apandepub...@gmail.com wrote: Right. Actually the issue i am facing is that i am starting the pacemaker service remotely from a wrapper and thus pacemakerd dies when the wrapper exits.nohup solves the problem but then HUP cannot be used by pacemaker. Is this workaround ok ? I guess. How are you starting pacemaker? Usually its with some variant of 'service pacemaker start'. I am using 'service pacemaker start'. However this is being called from my script. So when the script exits pacemaker gets SIGHUP. Release testing starts clusters as: ssh -l root somenode -- service pacemaker start It could depend on what service is. It would either schedule systemd to run job (el7/fc18+), or just run init script itself (el6). In latter case, if process didn't detach from its controlling terminal when that terminal gone away, it will be sent a SIGHUP. Except we test rhel6 the same way... I understand. This issue is from sometimes happens on some systems folder. I recall I had problems ages ago with a daemon run from rc.local sometimes exists with HUP. 'sleep 1' after its launch was the easiest fix. How about: https://github.com/beekhof/pacemaker/commit/95175f5 That doesn't hurt, but could be just not enough, as pacemakerd does not daemonize itself, but is put into background by shell means. Thus, when you add signal handler, pacemakerd already runs some time in the background. If terminal (ssh session) disconnects before signal handler is installed, then process exits anyways. Just moving it earlier would seem the simplest option Yes, but it still remains racy. Strictly speaking, handler should be installed in the child before the parent process exits. And you cannot control this when shell does the fork. I guess this is one of the few times I've been grateful for systemd The main question for me is Why does it work at all on EL6? ;) Particularly, why it doesn't stop when the init sequence finishes... Probably because of ' /dev/null 21 ', but stdin is still not closed there (f.e. with ' 0- '). see http://stackoverflow.com/questions/3430330/best-way-to-make-a-shell-script-daemon, answers 2 and 3 are relevant for daemonization in shell. Or, that may be an init (upstart) boot-sequence implementation side-effect. Actually, daemonization code in C is not so hard to write properly. corosync_tty_detach() is a pretty good example. I'd suggest to add '-d' option and daemonize (double-fork or fork+setsid plus common daemonization cleanups) if it is set after signal handlers are installed but before main loop is run. Also, SIGTTIN and SIGTTOU could be added to ignore list for a daemon mode. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Signal hangup handling for pacemaker and corosync
24.07.2014 03:39, Andrew Beekhof wrote: On 23 Jul 2014, at 2:46 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 23.07.2014 05:56, Andrew Beekhof wrote: On 21 Jul 2014, at 3:45 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 08:36, Andrew Beekhof wrote: On 21 Jul 2014, at 2:50 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 06:28, Andrew Beekhof wrote: On 15 Jul 2014, at 8:45 pm, Arjun Pandey apandepub...@gmail.com wrote: On Tue, Jul 15, 2014 at 3:36 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 8:00 pm, Arjun Pandey apandepub...@gmail.com wrote: Right. Actually the issue i am facing is that i am starting the pacemaker service remotely from a wrapper and thus pacemakerd dies when the wrapper exits.nohup solves the problem but then HUP cannot be used by pacemaker. Is this workaround ok ? I guess. How are you starting pacemaker? Usually its with some variant of 'service pacemaker start'. I am using 'service pacemaker start'. However this is being called from my script. So when the script exits pacemaker gets SIGHUP. Release testing starts clusters as: ssh -l root somenode -- service pacemaker start It could depend on what service is. It would either schedule systemd to run job (el7/fc18+), or just run init script itself (el6). In latter case, if process didn't detach from its controlling terminal when that terminal gone away, it will be sent a SIGHUP. Except we test rhel6 the same way... I understand. This issue is from sometimes happens on some systems folder. I recall I had problems ages ago with a daemon run from rc.local sometimes exists with HUP. 'sleep 1' after its launch was the easiest fix. How about: https://github.com/beekhof/pacemaker/commit/95175f5 That doesn't hurt, but could be just not enough, as pacemakerd does not daemonize itself, but is put into background by shell means. Thus, when you add signal handler, pacemakerd already runs some time in the background. If terminal (ssh session) disconnects before signal handler is installed, then process exits anyways. Just moving it earlier would seem the simplest option Yes, but it still remains racy. Strictly speaking, handler should be installed in the child before the parent process exits. And you cannot control this when shell does the fork. I'd suggest to add '-d' option and daemonize (double-fork or fork+setsid plus common daemonization cleanups) if it is set after signal handlers are installed but before main loop is run. Also, SIGTTIN and SIGTTOU could be added to ignore list for a daemon mode. The meaning of those two aren't making sense to my brain today. I'd recommend adding HUP handler (f.e. ignore) or/and detach (setsid()) right before daemonizing. And I've never seen the behaviour you speak of. How is what you're doing different? I was checking out the current pacemaker code.setsid is called for each child process.However if we do this for main process to then it will also be detached from the terminal. Regards Arjun On Tue, Jul 15, 2014 at 3:01 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 7:13 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi Andrew AFAIK linux daemons don't terminate on SIGHUP. Read the man page, POSIX specifies that the default action is 'term' ie. 'terminate'. They typically reload configuration on receiving this signal.Eg- rsyslogd. I thought it was safe to make this assumption here as well. Not anywhere as it turns out Regards Arjun On Tue, Jul 15, 2014 at 2:15 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 6:19 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi all I am running pacemaker version 1.1.10-14.el6 on CentOS 6. On setting up cluster if I send SIGHUP to either pacemaker or corosync services , they die. Is this a bug ? What is the intension behind this behavior? Standard default I believe. Have you run 'man 7 signal' lately? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Signal hangup handling for pacemaker and corosync
23.07.2014 05:56, Andrew Beekhof wrote: On 21 Jul 2014, at 3:45 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 08:36, Andrew Beekhof wrote: On 21 Jul 2014, at 2:50 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 06:28, Andrew Beekhof wrote: On 15 Jul 2014, at 8:45 pm, Arjun Pandey apandepub...@gmail.com wrote: On Tue, Jul 15, 2014 at 3:36 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 8:00 pm, Arjun Pandey apandepub...@gmail.com wrote: Right. Actually the issue i am facing is that i am starting the pacemaker service remotely from a wrapper and thus pacemakerd dies when the wrapper exits.nohup solves the problem but then HUP cannot be used by pacemaker. Is this workaround ok ? I guess. How are you starting pacemaker? Usually its with some variant of 'service pacemaker start'. I am using 'service pacemaker start'. However this is being called from my script. So when the script exits pacemaker gets SIGHUP. Release testing starts clusters as: ssh -l root somenode -- service pacemaker start It could depend on what service is. It would either schedule systemd to run job (el7/fc18+), or just run init script itself (el6). In latter case, if process didn't detach from its controlling terminal when that terminal gone away, it will be sent a SIGHUP. Except we test rhel6 the same way... I understand. This issue is from sometimes happens on some systems folder. I recall I had problems ages ago with a daemon run from rc.local sometimes exists with HUP. 'sleep 1' after its launch was the easiest fix. How about: https://github.com/beekhof/pacemaker/commit/95175f5 That doesn't hurt, but could be just not enough, as pacemakerd does not daemonize itself, but is put into background by shell means. Thus, when you add signal handler, pacemakerd already runs some time in the background. If terminal (ssh session) disconnects before signal handler is installed, then process exits anyways. I'd suggest to add '-d' option and daemonize (double-fork or fork+setsid plus common daemonization cleanups) if it is set after signal handlers are installed but before main loop is run. Also, SIGTTIN and SIGTTOU could be added to ignore list for a daemon mode. I'd recommend adding HUP handler (f.e. ignore) or/and detach (setsid()) right before daemonizing. And I've never seen the behaviour you speak of. How is what you're doing different? I was checking out the current pacemaker code.setsid is called for each child process.However if we do this for main process to then it will also be detached from the terminal. Regards Arjun On Tue, Jul 15, 2014 at 3:01 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 7:13 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi Andrew AFAIK linux daemons don't terminate on SIGHUP. Read the man page, POSIX specifies that the default action is 'term' ie. 'terminate'. They typically reload configuration on receiving this signal.Eg- rsyslogd. I thought it was safe to make this assumption here as well. Not anywhere as it turns out Regards Arjun On Tue, Jul 15, 2014 at 2:15 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 6:19 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi all I am running pacemaker version 1.1.10-14.el6 on CentOS 6. On setting up cluster if I send SIGHUP to either pacemaker or corosync services , they die. Is this a bug ? What is the intension behind this behavior? Standard default I believe. Have you run 'man 7 signal' lately? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman
Re: [Pacemaker] Managing big number of globally-unique clone instances
21.07.2014 13:37, Andrew Beekhof wrote: On 21 Jul 2014, at 3:09 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 06:21, Andrew Beekhof wrote: On 18 Jul 2014, at 5:16 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, I have a task which seems to be easily solvable with the use of globally-unique clone: start huge number of specific virtual machines to provide a load to a connection multiplexer. I decided to look how pacemaker behaves in such setup with Dummy resource agent, and found that handling of every instance in an initial transition (probe+start) slows down with increase of clone-max. yep for non unique clones the number of probes needed is N, where N is the number of nodes. for unique clones, we must test every instance and node combination, or N*M, where M is clone-max. And that's just the running of the probes... just figuring out which nodes need to be probed is incredibly resource intensive (run crm_simulate and it will be painfully obvious). F.e. for 256 instances transition took 225 seconds, ~0.88s per instance. After I added 768 more instances (set clone-max to 1024) How many nodes though? Two nodes run in VMs. Assuming 3, thats still only ~1s per operation (including the time taken to send the operation across the network twice and update the cib). together with increasing batch-limit to 512, transition took almost an hour (3507 seconds), or ~4.57s per added instance. Even if I take in account that monitoring of already started instances consumes some resources, last number seems to be rather big, I believe this ^ is the main point. If with N instances probe/start of _each_ instance takes X time slots, then with 4*N instances probe/start of _each_ instance takes ~5*X time slots. In an ideal world, I would expect it to remain constant. Unless you have 512 cores in the cluster, increasing the batch-limit in this way is certainly not going to give you the results you're looking for. Firing more tasks at a machine just ends up in producing more context switches as the kernel tries to juggle the various tasks. More context switches == more CPU wasted == more time taken overall == completely consistent with your results. Thanks to the oprofile, I was able to gain speedup by 8-9% with following patch: = diff --git a/crmd/te_utils.c b/crmd/te_utils.c index 2167370..c612718 100644 --- a/crmd/te_utils.c +++ b/crmd/te_utils.c @@ -374,8 +374,6 @@ te_graph_trigger(gpointer user_data) graph_rc = run_graph(transition_graph); transition_graph-batch_limit = limit; /* Restore the configured value */ -print_graph(LOG_DEBUG_3, transition_graph); - if (graph_rc == transition_active) { crm_trace(Transition not yet complete); return TRUE; diff --git a/crmd/tengine.c b/crmd/tengine.c index 765628c..ec0e1d4 100644 --- a/crmd/tengine.c +++ b/crmd/tengine.c @@ -221,7 +221,6 @@ do_te_invoke(long long action, } trigger_graph(); -print_graph(LOG_DEBUG_2, transition_graph); if (graph_data != input-xml) { free_xml(graph_data); = Results this time are measured only for clean start op, after probes are done (add stopped clone, wait for probes to complete and then start clone). 256(vanilla): 09:51:50 - 09:53:17 = 1:27 = 87s = 0.33984375 s per instance 1024(vanilla): 10:17:10 - 10:34:34 = 17:24 = 1044s = 1.01953125 s per instance 1024(patched): 11:59:26 - 12:15:12 = 15:46 = 946s = 0.92382813 s per instance So, still not perfect, but better. Unfortunately, my binaries are build with optimization, so I'm not able to get call graphs yet. Also, as I run in VMs, no hardware support for oprofile is available, so results may be inaccurate a bit. Here is system-wide opreport's top for unpatched crmd with 1024 instances: CPU: CPU with timer interrupt, speed 0 MHz (estimated) Profiling through timer interrupt samples %image name app name symbol name 429963 41.3351 no-vmlinux no-vmlinux /no-vmlinux 129533 12.4528 libxml2.so.2.7.6 libxml2.so.2.7.6 /usr/lib64/libxml2.so.2.7.6 1013269.7411 libc-2.12.so libc-2.12.so __strcmp_sse42 42524 4.0881 libtransitioner.so.2.0.1 libtransitioner.so.2.0.1 print_synapse 37062 3.5630 libc-2.12.so libc-2.12.so malloc_consolidate 23268 2.2369 libcrmcommon.so.3.2.0libcrmcommon.so.3.2.0find_entity 21416 2.0589 libc-2.12.so libc-2.12.so _int_malloc 18950 1.8218 libcrmcommon.so.3.2.0libcrmcommon.so.3.2.0 crm_element_value 17482 1.6807 libfreebl3.solibfreebl3.so /lib64/libfreebl3.so 15350 1.4757 libc-2.12.so libc-2.12.so vfprintf 15016 1.4436 libqb.so.0.16.0 libqb.so.0.16.0 /usr
Re: [Pacemaker] Signal hangup handling for pacemaker and corosync
21.07.2014 06:28, Andrew Beekhof wrote: On 15 Jul 2014, at 8:45 pm, Arjun Pandey apandepub...@gmail.com wrote: On Tue, Jul 15, 2014 at 3:36 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 8:00 pm, Arjun Pandey apandepub...@gmail.com wrote: Right. Actually the issue i am facing is that i am starting the pacemaker service remotely from a wrapper and thus pacemakerd dies when the wrapper exits.nohup solves the problem but then HUP cannot be used by pacemaker. Is this workaround ok ? I guess. How are you starting pacemaker? Usually its with some variant of 'service pacemaker start'. I am using 'service pacemaker start'. However this is being called from my script. So when the script exits pacemaker gets SIGHUP. Release testing starts clusters as: ssh -l root somenode -- service pacemaker start It could depend on what service is. It would either schedule systemd to run job (el7/fc18+), or just run init script itself (el6). In latter case, if process didn't detach from its controlling terminal when that terminal gone away, it will be sent a SIGHUP. I'd recommend adding HUP handler (f.e. ignore) or/and detach (setsid()) right before daemonizing. And I've never seen the behaviour you speak of. How is what you're doing different? I was checking out the current pacemaker code.setsid is called for each child process.However if we do this for main process to then it will also be detached from the terminal. Regards Arjun On Tue, Jul 15, 2014 at 3:01 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 7:13 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi Andrew AFAIK linux daemons don't terminate on SIGHUP. Read the man page, POSIX specifies that the default action is 'term' ie. 'terminate'. They typically reload configuration on receiving this signal.Eg- rsyslogd. I thought it was safe to make this assumption here as well. Not anywhere as it turns out Regards Arjun On Tue, Jul 15, 2014 at 2:15 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 6:19 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi all I am running pacemaker version 1.1.10-14.el6 on CentOS 6. On setting up cluster if I send SIGHUP to either pacemaker or corosync services , they die. Is this a bug ? What is the intension behind this behavior? Standard default I believe. Have you run 'man 7 signal' lately? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Signal hangup handling for pacemaker and corosync
21.07.2014 08:36, Andrew Beekhof wrote: On 21 Jul 2014, at 2:50 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 06:28, Andrew Beekhof wrote: On 15 Jul 2014, at 8:45 pm, Arjun Pandey apandepub...@gmail.com wrote: On Tue, Jul 15, 2014 at 3:36 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 8:00 pm, Arjun Pandey apandepub...@gmail.com wrote: Right. Actually the issue i am facing is that i am starting the pacemaker service remotely from a wrapper and thus pacemakerd dies when the wrapper exits.nohup solves the problem but then HUP cannot be used by pacemaker. Is this workaround ok ? I guess. How are you starting pacemaker? Usually its with some variant of 'service pacemaker start'. I am using 'service pacemaker start'. However this is being called from my script. So when the script exits pacemaker gets SIGHUP. Release testing starts clusters as: ssh -l root somenode -- service pacemaker start It could depend on what service is. It would either schedule systemd to run job (el7/fc18+), or just run init script itself (el6). In latter case, if process didn't detach from its controlling terminal when that terminal gone away, it will be sent a SIGHUP. Except we test rhel6 the same way... I understand. This issue is from sometimes happens on some systems folder. I recall I had problems ages ago with a daemon run from rc.local sometimes exists with HUP. 'sleep 1' after its launch was the easiest fix. I'd recommend adding HUP handler (f.e. ignore) or/and detach (setsid()) right before daemonizing. And I've never seen the behaviour you speak of. How is what you're doing different? I was checking out the current pacemaker code.setsid is called for each child process.However if we do this for main process to then it will also be detached from the terminal. Regards Arjun On Tue, Jul 15, 2014 at 3:01 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 7:13 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi Andrew AFAIK linux daemons don't terminate on SIGHUP. Read the man page, POSIX specifies that the default action is 'term' ie. 'terminate'. They typically reload configuration on receiving this signal.Eg- rsyslogd. I thought it was safe to make this assumption here as well. Not anywhere as it turns out Regards Arjun On Tue, Jul 15, 2014 at 2:15 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 6:19 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi all I am running pacemaker version 1.1.10-14.el6 on CentOS 6. On setting up cluster if I send SIGHUP to either pacemaker or corosync services , they die. Is this a bug ? What is the intension behind this behavior? Standard default I believe. Have you run 'man 7 signal' lately? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http
[Pacemaker] Managing big number of globally-unique clone instances
Hi Andrew, all, I have a task which seems to be easily solvable with the use of globally-unique clone: start huge number of specific virtual machines to provide a load to a connection multiplexer. I decided to look how pacemaker behaves in such setup with Dummy resource agent, and found that handling of every instance in an initial transition (probe+start) slows down with increase of clone-max. F.e. for 256 instances transition took 225 seconds, ~0.88s per instance. After I added 768 more instances (set clone-max to 1024) together with increasing batch-limit to 512, transition took almost an hour (3507 seconds), or ~4.57s per added instance. Even if I take in account that monitoring of already started instances consumes some resources, last number seems to be rather big, Main CPU consumer on DC while transition is running is crmd, Its memory footprint is around 85Mb, resulting CIB size together with the status section is around 2Mb, Could it be possible to optimize this use-case from your opinion with minimal efforts? Could it be optimized with just configuration? Or may it be some trivial development task, f.e replace one GList with GHashtable somewhere? Sure I can look deeper and get any additional information, f.e. to get crmd profiling results if it is hard to get an answer just from the head. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker Digest, Vol 80, Issue 32
13.07.2014 17:28, W Forum W wrote: Hi, The apache logs doesn't say a lot (LogLevel debug) [error] python_init: Python version mismatch, expected '2.7.2+', found '2.7.3'. [error] python_init: Python executable found '/usr/bin/python'. [error] python_init: Python path being used '/usr/lib/python2.7/:/usr/lib/python2.7/plat-linux2:/usr/lib/python2.7/lib-tk:/usr/lib/python2.7/lib-old:/usr/lib/python2.7/lib-dynload'. [notice] mod_python: Creating 8 session mutexes based on 150 max processes and 0 max threads. [notice] mod_python: using mutex_directory /tmp [notice] Apache/2.2.22 (Debian) PHP/5.4.4-14+deb7u12 mod_python/3.3.1 Python/2.7.3 mod_ssl/2.2.22 OpenSSL/1.0.1e configured -- resuming normal operations [notice] caught SIGTERM, shutting down The configuration should be ok, or do I miss something primitive p_ps_apache lsb:apache2 \ op monitor interval=60 timeout=30 \ op start interval=0 timeout=30 \ op start interval=0 timeout=30 You'd better use ocf:apache, because your LSB init script seems to return incorrect return code for status operation when apache is running. Also configure mod_status so it answers on the URL ocf:apache expects it to. Many thanks On 07/12/2014 12:00 PM, pacemaker-requ...@oss.clusterlabs.org wrote: Send Pacemaker mailing list submissions to pacemaker@oss.clusterlabs.org To subscribe or unsubscribe via the World Wide Web, visit http://oss.clusterlabs.org/mailman/listinfo/pacemaker or, via email, send a message with subject or body 'help' to pacemaker-requ...@oss.clusterlabs.org You can reach the person managing the list at pacemaker-ow...@oss.clusterlabs.org When replying, please edit your Subject line so it is more specific than Re: Contents of Pacemaker digest... Today's Topics: 1. Re: crm resourse (lsb:apache2) not starting (Michael Monette) 2. Re: crm resourse (lsb:apache2) not starting (W Forum W) 3. Re: crm resourse (lsb:apache2) not starting (Vladislav Bogdanov) -- Message: 1 Date: Fri, 11 Jul 2014 11:41:09 -0400 From: Michael Monette mmone...@2keys.ca To: wfor...@gmail.com, The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] crm resourse (lsb:apache2) not starting Message-ID: 2a776ccd-52a6-487d-bbc9-73e24a92b...@email.android.com Content-Type: text/plain; charset=utf-8 Is there a certificate passphrase when starting apache from command line? On July 11, 2014 11:38:11 AM EDT, W Forum W wfor...@gmail.com wrote: hi, we are using debian and selinux is default disabled in debian. we don't use it either is there no way to find what causes apache not to start? many thanks On 07/11/2014 01:36 AM, Andrew Beekhof wrote: On 10 Jul 2014, at 7:58 pm, W Forum W wfor...@gmail.com wrote: Hi thanks for the help. the status url is configured and working, also no error in apache log when I start the service manually any other ideas where to look?? selinux. if it starts from the command line but not in the cluster its very often selinux many thanks!! On 07/09/2014 12:53 AM, Andrew Beekhof wrote: On 8 Jul 2014, at 11:15 pm, W Forum W wfor...@gmail.com wrote: Hi, I have a two node cluster with a DRBD, heartbeat and pacemaker (on Debian Wheezy) The cluster is working fine. 2 DRBD resources, Shared IP, 2 File systems and a postgresql database start, stop, migrate, ... correctly. Now the problem is with the lsb:apache2 resource agent. When I try to start is (crm resource start p_ps_apache) immediately I got an error like p_ps_apache_monitor_6 (node=wegc203136, call=653, rc=7, status=complete): not running When I start Apache from the console (service apache2 start), it works fine I have checked if the Init Script LSB is compatible (see http://www.linux-ha.org/wiki/LSB_Resource_Agents ). All sequences tested are ok How can I found out why crm is not starting Apache? most likely the status url is not setup/configured. have you checked the apache logs? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- next part -- An HTML attachment was scrubbed... URL: http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20140711/726c2f55/attachment-0001.html -- Message: 2 Date: Fri, 11 Jul 2014 17:44:52 +0200 From: W Forum W wfor...@gmail.com To: Michael Monette mmone...@2keys.ca, The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] crm resourse (lsb:apache2
Re: [Pacemaker] crm resourse (lsb:apache2) not starting
08.07.2014 16:15, W Forum W wrote: Hi, I have a two node cluster with a DRBD, heartbeat and pacemaker (on Debian Wheezy) The cluster is working fine. 2 DRBD resources, Shared IP, 2 File systems and a postgresql database start, stop, migrate, ... correctly. Now the problem is with the lsb:apache2 resource agent. When I try to start is (crm resource start p_ps_apache) immediately I got an error like /p_ps_apache_monitor_6 (node=wegc203136, call=653, rc=7, status=complete): not running/ When I start Apache from the console (service apache2 start), it works fine I have checked if the Init Script LSB is compatible (see http://www.linux-ha.org/wiki/LSB_Resource_Agents). All sequences tested are ok How can I found out why crm is not starting Apache? Is it really not started, or just is not configured enough to be successfully monitored and then monitor op fails? What your apache logs say? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] 2-node active/active cluster serving virtual machines (KVM via libvirt)
30.06.2014 15:45, Tony Atkinson wrote: Hi all, I'd really appreciate a helping hand here I'm so close to getting what I need, but just seem to be falling short at the last hurdle. 2-node active/active cluster serving virtual machines (KVM via libvirt) Virtual machines need to be able to live-migrate between cluster nodes. Nodes have local storage only Node storage replicated over DRBD (dual primary) LVM on DRBD, volumes given to virtual machines Output from crm_mon below (full pacemaker defs at end) ... I think the issue is with the LVM volume group definition. How would I prevent the VMs rebooting when a node comes back online? Any help would be greatly appreciated. * node $id=168440321 vm-a \ attributes standby=off node $id=168440322 vm-b \ attributes standby=off primitive cluster-ip ocf:heartbeat:IPaddr2 \ params ip=192.168.123.200 cidr_netmask=16 broadcast=192.168.255.255 nic=br0 \ op monitor interval=10s primitive p_clvm ocf:lvm2:clvmd \ params daemon_timeout=30 \ meta target-role=Started primitive p_dlm ocf:pacemaker:controld \ operations $id=dlm \ op monitor interval=10 timeout=20 start-delay=0 \ params args=-q 0 primitive p_drbd_r0 ocf:linbit:drbd \ params drbd_resource=r0 \ op start interval=0 timeout=240 \ op stop interval=0 timeout=100 \ op monitor interval=29s role=Master \ op monitor interval=31s role=Slave primitive p_drbd_r1 ocf:linbit:drbd \ params drbd_resource=r1 \ op start interval=0 timeout=330 \ op stop interval=0 timeout=100 \ op monitor interval=59s role=Master timeout=30s \ op monitor interval=60s role=Slave timeout=30s \ meta target-role=Master primitive p_fs_r0 ocf:heartbeat:Filesystem \ params device=/dev/drbd0 directory=/replica fstype=gfs2 \ op start interval=0 timeout=60 \ op stop interval=0 timeout=60 \ op monitor interval=60 timeout=40 primitive p_lvm_vm ocf:heartbeat:LVM \ params volgrpname=vm \ op start interval=0 timeout=30s \ op stop interval=0 timeout=30s \ op monitor interval=30 timeout=100 depth=0 primitive vm_test1 ocf:heartbeat:VirtualDomain \ params config=/etc/libvirt/qemu/test1.xml hypervisor=qemu:///system migration_transport=ssh \ meta allow-migrate=true target-role=Started \ op start timeout=240s interval=0 \ op stop timeout=120s interval=0 \ op monitor timeout=30 interval=10 depth=0 \ utilization cpu=1 hv_memory=1024 primitive vm_test2 ocf:heartbeat:VirtualDomain \ params config=/etc/libvirt/qemu/test2.xml hypervisor=qemu:///system migration_transport=ssh \ meta allow-migrate=true target-role=Started \ op start timeout=240s interval=0 \ op stop timeout=120s interval=0 \ op monitor timeout=30 interval=10 depth=0 \ utilization cpu=1 hv_memory=1024 ms ms_drbd_r0 p_drbd_r0 \ meta master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true ms ms_drbd_r1 p_drbd_r1 \ meta master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true clone cl_clvm p_clvm \ meta interleave=true clone cl_dlm p_dlm \ meta interleave=true clone cl_fs_r0 p_fs_r0 \ meta interleave=true clone cl_lvm_vm p_lvm_vm \ meta interleave=true colocation co_fs_with_drbd inf: cl_fs_r0 ms_drbd_r0:Master order o_order_default Mandatory: cl_dlm ms_drbd_r0:promote cl_fs_r0 cl_clvm ms_drbd_r1:promote cl_lvm_vm:start vm_test1 order o_order_default2 Mandatory: cl_dlm ms_drbd_r0:promote cl_fs_r0 cl_clvm ms_drbd_r1:promote cl_lvm_vm:start vm_test2 You'd split these two into pieces and add colocation constraints. Such the whole constraints block looks as colocation co_clvm_with_dlm inf: cl_clvm cl_dlm order o_clvm_after_dlm inf: cl_dlm:start cl_clvm:start colocation co_fs_with_drbd inf: cl_fs_r0 ms_drbd_r0:Master order o_fs_after_drbd inf: ms_drbd_r0:promote cl_fs_r0:start colocation co_lvm_vm_with_clvm inf: cl_lvm_vm cl_clvm order o_lvm_vm_after_clvm inf: cl_clvm:start cl_lvm_vm:start colocation co_vm_test1_with_lvm_vm inf: vm_test1 cl_lvm_vm order o_vm_test1_after_lvm_vm inf: cl_lvm_vm vm_test1 colocation co_vm_test2_with_lvm_vm inf: vm_test2 cl_lvm_vm order o_vm_test2_after_lvm_vm inf: cl_lvm_vm vm_test2 This (forgive me if I mistyped somewhere) should prevent p_lvm_vm from being stopped when you don't want it to do that. property $id=cib-bootstrap-options \ dc-version=1.1.10-42f2063 \ cluster-infrastructure=corosync \ stonith-enabled=false \ no-quorum-policy=ignore \ last-lrm-refresh=1404130649 rsc_defaults $id=rsc-options \ resource-stickiness=100 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
Re: [Pacemaker] votequorum for 2 node cluster
11.06.2014 16:26, Kostiantyn Ponomarenko wrote: Hi guys, I am trying to deal somehow with split brain situation in 2 node cluster using votequorum. Here is a quorum section in my corosync.conf: provider: corosync_votequorum expected_votes: 2 Just a side note, not an answer to your question: you'd add 'two_node: 1' here as two-node clusters are very special in terms of quorum. wait_for_all: 1 last_man_standing: 1 auto_tie_breaker: 1 My question is about behavior of the remaining node after I shout down node with the lowest nodeid. My expectation is that after a last_man_standing_window this node should be back working. Or in the case of two node cluster it is not a solution? Thank you, Kostya ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] location anti affinity rule not working
02.05.2014 03:47, ESWAR RAO wrote: Hi All, I am working on 3 node cluster with corosync+pacemaker on RHEL 6. Eventhough I applied anti-affinity the app is still trying to run on 3rd node. # rpm -qa|grep corosync corosync-1.4.1-17.el6_5.1.x86_64 corosynclib-1.4.1-17.el6_5.1.x86_64 # rpm -qa | grep -i pacemaker pacemaker-1.1.10-14.el6_5.3.x86_64 pacemaker-libs-1.1.10-14.el6_5.3.x86_64 pacemaker-cluster-libs-1.1.10-14.el6_5.3.x86_64 pacemaker-cli-1.1.10-14.el6_5.3.x86_64 # rpm -qa | grep -i crm crmsh-2.0+git5-1.1.x86_64 # crm configure primitive oc_app lsb::app meta migration-threshold=10 failure-timeout=300s op monitor interval=3s # crm configure clone oc_app_clone oc_app meta clone-max=2 globally-unique=false interleave=true # crm configure location nvp_prefer_node oc_app_clon -inf: common-test-redhat Typo in the configuration? oc_app_clon. Online: [ common-test-redhat test-redhat-1 test-redhat-2 ] Full list of resources: Clone Set: oc_app_clone [oc_app] oc_app(lsb:app):FAILED common-test-redhat (unmanaged) Started: [ test-redhat-1 ] Stopped: [ test-redhat-2 ] The same configuration worked correctly on ubuntu and it started oc_app on only 2 nodes and not on common-test-redhat node. after some time: - Clone Set: oc_app_clone [oc_app] Started: [ test-redhat-1 ] Stopped: [ common-test-redhat test-redhat-2 ] # crm configure show node common-test-redhat node test-redhat-1 node est-redhat-2 primitive oc_app lsb:app \ meta migration-threshold=10 failure-timeout=300s \ op monitor interval=3s clone oc_app_clone oc_app \ meta clone-max=2 globally-unique=false interleave=true location nvp_prefer_node oc_app_clone -inf: common-test-redhat property cib-bootstrap-options: \ dc-version=1.1.10-14.el6_5.3-368c726 \ cluster-infrastructure=classic openais (with plugin) \ expected-quorum-votes=3 \ stonith-enabled=false Can someone please help me on the configuration. Thanks Eswar ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] The next release
23.04.2014 13:29, Andrew Beekhof wrote: I'd like to get a release out in the next month or so, so expect a release candidate RealSoonNow(tm). The last item on my personal todo list is updating the ACL syntax to make the terms a little more generic (since it won't just be users anymore). If any other devs have high priority items, knowing about them sooner rather later would be helpful :-) I am also accepting harassment to review patches/features that have slipped off my radar. It would be cool if cl#5165 is resolved. It already 'works for me' but I do not have a chance to update patch attached to that bug to your suggestions. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?
12.03.2014 00:40, Andrew Beekhof wrote: On 11 Mar 2014, at 6:23 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 07.03.2014 10:30, Vladislav Bogdanov wrote: 07.03.2014 05:43, Andrew Beekhof wrote: On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida yusk.i...@gmail.com wrote: Hi, all I measure the performance of Pacemaker in the following combinations. Pacemaker-1.1.11.rc1 libqb-0.16.0 corosync-2.3.2 All nodes are KVM virtual machines. stopped the node of vm01 compulsorily from the inside, after starting 14 nodes. virsh destroy vm01 was used for the stop. Then, in addition to the compulsorily stopped node, other nodes are separated from a cluster. The log of Retransmit List: is then outputted in large quantities from corosync. Probably best to poke the corosync guys about this. However, = .11 is known to cause significant CPU usage with that many nodes. I can easily imagine this staving corosync of resources and causing breakage. I would _highly_ recommend retesting with the current git master of pacemaker. I merged the new cib code last week which is faster by _two_ orders of magnitude and uses significantly less CPU. Andrew, current git master (ee094a2) almost works, the only issue is that crm_diff calculates incorrect diff digest. If I replace digest in diff by hands with what cib calculates as expected. it applies correctly. Otherwise - -206. More details? Hmmm... seems to be crmsh-specific, Cannot reproduce with pure-XML editing. Kristoffer, does http://hg.savannah.gnu.org/hgweb/crmsh/rev/c42d9361a310 address this? The problem seems to be caused by the fact that crmsh does not provide status section in both orig and new XMLs to crm_diff, and digest generation seems to rely on that, so crm_diff and cib daemon produce different digests. Attached are two sets of XML files, one (orig.xml, new.xml, patch.xml) are related to the full CIB operation (with status section included), another (orig-edited.xml, new-edited.xml, patch-edited.xml) have that section removed like crmsh does do. Resulting diffs differ only by digest, and that seems to be the exact issue. This should help. As long as crmsh isn't passing -c to crm_diff, then the digest will no longer be present. https://github.com/beekhof/pacemaker/commit/c8d443d Yep, that helped. Thank you! ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?
07.03.2014 10:30, Vladislav Bogdanov wrote: 07.03.2014 05:43, Andrew Beekhof wrote: On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida yusk.i...@gmail.com wrote: Hi, all I measure the performance of Pacemaker in the following combinations. Pacemaker-1.1.11.rc1 libqb-0.16.0 corosync-2.3.2 All nodes are KVM virtual machines. stopped the node of vm01 compulsorily from the inside, after starting 14 nodes. virsh destroy vm01 was used for the stop. Then, in addition to the compulsorily stopped node, other nodes are separated from a cluster. The log of Retransmit List: is then outputted in large quantities from corosync. Probably best to poke the corosync guys about this. However, = .11 is known to cause significant CPU usage with that many nodes. I can easily imagine this staving corosync of resources and causing breakage. I would _highly_ recommend retesting with the current git master of pacemaker. I merged the new cib code last week which is faster by _two_ orders of magnitude and uses significantly less CPU. Andrew, current git master (ee094a2) almost works, the only issue is that crm_diff calculates incorrect diff digest. If I replace digest in diff by hands with what cib calculates as expected. it applies correctly. Otherwise - -206. More details? Hmmm... seems to be crmsh-specific, Cannot reproduce with pure-XML editing. Kristoffer, does http://hg.savannah.gnu.org/hgweb/crmsh/rev/c42d9361a310 address this? The problem seems to be caused by the fact that crmsh does not provide status section in both orig and new XMLs to crm_diff, and digest generation seems to rely on that, so crm_diff and cib daemon produce different digests. Attached are two sets of XML files, one (orig.xml, new.xml, patch.xml) are related to the full CIB operation (with status section included), another (orig-edited.xml, new-edited.xml, patch-edited.xml) have that section removed like crmsh does do. Resulting diffs differ only by digest, and that seems to be the exact issue. cib epoch=4 num_updates=5 admin_epoch=0 validate-with=pacemaker-1.2 cib-last-written=Tue Mar 11 06:57:54 2014 update-origin=booter-0 update-client=crmd update-user=hacluster crm_feature_set=3.0.9 have-quorum=1 dc-uuid=1 configuration crm_config cluster_property_set id=cib-bootstrap-options nvpair id=cib-bootstrap-options-dc-version name=dc-version value=1.1.11-1.3.el6-b75a9bd/ nvpair id=cib-bootstrap-options-cluster-infrastructure name=cluster-infrastructure value=corosync/ nvpair name=symmetric-cluster value=true id=cib-bootstrap-options-symmetric-cluster/ /cluster_property_set /crm_config nodes node id=1 uname=booter-0/ node id=2 uname=booter-1/ /nodes resources/ constraints/ /configuration status node_state id=1 uname=booter-0 in_ccm=true crmd=online crm-debug-origin=do_state_transition join=member expected=member lrm id=1 lrm_resources/ /lrm transient_attributes id=1 instance_attributes id=status-1 nvpair id=status-1-shutdown name=shutdown value=0/ nvpair id=status-1-probe_complete name=probe_complete value=true/ /instance_attributes /transient_attributes /node_state /status /cib cib epoch=4 num_updates=5 admin_epoch=0 validate-with=pacemaker-1.2 cib-last-written=Tue Mar 11 06:57:54 2014 update-origin=booter-0 update-client=crmd update-user=hacluster crm_feature_set=3.0.9 have-quorum=1 dc-uuid=1 configuration crm_config cluster_property_set id=cib-bootstrap-options nvpair id=cib-bootstrap-options-dc-version name=dc-version value=1.1.11-1.3.el6-b75a9bd/ nvpair id=cib-bootstrap-options-cluster-infrastructure name=cluster-infrastructure value=corosync/ nvpair name=symmetric-cluster value=true id=cib-bootstrap-options-symmetric-cluster/ /cluster_property_set /crm_config nodes node id=1 uname=booter-0/ node id=2 uname=booter-1/ /nodes resources/ constraints/ /configuration /cib cib epoch=3 num_updates=5 admin_epoch=0 validate-with=pacemaker-1.2 cib-last-written=Tue Mar 11 06:57:54 2014 update-origin=booter-0 update-client=crmd update-user=hacluster crm_feature_set=3.0.9 have-quorum=1 dc-uuid=1 configuration crm_config cluster_property_set id=cib-bootstrap-options nvpair id=cib-bootstrap-options-dc-version name=dc-version value=1.1.11-1.3.el6-b75a9bd/ nvpair id=cib-bootstrap-options-cluster-infrastructure name=cluster-infrastructure value=corosync/ /cluster_property_set /crm_config nodes node id=1 uname=booter-0/ node id=2 uname=booter-1/ /nodes resources/ constraints/ /configuration status node_state id=1 uname=booter-0 in_ccm=true crmd=online crm-debug-origin=do_state_transition join=member expected
Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?
12.03.2014 00:37, Andrew Beekhof wrote: ... I'm somewhat confused at this point if crmsh is using --replace, then why is it doing diff calculations? Or are replace operations only for the load operation? It uses on of two methods depending on pacemaker version. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?
18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida yusk.i...@gmail.com wrote: Hi, all I measure the performance of Pacemaker in the following combinations. Pacemaker-1.1.11.rc1 libqb-0.16.0 corosync-2.3.2 All nodes are KVM virtual machines. stopped the node of vm01 compulsorily from the inside, after starting 14 nodes. virsh destroy vm01 was used for the stop. Then, in addition to the compulsorily stopped node, other nodes are separated from a cluster. The log of Retransmit List: is then outputted in large quantities from corosync. Probably best to poke the corosync guys about this. However, = .11 is known to cause significant CPU usage with that many nodes. I can easily imagine this staving corosync of resources and causing breakage. I would _highly_ recommend retesting with the current git master of pacemaker. I merged the new cib code last week which is faster by _two_ orders of magnitude and uses significantly less CPU. Andrew, current git master (ee094a2) almost works, the only issue is that crm_diff calculates incorrect diff digest. If I replace digest in diff by hands with what cib calculates as expected. it applies correctly. Otherwise - -206. I'd be interested to hear your feedback. What is the reason which the node in which failure has not occurred carries out lost? Please advise, if there is a problem in a setup in something. I attached the report when the problem occurred. https://drive.google.com/file/d/0BwMFJItoO-fVMkFWWWlQQldsSFU/edit?usp=sharing Regards, Yusuke -- METRO SYSTEMS CO., LTD Yusuke Iida Mail: yusk.i...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?
07.03.2014 05:43, Andrew Beekhof wrote: On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida yusk.i...@gmail.com wrote: Hi, all I measure the performance of Pacemaker in the following combinations. Pacemaker-1.1.11.rc1 libqb-0.16.0 corosync-2.3.2 All nodes are KVM virtual machines. stopped the node of vm01 compulsorily from the inside, after starting 14 nodes. virsh destroy vm01 was used for the stop. Then, in addition to the compulsorily stopped node, other nodes are separated from a cluster. The log of Retransmit List: is then outputted in large quantities from corosync. Probably best to poke the corosync guys about this. However, = .11 is known to cause significant CPU usage with that many nodes. I can easily imagine this staving corosync of resources and causing breakage. I would _highly_ recommend retesting with the current git master of pacemaker. I merged the new cib code last week which is faster by _two_ orders of magnitude and uses significantly less CPU. Andrew, current git master (ee094a2) almost works, the only issue is that crm_diff calculates incorrect diff digest. If I replace digest in diff by hands with what cib calculates as expected. it applies correctly. Otherwise - -206. More details? Hmmm... seems to be crmsh-specific, Cannot reproduce with pure-XML editing. Kristoffer, does http://hg.savannah.gnu.org/hgweb/crmsh/rev/c42d9361a310 address this? I'd be interested to hear your feedback. What is the reason which the node in which failure has not occurred carries out lost? Please advise, if there is a problem in a setup in something. I attached the report when the problem occurred. https://drive.google.com/file/d/0BwMFJItoO-fVMkFWWWlQQldsSFU/edit?usp=sharing Regards, Yusuke -- METRO SYSTEMS CO., LTD Yusuke Iida Mail: yusk.i...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Migrating resources on custom conditions
21.02.2014 13:45, Lars Marowsky-Bree wrote: On 2014-02-21T13:02:23, Vladislav Bogdanov bub...@hoster-ok.com wrote: It could be nice feature to have kind of general SLA concept (it could be very similar to the utilization one from the resource configuration perspective), so resources try to move or live migrate out of nodes which have SLA attributes below the configured threshold. That SLA attributes should probably go to the status section (to not trigger transition aborts on attribute updates) and be managed both internally by pacemaker (expansion to the recent node-load concept) and by resource agents (like Health-* as noted by Frank). Pacemaker already has (almost?) all pieces of code to do that ('rule' and 'score-attribute'), but in my taste it is still not enough general in contrast to 'utilization' feature. The SystemHealth and node health features are clearly meant to achieve this; what's missing from your point of view? Only ease of configuration I think (for 'custom' node-health-strategy which I think is most suitable for my use-cases). I would prefer to have constraints not in location rules, but in resource definition. I really like how utilization handling is implemented at the configuration/XML level, and I think it is worth having the same for SLA/health. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] node1 fencing itself after node2 being fenced
Hi Fabio, 19.02.2014 12:32, Fabio M. Di Nitto wrote: On 2/19/2014 9:39 AM, Fabio M. Di Nitto wrote: On 2/18/2014 9:24 PM, Asgaroth wrote: Just a guess. Do you have startup fencing enabled in dlm-controld (I actually do not remember if it is applicable to cman's version, but it exists in dlm-4) or cman? If yes, then that may play its evil game, because imho it is not intended to use with pacemaker which has its own startup fencing policy (if you redirect fencing to pacemaker). I can't seem to find the option to enable/disable startup fencing in either dlm_controld or cman. 3 things: 1) add logging debug=on/ to cluster.conf 2) reproduce the issue 3) collect sosreports and/or crm_reports and send them over. if security/privacy is a concern, send them privately to me and Andrew. The thread has gone for a while with many different suggestions, but there is a lot of confusion around rhel6/7, agents/init and so on. Just to remove some of the misconceptions: clvmd in RHEL6 is only supported via init script. dlm config should not be changed to avoid fencing at startup. There are other timers involved on clvmd startup that are directly related to the storage. We might be hitting those ones. No logs, no help. At this point, in 6, pacemaker is not involved if not for fencing operations. Just to clarify this last sentence: At this stage of the boot process, in 6, pacemaker is not involved directly. If node3 is attempting to fence another node, then it would block on pacemaker, since fencing is proxy´d to pcmk. But node3 should have no reason to fence, only showing some underlying issues. Doesn't it check on startup (or on lockspace creation) that fencing is available? And, what happens with that when the whole cluster boots up? If I understand correctly, dlm should block all operations, including requests from clvmd and thus all lvm tools, until pacemaker is started locally. But clvmd init script runs {vg,lv}display after clvmd is started, and it should block, introducing the deadlock. Am I missing something? Fabio In RHEL7: clvmd/dlm will be/are managed _only_ by pacemaker as resources. This integration is only 99.9% done. We recently discovered 3 corner cases that needs fixing via the agents that David V. is pushing around (after all 7 is not released yet ;)) Fabio, could you please list that cases, as it is really valuable for some people to know them? And for some reason I'm not able to locate that patch David talks about, probably it is still not pushed to github. Best, Vladislav Cheers Fabio dlm_controld -h doesn’t list an option to enable/disable start up fencing. I had a quick read of the cman man page and I also don’t see any option mentioning startup fencing. Would you mind pointing me in the direction of the parameter to disable this in cman/dlm_controld please. PS: I am redirecting all fencing operations to pacemaker using the following directive: method name=pcmk-redirect Thanks ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Order of resources in a group and crm_diff
29.01.2014 08:44, Andrew Beekhof wrote: ... Thats a known deficiency in the v1 diff format (and why we need costly digests to detect ordering changes). Happily .12 will have a new and improve diff format that will handle this correctly. Does your recent cib-performance rewrite address this as well? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Manual resource reload
17.02.2014 04:11, Andrew Beekhof wrote: On 11 Feb 2014, at 2:49 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, cannot find anywhere (am I blind?), is it possible to manually inject 'reload' op for a given resource? Background for this is if some configuration files are edited, and resource-agent (or LSB script) supports 'reload' operation, then it would be nice to have a way to request that reload to be done without change of resource parameters. sounds like a reasonable feature request for crm_resource Filed as cl#5198. http://bugs.clusterlabs.org/show_bug.cgi?id=5198 Thanks, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?
18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida yusk.i...@gmail.com wrote: Hi, all I measure the performance of Pacemaker in the following combinations. Pacemaker-1.1.11.rc1 libqb-0.16.0 corosync-2.3.2 All nodes are KVM virtual machines. stopped the node of vm01 compulsorily from the inside, after starting 14 nodes. virsh destroy vm01 was used for the stop. Then, in addition to the compulsorily stopped node, other nodes are separated from a cluster. The log of Retransmit List: is then outputted in large quantities from corosync. Probably best to poke the corosync guys about this. However, = .11 is known to cause significant CPU usage with that many nodes. I can easily imagine this staving corosync of resources and causing breakage. I would _highly_ recommend retesting with the current git master of pacemaker. I merged the new cib code last week which is faster by _two_ orders of magnitude and uses significantly less CPU. Andrew, you mean your cib-performance branch, am I correct? Unfortunately it is not in .11 (sorry if I overlooked it there), and even not in Clusterlabs/master yet and seems to be merged and then reverted in beekhof/master... I'd be interested to hear your feedback. What is the reason which the node in which failure has not occurred carries out lost? Please advise, if there is a problem in a setup in something. I attached the report when the problem occurred. https://drive.google.com/file/d/0BwMFJItoO-fVMkFWWWlQQldsSFU/edit?usp=sharing Regards, Yusuke -- METRO SYSTEMS CO., LTD Yusuke Iida Mail: yusk.i...@gmail.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] node1 fencing itself after node2 being fenced
18.02.2014 19:49, Asgaroth wrote: i sometimes have the same situation. sleep ~30 seconds between startup cman and clvmd helps a lot. Thanks for the tip, I just tried this (added sleep 30 in the start section of case statement in cman script, but this did not resolve the issue for me), for some reason clvmd just refuses to start, I don’t see much debugging errors shooting up, so I cannot say for sure what clvmd is trying to do :( Just a guess. Do you have startup fencing enabled in dlm-controld (I actually do not remember if it is applicable to cman's version, but it exists in dlm-4) or cman? If yes, then that may play its evil game, because imho it is not intended to use with pacemaker which has its own startup fencing policy (if you redirect fencing to pacemaker). ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] node1 fencing itself after node2 being fenced
18.02.2014 23:01, David Vossel wrote: - Original Message - From: Vladislav Bogdanov bub...@hoster-ok.com To: pacemaker@oss.clusterlabs.org Sent: Tuesday, February 18, 2014 1:02:09 PM Subject: Re: [Pacemaker] node1 fencing itself after node2 being fenced 18.02.2014 19:49, Asgaroth wrote: i sometimes have the same situation. sleep ~30 seconds between startup cman and clvmd helps a lot. Thanks for the tip, I just tried this (added sleep 30 in the start section of case statement in cman script, but this did not resolve the issue for me), for some reason clvmd just refuses to start, I don’t see much debugging errors shooting up, so I cannot say for sure what clvmd is trying to do :( I actually just made a patch related to this. If you are managing the dlm with pacemaker, you'll want to use this patch. It disables startup fencing in the dlm and has pacemaker perform the fencing instead. The agent checks the startup fencing condition, so you'll need that bit as well instead of just disabling startup fencing in the dlm. Asgaroth said earlier that dlm and clvmd starts from init scripts now in that install, so RA magic would not help, but it is great to know about this feature. I'd try to disable startup fencing in cman/dlm and look if it helps. Something weird is happening there. I cannot simulate similar symptoms in my setups, but I run on corosync2, manage everything with pacemaker, and have startup fencing disabled in dlm. -- Vossel Just a guess. Do you have startup fencing enabled in dlm-controld (I actually do not remember if it is applicable to cman's version, but it exists in dlm-4) or cman? If yes, then that may play its evil game, because imho it is not intended to use with pacemaker which has its own startup fencing policy (if you redirect fencing to pacemaker). ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] node1 fencing itself after node2 being fenced
10.02.2014 14:46, Asgaroth wrote: Hi All, OK, here is my testing using cman/clvmd enabled on system startup and clvmd outside of pacemaker control. I still seem to be getting the clvmd hang/fail situation even when running outside of pacemaker control, I cannot see off-hand where the issue is occurring, but maybe it is related to what Vladislav was saying where clvmd hangs if it is not running on a cluster node that has cman running, however, I have both cman/clvmd enable to start at boot. Here is a little synopsis of what appears to be happening here: [1] Everything is fine here, both nodes up and running: # cman_tool nodes Node Sts Inc Joined Name 1 M444 2014-02-07 10:25:00 test01 2 M440 2014-02-07 10:25:00 test02 # dlm_tool ls dlm lockspaces name clvmd id0x4104eefa flags 0x changemember 2 joined 1 remove 0 failed 0 seq 1,1 members 1 2 [2] Here I “echo c /proc/sysrq-trigger” on node2 (test02), I can see crm_mon saying that node 2 is in unclean state and fencing kicks in (reboot node 2) # cman_tool nodes Node Sts Inc Joined Name 1 M440 2014-02-07 10:27:58 test01 2 X444 test02 # dlm_tool ls dlm lockspaces name clvmd id0x4104eefa flags 0x0004 kern_stop changemember 2 joined 1 remove 0 failed 0 seq 2,2 members 1 2 new changemember 1 joined 0 remove 1 failed 1 seq 3,3 new statuswait_messages 0 wait_condition 1 fencing new members 1 [3] So the above looks fine so far, to my untrained eye, dlm in kern_stop state while waiting on successful fence, and the node reboots and we have the following state: # cman_tool nodes Node Sts Inc Joined Name 1 M440 2014-02-07 10:27:58 test01 2 M456 2014-02-07 10:35:42 test02 # dlm_tool ls dlm lockspaces name clvmd id0x4104eefa flags 0x changemember 2 joined 1 remove 0 failed 0 seq 4,4 members 1 2 So it looks like dlm and cman seem to be working properly (again, I could be wrong, my untrained eye and all J) Yep, all above is correct. And yes, at the dlm layer everything seems to be perfect (did't look at the dump though, that is not needed for the 'ls' outputs you provided). However, if I try to run any lvm status/clvm status commands then they still just hang. Could this be related to clvmd doing a check when cman is up and running but clvmd has not started yet (As I understand from Vladislav’s previous email). Or do I have something fundamentally wrong with my fencing configuration. I cannot really recall if it hangs or returns error for that (I moved to corosync2 long ago). Anyways you probably want to run clvmd with debugging enabled. iirc you have two choices here, either you'd need to stop running instance first and then run it in the console with -f -d1, or run clvmd -C -d2 to ask all running instances to start debug logging to syslog. I prefer first one, because modern syslogs do rate-limiting. And, you'd need to run lvm commands with debugging enabled too. Alternatively (or in addition to the above) you may want to run hang-suspect under gdb (make sure you have relevant -debuginfo packages installed, one for lvm2 should be enough), this way you can obtain the backtrace of function calls which led to the hang, and more. You may need to tell gdb that it shouldn't stop on some signals sent to the binary being debugged if it stops immediately after you type 'run' or 'cont' (f.e 'handle PIPE nostop'). Once you have daemon running under debugger (do not forget to type 'set args -f -d1' in gdb prompt before start), once you notice the hang, you can press Ctrl^C and then type 'bt full' (you may need to do that for some/all threads, use 'info threads' and 'thread #' commands to switch between them). With all that you can find what exactly hangs and where and probably even why. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Manual resource reload
Hi, cannot find anywhere (am I blind?), is it possible to manually inject 'reload' op for a given resource? Background for this is if some configuration files are edited, and resource-agent (or LSB script) supports 'reload' operation, then it would be nice to have a way to request that reload to be done without change of resource parameters. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] node1 fencing itself after node2 being fenced
10.02.2014 18:54, Asgaroth wrote: -Original Message- From: Vladislav Bogdanov [mailto:bub...@hoster-ok.com] Sent: 10 February 2014 13:27 To: pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] node1 fencing itself after node2 being fenced I cannot really recall if it hangs or returns error for that (I moved to corosync2 long ago). Are you running corosync2 on RHEL7 beta? Are we able to run corosync2 on CentOS 6/RHEL 6? Nope, it's Centos6. In few words, It is probably safer for you to stay with cman, especially if you need GFS2. gfs_controld is not officially ported to corosync2 and is obsolete in EL7 because communication between gfs2 and dlm is moved to kernelspace there. Anyways you probably want to run clvmd with debugging enabled. iirc you have two choices here, either you'd need to stop running instance first and then run it in the console with -f -d1, or run clvmd -C -d2 to ask all running instances to start debug logging to syslog. I prefer first one, because modern syslogs do rate-limiting. And, you'd need to run lvm commands with debugging enabled too. Thanks for this tip, I have modified clvmd to run in debug mode (clvmd -T60 -d 2 -I cman) and I notice that on node2 reboot, I don't see any logs for clvmd actually attempting to start, so it appears there is something wrong here with clvmd. However, I did try to manually stop/start clvmd on node2 You need to fix that for sure. after a reboot and these were the error logs reported: Feb 10 12:37:08 test02 kernel: dlm: connecting to 1 sctp association 2 Feb 10 12:38:00 test02 kernel: dlm: Using SCTP for communications Feb 10 12:38:00 test02 clvmd[2118]: Unable to create DLM lockspace for CLVM: Address already in use Feb 10 12:38:00 test02 kernel: dlm: Can't bind to port 21064 addr number 1 Feb 10 12:38:00 test02 kernel: dlm: cannot start dlm lowcomms -98 Feb 10 12:39:37 test02 kernel: dlm: Using SCTP for communications Strange message, looks like something is bound to that port already. You may want to try dlm in tcp mode btw. Feb 10 12:39:37 test02 clvmd[2137]: Unable to create DLM lockspace for CLVM: Address already in use Feb 10 12:39:37 test02 kernel: dlm: Can't bind to port 21064 addr number 1 Feb 10 12:39:37 test02 kernel: dlm: cannot start dlm lowcomms -98 Feb 10 12:47:21 test02 clvmd[2159]: Unable to create DLM lockspace for CLVM: Address already in use Feb 10 12:47:21 test02 kernel: dlm: Using SCTP for communications Feb 10 12:47:21 test02 kernel: dlm: Can't bind to port 21064 addr number 1 Feb 10 12:47:21 test02 kernel: dlm: cannot start dlm lowcomms -98 Feb 10 12:48:14 test02 kernel: dlm: closing connection to node 2 Feb 10 12:48:14 test02 kernel: dlm: closing connection to node 1 So it appears that the issue is with clvmd attempting to communicated with, I presume, dlm. I tried to do some searching on this error and it appears there is a bug report, if I recall correctly, around 2004, which was fixed, so I cannot see why this error is cropping up. Some other strangeness is, that if I reboot the node a couple times, it may start up properly on 2nd node and then things appear to work properly, however, while node 2 is down the clvmd on node1 is still in a hung state even though dlm appears to think everything is good. Have you come across this issue before? Thanks for your assistance thus far, I appreciate it. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] node1 fencing itself after node2 being fenced
07.02.2014 14:22, Asgaroth wrote: ... Thanks for the explanation, this is interresting for me as I need a volume manager in the cluster to manager the shared file systems in case I need to resize for some reason. I think I may be coming up against something similar now that I am testing cman outside of the cluster, even though I have cman/clvmd enabled outside pacemaker the clvmd daemon still hangs even when the 2nd node has been rebooted due to a fence operation, when it (node 2) reboots, cman clvmd starts, I can see both nodes as members using cman_tool, but clvmd still seems to have an issue, it just hangs, I cant see off-hand if dlm still thinks pacemaker is in the fence operation (or if it has already returned true for successful fence). I am still gathering logs and will post back to this thread once I have all my logs from yesterday and this morning. As I wrote (may be it was not completely clear) there are two points where it clustered LVM may block: dlm (kern_stop flag in 'dlm ls' output) and clvmd itself (not all cluster nodes run clvmd). Of course there could be additional bugs. I'd break fencing for your node1 and look what dlm_tool shows there after node2 is fenced. 'dlm_tool ls' and 'dlm_tool dump' should provide enough information (but you'd probably need to dig into dlm_controld code to fully interpret the latter). Also, you may want to run clvmd in the debugging mode. I dont suppose there is another volume manager available that would be cluster aware that anyone is aware of? I'm not aware of any. Increasing timeout for LSB clvmd resource probably wont help you, because blocked (because of DLM waits for fencing) LVM operations iirc never finish. You may want to search for clvmd OCF resource-agent, it is available for SUSE I think. Although it is not perfect, it should work much better for you I will have a look around for this clvmd ocf agent, and see what is involverd in getting it to work on CentOS 6.5 if I dont have any success with the current recommendation for running it outside of pacemaker control. Generally, that alone wont help, because you'll still get timeouts on every LVM operation if some of cman nodes do not run clvmd for any reason. I mean, if you manage VGs/LVs as cluster resources. But that removes one point of failure when combined with newer stack. I know that latest versions of cluster-stack software (those which require corosync2 and it's quorum implementation) work like a charm all-together, and there was a REASON to write them (and use them in RHEL7). ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] node1 fencing itself after node2 being fenced
05.02.2014 20:10, Asgaroth wrote: On 05/02/2014 16:12, Digimer wrote: You say it's working now? If so, excellent. If you have any troubles though, please share your cluster.conf and 'pcs config show'. Hi Digimer, no its not working as I expect it to when I test a crash of node 2, clvmd goes in to a failed state and then node1 gets shot in the head, other than that the config appears works fine with the minimal testing I have done so far :) I have attached the cluster.conf and pcs config files to the email (with minimal obfuscation). Hi, I bet your problem comes from the LSB clvmd init script. Here is what it does do: === ... clustered_vgs() { ${lvm_vgdisplay} 2/dev/null | \ awk 'BEGIN {RS=VG Name} {if (/Clustered/) print $1;}' } clustered_active_lvs() { for i in $(clustered_vgs); do ${lvm_lvdisplay} $i 2/dev/null | \ awk 'BEGIN {RS=LV Name} {if (/[^N^O^T] available/) print $1;}' done } rh_status() { status $DAEMON } ... case $1 in ... status) rh_status rtrn=$? if [ $rtrn = 0 ]; then cvgs=$(clustered_vgs) echo Clustered Volume Groups: ${cvgs:-(none)} clvs=$(clustered_active_lvs) echo Active clustered Logical Volumes: ${clvs:-(none)} fi ... esac exit $rtrn = So, it not only looks for status of daemon itself, but also tries to list volume groups. And this operation is blocked because fencing is still in progress, and the whole cLVM thing (as well as DLM itself and all other dependent services) is frozen. So your resource timeouts in monitor operation, and then pacemaker asks it to stop (unless you have on-fail=fence). Anyways, there is a big chance that stop will fail too, and that leads again to fencing. cLVM is very fragile in my opinion (although newer versions running on corosync2 stack seem to be much better). And it is probably still doesn't work well when managed by pacemaker in CMAN-based clusters, because it blocks globally if any node in the whole cluster is online at the cman layer but doesn't run clvmd (I checked last time with .99). And that was the same for all stacks, until was fixed for corosync (only 2?) stack recently. The problem with that is that you cannot just stop pacemaker on one node (f.e. for maintenance), you should immediately stop cman as well (or run clvmd in cman'ish way) - cLVM freezes on another node. This should be easily fixable in clvmd code, but nobody cares. Increasing timeout for LSB clvmd resource probably wont help you, because blocked (because of DLM waits for fencing) LVM operations iirc never finish. You may want to search for clvmd OCF resource-agent, it is available for SUSE I think. Although it is not perfect, it should work much better for you. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Order of resources in a group and crm_diff
Hi all, Just discovered, that when I add resource to a middle of (running) group, it is added to the end. I mean, if I update following (crmsh syntax) group dhcp-server vip-10-5-200-244 dhcpd with group dhcp-server vip-10-5-200-244 vip-10-5-201-244 dhcpd with 'crm configure load update', actual definition becomes group dhcp-server vip-10-5-200-244 dhcpd vip-10-5-201-244 Also, strange enough, if I get XML CIB with cibadmin -Q, then edit order of primitives with text editor, crm_diff doesn't show any differences: cib-orig.xml: ... group id=dhcp-server primitive id=vip-10-5-200-244 class=ocf provider=heartbeat type=IPaddr2 instance_attributes id=vip-10-5-200-244-instance_attributes nvpair name=ip value=10.5.200.244 id=vip-10-5-200-244-instance_attributes-ip/ nvpair name=cidr_netmask value=32 id=vip-10-5-200-244-instance_attributes-cidr_netmask/ nvpair name=nic value=vlan1 id=vip-10-5-200-244-instance_attributes-nic/ /instance_attributes operations op name=start interval=0 timeout=20 id=vip-10-5-200-244-start-0/ op name=stop interval=0 timeout=20 id=vip-10-5-200-244-stop-0/ op name=monitor interval=30 id=vip-10-5-200-244-monitor-30/ /operations /primitive primitive id=dhcpd class=lsb type=dhcpd operations op name=monitor interval=10 timeout=15 id=dhcpd-monitor-10/ op name=start interval=0 timeout=90 id=dhcpd-start-0/ op name=stop interval=0 timeout=90 id=dhcpd-stop-0/ /operations meta_attributes id=dhcpd-meta_attributes nvpair id=dhcpd-meta_attributes-target-role name=target-role value=Started/ /meta_attributes /primitive primitive id=vip-10-5-201-244 class=ocf provider=heartbeat type=IPaddr2 instance_attributes id=vip-10-5-201-244-instance_attributes nvpair name=ip value=10.5.201.244 id=vip-10-5-201-244-instance_attributes-ip/ nvpair name=cidr_netmask value=24 id=vip-10-5-201-244-instance_attributes-cidr_netmask/ nvpair name=nic value=vlan201 id=vip-10-5-201-244-instance_attributes-nic/ /instance_attributes operations op name=start interval=0 timeout=20 id=vip-10-5-201-244-start-0/ op name=stop interval=0 timeout=20 id=vip-10-5-201-244-stop-0/ op name=monitor interval=30 id=vip-10-5-201-244-monitor-30/ /operations /primitive /group ... cib.xml: ... group id=dhcp-server primitive id=vip-10-5-200-244 class=ocf provider=heartbeat type=IPaddr2 instance_attributes id=vip-10-5-200-244-instance_attributes nvpair name=ip value=10.5.200.244 id=vip-10-5-200-244-instance_attributes-ip/ nvpair name=cidr_netmask value=32 id=vip-10-5-200-244-instance_attributes-cidr_netmask/ nvpair name=nic value=vlan1 id=vip-10-5-200-244-instance_attributes-nic/ /instance_attributes operations op name=start interval=0 timeout=20 id=vip-10-5-200-244-start-0/ op name=stop interval=0 timeout=20 id=vip-10-5-200-244-stop-0/ op name=monitor interval=30 id=vip-10-5-200-244-monitor-30/ /operations /primitive primitive id=vip-10-5-201-244 class=ocf provider=heartbeat type=IPaddr2 instance_attributes id=vip-10-5-201-244-instance_attributes nvpair name=ip value=10.5.201.244 id=vip-10-5-201-244-instance_attributes-ip/ nvpair name=cidr_netmask value=24 id=vip-10-5-201-244-instance_attributes-cidr_netmask/ nvpair name=nic value=vlan201 id=vip-10-5-201-244-instance_attributes-nic/ /instance_attributes operations op name=start interval=0 timeout=20 id=vip-10-5-201-244-start-0/ op name=stop interval=0 timeout=20 id=vip-10-5-201-244-stop-0/ op name=monitor interval=30 id=vip-10-5-201-244-monitor-30/ /operations /primitive primitive id=dhcpd class=lsb type=dhcpd operations op name=monitor interval=10 timeout=15 id=dhcpd-monitor-10/ op name=start interval=0 timeout=90 id=dhcpd-start-0/ op name=stop interval=0 timeout=90 id=dhcpd-stop-0/ /operations meta_attributes id=dhcpd-meta_attributes nvpair id=dhcpd-meta_attributes-target-role name=target-role value=Started/ /meta_attributes /primitive /group ... # crm_diff --original cib-orig.xml --new cib.xml shows nothing. And, 'cibadmin --replace --xml-file cib.xml' does nothing: Jan 28 11:01:21 booter-0 cib[2693]: notice: cib:diff: Diff: --- 0.427.2 Jan 28 11:01:21 booter-0 cib[2693]: notice: cib:diff: Diff: +++ 0.427.19 df366a02885285cc95529f402bfdac12 Jan 28 11:01:21 booter-0 cib[2693]: notice: cib:diff: -- nvpair
[Pacemaker] [PATCH] Downgrade probe log message for promoted ms resources
Hi, This is the only one message I see in logs in otherwise static cluster (with rechecks enabled), probably it is good idea to downgrade it to info. diff --git a/lib/pengine/unpack.c b/lib/pengine/unpack.c index 97e114f..6dbcf19 100644 --- a/lib/pengine/unpack.c +++ b/lib/pengine/unpack.c @@ -2515,7 +2515,7 @@ determine_op_status( case PCMK_OCF_RUNNING_MASTER: if (is_probe) { result = PCMK_LRM_OP_DONE; -crm_notice(Operation %s found resource %s active in master mode on %s, +pe_rsc_info(rsc, Operation %s found resource %s active in master mode on %s, task, rsc-id, node-details-uname); } else if (target_rc == rc) { ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] again return code, now in crm_attribute
10.01.2014 08:00, Andrew Beekhof wrote: On 10 Jan 2014, at 3:51 pm, Andrey Groshev gre...@yandex.ru wrote: 10.01.2014, 03:28, Andrew Beekhof and...@beekhof.net: On 9 Jan 2014, at 4:44 pm, Andrey Groshev gre...@yandex.ru wrote: 09.01.2014, 02:39, Andrew Beekhof and...@beekhof.net: On 18 Dec 2013, at 11:55 pm, Andrey Groshev gre...@yandex.ru wrote: Hi, Andrew and ALL. I'm sorry, but I again found an error. :) Crux of the problem: # crm_attribute --type crm_config --attr-name stonith-enabled --query; echo $? scope=crm_config name=stonith-enabled value=true 0 # crm_attribute --type crm_config --attr-name stonith-enabled --update firstval ; echo $? 0 # crm_attribute --type crm_config --attr-name stonith-enabled --query; echo $? scope=crm_config name=stonith-enabled value=firstval 0 # crm_attribute --type crm_config --attr-name stonith-enabled --update secondval --lifetime=reboot ; echo $? 0 # crm_attribute --type crm_config --attr-name stonith-enabled --query; echo $? scope=crm_config name=stonith-enabled value=firstval 0 # crm_attribute --type crm_config --attr-name stonith-enabled --update thirdval --lifetime=forever ; echo $? 0 # crm_attribute --type crm_config --attr-name stonith-enabled --query; echo $? scope=crm_config name=stonith-enabled value=firstval 0 Ie if specify the lifetime of an attribute, then a attribure is not updated. If impossible setup the lifetime of the attribute when it is installing, it must be return an error. Agreed. I'll reproduce and get back to you. How, I was able to review code, problem comes when used both options --type and options --lifetime. One variant in case without break; Unfortunately, I did not have time to dive into the logic. Actually, the logic is correct. The command: # crm_attribute --type crm_config --attr-name stonith-enabled --update secondval --lifetime=reboot ; echo $? is invalid. You only get to specify --type OR --lifetime, not both. By specifying --lifetime, you're creating a node attribute, not a cluster proprerty. With this, I do not argue. I think that should be the exit code is NOT ZERO, ie it's error! No, its setting a value, just not where you thought (or where you're looking for it in the next command). Its the same as writing: crm_attribute --type crm_config --type status --attr-name stonith-enabled --update secondval; echo $? Only the last value for --type wins From usability PoV it would better return error during argv parsing if there are conflicting arguments. And if possible then the value should be established. In general, something is wrong. Denser unfortunately not yet looked, because I struggle with STONITH :) P.S. Andrew! Late to congratulate you on your new addition to the family. This fine time - now you will have toys which was not in your childhood. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org , ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org , ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list:
Re: [Pacemaker] dc-version and cluster-infrastructure cluster options may be lost when editing
02.01.2014 12:53, Kristoffer Grönlund wrote: On Tue, 24 Dec 2013 14:41:20 +0300 Vladislav Bogdanov bub...@hoster-ok.com wrote: I would expect that only one option is modified, but crmsh intend to remove all others. May be it is possible to fix it by one-line crmsh patch? Kristoffer, Dejan? Best, Vladislav Hi, This was indeed a bug in crmsh. An updated version which fixes this is now available from network:ha-clustering:Stable, or in the 1.2.6 branch of the crmsh repository. Amazing! Thank you very much Kristoffer, will try that in next few days. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Unable to start cloned apache service on node 2
03.01.2014 07:46, Digimer wrote: Hi all, While trying to test to answer questions from my previous thread, I hit another problem. Since posting the first thread, I moved on in the Cluster from Scratch tutorial and got to the point where I was running Active/Active. Here I have a couple of problems. First up, the dlm service doesn't start with the cluster, but I can start it successfully manually. Second, and more annoying, I can't get the cloned apache service to start on both nodes: [root@an-c03n01 ~]# pcs config show --full Cluster Name: an-cluster-03 Corosync Nodes: an-c03n01.alteeve.ca an-c03n02.alteeve.ca Pacemaker Nodes: an-c03n01.alteeve.ca an-c03n02.alteeve.ca Resources: Master: WebDataClone Meta Attrs: master-node-max=1 clone-max=2 clone-node-max=1 notify=true master-max=2 Resource: WebData (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=r0 Operations: monitor interval=60s (WebData-monitor-60s) Clone: dlm-clone Meta Attrs: clone-max=2 clone-node-max=1 Resource: dlm (class=ocf provider=pacemaker type=controld) Operations: monitor interval=60s (dlm-monitor-interval-60s) Clone: ClusterIP-clone Meta Attrs: globally-unique=true clone-max=2 clone-node-max=2 Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=192.168.122.10 cidr_netmask=32 clusterip_hash=sourceip Operations: monitor interval=30s (ClusterIP-monitor-interval-30s) Clone: WebFS-clone Resource: WebFS (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd0 directory=/var/www/html fstype=gfs2 Operations: monitor interval=60s (WebFS-monitor-interval-60s) Clone: WebSite-clone Resource: WebSite (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://127.0.0.1/server-status Operations: monitor interval=60s (WebSite-monitor-interval-60s) Stonith Devices: Resource: fence_n01_virsh (class=stonith type=fence_virsh) Attributes: pcmk_host_list=an-c03n01.alteeve.ca ipaddr=lemass login=root passwd_script=/root/lemass.pw delay=15 port=an-c03n01 Operations: monitor interval=60s (fence_n01_virsh-monitor-interval-60s) Resource: fence_n02_virsh (class=stonith type=fence_virsh) Attributes: pcmk_host_list=an-c03n02.alteeve.ca ipaddr=lemass login=root passwd_script=/root/lemass.pw port=an-c03n02 Operations: monitor interval=60s (fence_n02_virsh-monitor-interval-60s) Fencing Levels: Location Constraints: Resource: ClusterIP-clone Enabled on: an-c03n01.alteeve.ca (score:INFINITY) (role: Started) (id:cli-prefer-ClusterIP) ^^^ This one? Ordering Constraints: promote WebDataClone then start WebFS-clone (Mandatory) (id:order-WebDataClone-WebFS-mandatory) start WebFS-clone then start WebSite-clone (Mandatory) (id:order-WebFS-WebSite-mandatory) Colocation Constraints: WebFS-clone with WebDataClone (INFINITY) (with-rsc-role:Master) (id:colocation-WebFS-WebDataClone-INFINITY) WebSite-clone with ClusterIP-clone (INFINITY) (id:colocation-WebSite-ClusterIP-INFINITY) WebSite-clone with WebFS-clone (INFINITY) (id:colocation-WebSite-WebFS-INFINITY) Cluster Properties: cluster-infrastructure: corosync dc-version: 1.1.10-19.el7-368c726 last-lrm-refresh: 1388723732 no-quorum-policy: ignore stonith-enabled: true [root@an-c03n02 ~]# pcs status Cluster name: an-cluster-03 Last updated: Thu Jan 2 23:40:14 2014 Last change: Thu Jan 2 23:39:31 2014 via crm_resource on an-c03n01.alteeve.ca Stack: corosync Current DC: an-c03n01.alteeve.ca (1) - partition with quorum Version: 1.1.10-19.el7-368c726 2 Nodes configured 12 Resources configured Online: [ an-c03n01.alteeve.ca an-c03n02.alteeve.ca ] Full list of resources: fence_n01_virsh(stonith:fence_virsh):Started an-c03n01.alteeve.ca fence_n02_virsh(stonith:fence_virsh):Started an-c03n02.alteeve.ca Master/Slave Set: WebDataClone [WebData] Masters: [ an-c03n01.alteeve.ca an-c03n02.alteeve.ca ] Clone Set: dlm-clone [dlm] Started: [ an-c03n01.alteeve.ca an-c03n02.alteeve.ca ] Clone Set: ClusterIP-clone [ClusterIP] (unique) ClusterIP:0(ocf::heartbeat:IPaddr2):Started an-c03n01.alteeve.ca ClusterIP:1(ocf::heartbeat:IPaddr2):Started an-c03n01.alteeve.ca Clone Set: WebFS-clone [WebFS] Started: [ an-c03n01.alteeve.ca an-c03n02.alteeve.ca ] Clone Set: WebSite-clone [WebSite] Started: [ an-c03n01.alteeve.ca ] Stopped: [ an-c03n02.alteeve.ca ] PCSD Status: an-c03n01.alteeve.ca: an-c03n01.alteeve.ca: Online an-c03n02.alteeve.ca: an-c03n02.alteeve.ca: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled [root@an-c03n01 ~]# ps aux | grep httpd root 19256 0.0 0.1 207188 3184 ?Ss 23:39 0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile
Re: [Pacemaker] Odd issues with apache on RHEL 7 beta
27.12.2013 09:34, Digimer wrote: ... 3. I know I mentioned this on IRC before, but I thought I should mention it here again. In the pcs CfS, it shows to set: Location /server-status SetHandler server-status Order deny,allow Deny from all Allow from 127.0.0.1 /Location But then in the resource setup, it says: pcs resource create WebSite ocf:heartbeat:apache \ configfile=/etc/httpd/conf/httpd.conf \ statusurl=http://localhost/server-status; op monitor interval=1min This fails because apache will not respond to 'localhost', so you need to set 'statusurl=http://127.0.0.1/server-status; (or change the apache directive to 'Allow from localhost'). Just a side note on this. It may be caused by 'localhost' resolve default to IPv6 localhost address. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Odd issues with apache on RHEL 7 beta
27.12.2013 09:45, Digimer wrote: On 27/12/13 01:44 AM, Vladislav Bogdanov wrote: 27.12.2013 09:34, Digimer wrote: ... 3. I know I mentioned this on IRC before, but I thought I should mention it here again. In the pcs CfS, it shows to set: Location /server-status SetHandler server-status Order deny,allow Deny from all Allow from 127.0.0.1 /Location But then in the resource setup, it says: pcs resource create WebSite ocf:heartbeat:apache \ configfile=/etc/httpd/conf/httpd.conf \ statusurl=http://localhost/server-status; op monitor interval=1min This fails because apache will not respond to 'localhost', so you need to set 'statusurl=http://127.0.0.1/server-status; (or change the apache directive to 'Allow from localhost'). Just a side note on this. It may be caused by 'localhost' resolve default to IPv6 localhost address. In my case, this is not so. 'localhost' resolves to '127.0.0.1': [root@an-c03n01 ~]# gethostip -d localhost 127.0.0.1 Hmm... I still think that _could_ be the source (or part of it) of the issue. Below is run on a f18 system. $ host -a localhost. Trying localhost ;; -HEADER- opcode: QUERY, status: NOERROR, id: 28983 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;localhost. IN ANY ;; ANSWER SECTION: localhost. 0 IN A 127.0.0.1 localhost. 0 IN ::1 And (http_mon.sh uses wget by default): $ wget http://localhost/ --2013-12-27 09:55:29-- http://localhost/ Resolving localhost (localhost)... ::1, 127.0.0.1 Connecting to localhost (localhost)|::1|:80... failed: Connection refused. Connecting to localhost (localhost)|127.0.0.1|:80... failed: Connection refused. It first tries IPv6. curl (second http_mon.sh option) tries IPv6 first too: $ curl -v http://localhost/ * About to connect() to localhost port 80 (#0) * Trying ::1... * Connection refused * Trying 127.0.0.1... * Connection refused * couldn't connect to host * Closing connection #0 curl: (7) couldn't connect to host In your setup apache is probably listening on both localhost addresses, thus first connection attempt succeeds and it (apache) returns 403 Forbidden, preventing wget(curl) from trying IPv4 address. If apache listens only on 127.0.0.1 but not on ::1, then monitoring would succeed. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] dc-version and cluster-infrastructure cluster options may be lost when editing
20.12.2013 14:15, Vladislav Bogdanov wrote: Hi all, Just discovered that it is possible to remove options in subject with admin action (at least with cibadmin --patch). This is notably annoying when updating CIB with 'crm configure load update' (which uses crm_diff to create patch) when updating other cluster options. While most (all?) other options are reset to defaults when they are missing in the input, those two in subject are silently removed. That is probably a bug. IMO such internal options should be preserved (filtered?) from external updates. I noticed this while debugging problem with one of my resource-agents which polls for cluster-infrastructure attribute during start - it refused to restart after cluster options update. Would be nice to have this fixed in 1.1.11 :) Relevant log lines are: Dec 20 10:54:21 booter-test-0 cib[3517]: notice: cib:diff: -- nvpair id=cib-bootstrap-options-dc-version name=dc-version value=1.1.10-3.3.el6-8db9f0b/ Dec 20 10:54:21 booter-test-0 cib[3517]: notice: cib:diff: -- nvpair id=cib-bootstrap-options-cluster-infrastructure name=cluster-infrastructure value=corosync/ Versions are: pacemaker - ClusterLabs/pacemaker/master 0d1ac18 + beekhof/pacemaker/master a269ba8 crmsh - 4f66cc190185 Actually, although I still shink that pacemaker should not allow modification of that options, source of my problem is that crmsh doesn't initialize cluster_property_set/ node with original values when it is doing load update. Example (very long lines are stripped for readability): # cat u.crm property $id=cib-bootstrap-options \ batch-limit=30 # crm -d configure load update u.crm ... DEBUG: clitext: property $id=cib-bootstrap-options batch-limit=40 cluster-delay=60s cluster-recheck-interval=10m crmd-transition-delay=0s dc-deadtime=20s default-action-timeout=20s default-resource-stickiness=100 e... DEBUG: id_store: saved cib-bootstrap-options-batch-limit DEBUG: id_store: saved cib-bootstrap-options-cluster-delay DEBUG: id_store: saved cib-bootstrap-options-cluster-recheck-interval DEBUG: id_store: saved cib-bootstrap-options-crmd-transition-delay DEBUG: id_store: saved cib-bootstrap-options-dc-deadtime DEBUG: id_store: saved cib-bootstrap-options-default-action-timeout DEBUG: id_store: saved cib-bootstrap-options-default-resource-stickiness DEBUG: id_store: saved cib-bootstrap-options-election-timeout DEBUG: id_store: saved cib-bootstrap-options-enable-acl DEBUG: id_store: saved cib-bootstrap-options-enable-startup-probes DEBUG: id_store: saved cib-bootstrap-options-enable-container-probes DEBUG: id_store: saved cib-bootstrap-options-is-managed-default DEBUG: id_store: saved cib-bootstrap-options-maintenance-mode DEBUG: id_store: saved cib-bootstrap-options-migration-limit DEBUG: id_store: saved cib-bootstrap-options-no-quorum-policy DEBUG: id_store: saved cib-bootstrap-options-node-health-green DEBUG: id_store: saved cib-bootstrap-options-node-health-red DEBUG: id_store: saved cib-bootstrap-options-node-health-strategy DEBUG: id_store: saved cib-bootstrap-options-node-health-yellow DEBUG: id_store: saved cib-bootstrap-options-pe-error-series-max DEBUG: id_store: saved cib-bootstrap-options-pe-input-series-max DEBUG: id_store: saved cib-bootstrap-options-pe-warn-series-max DEBUG: id_store: saved cib-bootstrap-options-placement-strategy DEBUG: id_store: saved cib-bootstrap-options-remove-after-stop DEBUG: id_store: saved cib-bootstrap-options-shutdown-escalation DEBUG: id_store: saved cib-bootstrap-options-start-failure-is-fatal DEBUG: id_store: saved cib-bootstrap-options-startup-fencing DEBUG: id_store: saved cib-bootstrap-options-stonith-action DEBUG: id_store: saved cib-bootstrap-options-stonith-enabled DEBUG: id_store: saved cib-bootstrap-options-stonith-timeout DEBUG: id_store: saved cib-bootstrap-options-stop-all-resources DEBUG: id_store: saved cib-bootstrap-options-stop-orphan-actions DEBUG: id_store: saved cib-bootstrap-options-stop-orphan-resources DEBUG: id_store: saved cib-bootstrap-options-symmetric-cluster DEBUG: id_store: saved cib-bootstrap-options-dc-version DEBUG: id_store: saved cib-bootstrap-options-cluster-infrastructure DEBUG: id_store: saved cib-bootstrap-options-last-lrm-refresh DEBUG: clitext: rsc_defaults $id=rsc_options allow-migrate=false failure-timeout=10m migration-threshold=INFINITY multiple-active=stop_start priority=0 DEBUG: id_store: saved rsc_options-allow-migrate DEBUG: id_store: saved rsc_options-failure-timeout DEBUG: id_store: saved rsc_options-migration-threshold DEBUG: id_store: saved rsc_options-multiple-active DEBUG: id_store: saved rsc_options-priority DEBUG: id_store: saved cib-bootstrap-options-batch-limit DEBUG: update CIB element: property:cib-bootstrap-options DEBUG: clitext: property $id=cib-bootstrap-options batch-limit=30 DEBUG: id_store: saved cib-bootstrap-options-batch-limit DEBUG: create configuration section rsc_defaults DEBUG: piping string to crm_verify
[Pacemaker] dc-version and cluster-infrastructure cluster options may be lost when editing
Hi all, Just discovered that it is possible to remove options in subject with admin action (at least with cibadmin --patch). This is notably annoying when updating CIB with 'crm configure load update' (which uses crm_diff to create patch) when updating other cluster options. While most (all?) other options are reset to defaults when they are missing in the input, those two in subject are silently removed. That is probably a bug. IMO such internal options should be preserved (filtered?) from external updates. I noticed this while debugging problem with one of my resource-agents which polls for cluster-infrastructure attribute during start - it refused to restart after cluster options update. Would be nice to have this fixed in 1.1.11 :) Relevant log lines are: Dec 20 10:54:21 booter-test-0 cib[3517]: notice: cib:diff: -- nvpair id=cib-bootstrap-options-dc-version name=dc-version value=1.1.10-3.3.el6-8db9f0b/ Dec 20 10:54:21 booter-test-0 cib[3517]: notice: cib:diff: -- nvpair id=cib-bootstrap-options-cluster-infrastructure name=cluster-infrastructure value=corosync/ Versions are: pacemaker - ClusterLabs/pacemaker/master 0d1ac18 + beekhof/pacemaker/master a269ba8 crmsh - 4f66cc190185 Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] crmsh: New syntax for location constraints, suggestions / comments
18.12.2013 23:21, Rainer Brestan wrote: Hi Lars, maybe a little off topic. What i really miss in crmsh is the possibility to specify resource parameters which are different on different nodes, so the parameter is node dependant. In XML syntax this is existing, Andrew gave me the hint as answer to an discussion how to deal with different node parameters. But, if specified in XML syntax, crmsh cannot interpret it any more and print the resource in XML syntax. Therefore, this is not usable with crmsh, as there is no single syntax display with crm configure show. +1 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] crmsh: New syntax for location constraints, suggestions / comments
13.12.2013 15:39, Lars Marowsky-Bree wrote: On 2013-12-13T13:11:30, Rainer Brestan rainer.bres...@gmx.net wrote: Please do not merge colocation and order together in a way that only none or both is present. This was never the plan. The idea was to offer an additional construct that provides both properties, since *most of the time*, that's what users want. In the interest of clarity and brevity in the configuration, this would be quite useful. group? That they are also useful on their own remains unchallenged. I was the one who proposed that originally ;-) Regards, Lars ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Reg. clone and order attributes
07.12.2013 14:30, ESWAR RAO wrote: Hi Vladislav, Thanks for the response. I will follow your suggestions. I have configured my required configuration in a different manner: (1) crm configure : all the 3 primitives (2) crm configure colocation : for all the 3 primitives (3) crm configure order : for the primitives (4) Now I did clone for the 3 primitives Once you have a clone, then you should not refer primitive it is build on any more, but use that clone name in all constraints. Thus, you need to repeat your steps in order 1 4 2 3 and replace primitives' with clones in latter two. But I couldn't understand why pacemaker is giving errors with this type of configuration. Thanks Eswar On Sat, Dec 7, 2013 at 2:47 PM, Vladislav Bogdanov bub...@hoster-ok.com mailto:bub...@hoster-ok.com wrote: 06.12.2013 14:07, ESWAR RAO wrote: Hi Vladislav, I used the below advisory colocation but its not working. Do be frank, I'm not sure if it even possible to achieve such exotic behavior with just pacemaker in a non-fragile way, that was just a suggestion. But you may also play with CIB editing from within resource agent code, f.e. remove some node attributes your resources depend on (with location constraints) when threshold is reached, or something similar. On 3 node setup: I have configured all 3 resources in clone mode to start only on node1 and node2 with a fail-count of only 1. +++ + crm configure primitive res_dummy_1 lsb::dummy_1 meta allow-migrate=false migration-threshold=1 op monitor interval=5s + crm configure clone dummy_1_clone res_dummy_1 meta clone-max=2 globally-unique=false + crm configure location dummy_1_clone_prefer_node dummy_1_clone -inf: node-3 +++ advisory ordering: + crm configure order 1-BEFORE-2 0: dummy_1_clone dummy_2_clone + crm configure order 2-BEFORE-3 0: dummy_2_clone dummy_3_clone +++ advisory colocation: # crm configure colocation node-with-apps inf: dummy_1_clone dummy_2_clone dummy_3_clone +++ After I killed dummy_1 on node1 , i expected the pacemaker to kill dummy_2 and dummy_3 on node1 and not to disturb the apps on node2. But with above colocation rule, it stopped the apps on node1 but it restarted dummy_2 and dummy_3 on node2. With a score of 0: it didn't stop dummy_2 and dummy_3 on node1. With a score of 500: it stopped only dummy_2 and restarted dummy_2 on node2. Thanks Eswar On Fri, Dec 6, 2013 at 12:20 PM, ESWAR RAO eswar7...@gmail.com mailto:eswar7...@gmail.com mailto:eswar7...@gmail.com mailto:eswar7...@gmail.com wrote: Thanks Vladislav. I will work on that. Thanks Eswar On Fri, Dec 6, 2013 at 11:05 AM, Vladislav Bogdanov bub...@hoster-ok.com mailto:bub...@hoster-ok.com mailto:bub...@hoster-ok.com mailto:bub...@hoster-ok.com wrote: 06.12.2013 07:58, ESWAR RAO wrote: Hi All, Can someone help me with below configuration?? I have a 3 node HB setup (node1, node2, node3) which runs HB+pacemaker. I have 3 apps dummy1, dummy2 , dummy3 which needs to be run on only 2 nodes among the 3 nodes. By using the below configuration, I was able to run 3 resources on 2 nodes. # crm configure primitive res_dummy1 lsb::dummy1 meta allow-migrate=false migration-threshold=3 failure-timeout=30s op monitor interval=5s # crm configure location app_prefer_node res_dummy1 -inf: node3 First that comes to mind is that you should put above line below the next one and refer app_clone instead of res_dummy1. # crm configure clone app_clone res_dummy1 meta clone-max=2 globally-unique=false I have a dependency order like dummy2 should start after dummy1 and dummy3 should start only after dummy2. For now I am keeping a sleep in the script and starting the resources by using crm. Is there any clean way to have the dependency on the resources so that ordering is maintained while clone is run on bot the nodes?? I have tried with below config but couldn't succeed
Re: [Pacemaker] Pacemaker 1.1.10 and pacemaker-remote
06.12.2013 19:13, James Oakley wrote: On Thursday, December 5, 2013 9:55:47 PM Vladislav Bogdanov bub...@hoster-ok.com wrote: Does 'db0' resolve to a correct IP address? If not, then you probably want either fix that or use remote-addr option as well. I saw that you can ping/ssh that container, but it is not clear did you use 'db0' name for that or IP address. Yes, I access by name. I defined all of my private node addresses in /etc/hosts, which is shared throughout the cluster via csync2. No idea then, sorry. I grepped for 'remote' in the logs and found nothing. I have no idea if it's even trying. Is there something specific I can look for? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Reg. clone and order attributes
06.12.2013 14:07, ESWAR RAO wrote: Hi Vladislav, I used the below advisory colocation but its not working. Do be frank, I'm not sure if it even possible to achieve such exotic behavior with just pacemaker in a non-fragile way, that was just a suggestion. But you may also play with CIB editing from within resource agent code, f.e. remove some node attributes your resources depend on (with location constraints) when threshold is reached, or something similar. On 3 node setup: I have configured all 3 resources in clone mode to start only on node1 and node2 with a fail-count of only 1. +++ + crm configure primitive res_dummy_1 lsb::dummy_1 meta allow-migrate=false migration-threshold=1 op monitor interval=5s + crm configure clone dummy_1_clone res_dummy_1 meta clone-max=2 globally-unique=false + crm configure location dummy_1_clone_prefer_node dummy_1_clone -inf: node-3 +++ advisory ordering: + crm configure order 1-BEFORE-2 0: dummy_1_clone dummy_2_clone + crm configure order 2-BEFORE-3 0: dummy_2_clone dummy_3_clone +++ advisory colocation: # crm configure colocation node-with-apps inf: dummy_1_clone dummy_2_clone dummy_3_clone +++ After I killed dummy_1 on node1 , i expected the pacemaker to kill dummy_2 and dummy_3 on node1 and not to disturb the apps on node2. But with above colocation rule, it stopped the apps on node1 but it restarted dummy_2 and dummy_3 on node2. With a score of 0: it didn't stop dummy_2 and dummy_3 on node1. With a score of 500: it stopped only dummy_2 and restarted dummy_2 on node2. Thanks Eswar On Fri, Dec 6, 2013 at 12:20 PM, ESWAR RAO eswar7...@gmail.com mailto:eswar7...@gmail.com wrote: Thanks Vladislav. I will work on that. Thanks Eswar On Fri, Dec 6, 2013 at 11:05 AM, Vladislav Bogdanov bub...@hoster-ok.com mailto:bub...@hoster-ok.com wrote: 06.12.2013 07:58, ESWAR RAO wrote: Hi All, Can someone help me with below configuration?? I have a 3 node HB setup (node1, node2, node3) which runs HB+pacemaker. I have 3 apps dummy1, dummy2 , dummy3 which needs to be run on only 2 nodes among the 3 nodes. By using the below configuration, I was able to run 3 resources on 2 nodes. # crm configure primitive res_dummy1 lsb::dummy1 meta allow-migrate=false migration-threshold=3 failure-timeout=30s op monitor interval=5s # crm configure location app_prefer_node res_dummy1 -inf: node3 First that comes to mind is that you should put above line below the next one and refer app_clone instead of res_dummy1. # crm configure clone app_clone res_dummy1 meta clone-max=2 globally-unique=false I have a dependency order like dummy2 should start after dummy1 and dummy3 should start only after dummy2. For now I am keeping a sleep in the script and starting the resources by using crm. Is there any clean way to have the dependency on the resources so that ordering is maintained while clone is run on bot the nodes?? I have tried with below config but couldn't succeed. # crm configure order dum1-BEFORE-dum2 0: res_dummy1 res_dummy2 The same is here. # crm configure order dum2-BEFORE-dum3 0: res_dummy2 res_dummy3 So, your example should look like: # crm configure primitive res_dummy1 lsb::dummy1 meta allow-migrate=false migration-threshold=3 failure-timeout=30s op monitor interval=5s # crm configure clone app_clone res_dummy1 meta clone-max=2 globally-unique=false # crm configure location app_prefer_node app_clone -inf: node3 # crm configure order dum1-BEFORE-dum2 0: app_clone res_dummy2 # crm configure order dum2-BEFORE-dum3 0: res_dummy2 res_dummy3 Instead of group i used order , so that even if 1 app gets restarts others will not be affected. Yep, advisory ordering is fine for that. Also is there any way so that if 1 app fails more than migration-threshold times, we can stop all 3 resources on that node?? Maybe advisory colocations can do something similar (I'm not sure)? http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_advisory_placement.html You should find correct value for its score (positive or negative) though. crm_simulate is your friend for that. Thanks Eswar
Re: [Pacemaker] Pacemaker 1.1.10 and pacemaker-remote
06.12.2013 11:41, Lars Marowsky-Bree wrote: On 2013-12-06T08:55:47, Vladislav Bogdanov bub...@hoster-ok.com wrote: BTW, pacemaker cib accepts any meta attributes (and that is very convenient way for me to store some 'meta' information), while crmsh limits them to a pre-defined list. While that is probably fine for novices, that limits some advanced usage scenarios, that's why I disabled that check by a patch for my builds. That's intentional. You're polluting a namespace that doesn't belong to you. Any update that introduces a new feature could potentially collide with your key/value pairs. Agree. Also, the CIB really isn't performant enough to store meta data like this. Generally I agree. In my use-case I store bits of information needed to recreate higher-layer configuration in the case of catastrophe (and to verify that CIB configuration is correct after cluster restart). I read them in the same function with other attributes, after the whole CIB is received. So performance impact is nearly zero for me. If by advanced you mean abusing cluster features, yes, then I agree, that are scenarios we want to limit ;-) Yep, abuse is a correct word :) Fortunately I know how to use text editor, hg diff, patch and rpmbuild ;) ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Reg. cone and order attributes
06.12.2013 07:58, ESWAR RAO wrote: Hi All, Can someone help me with below configuration?? I have a 3 node HB setup (node1, node2, node3) which runs HB+pacemaker. I have 3 apps dummy1, dummy2 , dummy3 which needs to be run on only 2 nodes among the 3 nodes. By using the below configuration, I was able to run 3 resources on 2 nodes. # crm configure primitive res_dummy1 lsb::dummy1 meta allow-migrate=false migration-threshold=3 failure-timeout=30s op monitor interval=5s # crm configure location app_prefer_node res_dummy1 -inf: node3 First that comes to mind is that you should put above line below the next one and refer app_clone instead of res_dummy1. # crm configure clone app_clone res_dummy1 meta clone-max=2 globally-unique=false I have a dependency order like dummy2 should start after dummy1 and dummy3 should start only after dummy2. For now I am keeping a sleep in the script and starting the resources by using crm. Is there any clean way to have the dependency on the resources so that ordering is maintained while clone is run on bot the nodes?? I have tried with below config but couldn't succeed. # crm configure order dum1-BEFORE-dum2 0: res_dummy1 res_dummy2 The same is here. # crm configure order dum2-BEFORE-dum3 0: res_dummy2 res_dummy3 So, your example should look like: # crm configure primitive res_dummy1 lsb::dummy1 meta allow-migrate=false migration-threshold=3 failure-timeout=30s op monitor interval=5s # crm configure clone app_clone res_dummy1 meta clone-max=2 globally-unique=false # crm configure location app_prefer_node app_clone -inf: node3 # crm configure order dum1-BEFORE-dum2 0: app_clone res_dummy2 # crm configure order dum2-BEFORE-dum3 0: res_dummy2 res_dummy3 Instead of group i used order , so that even if 1 app gets restarts others will not be affected. Yep, advisory ordering is fine for that. Also is there any way so that if 1 app fails more than migration-threshold times, we can stop all 3 resources on that node?? Maybe advisory colocations can do something similar (I'm not sure)? http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_advisory_placement.html You should find correct value for its score (positive or negative) though. crm_simulate is your friend for that. Thanks Eswar ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker 1.1.10 and pacemaker-remote
05.12.2013 19:57, James Oakley wrote: I have Pacemaker 1.1.10 cluster running on openSUSE 13.1 and I am trying to get pacemaker-remote working so I can manage resources in LXC containers. I have pacemaker_remoted running in the containers. However, I can't seem to get crm configured to talk to the daemons. The only documentation I found is the pcs-centric doc here: http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/#idm272823453136 I'm using crm, and not pcs, but it plainly states that all I need to do is set the remote-node meta to the hostname on the LXC resource. Unfortunately, crm does not let me add it. I tried forcing it using crm_resource, but it's not working. It shows up now: primitive lxc_db0 @lxc \ params container=db0 config=/var/lib/lxc/db0/config \ meta remote-node=db0 Does 'db0' resolve to a correct IP address? If not, then you probably want either fix that or use remote-addr option as well. I saw that you can ping/ssh that container, but it is not clear did you use 'db0' name for that or IP address. But verify doesn't like it: crm(live)configure# verify error: setup_container: Resource lxc_db0: Unknown resource container (db0) error: setup_container: Resource lxc_db1: Unknown resource container (db1) Errors found during check: config not valid ERROR: lxc_db0: attribute remote-node does not exist ERROR: lxc_db1: attribute remote-node does not exist You can use check_mode=relaxed crmsh userpref to downgrade that error to a warning until it is fixed. BTW, pacemaker cib accepts any meta attributes (and that is very convenient way for me to store some 'meta' information), while crmsh limits them to a pre-defined list. While that is probably fine for novices, that limits some advanced usage scenarios, that's why I disabled that check by a patch for my builds. It would better look for meta attributes 'similar' to a known ones, f.e. target_role vs target-role, and warn about them only. What am I missing? Is this somehow disabled on non-pcs clusters, and if so, why is the pacemaker-remote package even in the openSUSE repositories and pcs not? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Where the heck is Beekhof?
28.11.2013 04:04, Andrew Beekhof wrote: If you find yourself asking $subject at some point in the next couple of months, the answer is that I'm taking leave to look after our new son (Lawson Tiberius Beekhof) who was born on Tuesday. I will be dropping in occasionally to see how things are travelling and attempt to get 1.1.11 finalised, but don't expect much from me until February :-) For pcs specific issues it would be worth CC'ing Chris Feist. See you in 2014! Congratulations to you and your family! Your wife is lucky to have such husband who is able to drop everything to look after babies :) ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node
12.11.2013 09:56, Vladislav Bogdanov wrote: ... Ah, then in_ccm will be set to false only when corosync (2) is stopped on a node, not when pacemaker is stopped? Thus, current drbd agent/fencing logic does not (well) support just stop of pacemaker in my use-case, messaging layer should be stopped as well. May be it should also look at the shutdown attribute... Just for the thread completeness. With the patch below and latest pacemaker tip from beekhof repository drbd fence handler returns almost immediately and drbd resource is promoted without delays on another node after shutdown of pacemaker instance which has it promoted. --- a/scripts/crm-fence-peer.sh 2013-09-27 10:47:52.0 + +++ b/scripts/crm-fence-peer.sh 2013-11-12 13:45:52.274674803 + @@ -500,6 +500,21 @@ guess_if_pacemaker_will_fence() [[ $crmd = banned ]] will_fence=true if [[ ${expected-down} = down $in_ccm = false $crmd != online ]]; then : pacemaker considers this as clean down + elif [[ $crmd/$join/$expected = offline/down/down ]] ; then + # Check if pacemaker is simply shutdown, but membership/quorum is possibly still established (corosync2/cman) + # 1.1.11 will set expected=down on a clean shutdown too + # Look for shutdown transient node attribute + local node_attributes=$(set +x; echo $cib_xml | awk /node_state [^\n]*uname=\$DRBD_PEER\/,/\/instance_attributes/| grep -F -e nvpair ) + if [ -n ${node_attributes} ] ; then + local shut_down=$(set +x; echo $node_attributes | awk '/ name=shutdown/ {if (match($0, /value=\([[:digit:]]+)\/, values)) {print values[1]} }') + if [ -n ${shut_down} ] ; then + : pacemaker considers this as clean down + else + will_fence=true + fi + else + will_fence=true + fi elif [[ $in_ccm = false ]] || [[ $crmd != online ]]; then will_fence=true fi ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node
11.11.2013 09:00, Vladislav Bogdanov wrote: ... Looking at crm-fence-peer.sh script, it would determine peer state as offline immediately if node state (all of) * doesn't contain expected tag or has it set to down * has in_ccm tag set to false * has crmd tag set to anything except online On the other hand, crmd sets expected = down only after fencing is complete (probably the same for in_ccm?). Shouldn't is do the same (or may be just remove that tag) if clean shutdown about to be complete? That would make sense. Are you using the plugin, cman or corosync 2? This one works in all tests I was able to imagine, but I'm not sure it is completely safe to set expected=down for old DC (in test when drbd is promoted on DC and it reboots). From ddfccc8a40cfece5c29d61f44a4467954d5c5da8 Mon Sep 17 00:00:00 2001 From: Vladislav Bogdanov bub...@hoster-ok.com Date: Mon, 11 Nov 2013 14:32:48 + Subject: [PATCH] Update node values in cib on clean shutdown --- crmd/callbacks.c |6 +- crmd/membership.c |2 +- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/crmd/callbacks.c b/crmd/callbacks.c index 3dae17b..9cfb973 100644 --- a/crmd/callbacks.c +++ b/crmd/callbacks.c @@ -162,6 +162,8 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d } else if (safe_str_eq(node-uname, fsa_our_dc) crm_is_peer_active(node) == FALSE) { /* Did the DC leave us? */ crm_notice(Our peer on the DC (%s) is dead, fsa_our_dc); +/* FIXME: is it safe? */ +crm_update_peer_expected(__FUNCTION__, node, CRMD_JOINSTATE_DOWN); register_fsa_input(C_CRMD_STATUS_CALLBACK, I_ELECTION, NULL); } break; @@ -169,6 +171,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d if (AM_I_DC) { xmlNode *update = NULL; +int flags = node_update_peer; gboolean alive = crm_is_peer_active(node); crm_action_t *down = match_down_event(0, node-uuid, NULL, appeared); @@ -199,6 +202,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d crm_update_peer_join(__FUNCTION__, node, crm_join_none); crm_update_peer_expected(__FUNCTION__, node, CRMD_JOINSTATE_DOWN); +flags |= node_update_cluster | node_update_join | node_update_expected; check_join_state(fsa_state, __FUNCTION__); update_graph(transition_graph, down); @@ -221,7 +225,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d crm_trace(Other %p, down); } -update = do_update_node_cib(node, node_update_peer, NULL, __FUNCTION__); +update = do_update_node_cib(node, flags, NULL, __FUNCTION__); fsa_cib_anon_update(XML_CIB_TAG_STATUS, update, cib_scope_local | cib_quorum_override | cib_can_create); free_xml(update); diff --git a/crmd/membership.c b/crmd/membership.c index be1863a..d68b3aa 100644 --- a/crmd/membership.c +++ b/crmd/membership.c @@ -152,7 +152,7 @@ do_update_node_cib(crm_node_t * node, int flags, xmlNode * parent, const char *s crm_xml_add(node_state, XML_ATTR_UNAME, node-uname); if (flags node_update_cluster) { -if (safe_str_eq(node-state, CRM_NODE_ACTIVE)) { +if (crm_is_peer_active(node)) { value = XML_BOOLEAN_YES; } else if (node-state) { value = XML_BOOLEAN_NO; -- 1.7.1 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node
12.11.2013 03:05, Andrew Beekhof wrote: On 12 Nov 2013, at 10:29 am, Andrew Beekhof and...@beekhof.net wrote: On 12 Nov 2013, at 2:46 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: 11.11.2013 09:00, Vladislav Bogdanov wrote: ... Looking at crm-fence-peer.sh script, it would determine peer state as offline immediately if node state (all of) * doesn't contain expected tag or has it set to down * has in_ccm tag set to false * has crmd tag set to anything except online On the other hand, crmd sets expected = down only after fencing is complete (probably the same for in_ccm?). Shouldn't is do the same (or may be just remove that tag) if clean shutdown about to be complete? That would make sense. Are you using the plugin, cman or corosync 2? This one works in all tests I was able to imagine, but I'm not sure it is completely safe to set expected=down for old DC (in test when drbd is promoted on DC and it reboots). From ddfccc8a40cfece5c29d61f44a4467954d5c5da8 Mon Sep 17 00:00:00 2001 From: Vladislav Bogdanov bub...@hoster-ok.com Date: Mon, 11 Nov 2013 14:32:48 + Subject: [PATCH] Update node values in cib on clean shutdown --- crmd/callbacks.c |6 +- crmd/membership.c |2 +- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/crmd/callbacks.c b/crmd/callbacks.c index 3dae17b..9cfb973 100644 --- a/crmd/callbacks.c +++ b/crmd/callbacks.c @@ -162,6 +162,8 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d } else if (safe_str_eq(node-uname, fsa_our_dc) crm_is_peer_active(node) == FALSE) { /* Did the DC leave us? */ crm_notice(Our peer on the DC (%s) is dead, fsa_our_dc); +/* FIXME: is it safe? */ Not at all safe. It will prevent fencing. I actually tried to kill corosync, and node has been fenced. Killing crmd on a DC resulted in its restart and resources reprobe, and no fencing. I thought it is probably normal. +crm_update_peer_expected(__FUNCTION__, node, CRMD_JOINSTATE_DOWN); register_fsa_input(C_CRMD_STATUS_CALLBACK, I_ELECTION, NULL); } break; @@ -169,6 +171,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d if (AM_I_DC) { xmlNode *update = NULL; +int flags = node_update_peer; gboolean alive = crm_is_peer_active(node); crm_action_t *down = match_down_event(0, node-uuid, NULL, appeared); @@ -199,6 +202,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d crm_update_peer_join(__FUNCTION__, node, crm_join_none); crm_update_peer_expected(__FUNCTION__, node, CRMD_JOINSTATE_DOWN); +flags |= node_update_cluster | node_update_join | node_update_expected; This does look ok though With the exception of 'node_update_cluster'. That didn't change here and shouldn't be touched until it really does leave the membership. Ah, then in_ccm will be set to false only when corosync (2) is stopped on a node, not when pacemaker is stopped? Thus, current drbd agent/fencing logic does not (well) support just stop of pacemaker in my use-case, messaging layer should be stopped as well. May be it should also look at the shutdown attribute... Would it be sane? I also think about workaround: 'service pacemaker stop' first puts node to a standby state, saves flag somewhere that it should be put online automatically right after the next start, and waits until pengine goes to an idle state. After that it does actual stop. check_join_state(fsa_state, __FUNCTION__); update_graph(transition_graph, down); @@ -221,7 +225,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d crm_trace(Other %p, down); } -update = do_update_node_cib(node, node_update_peer, NULL, __FUNCTION__); +update = do_update_node_cib(node, flags, NULL, __FUNCTION__); fsa_cib_anon_update(XML_CIB_TAG_STATUS, update, cib_scope_local | cib_quorum_override | cib_can_create); free_xml(update); diff --git a/crmd/membership.c b/crmd/membership.c index be1863a..d68b3aa 100644 --- a/crmd/membership.c +++ b/crmd/membership.c @@ -152,7 +152,7 @@ do_update_node_cib(crm_node_t * node, int flags, xmlNode * parent, const char *s crm_xml_add(node_state, XML_ATTR_UNAME, node-uname); if (flags node_update_cluster) { -if (safe_str_eq(node-state, CRM_NODE_ACTIVE)) { +if (crm_is_peer_active(node)) { This is also wrong. XML_NODE_IN_CLUSTER is purely a record of whether the node is in the current corosync/cman/heartbeat membership. value = XML_BOOLEAN_YES; } else if (node-state) { value = XML_BOOLEAN_NO; -- 1.7.1
Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node
12.11.2013 03:15, Andrew Beekhof wrote: Can you try with these two patches please? + Andrew Beekhof (4 seconds ago) fec946a: Fix: crmd: When the DC gracefully shuts down, record the new expected state into the cib (HEAD, master) + Andrew Beekhof (10 seconds ago) 740122a: Fix: crmd: When a peer expectedly shuts down, record the new join and expected states into the cib Confirmed, they do the trick. Everything is set as expected. As I wrote earlier, drbd is still stuck until corosync is stopped too. That should be probably drbd issue, not pacemaker. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD promotion timeout after pacemaker stop on other node
11.11.2013 02:30, Andrew Beekhof wrote: On 5 Nov 2013, at 2:22 am, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, David, all, Just found interesting fact, don't know is it a bug or not. When doing service pacemaker stop on a node which has drbd resource promoted, that resource does not promote on another node, and promote operation timeouts. This is related to drbd fence integration with pacemaker and to insufficient default (recommended) promote timeout for drbd resource. crm-fence-peer.sh places constraint to cib one second after promote operation timeouts (promote op has 90s timeout, and crm-fence-peer.sh uses that value as a timeout, and fully utilizes it if it cannot say for sure that peer node is in a sane state - online or cleanly offline). It seems like increasing promote op timeout helps, but, I'd expect that to complete almost immediately, instead of waiting extra 90 seconds for nothing. Looking at crm-fence-peer.sh script, it would determine peer state as offline immediately if node state (all of) * doesn't contain expected tag or has it set to down * has in_ccm tag set to false * has crmd tag set to anything except online On the other hand, crmd sets expected = down only after fencing is complete (probably the same for in_ccm?). Shouldn't is do the same (or may be just remove that tag) if clean shutdown about to be complete? That would make sense. Are you using the plugin, cman or corosync 2? corosync2 Or may be it is possible to provide some different hint for crm_fence_peer.sh? Another option (actually hack) would be to delay shutdown between resources stop and processes stop (so drbd handler on the other node determines peer is still online, and places constraint immediately), but that is very fragile. pacemaker is one-week-old merge of clusterlab and bekkhof masters, drbd is 8.4.4. All runs on corosync2 (2.3.1) with libqb 0.16 on CentOS6. Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org