Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?
Dmitry Koterov dmitry.kote...@gmail.com writes: Oh, seems I've found the solution! At least two mistakes was in my corosync.conf (BTW logs did not say about any errors, so my conclusion is based on my experiments only). 1. nodelist.node MUST contain only IP addresses. No hostnames! They simply do not work, crm status shows no nodes. And no warnings are in logs regarding this. You can add name like this: nodelist { node { ring0_addr: public-ip-address-of-the-first-machine name: node1 } node { ring0_addr: public-ip-address-of-the-second-machine name: node2 } } I used it on Ubuntu Trusty with udpu. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Avoid monitoring of resources on nodes
Andrew Beekhof and...@beekhof.net writes: What version of pacemaker is this? Some very old versions wanted the agent to be installed on all nodes. It's 1.1.10+git20130802-1ubuntu2.1 on Trusty Tahr. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker fencing and DLM/cLVM
Andrew Beekhof and...@beekhof.net writes: This was fixed a few months ago: + David Vossel (9 months ago) 054fedf: Fix: stonith_api_time_helper now returns when the most recent fencing operation completed (origin/pr/444) + Andrew Beekhof (9 months ago) d9921e5: Fix: Fencing: Pass the correct options when looking up the history by node name + Andrew Beekhof (9 months ago) b0a8876: Log: Fencing: Send details of stonith_api_time() and stonith_api_kick() to syslog It doesn't seem Ubuntu has these patches Thanks, I just opened a bug report[1]. Footnotes: [1] https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1397278 -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Avoid monitoring of resources on nodes
Daniel Dehennin daniel.dehen...@baby-gnu.org writes: I'll try find how to make the change directly in XML. Ok, looking at git history this feature seems only available on master branch and not yet released. I do not have that feature on my pacemaker version. Does it sounds normal, I have: - asymmetrical Opt-in cluster[1] - a group of resources with INFINITY location on a specific node And the nodes excluded are fenced because of many monitor errors about this resource. Regards. Footnotes: [1] http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Pacemaker_Explained/_asymmetrical_opt_in_clusters.html -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Avoid monitoring of resources on nodes
Hello, I have a 4 nodes cluster and some resources are only installed on 2 of them. I set cluster asymmetry and infinity location: primitive Mysqld upstart:mysql \ op monitor interval=60 primitive OpenNebula-Sunstone-Sysv lsb:opennebula-sunstone \ op monitor interval=60 primitive OpenNebula-Sysv lsb:opennebula \ op monitor interval=60 group OpenNebula Mysqld OpenNebula-Sysv OpenNebula-Sunstone-Sysv \ meta target-role=Started location OpenNebula-runs-on-Frontend OpenNebula inf: one-frontend property $id=cib-bootstrap-options \ dc-version=1.1.10-42f2063 \ cluster-infrastructure=corosync \ symmetric-cluster=false \ stonith-enabled=true \ stonith-timeout=30 \ last-lrm-refresh=1416817941 \ no-quorum-policy=stop \ stop-all-resources=off But I have a lot of failing monitoring on other nodes of these resources because they are not installed on them. Is there a way to completely exclude the resources from nodes, even the monitoring? Regards. Ubuntu Trusty Tahr (amd64): - corosync 2.3.3-1ubuntu1 - pacemaker 1.1.10+git20130802-1ubuntu2.1 -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Avoid monitoring of resources on nodes
Daniel Dehennin daniel.dehen...@baby-gnu.org writes: Hello, Hello, I have a 4 nodes cluster and some resources are only installed on 2 of them. I set cluster asymmetry and infinity location: primitive Mysqld upstart:mysql \ op monitor interval=60 primitive OpenNebula-Sunstone-Sysv lsb:opennebula-sunstone \ op monitor interval=60 primitive OpenNebula-Sysv lsb:opennebula \ op monitor interval=60 group OpenNebula Mysqld OpenNebula-Sysv OpenNebula-Sunstone-Sysv \ meta target-role=Started location OpenNebula-runs-on-Frontend OpenNebula inf: one-frontend property $id=cib-bootstrap-options \ dc-version=1.1.10-42f2063 \ cluster-infrastructure=corosync \ symmetric-cluster=false \ stonith-enabled=true \ stonith-timeout=30 \ last-lrm-refresh=1416817941 \ no-quorum-policy=stop \ stop-all-resources=off But I have a lot of failing monitoring on other nodes of these resources because they are not installed on them. Is there a way to completely exclude the resources from nodes, even the monitoring? This cause troubles on my setup, as resources fails, my nodes are all fenced. Any hints? Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker fencing and DLM/cLVM
Christine Caulfield ccaul...@redhat.com writes: It seems to me that fencing is failing for some reason, though I can't tell from the logs exactly why, so you might have to investgate your setup for IPMI to see just what is happening (I'm no IPMI expert, sorry). Thanks for looking, but actually IPMI stonith is working, for all nodes I tested: stonith_adm --reboot node And it works. The logs files tell me this though: Nov 25 10:56:32 nebula3 dlm_controld[6465]: 1035 fence request 1084811079 pid 7358 nodedown time 1416909392 fence_all dlm_stonith Nov 25 10:56:32 nebula3 dlm_controld[6465]: 1035 fence result 1084811079 pid 7358 result 1 exit status Nov 25 10:56:32 nebula3 dlm_controld[6465]: 1035 fence status 1084811079 receive 1 from 1084811080 walltime 1416909392 local 1035 Nov 25 10:56:32 nebula3 dlm_controld[6465]: 1035 fence request 1084811079 no actor Showing a status code '1' from dlm_stonith - the result should be 0 if fencing completed succesfully. But 1084811080 is nebula3 and in its logs I see: Nov 25 10:56:33 nebula3 stonith-ng[6232]: notice: can_fence_host_with_device: Stonith-nebula2-IPMILAN can fence nebula2: static-list [...] Nov 25 10:56:34 nebula3 stonith-ng[6232]: notice: log_operation: Operation 'reboot' [7359] (call 4 from crmd.5038) for host 'nebula2' with device 'Stonith-nebula2-IPMILAN' returned: 0 (OK) Nov 25 10:56:34 nebula3 stonith-ng[6232]:error: crm_abort: crm_glib_handler: Forked child 7376 to record non-fatal assert at logging.c:63 : Source ID 20 was not found when attempting to remove it Nov 25 10:56:34 nebula3 stonith-ng[6232]:error: crm_abort: crm_glib_handler: Forked child 7377 to record non-fatal assert at logging.c:63 : Source ID 21 was not found when attempting to remove it Nov 25 10:56:34 nebula3 stonith-ng[6232]: notice: remote_op_done: Operation reboot of nebula2 by nebula1 for crmd.5038@nebula1.34bed18c: OK Nov 25 10:56:34 nebula3 crmd[6236]: notice: tengine_stonith_notify: Peer nebula2 was terminated (reboot) by nebula1 for nebula1: OK (ref=34bed18c-c395-4de2-b323-e00208cac6c7) by client crmd.5038 Nov 25 10:56:34 nebula3 crmd[6236]: notice: crm_update_peer_state: tengine_stonith_notify: Node nebula2[0] - state is now lost (was (null)) Which means to me that stonith-ng manage to fence the node and notify its success. How the “returned: 0 (OK)” could became “receive 1”? A logic issue somewhere between stonith-ng and dlm_controld? Thanks. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Avoid monitoring of resources on nodes
David Vossel dvos...@redhat.com writes: actually, this is possible now. I am unaware of any configuration tools (pcs or crmsh) that support this feature yet though. You might have to edit the cib xml manually. There's a new 'resource-discovery' option you can set on a location constraint that help prevent resources from ever being started or monitored on a node. Example: never start or monitor the resource FAKE1 on 18node2. rsc_location id=location-FAKE1-18node2 node=18node2 resource-discovery=never rsc=FAKE1 score=-INFINITY/ There are more examples in this regression test. https://github.com/ClusterLabs/pacemaker/blob/master/pengine/test10/resource-discovery.xml#L99 Thanks a lot. I'll try find how to make the change directly in XML. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Avoid monitoring of resources on nodes
Daniel Dehennin daniel.dehen...@baby-gnu.org writes: There's a new 'resource-discovery' option you can set on a location constraint that help prevent resources from ever being started or monitored on a node. Example: never start or monitor the resource FAKE1 on 18node2. rsc_location id=location-FAKE1-18node2 node=18node2 resource-discovery=never rsc=FAKE1 score=-INFINITY/ There are more examples in this regression test. https://github.com/ClusterLabs/pacemaker/blob/master/pengine/test10/resource-discovery.xml#L99 Thanks a lot. I'll try find how to make the change directly in XML. Ok, looking at git history this feature seems only available on master branch and not yet released. Thanks. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Pacemaker fencing and DLM/cLVM
]: notice: tengine_stonith_notify: Peer nebula1 was terminated (reboot) by nebula2 for nebula3: OK (ref=50c93bed-e66f-48a5-bd2f-100a9e7ca7a1) by client crmd.6043 Nov 24 09:51:13 nebula3 crmd[6043]: notice: te_rsc_command: Initiating action 22: start Stonith-nebula3-IPMILAN_start_0 on nebula2 Nov 24 09:51:14 nebula3 crmd[6043]: notice: run_graph: Transition 5 (Complete=11, Pending=0, Fired=0, Skipped=1, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-warn-2.bz2): Stopped Nov 24 09:51:14 nebula3 pengine[6042]: notice: process_pe_message: Calculated Transition 6: /var/lib/pacemaker/pengine/pe-input-2.bz2 Nov 24 09:51:14 nebula3 crmd[6043]: notice: te_rsc_command: Initiating action 21: monitor Stonith-nebula3-IPMILAN_monitor_180 on nebula2 Nov 24 09:51:15 nebula3 crmd[6043]: notice: run_graph: Transition 6 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-2.bz2): Complete Nov 24 09:51:15 nebula3 crmd[6043]: notice: do_state_transition: State transition S_TRANSITION_ENGINE - S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Nov 24 09:52:10 nebula3 dlm_controld[6263]: 566 datastores wait for fencing Nov 24 09:52:10 nebula3 dlm_controld[6263]: 566 clvmd wait for fencing Nov 24 09:55:10 nebula3 dlm_controld[6263]: 747 fence status 1084811078 receive -125 from 1084811079 walltime 1416819310 local 747 When the node is fenced I have “clvmd wait for fencing” and “datastores wait for fencing” (datastores is my GFS2 volume). Any idea of something I can check when this happens? Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker fencing and DLM/cLVM
Michael Schwartzkopff m...@sys4.de writes: Yes. You have to tell all the underlying infrastructure to use the fencing of pacemaker. I assume that you are working on a RH clone. See: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/ch08s02s03.html Sorry, this is my fault. I'm using Ubuntu 14.04: - corosync 2.3.3-1ubuntu1 - pacemaker 1.1.10+git20130802-1ubuntu2.1 I thought everything was integrated in such configuration. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] TOTEM Retransmit list in logs when a node gets up
Christine Caulfield ccaul...@redhat.com writes: [...] If its only happening at startup it could be the switch/router learning the ports for the nodes and building its routing tables. Switching to udpu will then get rid of the message if it's annoying Switching to updu make it works correctly. Thanks. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] TOTEM Retransmit list in logs when a node gets up
Hello, My cluster seems to works correctly but when I start corosync and pacemaker on one of them[1] I start to have some TOTEM logs like this: #+begin_src Nov 13 14:00:10 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4e 4f Nov 13 14:00:10 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 46 47 48 49 4a 4b 4c 4d 4e 4f Nov 13 14:00:10 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4b 4c 4d 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 47 48 49 4a 4b 4c 4d 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 47 48 49 4a 4b 4c 4d 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4b 4c 4d 4e 4f Nov 13 14:00:30 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4c 4d 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4c 4d 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4d 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 47 48 49 4a 4b 4c 4d 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4a 4b 4c 4d 4e 4f Nov 13 14:00:35 nebula3 corosync[5345]: [TOTEM ] Retransmit List: 4b 4c 4d 4e 4f #+end_src I do not understand what happens, do you have any hints? Regards. Footnotes: [1] the VM using two cards http://oss.clusterlabs.org/pipermail/pacemaker/2014-November/022962.html -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Fencing dependency between bare metal host and its VMs guest
Andrei Borzenkov arvidj...@gmail.com writes: [...] Now I have one issue, when the bare metal host on which the VM is running die, the VM is lost and can not be fenced. Is there a way to make pacemaker ACK the fencing of the VM running on a host when the host is fenced itself? Yes, you can define multiple stonith agents and priority between them. http://clusterlabs.org/wiki/Fencing_topology Hello, If I understand correctly, fencing topology is the way to have several fencing devices for a node and try them consecutively until one works. In my configuration, I group the VM stonith agents with the corresponding VM resource, to make them move together[1]. Here is my use case: 1. Resource ONE-Frontend-Group runs on nebula1 2. nebula1 is fenced 3. node one-fronted can not be fenced Is there a way to say that the life on node one-frontend is related to the state of resource ONE-Frontend? In which case when the node nebula1 is fenced, pacemaker should be aware that resource ONE-Frontend is not running any more, so node one-frontend is OFFLINE and not UNCLEAN. Regards. Footnotes: [1] http://oss.clusterlabs.org/pipermail/pacemaker/2014-October/022671.html -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF node $id=1084811078 nebula1 node $id=1084811079 nebula2 node $id=1084811080 nebula3 node $id=108488 quorum \ attributes standby=on node $id=108489 one-frontend primitive ONE-Datastores ocf:heartbeat:Filesystem \ params device=/dev/one-fs/datastores directory=/var/lib/one/datastores fstype=gfs2 \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ op monitor interval=20 timeout=40 primitive ONE-Frontend ocf:heartbeat:VirtualDomain \ params config=/var/lib/one/datastores/one/one.xml \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ utilization cpu=1 hv_memory=1024 primitive ONE-vg ocf:heartbeat:LVM \ params volgrpname=one-fs \ op start interval=0 timeout=30 \ op stop interval=0 timeout=30 \ op monitor interval=60 timeout=30 primitive Quorum-Node ocf:heartbeat:VirtualDomain \ params config=/var/lib/libvirt/qemu/pcmk/quorum.xml \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ utilization cpu=1 hv_memory=1024 primitive Stonith-ONE-Frontend stonith:external/libvirt \ params hostlist=one-frontend hypervisor_uri=qemu:///system pcmk_host_list=one-frontend pcmk_host_check=static-list \ op monitor interval=30m primitive Stonith-Quorum-Node stonith:external/libvirt \ params hostlist=quorum hypervisor_uri=qemu:///system pcmk_host_list=quorum pcmk_host_check=static-list \ op monitor interval=30m primitive Stonith-nebula1-IPMILAN stonith:external/ipmi \ params hostname=nebula1-ipmi ipaddr=XXX.XXX.XXX.XXX interface=lanplus userid=USER passwd=PASSWORD1 passwd_method=env priv=operator pcmk_host_list=nebula1 pcmk_host_check=static-list \ op monitor interval=30m \ meta target-role=Started primitive Stonith-nebula2-IPMILAN stonith:external/ipmi \ params hostname=nebula2-ipmi ipaddr=YYY.YYY.YYY.YYY interface=lanplus userid=USER passwd=PASSWORD2 passwd_method=env priv=operator pcmk_host_list=nebula2 pcmk_host_check=static-list \ op monitor interval=30m \ meta target-role=Started primitive Stonith-nebula3-IPMILAN stonith:external/ipmi \ params hostname=nebula3-ipmi ipaddr=ZZZ.ZZZ.ZZZ.ZZZ interface=lanplus userid=USER passwd=PASSWORD3 passwd_method=env priv=operator pcmk_host_list=nebula3 pcmk_host_check=static-list \ op monitor interval=30m \ meta target-role=Started primitive clvm ocf:lvm2:clvmd \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ op monitor interval=60 timeout=90 primitive dlm ocf:pacemaker:controld \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ op monitor interval=60 timeout=60 group ONE-Frontend-Group Stonith-ONE-Frontend ONE-Frontend \ meta target-role=Started group ONE-Storage dlm clvm ONE-vg ONE-Datastores group Quorum-Node-Group Stonith-Quorum-Node Quorum-Node \ meta target-role=Started clone ONE-Storage-Clone ONE-Storage \ meta interleave=true target-role=Started location Nebula1-does-not-fence-itslef Stonith-nebula1-IPMILAN \ rule $id=Nebula1-does-not-fence-itslef-rule 50: #uname eq nebula2 \ rule $id=Nebula1-does-not-fence-itslef-rule-0 40: #uname eq nebula3 location Nebula2-does-not-fence-itslef Stonith-nebula2-IPMILAN \ rule $id=Nebula2-does-not-fence-itslef-rule 50: #uname eq nebula3 \ rule $id=Nebula2-does-not-fence-itslef-rule-0 40: #uname eq nebula1 location Nebula3-does-not-fence-itslef Stonith-nebula3-IPMILAN \ rule $id=Nebula3-does
[Pacemaker] Loosing corosync communication clusterwide
Hello, I just have an issue on my pacemaker setup, my dlm/clvm/gfs2 was blocked. The “dlm_tool ls” command told me “wait ringid”. The corosync-* commands hangs (like corosync-quorumtool). The pacemaker “crm_mon” display nothing wrong. I'm using Ubuntu Trusty Tahr: - corosync 2.3.3-1ubuntu1 - pacemaker 1.1.10+git20130802-1ubuntu2.1 My cluster was manually rebooted. Any idea how to debug such situation? Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loosing corosync communication clusterwide
emmanuel segura emi2f...@gmail.com writes: I think, you don't have fencing configured in your cluster. I have fencing configured and working, modulo fencing VMs on dead host[1]. Regards. Footnotes: [1] http://oss.clusterlabs.org/pipermail/pacemaker/2014-November/022965.html -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loosing corosync communication clusterwide
Tomasz Kontusz tomasz.kont...@gmail.com writes: Hanging corosync sounds like libqb problems: trusty comes with 0.16, which likes to hang from time to time. Try building libqb 0.17. Thanks, I'll look at this. Is there a way to get back to normal state without rebooting all machines and interrupting services? I thought about a lightweight version of something like: 1. stop pacemaker on all nodes without doing anything with resources, they all continue to work 2. stop corosync on all nodes 3. start corosync on all nodes 4. start pacemaker on all nodes, as services are running nothing needs to be done I looked in the documentation but fail to find some kind of cluster management best practices. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] [SOLVED] Re: Multicast corosync packets and default route
Daniel Dehennin daniel.dehen...@baby-gnu.org writes: Daniel Dehennin daniel.dehen...@baby-gnu.org writes: Hello, [...] I only manage to have my VM as corosync member like others when default the route is on the same interface as my multicast traffic. [...] Using tcpdump I found the difference between single card VM and multicard VM. When using multiple cards, I need to force the IGMP version since my physical switches does not support IGMPv3. It looks like the kernel uses IGMPv3 to register any local IP addresses to the multicast group. Single card VM: No. Time Source Destination Protocol Info 2 0.000985 192.168.231.110 226.94.1.1 IGMPv2 Membership Report group 226.94.1.1 Frame 2: 46 bytes on wire (368 bits), 46 bytes captured (368 bits) Ethernet II, Src: RealtekU_03:6d:2d (52:54:00:03:6d:2d), Dst: IPv4mcast_5e:01:01 (01:00:5e:5e:01:01) Internet Protocol Version 4, Src: 192.168.231.110 (192.168.231.110), Dst: 226.94.1.1 (226.94.1.1) Internet Group Management Protocol [IGMP Version: 2] Type: Membership Report (0x16) Max Resp Time: 0,0 sec (0x00) Header checksum: 0x06a0 [correct] Multicast Address: 226.94.1.1 (226.94.1.1) Multicard VM: No. Time Source Destination Protocol Info 2 0.004419 192.168.231.111 224.0.0.22 IGMPv3 Membership Report / Join group 226.94.1.1 for any sources Frame 2: 54 bytes on wire (432 bits), 54 bytes captured (432 bits) Ethernet II, Src: RealtekU_dc:b6:92 (52:54:00:dc:b6:92), Dst: IPv4mcast_16 (01:00:5e:00:00:16) Internet Protocol Version 4, Src: 192.168.231.111 (192.168.231.111), Dst: 224.0.0.22 (224.0.0.22) Internet Group Management Protocol [IGMP Version: 3] Type: Membership Report (0x22) Header checksum: 0xf69e [correct] Num Group Records: 1 Group Record : 226.94.1.1 Change To Exclude Mode Record Type: Change To Exclude Mode (4) Aux Data Len: 0 Num Src: 0 Multicast Address: 226.94.1.1 (226.94.1.1) So I force the IGMP version for all interfaces with the following: sysctl -w net.ipv4.conf.all.force_igmp_version=2 Now my dual card VM is part of the ring: root@nebula3:~# corosync-quorumtool Quorum information -- Date: Fri Nov 7 16:32:34 2014 Quorum provider: corosync_votequorum Nodes:5 Node ID: 1084811080 Ring ID: 20624 Quorate: Yes Votequorum information -- Expected votes: 5 Highest expected: 5 Total votes: 5 Quorum: 3 Flags:Quorate WaitForAll LastManStanding Membership information -- Nodeid Votes Name 1084811078 1 nebula1.eole.lan 1084811079 1 nebula2.eole.lan 1084811080 1 nebula3.eole.lan (local) 108488 1 quorum.eole.lan 108489 1 one-frontend.eole.lan -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Fencing dependency between bare metal host and its VMs guest
Hello, As I finally manage to integrate my VM to corosync and my dlm/clvm/GFS2 are running on it. Now I have one issue, when the bare metal host on which the VM is running die, the VM is lost and can not be fenced. Is there a way to make pacemaker ACK the fencing of the VM running on a host when the host is fenced itself? Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Multicast corosync packets and default route
Hello, I'm trying to setup a pacemaker/corosync on Ubuntu Trusty to access a SAN to use with OpenNebula[1]: - pacemaker .1.10+git20130802-1ubuntu2.1 - corosync 2.3.3-1ubuntu1 I have a dedicated VLAN for cluster communications. Each bare metal node have a dedicated interface eth0 on that VLAN, 3 other interfaces are used as a bond0 integrated to an Open vSwtich as VLAN trunk. One VM have two interfaces on this Open vSwitch: - one for cluster communication - one to provide services, with default route on it My 3 bare metal nodes are OK, with pacemaker up running dlm/cLVM/GFS2, but my VM is always isolated. I setup a dedicated quorum (standby=on) VM with a single interface plugged to the cluster communication VLAN and it works (corosync/pacemaker). I run ssmping to debug multicast communication and found that the VM can only make unicast ping to the bare metal nodes. I finish by adding a route for multicast: ip route add 224.0.0.0/4 dev eth1 src 192.168.1.111 But it does not work. I only manage to have my VM as corosync member like others when default the route is on the same interface as my multicast traffic. I'm sure there is something I do not understand in corosync and multicast communication, do you have any hints? Regards. Footnotes: [1] http://opennebula.org/ -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Fencing of movable VirtualDomains
Andrew Beekhof and...@beekhof.net writes: [...] Is the ipaddr for each device really the same? If so, why not use a single 'resource'? No, sorry, the IP addr was not the same. Also, 1.1.7 wasn't as smart as 1.1.12 when it came to deciding which fencing device to use. Likely you'll get the behaviour you want with a version upgrade. I'll do that this week. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Fencing of movable VirtualDomains
Andrew Beekhof and...@beekhof.net writes: Maybe not, the collocation should be sufficient, but even without the orders, unclean VMs fencing is tried with other Stonith devices. Which other devices? The config you sent through didnt have any others. Sorry I sent it to linux-cluster mailing-list but not here, I attach it. I'll switch to newer corosync/pacemaker and use the pacemaker_remote if I can manage dlm/cLVM/OCFS2 with it. No can do. All three services require corosync on the node. Ok, so the remote is useless in my case, but upgrading seems required[1] in my case since wheezy software stack looks to old. Thanks. Footnotes: [1] http://article.gmane.org/gmane.linux.redhat.cluster/22963 -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF node nebula1 node nebula2 node nebula3 node one node quorum \ attributes standby=on primitive ONE-Frontend ocf:heartbeat:VirtualDomain \ params config=/var/lib/one/datastores/one/one.xml \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ meta target-role=Stopped primitive ONE-OCFS2-datastores ocf:heartbeat:Filesystem \ params device=/dev/one-fs/datastores directory=/var/lib/one/datastores fstype=ocfs2 \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ op monitor interval=20 timeout=40 primitive ONE-vg ocf:heartbeat:LVM \ params volgrpname=one-fs \ op start interval=0 timeout=30 \ op stop interval=0 timeout=30 \ op monitor interval=60 timeout=30 primitive Quorum-Node ocf:heartbeat:VirtualDomain \ params config=/var/lib/libvirt/qemu/pcmk/quorum.xml \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ meta target-role=Started primitive Stonith-ONE-Frontend stonith:external/libvirt \ params hostlist=one hypervisor_uri=qemu:///system pcmk_host_list=one pcmk_host_check=static-list \ op monitor interval=30m \ meta target-role=Started primitive Stonith-Quorum-Node stonith:external/libvirt \ params hostlist=quorum hypervisor_uri=qemu:///system pcmk_host_list=quorum pcmk_host_check=static-list \ op monitor interval=30m \ meta target-role=Started primitive Stonith-nebula1-IPMILAN stonith:external/ipmi \ params hostname=nebula1-ipmi ipaddr=A.B.C.D interface=lanplus userid=user passwd=X passwd_method=env priv=operator pcmk_host_list=nebula1 pcmk_host_check=static-list priority=10 \ op monitor interval=30m \ meta target-role=Started primitive Stonith-nebula2-IPMILAN stonith:external/ipmi \ params hostname=nebula2-ipmi ipaddr=A.B.C.D interface=lanplus userid=user passwd=X passwd_method=env priv=operator pcmk_host_list=nebula2 pcmk_host_check=static-list priority=20 \ op monitor interval=30m \ meta target-role=Started primitive Stonith-nebula3-IPMILAN stonith:external/ipmi \ params hostname=nebula3-ipmi ipaddr=A.B.C.D interface=lanplus userid=user passwd=X passwd_method=env priv=operator pcmk_host_list=nebula3 pcmk_host_check=static-list priority=30 \ op monitor interval=30m \ meta target-role=Started primitive clvm ocf:lvm2:clvm \ op start interval=0 timeout=90 \ op stop interval=0 timeout=90 \ op monitor interval=60 timeout=90 primitive dlm ocf:pacemaker:controld \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ op monitor interval=60 timeout=60 primitive o2cb ocf:pacemaker:o2cb \ params stack=pcmk daemon_timeout=30 \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ op monitor interval=60 timeout=60 group ONE-Storage dlm o2cb clvm ONE-vg ONE-OCFS2-datastores clone ONE-Storage-Clone ONE-Storage \ meta interleave=true target-role=Started location Nebula1-does-not-fence-itslef Stonith-nebula1-IPMILAN \ rule $id=Nebula1-does-not-fence-itslef-rule inf: #uname ne nebula1 location Nebula2-does-not-fence-itslef Stonith-nebula2-IPMILAN \ rule $id=Nebula2-does-not-fence-itslef-rule inf: #uname ne nebula2 location Nebula3-does-not-fence-itslef Stonith-nebula3-IPMILAN \ rule $id=Nebula3-does-not-fence-itslef-rule inf: #uname ne nebula3 location Nodes-with-ONE-Storage ONE-Storage-Clone \ rule $id=Nodes-with-ONE-Storage-rule inf: #uname eq nebula1 or #uname eq nebula2 or #uname eq nebula3 or #uname eq one location ONE-Fontend-fenced-by-hypervisor Stonith-ONE-Frontend \ rule $id=ONE-Fontend-fenced-by-hypervisor-rule inf: #uname ne quorum or #uname ne one location ONE-Frontend-run-on-hypervisor ONE-Frontend \ rule $id=ONE-Frontend-run-on-hypervisor-rule 40: #uname eq nebula1 \ rule $id=ONE-Frontend-run-on-hypervisor-rule-0 30: #uname eq nebula2 \ rule $id=ONE-Frontend-run
Re: [Pacemaker] Fencing of movable VirtualDomains
Andrew Beekhof and...@beekhof.net writes: It may be due to two “order”: #+begin_src order ONE-Frontend-after-its-Stonith inf: Stonith-ONE-Frontend ONE-Frontend order Quorum-Node-after-its-Stonith inf: Stonith-Quorum-Node Quorum-Node #+end_src Probably. Any particular reason for them to exist? Maybe not, the collocation should be sufficient, but even without the orders, unclean VMs fencing is tried with other Stonith devices. I'll switch to newer corosync/pacemaker and use the pacemaker_remote if I can manage dlm/cLVM/OCFS2 with it. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Fencing of movable VirtualDomains
Hello, I'm setting up a 3 nodes OpenNebula[1] cluster on Debian Wheezy using a SAN for shared storage and KVM as hypervisor. The OpenNebula fontend is a VM for HA[2]. I had some quorum issues when the node running the fontend die as the two other nodes loose quorum, so I added a pure quorum node in standby=on mode. My physical hosts are fenced using stonith:external/ipmi, which works great, one stonith device per node with a anti-location on itself. I have more troubles fencing the VMs since they can move. I try to define a stonith device per VM and colocate it with the VM itslef like this: #+begin_src primitive ONE-Frontend ocf:heartbeat:VirtualDomain \ params config=/var/lib/one/datastores/one/one.xml \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ meta target-role=Stopped primitive Quorum-Node ocf:heartbeat:VirtualDomain \ params config=/var/lib/one/datastores/one/quorum.xml \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ meta target-role=Started is-managed=true primitive Stonith-Quorum-Node stonith:external/libvirt \ params hostlist=quorum hypervisor_uri=qemu:///system pcmk_host_list=quorum pcmk_host_check=static-list \ op monitor interval=30m \ meta target-role=Started location ONE-Fontend-fenced-by-hypervisor Stonith-ONE-Frontend \ rule $id=ONE-Fontend-fenced-by-hypervisor-rule inf: #uname ne quorum or #uname ne one location ONE-Frontend-run-on-hypervisor ONE-Frontend \ rule $id=ONE-Frontend-run-on-hypervisor-rule 20: #uname eq nebula1 \ rule $id=ONE-Frontend-run-on-hypervisor-rule-0 30: #uname eq nebula2 \ rule $id=ONE-Frontend-run-on-hypervisor-rule-1 40: #uname eq nebula3 location Quorum-Node-fenced-by-hypervisor Stonith-Quorum-Node \ rule $id=Quorum-Node-fenced-by-hypervisor-rule inf: #uname ne quorum or #uname ne one location Quorum-Node-run-on-hypervisor Quorum-Node \ rule $id=Quorum-Node-run-on-hypervisor-rule 50: #uname eq nebula1 \ rule $id=Quorum-Node-run-on-hypervisor-rule-0 40: #uname eq nebula2 \ rule $id=Quorum-Node-run-on-hypervisor-rule-1 30: #uname eq nebula3 colocation Fence-ONE-Frontend-on-its-hypervisor inf: ONE-Frontend Stonith-ONE-Frontend colocation Fence-Quorum-Node-on-its-hypervisor inf: Quorum-Node Stonith-Quorum-Node property $id=cib-bootstrap-options \ dc-version=1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff \ cluster-infrastructure=openais \ expected-quorum-votes=5 \ stonith-enabled=true \ last-lrm-refresh=1412242734 \ stonith-timeout=30 \ symmetric-cluster=false #+end_src But, I can not start the Quorum-Node resource, I get the following in logs: #+begin_src info: can_fence_host_with_device: Stonith-nebula2-IPMILAN can not fence quorum: static-list #+end_src All the examples I found describe a configuration where each VM stay on a single hypervisor, in which case libvirt is configured to listen on TCP and the “hypervisor_uri” point to it. Does someone have ideas on configuring stonith:external/libvirt for movable VMs? Regards. Footnotes: [1] http://opennebula.org/ [2] http://docs.opennebula.org/4.8/advanced_administration/high_availability/oneha.html -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Fencing of movable VirtualDomains
emmanuel segura emi2f...@gmail.com writes: for guest fencing you can use, something like this http://www.daemonzone.net/e/3/, rather to have a full cluster stack in your guest, you can try to use pacemaker-remote for your virtual guest I think it could be done for the pure quorum node, but my other node needs to access the cLVM and OCFS2 resources. After some problems with blocking cLVM, even when cluster was quorated, I saw that the “Stonith-Quorum-Node” and “Stonith-ONE-Frontend” was started only when I ask to start the respective VirtualDomain. It may be due to two “order”: #+begin_src order ONE-Frontend-after-its-Stonith inf: Stonith-ONE-Frontend ONE-Frontend order Quorum-Node-after-its-Stonith inf: Stonith-Quorum-Node Quorum-Node #+end_src Now, it seems I mostly have dragons in DLM/o2cb/cLVM in my VM :-/ Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org