Hi folks.. I have some follow-up questions about corosync daemon status after cluster shutdown.
Basically, what should happen to corosync on a cluster node when pacemaker is shutdown on that node? On my 5 node cluster, when I do a global shutdown, the pacemaker processes exit, but corosync processes remain active. Here's an example of where this led me into some trouble... My cluster is still configured to use the "symmetric" resource distribution. I don't have any location constraints in place, so pacemaker tries to evenly distribute resources across all Online nodes. With one cluster node (KVM host) powered off, I did the global cluster stop: [root@zs90KP VD]# date;pcs cluster stop --all Wed Sep 28 15:07:40 EDT 2016 zs93KLpcs1: Unable to connect to zs93KLpcs1 ([Errno 113] No route to host) zs90kppcs1: Stopping Cluster (pacemaker)... zs95KLpcs1: Stopping Cluster (pacemaker)... zs95kjpcs1: Stopping Cluster (pacemaker)... zs93kjpcs1: Stopping Cluster (pacemaker)... Error: unable to stop all nodes zs93KLpcs1: Unable to connect to zs93KLpcs1 ([Errno 113] No route to host) Note: The "No route to host" messages are expected because that node / LPAR is powered down. (I don't show it here, but the corosync daemon is still running on the 4 active nodes. I do show it later). I then powered on the one zs93KLpcs1 LPAR, so in theory I should not have quorum when it comes up and activates pacemaker, which is enabled to autostart at boot time on all 5 cluster nodes. At this point, only 1 out of 5 nodes should be Online to the cluster, and therefore ... no quorum. I login to zs93KLpcs1, and pcs status shows those 4 nodes as 'pending' Online, and "partition with quorum": [root@zs93kl ~]# date;pcs status |less Wed Sep 28 15:25:13 EDT 2016 Cluster name: test_cluster_2 Last updated: Wed Sep 28 15:25:13 2016 Last change: Mon Sep 26 16:15:08 2016 by root via crm_resource on zs95kjpcs1 Stack: corosync Current DC: zs93KLpcs1 (version 1.1.13-10.el7_2.ibm.1-44eb2dd) - partition with quorum 106 nodes and 304 resources configured Node zs90kppcs1: pending Node zs93kjpcs1: pending Node zs95KLpcs1: pending Node zs95kjpcs1: pending Online: [ zs93KLpcs1 ] Full list of resources: zs95kjg109062_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109063_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 . . . Here you can see that corosync is up on all 5 nodes: [root@zs95kj VD]# date;for host in zs90kppcs1 zs95KLpcs1 zs95kjpcs1 zs93kjpcs1 zs93KLpcs1 ; do ssh $host "hostname;ps -ef |grep corosync |grep -v grep"; done Wed Sep 28 15:22:21 EDT 2016 zs90KP root 155374 1 0 Sep26 ? 00:10:17 corosync zs95KL root 22933 1 0 11:51 ? 00:00:54 corosync zs95kj root 19382 1 0 Sep26 ? 00:10:15 corosync zs93kj root 129102 1 0 Sep26 ? 00:12:10 corosync zs93kl root 21894 1 0 15:19 ? 00:00:00 corosync But, pacemaker is only running on the one, online node: [root@zs95kj VD]# date;for host in zs90kppcs1 zs95KLpcs1 zs95kjpcs1 zs93kjpcs1 zs93KLpcs1 ; do ssh $host "hostname;ps -ef |grep pacemakerd | grep -v grep"; done Wed Sep 28 15:23:29 EDT 2016 zs90KP zs95KL zs95kj zs93kj zs93kl root 23005 1 0 15:19 ? 00:00:00 /usr/sbin/pacemakerd -f You have new mail in /var/spool/mail/root [root@zs95kj VD]# This situation wreaks havoc on my VirtualDomain resources, as the majority of them are in FAILED or Stopped state, and to my surprise... many of them show as Started: [root@zs93kl VD]# date;pcs resource show |grep zs93KL Wed Sep 28 15:55:29 EDT 2016 zs95kjg109062_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109063_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109064_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109065_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109066_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109068_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109069_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109070_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109071_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109072_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109073_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109074_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109075_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109076_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109077_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109078_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109079_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109080_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109081_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109082_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109083_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109084_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109085_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109086_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109087_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109088_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109089_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109090_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109092_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109095_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109096_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109097_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109101_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109102_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg109104_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110063_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110065_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110066_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110067_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110068_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110069_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110070_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110071_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110072_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110073_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110074_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110075_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110076_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110079_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110080_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110081_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110082_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110084_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110086_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110087_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110088_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110089_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110103_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110104_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110093_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110094_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110095_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110097_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110099_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110100_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110101_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110102_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110098_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110105_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110106_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110107_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110108_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110109_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110110_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110111_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110112_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110113_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110114_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110115_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110116_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110117_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110118_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110119_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110120_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110121_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110122_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110123_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110124_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110125_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110126_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110128_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110129_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110130_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110131_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110132_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110133_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110134_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110135_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110137_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110138_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110139_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110140_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110141_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110142_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110143_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110144_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110145_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110146_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110148_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110149_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110150_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110152_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110154_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110155_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110156_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110159_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110160_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110161_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110164_res (ocf::heartbeat:VirtualDomain): Started zs93KLpcs1 zs95kjg110165_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 zs95kjg110166_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1 Pacemaker is attempting to activate all VirtualDomain resources on the one cluster node. So back to my original question... what should happen when I do a cluster stop? If it should be deactivating, what would prevent this? Also, I have tried simulating a failed cluster node (to trigger a STONITH action) by killing the corosync daemon on one node, but all that does is respawn the daemon ... causing a temporary / transient failure condition, and no fence takes place. Is there a way to kill corosync in such a way that it stays down? Is there a best practice for STONITH testing? As usual, thanks in advance for your advice. Scott Greenlese ... IBM KVM on System Z - Solutions Test, Poughkeepsie, N.Y. INTERNET: [email protected] From: Ken Gaillot <[email protected]> To: [email protected] Date: 09/09/2016 06:23 PM Subject: Re: [ClusterLabs] Pacemaker quorum behavior On 09/09/2016 04:27 AM, Klaus Wenninger wrote: > On 09/08/2016 07:31 PM, Scott Greenlese wrote: >> >> Hi Klaus, thanks for your prompt and thoughtful feedback... >> >> Please see my answers nested below (sections entitled, "Scott's >> Reply"). Thanks! >> >> - Scott >> >> >> Scott Greenlese ... IBM Solutions Test, Poughkeepsie, N.Y. >> INTERNET: [email protected] >> PHONE: 8/293-7301 (845-433-7301) M/S: POK 42HA/P966 >> >> >> Inactive hide details for Klaus Wenninger ---09/08/2016 10:59:27 >> AM---On 09/08/2016 03:55 PM, Scott Greenlese wrote: >Klaus Wenninger >> ---09/08/2016 10:59:27 AM---On 09/08/2016 03:55 PM, Scott Greenlese >> wrote: > >> >> From: Klaus Wenninger <[email protected]> >> To: [email protected] >> Date: 09/08/2016 10:59 AM >> Subject: Re: [ClusterLabs] Pacemaker quorum behavior >> >> ------------------------------------------------------------------------ >> >> >> >> On 09/08/2016 03:55 PM, Scott Greenlese wrote: >> > >> > Hi all... >> > >> > I have a few very basic questions for the group. >> > >> > I have a 5 node (Linux on Z LPARs) pacemaker cluster with 100 >> > VirtualDomain pacemaker-remote nodes >> > plus 100 "opaque" VirtualDomain resources. The cluster is configured >> > to be 'symmetric' and I have no >> > location constraints on the 200 VirtualDomain resources (other than to >> > prevent the opaque guests >> > from running on the pacemaker remote node resources). My quorum is set >> > as: >> > >> > quorum { >> > provider: corosync_votequorum >> > } >> > >> > As an experiment, I powered down one LPAR in the cluster, leaving 4 >> > powered up with the pcsd service up on the 4 survivors >> > but corosync/pacemaker down (pcs cluster stop --all) on the 4 >> > survivors. I then started pacemaker/corosync on a single cluster >> > >> >> "pcs cluster stop" shuts down pacemaker & corosync on my test-cluster but >> did you check the status of the individual services? >> >> Scott's reply: >> >> No, I only assumed that pacemaker was down because I got this back on >> my pcs status >> command from each cluster node: >> >> [root@zs95kj VD]# date;for host in zs93KLpcs1 zs95KLpcs1 zs95kjpcs1 >> zs93kjpcs1 ; do ssh $host pcs status; done >> Wed Sep 7 15:49:27 EDT 2016 >> Error: cluster is not currently running on this node >> Error: cluster is not currently running on this node >> Error: cluster is not currently running on this node >> Error: cluster is not currently running on this node In my experience, this is sufficient to say that pacemaker and corosync aren't running. >> >> What else should I check? The pcsd.service service was still up, >> since I didn't not stop that >> anywhere. Should I have done, ps -ef |grep -e pacemaker -e corosync >> to check the state before >> assuming it was really down? >> >> > Guess the answer from Poki should guide you well here ... >> >> >> > node (pcs cluster start), and this resulted in the 200 VirtualDomain >> > resources activating on the single node. >> > This was not what I was expecting. I assumed that no resources would >> > activate / start on any cluster nodes >> > until 3 out of the 5 total cluster nodes had pacemaker/corosync running. Your expectation is correct; I'm not sure what happened in this case. There are some obscure corosync options (e.g. last_man_standing, allow_downscale) that could theoretically lead to this, but I don't get the impression you're using anything unusual. >> > After starting pacemaker/corosync on the single host (zs95kjpcs1), >> > this is what I see : >> > >> > [root@zs95kj VD]# date;pcs status |less >> > Wed Sep 7 15:51:17 EDT 2016 >> > Cluster name: test_cluster_2 >> > Last updated: Wed Sep 7 15:51:18 2016 Last change: Wed Sep 7 15:30:12 >> > 2016 by hacluster via crmd on zs93kjpcs1 >> > Stack: corosync >> > Current DC: zs95kjpcs1 (version 1.1.13-10.el7_2.ibm.1-44eb2dd) - >> > partition with quorum >> > 106 nodes and 304 resources configured >> > >> > Node zs93KLpcs1: pending >> > Node zs93kjpcs1: pending >> > Node zs95KLpcs1: pending >> > Online: [ zs95kjpcs1 ] >> > OFFLINE: [ zs90kppcs1 ] >> > >> > . >> > . >> > . >> > PCSD Status: >> > zs93kjpcs1: Online >> > zs95kjpcs1: Online >> > zs95KLpcs1: Online >> > zs90kppcs1: Offline >> > zs93KLpcs1: Online FYI the Online/Offline above refers only to pcsd, which doesn't have any effect on the cluster itself -- just the ability to run pcs commands. >> > So, what exactly constitutes an "Online" vs. "Offline" cluster node >> > w.r.t. quorum calculation? Seems like in my case, it's "pending" on 3 >> > nodes, >> > so where does that fall? Any why "pending"? What does that mean? "pending" means that the node has joined the corosync cluster (which allows it to contribute to quorum), but it has not yet completed the pacemaker join process (basically a handshake with the DC). I think the corosync and pacemaker detail logs would be essential to figuring out what's going on. Check the logs on the "pending" nodes to see whether corosync somehow started up by this point, and check the logs on this node to see what the most recent references to the pending nodes were. >> > Also, what exactly is the cluster's expected reaction to quorum loss? >> > Cluster resources will be stopped or something else? >> > >> Depends on how you configure it using cluster property no-quorum-policy >> (default: stop). >> >> Scott's reply: >> >> This is how the policy is configured: >> >> [root@zs95kj VD]# date;pcs config |grep quorum >> Thu Sep 8 13:18:33 EDT 2016 >> no-quorum-policy: stop >> >> What should I expect with the 'stop' setting? >> >> >> > >> > >> > Where can I find this documentation? >> > >> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ >> >> Scott's reply: >> >> OK, I'll keep looking thru this doc, but I don't easily find the >> no-quorum-policy explained. >> > Well, the index leads you to: > http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-cluster-options.html > where you find an exhaustive description of the option. > > In short: > you are running the default and that leads to all resources being > stopped in a partition without quorum > >> Thanks.. >> >> >> > >> > >> > Thanks! >> > >> > Scott Greenlese - IBM Solution Test Team. >> > >> > >> > >> > Scott Greenlese ... IBM Solutions Test, Poughkeepsie, N.Y. >> > INTERNET: [email protected] >> > PHONE: 8/293-7301 (845-433-7301) M/S: POK 42HA/P966 _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
_______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
