[ClusterLabs] Opt-in cluster shows resources stopped where no nodes should be considered
Hello all While our cluster seems to be working just fine I have noticed something in the crm_mon output that I don't quite understand and that is throwing off my monitoring a bit as stopped resources could mean something is wrong. I was hoping somebody could help me to understand what it means. It seems this might have something to do with the fact I am using remote nodes, but I cannot wrap my head around it. What I am seeing are 3 additional, unexpected lines in the crm_mon -1rR output listing my "p_pgcPgbouncer_test" resources as stopped even though there should not be any more nodes to be considered in my mind (opt-in cluster, see location rules). At the same time this is not happening to my p_pgsqln resources as shown at the top of the crm_mon output. The important crm_mon -1rR output lines further below are marked with arrows -> <---. Some background on the policy: We are running an asymmetric / opt-in cluster (property symmetric-cluster=false. The cluster's main purpose is to take care of a 3+-nodes replicating master / slave database running strictly on nodes pg1, pg2 and pg3 per location rule l_pgs_resources. We also have 2 remote nodes pagalog1 & pgalog2 defined to control database connection pooler resources (p_pgcPgbouncer_test) to facilitate client connection reroute as per location rule l_pgc_resources. crm_mon -1rR output: Last updated: Fri Mar 4 09:56:02 2016 Last change: Fri Mar 4 09:55:47 2016 by root via cibadmin on pg1 Stack: corosync Current DC: pg1 (1) (version 1.1.14-70404b0) - partition with quorum 5 nodes and 29 resources configured Online: [ pg1 (1) pg2 (2) pg3 (3) ] RemoteOnline: [ pgalog1 pgalog2 ] Full list of resources: Master/Slave Set: ms_pgsqln [p_pgsqln] p_pgsqln (ocf::heartbeat:pgsqln):Master pg3 p_pgsqln (ocf::heartbeat:pgsqln):Started pg1 p_pgsqln (ocf::heartbeat:pgsqln):Started pg2 -> NO additional lines here <--- Masters: [ pg3 ] Stopped: [ pg1 pg2 ] [...] pgalog1(ocf::pacemaker:remote):Started pg1 pgalog2(ocf::pacemaker:remote):Started pg3 Clone Set: cl_pgcPgbouncer [p_pgcPgbouncer_test] p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Started pgalog1 p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Started pgalog2 -> p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped < -> p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped < -> p_pgcPgbouncer_test(ocf::heartbeat:pgbouncer): Stopped < Started: [ pgalog1 pgalog2 ] Here are the most important parts of the configuration as shown in "crm configure show": [...] primitive pgalog1 ocf:pacemaker:remote \ params server=pgalog1 port=3121 \ meta target-role=Started primitive pgalog2 ocf:pacemaker:remote \ params server=pgalog2 port=3121 \ meta target-role=Started [...] location l_pgc_resources { cl_pgcPgbouncer } resource-discovery=exclusive \ rule #uname eq pgalog1 \ rule #uname eq pgalog2 location l_pgs_resources { cl_pgsServices1 ms_pgsqln p_pgsBackupjob pgalog1 pgalog2 } resource-discovery=exclusive \ rule #uname eq pg1 \ rule #uname eq pg2 \ rule #uname eq pg3 [...] property cib-bootstrap-options: \ symmetric-cluster=false \ [...] Regards, Martin Schlegel ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Removing node from pacemaker.
I have tried it on my cluster, "crm node delete" just removes node from the cib without updating of corosync.conf. After restart of pacemaker service you will get something like this: Online: [ node1 ] OFFLINE: [ node2 ] BTW, you will get the same state after "pacemaker restart", if you remove a node from corosync.conf and do not call "crm corosync reload". On 03/04/2016 12:07 PM, Dejan Muhamedagic wrote: Hi, On Thu, Mar 03, 2016 at 03:20:56PM +0300, Andrei Maruha wrote: Hi, Usually I use the following steps to delete node from the cluster: 1. #crm corosync del-node 2. #crm_node -R node --force 3. #crm corosync reload I'd expect all this to be wrapped in "crm node delete". Isn't that the case? Also, is "corosync reload" really required after node removal? Thanks, Dejan Instead of steps 1 and 2you can delete certain node from the corosync config manually and run: #corosync-cfgtool -R On 03/03/2016 02:44 PM, Somanath Jeeva wrote: Hi, I am trying to remove a node from the pacemaker’/corosync cluster, using the command “crm_node -R dl360x4061 –force”. Though this command removes the node from the cluster, it is appearing as offline after pacemaker/corosync restart in the nodes that are online. Is there any other command to completely delete the node from the pacemaker/corosync cluster. Pacemaker and Corosync Versions. PACEMAKER=1.1.10 COROSYNC=1.4.1 Regards Somanath Thilak J ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Removing node from pacemaker.
Hi, On Thu, Mar 03, 2016 at 03:20:56PM +0300, Andrei Maruha wrote: > Hi, > Usually I use the following steps to delete node from the cluster: > 1. #crm corosync del-node > 2. #crm_node -R node --force > 3. #crm corosync reload I'd expect all this to be wrapped in "crm node delete". Isn't that the case? Also, is "corosync reload" really required after node removal? Thanks, Dejan > Instead of steps 1 and 2you can delete certain node from the > corosync config manually and run: > #corosync-cfgtool -R > > On 03/03/2016 02:44 PM, Somanath Jeeva wrote: > > > >Hi, > > > >I am trying to remove a node from the pacemaker’/corosync cluster, > >using the command “crm_node -R dl360x4061 –force”. > > > >Though this command removes the node from the cluster, it is > >appearing as offline after pacemaker/corosync restart in the nodes > >that are online. > > > >Is there any other command to completely delete the node from the > >pacemaker/corosync cluster. > > > >Pacemaker and Corosync Versions. > > > >PACEMAKER=1.1.10 > > > >COROSYNC=1.4.1 > > > >Regards > > > >Somanath Thilak J > > > > > > > >___ > >Users mailing list: Users@clusterlabs.org > >http://clusterlabs.org/mailman/listinfo/users > > > >Project Home: http://www.clusterlabs.org > >Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >Bugs: http://bugs.clusterlabs.org > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org