Ken, I have another set of logs :
Sep 01 09:10:05 [1328] TPC-F9-26.phaedrus.sandvine.com crmd: info: do_lrm_rsc_op: Performing key=5:50864:0:86160921-abd7-4e14-94d4-f53cee278858 op=SVSDEHA_monitor_2000 SvsdeStateful(SVSDEHA)[6174]: 2017/09/01_09:10:06 ERROR: Resource is in failed state Sep 01 09:10:06 [1328] TPC-F9-26.phaedrus.sandvine.com crmd: info: action_synced_wait: Managed SvsdeStateful_meta-data_0 process 6274 exited with rc=4 Sep 01 09:10:06 [1328] TPC-F9-26.phaedrus.sandvine.com crmd: error: generic_get_metadata: Failed to receive meta-data for ocf:pacemaker:SvsdeStateful Sep 01 09:10:06 [1328] TPC-F9-26.phaedrus.sandvine.com crmd: error: build_operation_update: No metadata for ocf::pacemaker:SvsdeStateful Sep 01 09:10:06 [1328] TPC-F9-26.phaedrus.sandvine.com crmd: info: process_lrm_event: Result of monitor operation for SVSDEHA on TPC-F9-26.phaedrus.sandvine.com: 0 (ok) | call=939 key=SVSDEHA_monitor_2000 confirmed=false cib-update=476 Sep 01 09:10:06 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_process_request: Forwarding cib_modify operation for section status to all (origin=local/crmd/476) Sep 01 09:10:06 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: Diff: --- 0.37.4054 2 Sep 01 09:10:06 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: Diff: +++ 0.37.4055 (null) Sep 01 09:10:06 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: + /cib: @num_updates=4055 Sep 01 09:10:06 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: ++ /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='SVSDEHA']: <lrm_rsc_op id="SVSDEHA_monitor_2000" operation_key="SVSDEHA_monitor_2000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" transition-key="5:50864:0:86160921-abd7-4e14-94d4-f53cee278858" transition-magic="0:0;5:50864:0:86160921-abd7-4e14-94d4-f53cee278858" on_node="TPC-F9-26.phaedrus.sandvi Sep 01 09:10:06 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=TPC-F9-26.phaedrus.sandvine.com/crmd/476, version=0.37.4055) *Sep 01 09:10:12 [1325] TPC-F9-26.phaedrus.sandvine.com <http://TPC-F9-26.phaedrus.sandvine.com> cib: info: cib_process_ping: Reporting our current digest to TPC-E9-23.phaedrus.sandvine.com <http://TPC-E9-23.phaedrus.sandvine.com>: 74bbb7e9f35fabfdb624300891e32018 for 0.37.4055 (0x7f5719954560 0)Sep 01 09:15:33 [1325] TPC-F9-26.phaedrus.sandvine.com <http://TPC-F9-26.phaedrus.sandvine.com> cib: info: cib_perform_op: Diff: --- 0.37.4055 2* Sep 01 09:15:33 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: Diff: +++ 0.37.4056 (null) Sep 01 09:15:33 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: + /cib: @num_updates=4056 Sep 01 09:15:33 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: ++ /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='SVSDEHA']: <lrm_rsc_op id="SVSDEHA_last_failure_0" operation_key="SVSDEHA_monitor_1000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" transition-key="7:50662:8:86160921-abd7-4e14-94d4-f53cee278858" transition-magic="2:1;7:50662:8:86160921-abd7-4e14-94d4-f53cee278858" on_node="TPC-E9-23.phaedrus.sand Sep 01 09:15:33 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=TPC-E9-23.phaedrus.sandvine.com/crmd/53508, version=0.37.4056) Sep 01 09:15:33 [1327] TPC-F9-26.phaedrus.sandvine.com attrd: info: attrd_peer_update: Setting fail-count-SVSDEHA[ TPC-E9-23.phaedrus.sandvine.com]: (null) -> 1 from TPC-E9-23.phaedrus.sandvine.com Sep 01 09:15:33 [1327] TPC-F9-26.phaedrus.sandvine.com attrd: info: attrd_peer_update: Setting last-failure-SVSDEHA[ TPC-E9-23.phaedrus.sandvine.com]: (null) -> 1504271733 from TPC-E9-23.phaedrus.sandvine.com Sep 01 09:15:33 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: Diff: --- 0.37.4056 2 Sep 01 09:15:33 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: Diff: +++ 0.37.4057 (null) Sep 01 09:15:33 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: + /cib: @num_updates=4057 Sep 01 09:15:33 [1325] TPC-F9-26.phaedrus.sandvine.com cib: info: cib_perform_op: ++ /cib/status/node_state[@id='2']/transient_attributes[@id='2']/instance_attributes[@id='status-2']: <nvpair id="status-2-fail-count-SVSDEHA" name="fail-count-SVSDEHA" value="1"/> I was suspecting around the highlighted parts of the logs above. After 09:10:12 the next log is at 09:15:33. During this time other node failed several times but was not migrated here. I am yet to check with sbd fencing with the patch shared by Klaus. I am on CentOS. # cat /etc/centos-release CentOS Linux release 7.3.1611 (Core) Regards, Abhay On Sat, 2 Sep 2017 at 15:23 Klaus Wenninger <[email protected]> wrote: > On 09/01/2017 11:45 PM, Ken Gaillot wrote: > > On Fri, 2017-09-01 at 15:06 +0530, Abhay B wrote: > >> Are you sure the monitor stopped? Pacemaker only logs > >> recurring monitors > >> when the status changes. Any successful monitors after this > >> wouldn't be > >> logged. > >> > >> Yes. Since there were no logs which said "RecurringOp: Start > >> recurring monitor" on the node after it had failed. > >> Also there were no logs for any actions pertaining to > >> The problem was that even though the one node was failing, the > >> resources were never moved to the other node(the node on which I > >> suspect monitoring had stopped). > >> > >> > >> There are a lot of resource action failures, so I'm not sure > >> where the > >> issue is, but I'm guessing it has to do with > >> migration-threshold=1 -- > >> once a resource has failed once on a node, it won't be allowed > >> back on > >> that node until the failure is cleaned up. Of course you also > >> have > >> failure-timeout=1s, which should clean it up immediately, so > >> I'm not > >> sure. > >> > >> > >> migration-threshold=1 > >> failure-timeout=1s > >> > >> cluster-recheck-interval=2s > >> > >> > >> first, set "two_node: > >> 1" in corosync.conf and let no-quorum-policy default in > >> pacemaker > >> > >> > >> This is already configured. > >> # cat /etc/corosync/corosync.conf > >> totem { > >> version: 2 > >> secauth: off > >> cluster_name: SVSDEHA > >> transport: udpu > >> token: 5000 > >> } > >> > >> > >> nodelist { > >> node { > >> ring0_addr: 2.0.0.10 > >> nodeid: 1 > >> } > >> > >> > >> node { > >> ring0_addr: 2.0.0.11 > >> nodeid: 2 > >> } > >> } > >> > >> > >> quorum { > >> provider: corosync_votequorum > >> two_node: 1 > >> } > >> > >> > >> logging { > >> to_logfile: yes > >> logfile: /var/log/cluster/corosync.log > >> to_syslog: yes > >> } > >> > >> > >> let no-quorum-policy default in pacemaker; then, > >> get stonith configured, tested, and enabled > >> > >> > >> By not configuring no-quorum-policy, would it ignore quorum for a 2 > >> node cluster? > > With two_node, corosync always provides quorum to pacemaker, so > > pacemaker doesn't see any quorum loss. The only significant difference > > from ignoring quorum is that corosync won't form a cluster from a cold > > start unless both nodes can reach each other (a safety feature). > > > >> For my use case I don't need stonith enabled. My intention is to have > >> a highly available system all the time. > > Stonith is the only way to recover from certain types of failure, such > > as the "split brain" scenario, and a resource that fails to stop. > > > > If your nodes are physical machines with hardware watchdogs, you can set > > up sbd for fencing without needing any extra equipment. > > Small caveat here: > If I get it right you have a 2-node-setup. In this case the watchdog-only > sbd-setup would not be usable as it relies on 'real' quorum. > In 2-node-setups sbd needs at least a single shared disk. > For the sbd-single-disk-setup working with 2-node > you need the patch from https://github.com/ClusterLabs/sbd/pull/23 > in place. (Saw you mentioning RHEL documentation - RHEL-7.4 has > it in since GA) > > Regards, > Klaus > > > > >> I will test my RA again as suggested with no-quorum-policy=default. > >> > >> > >> One more doubt. > >> Why do we see this is 'pcs property' ? > >> last-lrm-refresh: 1504090367 > >> > >> > >> > >> Never seen this on a healthy cluster. > >> From RHEL documentation: > >> last-lrm-refresh > >> > >> Last refresh of the > >> Local Resource Manager, > >> given in units of > >> seconds since epoca. > >> Used for diagnostic > >> purposes; not > >> user-configurable. > >> > >> > >> Doesn't explain much. > > Whenever a cluster property changes, the cluster rechecks the current > > state to see if anything needs to be done. last-lrm-refresh is just a > > dummy property that the cluster uses to trigger that. It's set in > > certain rare circumstances when a resource cleanup is done. You should > > see a line in your logs like "Triggering a refresh after ... deleted ... > > from the LRM". That might give some idea of why. > > > >> Also. does avg. CPU load impact resource monitoring ? > >> > >> > >> Regards, > >> Abhay > > Well, it could cause the monitor to take so long that it times out. The > > only direct effect of load on pacemaker is that the cluster might lower > > the number of agent actions that it can execute simultaneously. > > > > > >> On Thu, 31 Aug 2017 at 20:11 Ken Gaillot <[email protected]> wrote: > >> > >> On Thu, 2017-08-31 at 06:41 +0000, Abhay B wrote: > >> > Hi, > >> > > >> > > >> > I have a 2 node HA cluster configured on CentOS 7 with pcs > >> command. > >> > > >> > > >> > Below are the properties of the cluster : > >> > > >> > > >> > # pcs property > >> > Cluster Properties: > >> > cluster-infrastructure: corosync > >> > cluster-name: SVSDEHA > >> > cluster-recheck-interval: 2s > >> > dc-deadtime: 5 > >> > dc-version: 1.1.15-11.el7_3.5-e174ec8 > >> > have-watchdog: false > >> > last-lrm-refresh: 1504090367 > >> > no-quorum-policy: ignore > >> > start-failure-is-fatal: false > >> > stonith-enabled: false > >> > > >> > > >> > PFA the cib. > >> > Also attached is the corosync.log around the time the below > >> issue > >> > happened. > >> > > >> > > >> > After around 10 hrs and multiple failures, pacemaker stops > >> monitoring > >> > resource on one of the nodes in the cluster. > >> > > >> > > >> > So even though the resource on other node fails, it is never > >> migrated > >> > to the node on which the resource is not monitored. > >> > > >> > > >> > Wanted to know what could have triggered this and how to > >> avoid getting > >> > into such scenarios. > >> > I am going through the logs and couldn't find why this > >> happened. > >> > > >> > > >> > After this log the monitoring stopped. > >> > > >> > Aug 29 11:01:44 [16500] TPC-D12-10-002.phaedrus.sandvine.com > >> > crmd: info: process_lrm_event: Result of monitor > >> operation for > >> > SVSDEHA on TPC-D12-10-002.phaedrus.sandvine.com: 0 (ok) | > >> call=538 > >> > key=SVSDEHA_monitor_2000 confirmed=false cib-update=50013 > >> > >> Are you sure the monitor stopped? Pacemaker only logs > >> recurring monitors > >> when the status changes. Any successful monitors after this > >> wouldn't be > >> logged. > >> > >> > Below log says the resource is leaving the cluster. > >> > Aug 29 11:01:44 [16499] TPC-D12-10-002.phaedrus.sandvine.com > >> > pengine: info: LogActions: Leave SVSDEHA:0 > >> (Slave > >> > TPC-D12-10-002.phaedrus.sandvine.com) > >> > >> This means that the cluster will leave the resource where it > >> is (i.e. it > >> doesn't need a start, stop, move, demote, promote, etc.). > >> > >> > Let me know if anything more is needed. > >> > > >> > > >> > Regards, > >> > Abhay > >> > > >> > > >> > PS:'pcs resource cleanup' brought the cluster back into good > >> state. > >> > >> There are a lot of resource action failures, so I'm not sure > >> where the > >> issue is, but I'm guessing it has to do with > >> migration-threshold=1 -- > >> once a resource has failed once on a node, it won't be allowed > >> back on > >> that node until the failure is cleaned up. Of course you also > >> have > >> failure-timeout=1s, which should clean it up immediately, so > >> I'm not > >> sure. > >> > >> My gut feeling is that you're trying to do too many things at > >> once. I'd > >> start over from scratch and proceed more slowly: first, set > >> "two_node: > >> 1" in corosync.conf and let no-quorum-policy default in > >> pacemaker; then, > >> get stonith configured, tested, and enabled; then, test your > >> resource > >> agent manually on the command line to make sure it conforms to > >> the > >> expected return values > >> ( > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#ap-ocf > ); then add your resource to the cluster without migration-threshold or > failure-timeout, and work out any issues with frequent failures; then > finally set migration-threshold and failure-timeout to reflect how you want > recovery to proceed. > >> -- > >> Ken Gaillot <[email protected]> > >> > >> > >> > >> > >> > >> _______________________________________________ > >> Users mailing list: [email protected] > >> http://lists.clusterlabs.org/mailman/listinfo/users > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: > >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Users mailing list: [email protected] > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org >
_______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
