Re: [ClusterLabs] Problems with master/slave failovers
I would tend to agree with you on this matter Andrei. To me it makes more sense for Pacemaker to prioritise maintaining a master over restarting a failed resource. If master scores are set in a sensible manner, the promoted master would immediately be given a high score and hence the other instance coming back online later would not cause a second failover. It becomes more difficult to maintain desired master scores if things depend on how long it takes for the failed resource to restart, and it then becomes a matter of timing as to whether or not Pacemaker causes the resource to failover or just restart and promote on the failed node. I'm pretty sure that's why I've been seeing the behaviour I've reported. Regards, Harvey From: Users on behalf of Andrei Borzenkov Sent: Wednesday, 3 July 2019 8:59 p.m. To: Cluster Labs - All topics related to open-source clustering welcomed Subject: EXTERNAL: Re: [ClusterLabs] Problems with master/slave failovers On Wed, Jul 3, 2019 at 12:59 AM Ken Gaillot wrote: > > On Mon, 2019-07-01 at 23:30 +, Harvey Shepherd wrote: > > > The "transition summary" is just a resource-by-resource list, not > > > the > > > order things will be done. The "executing cluster transition" > > > section > > > is the order things are being done. > > > > Thanks Ken. I think that's where the problem is originating. If you > > look at the "executing cluster transition" section, it's actually > > restarting the failed king instance BEFORE promoting the remaining > > in-service slave. When the failed resource comes back online, that > > adjusts the master scores, resulting in the transition being aborted. > > Both nodes then end up having the same master score for the king > > resource, and Pacemaker decides to re-promote the original master. I > > would have expected Pacemaker's priority to be to ensure that there > > was a master available first, then to restart the failed instance in > > slave mode. Is there a way to configure it to do that? > > No, that's intentional behavior. Starts are done before promotes so > that promotion scores are in their final state before ultimately > choosing the master. There are applications that take tens of minutes to start while failover is near to instantaneous. Enforcing slave restart before promoting means extended period of service unavailability. At the very least this must be configurable. > Otherwise, you'd end up in the same final > situation, but the master would fail over first then fail back. > Now, really - while of course resource agent is free to throw dice to decide master scores in real life in all cases I am familiar with master score is decided by underlying application state. If agent comes up and sees another instance running as master, it is highly unlikely that agent will voluntarily force master away. And if it happens I'd say agent is buggy and it is not pacemaker job to work around it. > It's up to the agent to set master scores in whatever fashion it > considers ideal. > Except pacemaker makes started resource prerequisite for it. In real life it may not even be possible to start former master before re-configuring it. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Strange monitor return code log for LSB resource
Thanks Jan for taking the time to provide such a detailed response. I was able to resolve an issue which caused these logs to disappear. The problem was that the LSB script was named "logging" and the daemon that it controlled was also called "logging". The init script uses "start-stop-daemon --name" to start and stop the logging daemon, but "start-stop-daemon --stop --name logging" was causing both the init script and daemon to be terminated, and hence an error code would be returned for the stop operation. I have no idea why crm_resource logs were produced, or why the logs seemed to indicate that that it was treating an LSB resource as OCF (expected return code of 7 - not running), but the logs were no longer generated once I renamed the init script. I've not seen them since. There were definitely no external monitoring tools installed or in use (it's a very isolated system with very few resources installed), although I was using crm_monitor at the time. Does that run crm_resource under the hood? Regards, Harvey From: Harvey Shepherd Sent: Wednesday, 26 June 2019 9:26 a.m. To: users@clusterlabs.org Subject: Re: EXTERNAL: Re: [ClusterLabs] Strange monitor return code log for LSB resource Thanks for your reply Andrei. There is no external monitoring software running. The logs that I posted are from the pacemaker log with debug enabled. The Linux hostname of the node is "ctr_qemu", but the Pacemaker name is "primary". It is actually the same node. I've just run the same test again and interestingly this time the source of the monitor logs seems to be shared between "crm_resource" and "pacemaker-schedulerd" Jun 25 21:18:31 ctr_qemu pacemaker-execd [1234] (recurring_action_timer) debug: Scheduling another invocation of logging_status_1 Jun 25 21:18:31 ctr_qemu pacemaker-execd [1234] (operation_finished) debug: logging_status_1:1181 - exited with rc=0 Jun 25 21:18:31 ctr_qemu pacemaker-execd [1234] (operation_finished) debug: logging_status_1:1181:stdout [ Remote syslog service is running ] Jun 25 21:18:31 ctr_qemu pacemaker-execd [1234] (log_finished) debug: finished - rsc:logging action:monitor call_id:127 pid:1181 exit-code:0 exec-time:0ms queue-time:0ms Jun 25 21:18:35 ctr_qemu crm_resource[1268] (determine_op_status) debug: logging_monitor_1 on primary returned 'not running' (7) instead of the expected value: 'ok' (0) Jun 25 21:18:35 ctr_qemu crm_resource[1268] (unpack_rsc_op_failure) warning: Processing failed monitor of logging on primary: not running | rc=7 Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (determine_op_status) debug: logging_monitor_1 on primary returned 'not running' (7) instead of the expected value: 'ok' (0) Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (unpack_rsc_op_failure) warning: Processing failed monitor of logging on primary: not running | rc=7 Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (common_print) info: logging (lsb:logging): Started primary Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (common_apply_stickiness) debug: Resource logging: preferring current location (node=primary, weight=1) Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (native_assign_node) debug: Assigning primary to logging Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (LogActions)info: Leave logging (Started primary) Regards, Harvey From: Users on behalf of Andrei Borzenkov Sent: Wednesday, 26 June 2019 4:47 a.m. To: users@clusterlabs.org Subject: EXTERNAL: Re: [ClusterLabs] Strange monitor return code log for LSB resource 25.06.2019 16:53, Harvey Shepherd пишет: > Hi All, > > > I have a 2 node cluster running under Pacemaker 2.0.2, with around 20 > resources configured, the majority of which are LSB resources, but there are > also a few OCF ones. One of the LSB resources is controlled via an init > script called "logging" and runs only on the master node. The CIB > configuration for it is as follows: > > > > > > > > > > > > There is a global setting which sets the default timeout: > > > > > > > > > > All of the other LSB resources are configured in the same way, and seem to > work correctly, but for some reason I see the following recurring logs for > the logging resource: > > > Jun 25 13:16:22 ctr_qemu pacemaker-execd [1234] (recurring_action_timer) > debug: Scheduling another invocation of logging_status_1 > Jun 25 13:16:22 ctr_qemu pacemaker-execd [1234] (operation_finished) > debug: logging_status_1:8423 - exited with rc=0 > Jun 25 13:16:22 ctr_qemu pacemaker-execd [1234] (operation_finished) > debug: logging_status_1:8423:stdout [ Remote syslog service is running ] > Jun 25 13:16:22
Re: [ClusterLabs] Strange monitor return code log for LSB resource
Harvey, On 25/06/19 21:26 +, Harvey Shepherd wrote: > Thanks for your reply Andrei. There is no external monitoring > software running. The logs that I posted are from the pacemaker log > with debug enabled. that's exactly what's questionable here -- how can crm_resource invocation be at all correlated to the standard operation of pacemaker cluster if you claim you don't have a clue what else (such as external monitoring built upon crm_resource or, in extension upon crm/pcs) could be triggering it. That's where Andrei was zeroing in. Anyway, to spare you possibly heavyhanded inspection of circumstances in your cluster, I'd suggest you install system and run this command as root: stap /usr/share/systemtap/examples/process/pstrace_exec.stp /logging where pstrace_exec.stp can also be obtained from https://sourceware.org/systemtap/examples/process/pstrace_exec.stp if your installation (systemtap-client package in my case) doesn't carry it. During the time, you'll likely observe occurrences where "logging" script is run directly by pacemaker-execd, plus some process chains finishing with crm_resource and these shall suggest the cause of why that is being run at all (where it likely should not, see also the remark on re-entrancy below). > The Linux hostname of the node is "ctr_qemu", but the Pacemaker name > is "primary". It is actually the same node. > > I've just run the same test again and interestingly this time the > source of the monitor logs seems to be shared between "crm_resource" > and "pacemaker-schedulerd" > > Jun 25 21:18:31 ctr_qemu pacemaker-execd [1234] (recurring_action_timer) > debug: Scheduling another invocation of logging_status_1 > Jun 25 21:18:31 ctr_qemu pacemaker-execd [1234] (operation_finished) > debug: logging_status_1:1181 - exited with rc=0 > Jun 25 21:18:31 ctr_qemu pacemaker-execd [1234] (operation_finished) > debug: logging_status_1:1181:stdout [ Remote syslog service is running ] > Jun 25 21:18:31 ctr_qemu pacemaker-execd [1234] (log_finished) > debug: finished - rsc:logging action:monitor call_id:127 pid:1181 exit-code:0 > exec-time:0ms queue-time:0ms > Jun 25 21:18:35 ctr_qemu crm_resource[1268] (determine_op_status) > debug: logging_monitor_1 on primary returned 'not running' (7) instead > of the expected value: 'ok' (0) > Jun 25 21:18:35 ctr_qemu crm_resource[1268] (unpack_rsc_op_failure) > warning: Processing failed monitor of logging on primary: not running | rc=7 > Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (determine_op_status) > debug: logging_monitor_1 on primary returned 'not running' (7) instead > of the expected value: 'ok' (0) > Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (unpack_rsc_op_failure) > warning: Processing failed monitor of logging on primary: not running | rc=7 > Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (common_print) info: > logging (lsb:logging): Started primary > Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (common_apply_stickiness) > debug: Resource logging: preferring current location (node=primary, > weight=1) > Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (native_assign_node) > debug: Assigning primary to logging > Jun 25 21:18:36 ctr_qemu pacemaker-schedulerd[1236] (LogActions)info: > Leave logging (Started primary) another thing, actually: > 25.06.2019 16:53, Harvey Shepherd пишет: >> Pacemaker is reporting failed resource actions, but fail-count is not >> incremented for the resource: >> >> Migration Summary: >> * Node primary: >> * Node secondary: >> >> Failed Resource Actions: >> * logging_monitor_1 on primary 'not running' (7): call=119, >> status=complete, exitreason='', >> last-rc-change='Tue Jun 25 13:13:12 2019', queued=0ms, exec=0ms Gotta wonder this can only be caused with what _exclusively_ pacemaker-execd observed (but that could be a result of interference with crm_resource initiated operation ... with --force-check parameter, or what else?), sadly, the log didn't contain the referenced time of "last-rc-change". Finally, speaking of interferences, this topic has been brought up recently, regarding re-entrancy in particular: https://lists.clusterlabs.org/pipermail/users/2019-February/025454.html Apparently, axioms of agent/resource execution that apply with pacemaker alone won't hold when there are multiple actors as it seems to be your case (pacemaker vs. isolated crm_resource tool based trigger). Hope this helps -- Jan (Poki) pgpWwEAgXpABx.pgp Description: PGP signature ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/