>> Actually you would need the reduced stickiness just during the stop phase - right. Oh, that is good to know.
While I can reduce time when waiting for only "stop" commands to finish, I don't think that this is worth it. Because this doesn't address my problem fully. Does that mean that the reality is cruel, and there is no way to tell Pacemaker - here you have this two commands, execute them sequentially? It is all about usability for the end user. As a last resort I was thinking about not providing this "do a fail-back" one-shot button to a user. But instead provide "fail-back ON/OFF" switch-button, with some kind of indicator "resources are placed optimally". Anyways, maybe there still are some other ideas? I really want to have this "one shot fail-back" rock-solid solution, and maybe I am missing here something =) Or maybe it can be a feature request =) Thank you, Kostia On Wed, Nov 9, 2016 at 6:42 PM, Klaus Wenninger <[email protected]> wrote: > On 11/09/2016 05:30 PM, Kostiantyn Ponomarenko wrote: > > When one problem seems to be solved, another one appears. > > Now my script looks this way: > > > > crm --wait configure rsc_defaults resource-stickiness=50 > > crm configure rsc_defaults resource-stickiness=150 > > > > While now I am sure that transactions caused by the first command > > won't be aborted, I see another possible problem here. > > With a minimum load in the cluster it took 22 sec for this script to > > finish. > > I see here a weakness. > > If a node on which this script is called goes down for any reasons, > > then "resource-stickiness" is not set back to its original value, > > which is vary bad. > > > > So, now I am thinking of how to solve this problem. I would appreciate > > any thoughts about this. > > > > Is there a way to ask Pacemaker to do these commands sequentially so > > there is no need to wait in the script? > > If it is possible, than I think that my concern from above goes away. > > > > Another thing which comes to my mind - is to use time based rules. > > This ways when I need to do a manual fail-back, I simply set (or > > update) a time-based rule from the script. > > And the rule will basically say - set "resource-stickiness" to 50 > > right now and expire in 10 min. > > This looks good at the first glance, but there is no a reliable way to > > put a minimum sufficient time for it; at least not I am aware of. > > And the thing is - it is important to me that "resource-stickiness" is > > set back to its original value as soon as possible. > > > > Those are my thoughts. As I said, I appreciate any ideas here. > > Have never tried --wait with crmsh but I would guess that the delay you > are observing > is really the time your resources are taking to stop and start somewhere > else. > > Actually you would need the reduced stickiness just during the stop > phase - right. > > So as there is no command like "wait till all stops are done" you could > still > do the 'crm_simulate -Ls' and check that it doesn't want to stop > anything anymore. > So you can save the time the starts would take. > Unfortunately you have to repeat that and thus put additional load on > pacemaker > possibly slowing down things if your poll-cycle is to short. > > > > > > > Thank you, > > Kostia > > > > On Tue, Nov 8, 2016 at 10:19 PM, Dejan Muhamedagic > > <[email protected] <mailto:[email protected]>> wrote: > > > > On Tue, Nov 08, 2016 at 12:54:10PM +0100, Klaus Wenninger wrote: > > > On 11/08/2016 11:40 AM, Kostiantyn Ponomarenko wrote: > > > > Hi, > > > > > > > > I need a way to do a manual fail-back on demand. > > > > To be clear, I don't want it to be ON/OFF; I want it to be > > more like > > > > "one shot". > > > > So far I found that the most reasonable way to do it - is to set > > > > "resource stickiness" to a different value, and then set it > > back to > > > > what it was. > > > > To do that I created a simple script with two lines: > > > > > > > > crm configure rsc_defaults resource-stickiness=50 > > > > crm configure rsc_defaults resource-stickiness=150 > > > > > > > > There are no timeouts before setting the original value back. > > > > If I call this script, I get what I want - Pacemaker moves > > resources > > > > to their preferred locations, and "resource stickiness" is set > > back to > > > > its original value. > > > > > > > > Despite it works, I still have few concerns about this approach. > > > > Will I get the same behavior under a big load with delays on > > systems > > > > in cluster (which is truly possible and a normal case in my > > environment)? > > > > How Pacemaker treats fast change of this parameter? > > > > I am worried that if "resource stickiness" is set back to its > > original > > > > value to fast, then no fail-back will happen. Is it possible, or > I > > > > shouldn't worry about it? > > > > > > AFAIK pengine is interrupted when calculating a more complicated > > transition > > > and if the situation has changed a transition that is just being > > executed > > > is aborted if the input from pengine changed. > > > So I would definitely worry! > > > What you could do is to issue 'crm_simulate -Ls' in between and > > grep for > > > an empty transition. > > > There might be more elegant ways but that should be safe. > > > > crmsh has an option (-w) to wait for the PE to settle after > > committing configuration changes. > > > > Thanks, > > > > Dejan > > > > > > > Thank you, > > > > Kostia > > > > > > > > > > > > _______________________________________________ > > > > Users mailing list: [email protected] > > <mailto:[email protected]> > > > > http://clusterlabs.org/mailman/listinfo/users > > <http://clusterlabs.org/mailman/listinfo/users> > > > > > > > > Project Home: http://www.clusterlabs.org > > > > Getting started: > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> > > > > Bugs: http://bugs.clusterlabs.org > > > > > > > > > > > > _______________________________________________ > > > Users mailing list: [email protected] > > <mailto:[email protected]> > > > http://clusterlabs.org/mailman/listinfo/users > > <http://clusterlabs.org/mailman/listinfo/users> > > > > > > Project Home: http://www.clusterlabs.org > > > Getting started: > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> > > > Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > > Users mailing list: [email protected] > > <mailto:[email protected]> > > http://clusterlabs.org/mailman/listinfo/users > > <http://clusterlabs.org/mailman/listinfo/users> > > > > Project Home: http://www.clusterlabs.org > > Getting started: > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> > > Bugs: http://bugs.clusterlabs.org > > > > > > > _______________________________________________ > Users mailing list: [email protected] > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org >
_______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
