Re: [ClusterLabs] How Pacemaker reacts to fast changes of the same parameter in configuration

2016-11-09 Thread Kostiantyn Ponomarenko
>> Actually you would need the reduced stickiness just during the stop phase
- right.
Oh, that is good to know.

While I can reduce time when waiting for only "stop" commands to finish, I
don't think that this is worth it.
Because this doesn't address my problem fully.

Does that mean that the reality is cruel, and there is no way to tell
Pacemaker - here you have this two commands, execute them sequentially?

It is all about usability for the end user.
As a last resort I was thinking about not providing this "do a fail-back"
one-shot button to a user.
But instead provide "fail-back ON/OFF" switch-button, with some kind of
indicator "resources are placed optimally".

Anyways, maybe there still are some other ideas?
I really want to have this "one shot fail-back" rock-solid solution, and
maybe I am missing here something =)
Or maybe it can be a feature request =)


Thank you,
Kostia

On Wed, Nov 9, 2016 at 6:42 PM, Klaus Wenninger  wrote:

> On 11/09/2016 05:30 PM, Kostiantyn Ponomarenko wrote:
> > When one problem seems to be solved, another one appears.
> > Now my script looks this way:
> >
> > crm --wait configure rsc_defaults resource-stickiness=50
> > crm configure rsc_defaults resource-stickiness=150
> >
> > While now I am sure that transactions caused by the first command
> > won't be aborted, I see another possible problem here.
> > With a minimum load in the cluster it took 22 sec for this script to
> > finish.
> > I see here a weakness.
> > If a node on which this script is called goes down for any reasons,
> > then "resource-stickiness" is not set back to its original value,
> > which is vary bad.
> >
> > So, now I am thinking of how to solve this problem. I would appreciate
> > any thoughts about this.
> >
> > Is there a way to ask Pacemaker to do these commands sequentially so
> > there is no need to wait in the script?
> > If it is possible, than I think that my concern from above goes away.
> >
> > Another thing which comes to my mind - is to use time based rules.
> > This ways when I need to do a manual fail-back, I simply set (or
> > update) a time-based rule from the script.
> > And the rule will basically say - set "resource-stickiness" to 50
> > right now and expire in 10 min.
> > This looks good at the first glance, but there is no a reliable way to
> > put a minimum sufficient time for it; at least not I am aware of.
> > And the thing is - it is important to me that "resource-stickiness" is
> > set back to its original value as soon as possible.
> >
> > Those are my thoughts. As I said, I appreciate any ideas here.
>
> Have never tried --wait with crmsh but I would guess that the delay you
> are observing
> is really the time your resources are taking to stop and start somewhere
> else.
>
> Actually you would need the reduced stickiness just during the stop
> phase - right.
>
> So as there is no command like "wait till all stops are done" you could
> still
> do the 'crm_simulate -Ls' and check that it doesn't want to stop
> anything anymore.
> So you can save the time the starts would take.
> Unfortunately you have to repeat that and thus put additional load on
> pacemaker
> possibly slowing down things if your poll-cycle is to short.
>
> >
> >
> > Thank you,
> > Kostia
> >
> > On Tue, Nov 8, 2016 at 10:19 PM, Dejan Muhamedagic
> > > wrote:
> >
> > On Tue, Nov 08, 2016 at 12:54:10PM +0100, Klaus Wenninger wrote:
> > > On 11/08/2016 11:40 AM, Kostiantyn Ponomarenko wrote:
> > > > Hi,
> > > >
> > > > I need a way to do a manual fail-back on demand.
> > > > To be clear, I don't want it to be ON/OFF; I want it to be
> > more like
> > > > "one shot".
> > > > So far I found that the most reasonable way to do it - is to set
> > > > "resource stickiness" to a different value, and then set it
> > back to
> > > > what it was.
> > > > To do that I created a simple script with two lines:
> > > >
> > > > crm configure rsc_defaults resource-stickiness=50
> > > > crm configure rsc_defaults resource-stickiness=150
> > > >
> > > > There are no timeouts before setting the original value back.
> > > > If I call this script, I get what I want - Pacemaker moves
> > resources
> > > > to their preferred locations, and "resource stickiness" is set
> > back to
> > > > its original value.
> > > >
> > > > Despite it works, I still have few concerns about this approach.
> > > > Will I get the same behavior under a big load with delays on
> > systems
> > > > in cluster (which is truly possible and a normal case in my
> > environment)?
> > > > How Pacemaker treats fast change of this parameter?
> > > > I am worried that if "resource stickiness" is set back to its
> > original
> > > > value to fast, then no fail-back will happen. Is it possible, or
> I
> > > > shouldn't worry about it?
> > >
> 

Re: [ClusterLabs] How Pacemaker reacts to fast changes of the same parameter in configuration

2016-11-09 Thread Klaus Wenninger
On 11/09/2016 05:30 PM, Kostiantyn Ponomarenko wrote:
> When one problem seems to be solved, another one appears.
> Now my script looks this way:
>
> crm --wait configure rsc_defaults resource-stickiness=50
> crm configure rsc_defaults resource-stickiness=150
>
> While now I am sure that transactions caused by the first command
> won't be aborted, I see another possible problem here.
> With a minimum load in the cluster it took 22 sec for this script to
> finish. 
> I see here a weakness. 
> If a node on which this script is called goes down for any reasons,
> then "resource-stickiness" is not set back to its original value,
> which is vary bad.
>
> So, now I am thinking of how to solve this problem. I would appreciate
> any thoughts about this.
>
> Is there a way to ask Pacemaker to do these commands sequentially so
> there is no need to wait in the script?
> If it is possible, than I think that my concern from above goes away.
>
> Another thing which comes to my mind - is to use time based rules.
> This ways when I need to do a manual fail-back, I simply set (or
> update) a time-based rule from the script.
> And the rule will basically say - set "resource-stickiness" to 50
> right now and expire in 10 min.
> This looks good at the first glance, but there is no a reliable way to
> put a minimum sufficient time for it; at least not I am aware of.
> And the thing is - it is important to me that "resource-stickiness" is
> set back to its original value as soon as possible.
>
> Those are my thoughts. As I said, I appreciate any ideas here.

Have never tried --wait with crmsh but I would guess that the delay you
are observing
is really the time your resources are taking to stop and start somewhere
else.

Actually you would need the reduced stickiness just during the stop
phase - right.

So as there is no command like "wait till all stops are done" you could
still
do the 'crm_simulate -Ls' and check that it doesn't want to stop
anything anymore.
So you can save the time the starts would take.
Unfortunately you have to repeat that and thus put additional load on
pacemaker
possibly slowing down things if your poll-cycle is to short.

>
>
> Thank you,
> Kostia
>
> On Tue, Nov 8, 2016 at 10:19 PM, Dejan Muhamedagic
> > wrote:
>
> On Tue, Nov 08, 2016 at 12:54:10PM +0100, Klaus Wenninger wrote:
> > On 11/08/2016 11:40 AM, Kostiantyn Ponomarenko wrote:
> > > Hi,
> > >
> > > I need a way to do a manual fail-back on demand.
> > > To be clear, I don't want it to be ON/OFF; I want it to be
> more like
> > > "one shot".
> > > So far I found that the most reasonable way to do it - is to set
> > > "resource stickiness" to a different value, and then set it
> back to
> > > what it was.
> > > To do that I created a simple script with two lines:
> > >
> > > crm configure rsc_defaults resource-stickiness=50
> > > crm configure rsc_defaults resource-stickiness=150
> > >
> > > There are no timeouts before setting the original value back.
> > > If I call this script, I get what I want - Pacemaker moves
> resources
> > > to their preferred locations, and "resource stickiness" is set
> back to
> > > its original value.
> > >
> > > Despite it works, I still have few concerns about this approach.
> > > Will I get the same behavior under a big load with delays on
> systems
> > > in cluster (which is truly possible and a normal case in my
> environment)?
> > > How Pacemaker treats fast change of this parameter?
> > > I am worried that if "resource stickiness" is set back to its
> original
> > > value to fast, then no fail-back will happen. Is it possible, or I
> > > shouldn't worry about it?
> >
> > AFAIK pengine is interrupted when calculating a more complicated
> transition
> > and if the situation has changed a transition that is just being
> executed
> > is aborted if the input from pengine changed.
> > So I would definitely worry!
> > What you could do is to issue 'crm_simulate -Ls' in between and
> grep for
> > an empty transition.
> > There might be more elegant ways but that should be safe.
>
> crmsh has an option (-w) to wait for the PE to settle after
> committing configuration changes.
>
> Thanks,
>
> Dejan
> >
> > > Thank you,
> > > Kostia
> > >
> > >
> > > ___
> > > Users mailing list: Users@clusterlabs.org
> 
> > > http://clusterlabs.org/mailman/listinfo/users
> 
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> 
> > > Bugs: 

Re: [ClusterLabs] How Pacemaker reacts to fast changes of the same parameter in configuration

2016-11-09 Thread Kostiantyn Ponomarenko
When one problem seems to be solved, another one appears.
Now my script looks this way:

crm --wait configure rsc_defaults resource-stickiness=50
crm configure rsc_defaults resource-stickiness=150

While now I am sure that transactions caused by the first command won't be
aborted, I see another possible problem here.
With a minimum load in the cluster it took 22 sec for this script to
finish.
I see here a weakness.
If a node on which this script is called goes down for any reasons, then
"resource-stickiness" is not set back to its original value, which is vary
bad.

So, now I am thinking of how to solve this problem. I would appreciate any
thoughts about this.

Is there a way to ask Pacemaker to do these commands sequentially so there
is no need to wait in the script?
If it is possible, than I think that my concern from above goes away.

Another thing which comes to my mind - is to use time based rules.
This ways when I need to do a manual fail-back, I simply set (or update) a
time-based rule from the script.
And the rule will basically say - set "resource-stickiness" to 50 right now
and expire in 10 min.
This looks good at the first glance, but there is no a reliable way to put
a minimum sufficient time for it; at least not I am aware of.
And the thing is - it is important to me that "resource-stickiness" is set
back to its original value as soon as possible.

Those are my thoughts. As I said, I appreciate any ideas here.


Thank you,
Kostia

On Tue, Nov 8, 2016 at 10:19 PM, Dejan Muhamedagic 
wrote:

> On Tue, Nov 08, 2016 at 12:54:10PM +0100, Klaus Wenninger wrote:
> > On 11/08/2016 11:40 AM, Kostiantyn Ponomarenko wrote:
> > > Hi,
> > >
> > > I need a way to do a manual fail-back on demand.
> > > To be clear, I don't want it to be ON/OFF; I want it to be more like
> > > "one shot".
> > > So far I found that the most reasonable way to do it - is to set
> > > "resource stickiness" to a different value, and then set it back to
> > > what it was.
> > > To do that I created a simple script with two lines:
> > >
> > > crm configure rsc_defaults resource-stickiness=50
> > > crm configure rsc_defaults resource-stickiness=150
> > >
> > > There are no timeouts before setting the original value back.
> > > If I call this script, I get what I want - Pacemaker moves resources
> > > to their preferred locations, and "resource stickiness" is set back to
> > > its original value.
> > >
> > > Despite it works, I still have few concerns about this approach.
> > > Will I get the same behavior under a big load with delays on systems
> > > in cluster (which is truly possible and a normal case in my
> environment)?
> > > How Pacemaker treats fast change of this parameter?
> > > I am worried that if "resource stickiness" is set back to its original
> > > value to fast, then no fail-back will happen. Is it possible, or I
> > > shouldn't worry about it?
> >
> > AFAIK pengine is interrupted when calculating a more complicated
> transition
> > and if the situation has changed a transition that is just being executed
> > is aborted if the input from pengine changed.
> > So I would definitely worry!
> > What you could do is to issue 'crm_simulate -Ls' in between and grep for
> > an empty transition.
> > There might be more elegant ways but that should be safe.
>
> crmsh has an option (-w) to wait for the PE to settle after
> committing configuration changes.
>
> Thanks,
>
> Dejan
> >
> > > Thank you,
> > > Kostia
> > >
> > >
> > > ___
> > > Users mailing list: Users@clusterlabs.org
> > > http://clusterlabs.org/mailman/listinfo/users
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://www.clusterlabs.org/
> doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> >
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] How Pacemaker reacts to fast changes of the same parameter in configuration

2016-11-08 Thread Dejan Muhamedagic
On Tue, Nov 08, 2016 at 12:54:10PM +0100, Klaus Wenninger wrote:
> On 11/08/2016 11:40 AM, Kostiantyn Ponomarenko wrote:
> > Hi,
> >
> > I need a way to do a manual fail-back on demand.
> > To be clear, I don't want it to be ON/OFF; I want it to be more like
> > "one shot".
> > So far I found that the most reasonable way to do it - is to set
> > "resource stickiness" to a different value, and then set it back to
> > what it was. 
> > To do that I created a simple script with two lines:
> >
> > crm configure rsc_defaults resource-stickiness=50
> > crm configure rsc_defaults resource-stickiness=150
> >
> > There are no timeouts before setting the original value back.
> > If I call this script, I get what I want - Pacemaker moves resources
> > to their preferred locations, and "resource stickiness" is set back to
> > its original value. 
> >
> > Despite it works, I still have few concerns about this approach.
> > Will I get the same behavior under a big load with delays on systems
> > in cluster (which is truly possible and a normal case in my environment)?
> > How Pacemaker treats fast change of this parameter?
> > I am worried that if "resource stickiness" is set back to its original
> > value to fast, then no fail-back will happen. Is it possible, or I
> > shouldn't worry about it?
> 
> AFAIK pengine is interrupted when calculating a more complicated transition
> and if the situation has changed a transition that is just being executed
> is aborted if the input from pengine changed.
> So I would definitely worry!
> What you could do is to issue 'crm_simulate -Ls' in between and grep for
> an empty transition.
> There might be more elegant ways but that should be safe.

crmsh has an option (-w) to wait for the PE to settle after
committing configuration changes.

Thanks,

Dejan
> 
> > Thank you,
> > Kostia
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] How Pacemaker reacts to fast changes of the same parameter in configuration

2016-11-08 Thread Klaus Wenninger
On 11/08/2016 11:40 AM, Kostiantyn Ponomarenko wrote:
> Hi,
>
> I need a way to do a manual fail-back on demand.
> To be clear, I don't want it to be ON/OFF; I want it to be more like
> "one shot".
> So far I found that the most reasonable way to do it - is to set
> "resource stickiness" to a different value, and then set it back to
> what it was. 
> To do that I created a simple script with two lines:
>
> crm configure rsc_defaults resource-stickiness=50
> crm configure rsc_defaults resource-stickiness=150
>
> There are no timeouts before setting the original value back.
> If I call this script, I get what I want - Pacemaker moves resources
> to their preferred locations, and "resource stickiness" is set back to
> its original value. 
>
> Despite it works, I still have few concerns about this approach.
> Will I get the same behavior under a big load with delays on systems
> in cluster (which is truly possible and a normal case in my environment)?
> How Pacemaker treats fast change of this parameter?
> I am worried that if "resource stickiness" is set back to its original
> value to fast, then no fail-back will happen. Is it possible, or I
> shouldn't worry about it?

AFAIK pengine is interrupted when calculating a more complicated transition
and if the situation has changed a transition that is just being executed
is aborted if the input from pengine changed.
So I would definitely worry!
What you could do is to issue 'crm_simulate -Ls' in between and grep for
an empty transition.
There might be more elegant ways but that should be safe.

> Thank you,
> Kostia
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] How Pacemaker reacts to fast changes of the same parameter in configuration

2016-11-08 Thread Kostiantyn Ponomarenko
Hi,

I need a way to do a manual fail-back on demand.
To be clear, I don't want it to be ON/OFF; I want it to be more like "one
shot".
So far I found that the most reasonable way to do it - is to set "resource
stickiness" to a different value, and then set it back to what it was.
To do that I created a simple script with two lines:

crm configure rsc_defaults resource-stickiness=50
crm configure rsc_defaults resource-stickiness=150

There are no timeouts before setting the original value back.
If I call this script, I get what I want - Pacemaker moves resources to
their preferred locations, and "resource stickiness" is set back to its
original value.

Despite it works, I still have few concerns about this approach.
Will I get the same behavior under a big load with delays on systems in
cluster (which is truly possible and a normal case in my environment)?
How Pacemaker treats fast change of this parameter?
I am worried that if "resource stickiness" is set back to its original
value to fast, then no fail-back will happen. Is it possible, or I
shouldn't worry about it?

Thank you,
Kostia
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org