Re: [ClusterLabs] Antw: Salvaging aborted resource migration

2018-09-27 Thread Ken Gaillot
On Thu, 2018-09-27 at 18:00 +0200, Ferenc Wágner wrote:
> Ken Gaillot  writes:
> 
> > On Thu, 2018-09-27 at 09:36 +0200, Ulrich Windl wrote:
> > 
> > > Obviously you violated the most important cluster rule that is
> > > "be
> > > patient".  Maybe the next important is "Don't change the
> > > configuration while the cluster is not in IDLE state" ;-)
> > 
> > Agreed -- although even idle, removing a ban can result in a
> > migration
> > back (if something like stickiness doesn't prevent it).
> 
> I've got no problem with that in general.  However, I can't gurantee
> that every configuration change happens in idle state, certain
> operations (mostly resource additions) are done by several
> administrators without synchronization, and of course asynchronous
> cluster events can also happen any time.  So I have to ask: what are
> the
> consequences of breaking this "impossible" rule?

It's not truly a rule, just a "better safe than sorry" approach. In
general the cluster is very forgiving about frequent config changes
from any node. The only non-obvious consequence is that if the cluster
is still making changes based on the previous config, then it will wait
for any action already in progress to complete, then abandon that and
recalculate based on the new config, which might reverse actions just
taken.

> > There's currently no way to tell pacemaker that an operation (i.e.
> > migrate_from) is a no-op and can be ignored. If a migration is only
> > partially completed, it has to be considered a failure and
> > reverted.
> 
> OK.  Are there other complex operations which can "partially
> complete"
> if a transition is aborted by some event?

I believe migration is the only one in that category. Perhaps a restart
could be considered similar, as it involves a separate stop and start,
but a completed stop doesn't have to be reversed in that case, so that
wouldn't cause any similar issues.

> Now let's suppose a pull migration scenario: migrate_to does nothing,
> but in this tiny window a configuration change aborts the transition.
> The resources would go through a full recovery (stop+start), right?

Yes

> Now let's suppose migrate_from gets scheduled and starts performing
> the
> migration.  Before it finishes, a configuration change aborts the
> transition.  The cluster waits for the outstanding operation to
> finish,
> doesn't it?  And if it finishes successfully, is the migration
> considered complete requiring no recovery?

Correct. If an agent has actually been executed, the cluster will wait
for that operation to complete or timeout before recalculating.

(As an aside, that can cause problems of a different sort: if an
operation in progress has a very long timeout and takes that whole
time, it can delay recovery of other resources that newly fail, even if
their recovery would not depend on the outcome of that operation.
That's a complicated problem to solve because that last clause is not
obvious to a computer program without simulating all possible results,
and even then, it can't be sure that the operation won't do something
like change a node attribute that might affect other resources.)

> > I'm not sure why the reload was scheduled; I suspect it's a bug due
> > to
> > a restart being needed but no parameters having changed. There
> > should
> > be special handling for a partial migration to make the stop
> > required.
> 
> Probably CLBZ#5309 again...  You debugged a pe-input file for me with
> a
> similar issue almost exactly a year ago (thread subject "Pacemaker
> resource parameter reload confusion").  Time to upgrade this cluster,
> I
> guess.
-- 
Ken Gaillot 
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Salvaging aborted resource migration

2018-09-27 Thread Ferenc Wágner
Ken Gaillot  writes:

> On Thu, 2018-09-27 at 09:36 +0200, Ulrich Windl wrote:
> 
>> Obviously you violated the most important cluster rule that is "be
>> patient".  Maybe the next important is "Don't change the
>> configuration while the cluster is not in IDLE state" ;-)
>
> Agreed -- although even idle, removing a ban can result in a migration
> back (if something like stickiness doesn't prevent it).

I've got no problem with that in general.  However, I can't gurantee
that every configuration change happens in idle state, certain
operations (mostly resource additions) are done by several
administrators without synchronization, and of course asynchronous
cluster events can also happen any time.  So I have to ask: what are the
consequences of breaking this "impossible" rule?

> There's currently no way to tell pacemaker that an operation (i.e.
> migrate_from) is a no-op and can be ignored. If a migration is only
> partially completed, it has to be considered a failure and reverted.

OK.  Are there other complex operations which can "partially complete"
if a transition is aborted by some event?

Now let's suppose a pull migration scenario: migrate_to does nothing,
but in this tiny window a configuration change aborts the transition.
The resources would go through a full recovery (stop+start), right?
Now let's suppose migrate_from gets scheduled and starts performing the
migration.  Before it finishes, a configuration change aborts the
transition.  The cluster waits for the outstanding operation to finish,
doesn't it?  And if it finishes successfully, is the migration
considered complete requiring no recovery?

> I'm not sure why the reload was scheduled; I suspect it's a bug due to
> a restart being needed but no parameters having changed. There should
> be special handling for a partial migration to make the stop required.

Probably CLBZ#5309 again...  You debugged a pe-input file for me with a
similar issue almost exactly a year ago (thread subject "Pacemaker
resource parameter reload confusion").  Time to upgrade this cluster, I
guess.
-- 
Thanks,
Feri
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Salvaging aborted resource migration

2018-09-27 Thread Ken Gaillot
On Thu, 2018-09-27 at 09:36 +0200, Ulrich Windl wrote:
> Hi!
> 
> Obviously you violated the most important cluster rule that is "be
> patient".
> Maybe the next important is "Don't change the configuration while the
> cluster
> is not in IDLE state" ;-)

Agreed -- although even idle, removing a ban can result in a migration
back (if something like stickiness doesn't prevent it).

There's currently no way to tell pacemaker that an operation (i.e.
migrate_from) is a no-op and can be ignored. If a migration is only
partially completed, it has to be considered a failure and reverted.

I'm not sure why the reload was scheduled; I suspect it's a bug due to
a restart being needed but no parameters having changed. There should
be special handling for a partial migration to make the stop required.

> I feel these are issues that should be fixed, but the above rules
> make your
> life easier while these issues still exist.
> 
> Regards,
> Ulrich
> 
> > > > Ferenc Wágner  schrieb am 27.09.2018
> > > > um 08:37
> 
> in
> Nachricht <87tvmb5ttw@lant.ki.iif.hu>:
> > Hi,
> > 
> > The current behavior of cancelled migration with Pacemaker 1.1.16
> > with a
> > resource implementing push migration:
> > 
> > # /usr/sbin/crm_resource ‑‑ban ‑r vm‑conv‑4
> > 
> > vhbl03 crmd[10017]:   notice: State transition S_IDLE ‑>
> > S_POLICY_ENGINE
> > vhbl03 pengine[10016]:   notice: Migrate vm‑conv‑4#011(Started
> > vhbl07 ‑>
> 
> vhbl04)
> > vhbl03 crmd[10017]:   notice: Initiating migrate_to operation 
> > vm‑conv‑4_migrate_to_0 on vhbl07
> > vhbl03 pengine[10016]:   notice: Calculated transition 4633, saving
> > inputs 
> > in /var/lib/pacemaker/pengine/pe‑input‑1069.bz2
> > [...]
> > 
> > At this point, with the migration still ongoing, I wanted to get
> > rid of
> > the constraint:
> > 
> > # /usr/sbin/crm_resource ‑‑clear ‑r vm‑conv‑4
> > 
> > vhbl03 crmd[10017]:   notice: Transition aborted by deletion of 
> > rsc_location[@id='cli‑ban‑vm‑conv‑4‑on‑vhbl07']: Configuration
> > change
> > vhbl07 crmd[10233]:   notice: Result of migrate_to operation for
> > vm‑conv‑4
> 
> on 
> > vhbl07: 0 (ok)
> > vhbl03 crmd[10017]:   notice: Transition 4633 (Complete=6,
> > Pending=0, 
> > Fired=0, Skipped=1, Incomplete=6, 
> > Source=/var/lib/pacemaker/pengine/pe‑input‑1069.bz2): Stopped
> > vhbl03 pengine[10016]:   notice: Resource vm‑conv‑4 can no longer
> > migrate to
> > vhbl04. Stopping on vhbl07 too
> > vhbl03 pengine[10016]:   notice: Reload  vm‑conv‑4#011(Started
> > vhbl07)
> > vhbl03 pengine[10016]:   notice: Calculated transition 4634, saving
> > inputs 
> > in /var/lib/pacemaker/pengine/pe‑input‑1070.bz2
> > vhbl03 crmd[10017]:   notice: Initiating stop operation
> > vm‑conv‑4_stop_0 on
> > vhbl07
> > vhbl03 crmd[10017]:   notice: Initiating stop operation
> > vm‑conv‑4_stop_0 on
> > vhbl04
> > vhbl03 crmd[10017]:   notice: Initiating reload operation
> > vm‑conv‑4_reload_0
> > on vhbl04
> > 
> > This recovery was entirely unnecessary, as the resource
> > successfully
> > migrated to vhbl04 (the migrate_from operation does
> > nothing).  Pacemaker
> > does not know this, but is there a way to educate it?  I think in
> > this
> > special case it is possible to redesign the agent making migrate_to
> > a
> > no‑op and doing everything in migrate_from, which would
> > significantly
> > reduce the window between the start points of the two "halfs", but
> > I'm
> > not sure that would help in the end: Pacemaker could still decide
> > to do
> > an unnecessary stop+start recovery.  Would it?  I failed to find
> > any
> > documentation on recovery from aborted migration transitions.  I
> > don't
> > expect on‑fail (for migrate_* ops, not me) to apply here, does it?
> > 
> > Side question: why initiate a reload in any case, like above?
> > 
> > Even more side question: could you please consider using space
> > instead
> > of TAB in syslog messages?  (Actually, I wouldn't mind getting rid
> > of
> > them altogether in any output.)
> > ‑‑ 
> > Thanks,
> > Feri
> > ___
> > Users mailing list: Users@clusterlabs.org 
> > https://lists.clusterlabs.org/mailman/listinfo/users 
> > 
> > Project Home: http://www.clusterlabs.org 
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratc
> > h.pdf 
> > Bugs: http://bugs.clusterlabs.org 
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
-- 
Ken Gaillot 
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org