Hi!

Obviously you violated the most important cluster rule that is "be patient".
Maybe the next important is "Don't change the configuration while the cluster
is not in IDLE state" ;-)


I feel these are issues that should be fixed, but the above rules make your
life easier while these issues still exist.

Regards,
Ulrich

>>> Ferenc Wágner <wagner.fer...@kifu.gov.hu> schrieb am 27.09.2018 um 08:37
in
Nachricht <87tvmb5ttw....@lant.ki.iif.hu>:
> Hi,
> 
> The current behavior of cancelled migration with Pacemaker 1.1.16 with a
> resource implementing push migration:
> 
> # /usr/sbin/crm_resource ‑‑ban ‑r vm‑conv‑4
> 
> vhbl03 crmd[10017]:   notice: State transition S_IDLE ‑> S_POLICY_ENGINE
> vhbl03 pengine[10016]:   notice: Migrate vm‑conv‑4#011(Started vhbl07 ‑>
vhbl04)
> vhbl03 crmd[10017]:   notice: Initiating migrate_to operation 
> vm‑conv‑4_migrate_to_0 on vhbl07
> vhbl03 pengine[10016]:   notice: Calculated transition 4633, saving inputs 
> in /var/lib/pacemaker/pengine/pe‑input‑1069.bz2
> [...]
> 
> At this point, with the migration still ongoing, I wanted to get rid of
> the constraint:
> 
> # /usr/sbin/crm_resource ‑‑clear ‑r vm‑conv‑4
> 
> vhbl03 crmd[10017]:   notice: Transition aborted by deletion of 
> rsc_location[@id='cli‑ban‑vm‑conv‑4‑on‑vhbl07']: Configuration change
> vhbl07 crmd[10233]:   notice: Result of migrate_to operation for vm‑conv‑4
on 
> vhbl07: 0 (ok)
> vhbl03 crmd[10017]:   notice: Transition 4633 (Complete=6, Pending=0, 
> Fired=0, Skipped=1, Incomplete=6, 
> Source=/var/lib/pacemaker/pengine/pe‑input‑1069.bz2): Stopped
> vhbl03 pengine[10016]:   notice: Resource vm‑conv‑4 can no longer migrate to

> vhbl04. Stopping on vhbl07 too
> vhbl03 pengine[10016]:   notice: Reload  vm‑conv‑4#011(Started vhbl07)
> vhbl03 pengine[10016]:   notice: Calculated transition 4634, saving inputs 
> in /var/lib/pacemaker/pengine/pe‑input‑1070.bz2
> vhbl03 crmd[10017]:   notice: Initiating stop operation vm‑conv‑4_stop_0 on

> vhbl07
> vhbl03 crmd[10017]:   notice: Initiating stop operation vm‑conv‑4_stop_0 on

> vhbl04
> vhbl03 crmd[10017]:   notice: Initiating reload operation vm‑conv‑4_reload_0

> on vhbl04
> 
> This recovery was entirely unnecessary, as the resource successfully
> migrated to vhbl04 (the migrate_from operation does nothing).  Pacemaker
> does not know this, but is there a way to educate it?  I think in this
> special case it is possible to redesign the agent making migrate_to a
> no‑op and doing everything in migrate_from, which would significantly
> reduce the window between the start points of the two "halfs", but I'm
> not sure that would help in the end: Pacemaker could still decide to do
> an unnecessary stop+start recovery.  Would it?  I failed to find any
> documentation on recovery from aborted migration transitions.  I don't
> expect on‑fail (for migrate_* ops, not me) to apply here, does it?
> 
> Side question: why initiate a reload in any case, like above?
> 
> Even more side question: could you please consider using space instead
> of TAB in syslog messages?  (Actually, I wouldn't mind getting rid of
> them altogether in any output.)
> ‑‑ 
> Thanks,
> Feri
> _______________________________________________
> Users mailing list: Users@clusterlabs.org 
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 



_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to