On Tue, 2020-08-18 at 16:47 +0200, Lentes, Bernd wrote: > > ----- On Aug 17, 2020, at 5:09 PM, kgaillot [email protected] > wrote: > > > > > I checked all relevant pe-files in this time period. > > > This is what i found out (i just write the important entries): > > > > > Executing cluster transition: > > > * Resource action: vm_nextcloud stop on ha-idg-2 > > > Revised cluster status: > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > > > ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe- > > > input- > > > 3118 -G transition-4516.xml -D transition-4516.dot > > > Current cluster status: > > > Node ha-idg-1 (1084777482): standby > > > Online: [ ha-idg-2 ] > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > <============== vm_nextcloud is stopped > > > Transition Summary: > > > * Shutdown ha-idg-1 > > > Executing cluster transition: > > > * Resource action: vm_nextcloud stop on ha-idg-1 <==== why > > > stop ? > > > It is already stopped > > > > I'm not sure, I'd have to see the pe input. > > You find it here: > https://hmgubox2.helmholtz-muenchen.de/index.php/s/WJGtodMZ9k7rN29
This appears to be a scheduler bug. The scheduler considers a migration to be "dangling" if it has a record of a failed migrate_to on the source node, but no migrate_from on the target node (and no migrate_from or start on the source node, which would indicate a later full restart or reverse migration). In this case, any migrate_from on the target has since been superseded by a failed start and a successful stop, so there is no longer a record of it. Therefore the migration is considered dangling, which requires a full stop on the source node. However in this case we already have a successful stop on the source node after the failed migrate_to, and I believe that should be sufficient to consider it no longer dangling. > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped <======= > > > vm_nextcloud is stopped > > > Transition Summary: > > > * Fence (Off) ha-idg-1 'resource actions are unrunnable' > > > Executing cluster transition: > > > * Fencing ha-idg-1 (Off) > > > * Pseudo action: vm_nextcloud_stop_0 <======= why stop ? It is > > > already stopped ? > > > Revised cluster status: > > > Node ha-idg-1 (1084777482): OFFLINE (standby) > > > Online: [ ha-idg-2 ] > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > > > I don't understand why the cluster tries to stop a resource which > > > is > > > already stopped. > > Bernd > Helmholtz Zentrum München > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin > Guenther > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 -- Ken Gaillot <[email protected]> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
