> On 7 Jul 2015, at 9:45 pm, [email protected] wrote: > > Andrei Borzenkov <[email protected]> schrieb am 07.07.2015 10:03:26: > > > Von: Andrei Borzenkov <[email protected]> > > An: Cluster Labs - All topics related to open-source clustering > > welcomed <[email protected]> > > Datum: 07.07.2015 10:04 > > Betreff: Re: [ClusterLabs] clear pending fence operation > > > > On Tue, Jul 7, 2015 at 10:41 AM, <[email protected]> wrote: > > > hi, > > > > > > is there any way to clear/remove pending stonith operation on cluster > > > node? > > > > > > after some internal testing i got following status: > > > > > > Jul 4 12:18:02 XXX crmd[1673]: notice: te_fence_node: Executing reboot > > > fencing operation (179) on XXX (timeout=60000) > > > Jul 4 12:18:02 XXX stonith-ng[1668]: notice: handle_request: Client > > > crmd.1673.1867d504 wants to fence (reboot) 'XXX' with device '(any)' > > > Jul 4 12:18:02 XXX stonith-ng[1668]: notice: > > > initiate_remote_stonith_op: > > > Initiating remote operation reboot for XXX: > > > 3453b93d-a13a-4513-b05b-b79ad85ff992 (0) > > > Jul 4 12:18:03 XXX stonith-ng[1668]: error: remote_op_done: Operation > > > reboot of XXX by <no-one> for [email protected]: Generic Pacemaker > > > error > > > Jul 4 12:18:03 XXX crmd[1673]: notice: tengine_stonith_callback: > > > Stonith > > > operation 2/179:23875:0:134436dd-4df8-44a2-bf4a-ec6276883edd: Generic > > > Pacemaker error (-201) > > > Jul 4 12:18:03 XXX crmd[1673]: notice: tengine_stonith_callback: > > > Stonith > > > operation 2 forXXX failed (Generic Pacemaker error): aborting transition. > > > Jul 4 12:18:03 XXX crmd[1673]: notice: abort_transition_graph: > > > Transition > > > aborted: Stonith failed (source=tengine_stonith_callback:697, 0) > > > Jul 4 12:18:03 XXX crmd[1673]: notice: tengine_stonith_notify: Peer XXX > > > was not terminated (reboot) by <anyone> for XXX: Generic Pacemaker error > > > (ref=3453b93d-a13a-4513-b05b-b79ad85ff992) by client crmd.1673 > > > > > > so, node XXX is still online, i want to get cluster back to stable > > > > > > > If you are using sufficiently recent pacemaker, you can use > > "stonith_admin --confirm"; be sure to actually stop all resources on > > victim node in this case. > > ok, i use v1.1.12. > so in my case there were 7 VirtualDomain RGs online, do i have to migrate > them to there scheduled destination location,
nope > or is this not relevant? Correct. Confirming the node is down implies the VMs are stopped and we’ll take care of starting them somewhere for you. > > warning: stage6: Scheduling Node XXX for STONITH > notice: LogActions: Move vma#011(Started XXX -> WWW) > notice: LogActions: Move vmb#011(Started XXX -> WWW) > notice: LogActions: Move vmc#011(Started XXX -> YYY) > notice: LogActions: Move vmd#011(Started XXX -> YYY) > notice: LogActions: Move vme#011(Started XXX -> ZZZ) > notice: LogActions: Move vmf#011(Started XXX -> ZZZ) > notice: LogActions: Move fmg#011(Started XXX -> YYY) > > thank you! > > > > > > On older pacemaker using crmsh "crm node clearstate" does the same. > > > > _______________________________________________ > > Users mailing list: [email protected] > > http://clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > _______________________________________________ > Users mailing list: [email protected] > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
