Re: [ClusterLabs] Antwort: Re: clear pending fence operation

Andrew Beekhof Mon, 03 Aug 2015 18:16:11 -0700

> On 7 Jul 2015, at 9:45 pm, [email protected] wrote:
> 
> Andrei Borzenkov <[email protected]> schrieb am 07.07.2015 10:03:26:
> 
> > Von: Andrei Borzenkov <[email protected]> 
> > An: Cluster Labs - All topics related to open-source clustering 
> > welcomed <[email protected]> 
> > Datum: 07.07.2015 10:04 
> > Betreff: Re: [ClusterLabs] clear pending fence operation 
> > 
> > On Tue, Jul 7, 2015 at 10:41 AM,  <[email protected]> wrote:
> > > hi,
> > >
> > > is there any way to clear/remove pending stonith operation on cluster 
> > > node?
> > >
> > > after some internal testing i got following status:
> > >
> > > Jul  4 12:18:02 XXX crmd[1673]:   notice: te_fence_node: Executing reboot
> > > fencing operation (179) on XXX (timeout=60000)
> > > Jul  4 12:18:02 XXX stonith-ng[1668]:   notice: handle_request: Client
> > > crmd.1673.1867d504 wants to fence (reboot) 'XXX' with device '(any)'
> > > Jul  4 12:18:02 XXX stonith-ng[1668]:   notice: 
> > > initiate_remote_stonith_op:
> > > Initiating remote operation reboot for XXX:
> > > 3453b93d-a13a-4513-b05b-b79ad85ff992 (0)
> > > Jul  4 12:18:03 XXX stonith-ng[1668]:    error: remote_op_done: Operation
> > > reboot of XXX by <no-one> for [email protected]: Generic Pacemaker
> > > error
> > > Jul  4 12:18:03 XXX crmd[1673]:   notice: tengine_stonith_callback: 
> > > Stonith
> > > operation 2/179:23875:0:134436dd-4df8-44a2-bf4a-ec6276883edd: Generic
> > > Pacemaker error (-201)
> > > Jul  4 12:18:03 XXX crmd[1673]:   notice: tengine_stonith_callback: 
> > > Stonith
> > > operation 2 forXXX failed (Generic Pacemaker error): aborting transition.
> > > Jul  4 12:18:03 XXX crmd[1673]:   notice: abort_transition_graph: 
> > > Transition
> > > aborted: Stonith failed (source=tengine_stonith_callback:697, 0)
> > > Jul  4 12:18:03 XXX crmd[1673]:   notice: tengine_stonith_notify: Peer XXX
> > > was not terminated (reboot) by <anyone> for XXX: Generic Pacemaker error
> > > (ref=3453b93d-a13a-4513-b05b-b79ad85ff992) by client crmd.1673
> > >
> > > so, node XXX is still online, i want to get cluster back to stable
> > >
> > 
> > If you are using sufficiently recent pacemaker, you can use
> > "stonith_admin --confirm"; be sure to actually stop all resources on
> > victim node in this case. 
> 
> ok, i use v1.1.12. 
> so in my case there were 7 VirtualDomain RGs online, do i have to migrate 
> them to there scheduled destination location,


nope

> or is this not relevant? 

Correct. Confirming the node is down implies the VMs are stopped and we’ll take 
care of starting them somewhere for you.

> 
> warning: stage6: Scheduling Node XXX for STONITH 
> notice: LogActions: Move    vma#011(Started XXX -> WWW) 
> notice: LogActions: Move    vmb#011(Started XXX -> WWW) 
> notice: LogActions: Move    vmc#011(Started XXX -> YYY) 
> notice: LogActions: Move    vmd#011(Started XXX -> YYY) 
> notice: LogActions: Move    vme#011(Started XXX -> ZZZ) 
> notice: LogActions: Move    vmf#011(Started XXX -> ZZZ) 
> notice: LogActions: Move    fmg#011(Started XXX -> YYY) 
> 
> thank you! 
> 
> 
> > 
> > On older pacemaker using crmsh "crm node clearstate" does the same.
> > 
> > _______________________________________________
> > Users mailing list: [email protected]
> > http://clusterlabs.org/mailman/listinfo/users
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> _______________________________________________
> Users mailing list: [email protected]
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antwort: Re: clear pending fence operation

Reply via email to