Hi Ryan,

Thank you for response. Does it mean there is no way to intimate
administrator about failure of fencing as of now?

Let me give more information about my cluster -

I have set of nodes in cluster with only IP resource being protected. I have
two levels of fencing, first bladecenter fencing and second one is manual
fencing.

At times if machine is already down(either power failure or turned off
abrupty); blade center fencing timesout and manual fencing happens. At this
time, administrator is expected to run fence_ack_manual.

Clearly this is not something which is desirable, as downtime of services is
as long as administrator runs fence_ack_manual.

What is recommended method to deal with  blade center fencing failure in
this situation? Do I have to add another level of fencing(between blade
center and manual) which can fence automatically(not requiring manual
interference)?


Thanks









On Mon, Feb 28, 2011 at 9:44 PM, Ryan O'Hara <roh...@redhat.com> wrote:

> On Mon, Feb 28, 2011 at 12:43:10PM +0530, Parvez Shaikh wrote:
> > Hi all,
> >
> > I have a question related to fence agents and SNMP alarms.
> >
> > Fence Agent can fail to fence the failed node for various reason; e.g.
> with
> > my bladecenter fencing agent, I sometimes get message saying bladecenter
> > fencing failed because of timeout or fence device IP address/user
> > credentials are incorrect.
> >
> > In such a situation is it possible to generate SNMP trap?
>
> This feature will be in RHEL6.1. There is a new project called
> 'foghorn' that creates SNMPv2 traps from dbus signals.
>
> git://git.fedorahosted.org/foghorn.git
>
> In RHEL6.1 (and the latest upstream release), certain cluster
> components will emit dbus signals when certain events occurs. This
> includes fencing. So when a node is fenced a dbus signal is generated
> by fenced. The foghorn service catches this signal and generated
> SNMPv2 trap.
>
> Note that foghorn runs as an AgentX subagent, so snmpd must be running
> as the master agentx.
>
> Ryan
>
> > My cluster config file looks like below and in my case if bladecenter
> > fencing fails, manual fencing kicks in and requires user to do
> > fence_ack_manual, for this user must at least be notified via SNMP (or
> any
> > other mechanism?) to intervene  -
> >
> >   <clusternodes>
> >     <clusternode name="blade2" nodeid="2" votes="1">
> >       <fence>
> >         <method name="1">
> >           <device blade="2" name="BladeCenterFencing"/>
> >         </method>
> >         <method name="2">
> >           <device name="ManualFencing" nodename="blade2"/>
> >         </method>
> >       </fence>
> >     </clusternode>
> >     <clusternode name="blade1" nodeid="1" votes="1">
> >       <fence>
> >         <method name="1">
> >           <device blade="1" name="BladeCenterFencing"/>
> >         </method>
> >         <method name="2">
> >           <device name="ManualFencing" nodename="blade1"/>
> >         </method>
> >       </fence>
> >     </clusternode>
> >   </clusternodes>
> >   <cman expected_votes="1" two_node="1"/>
> >   <fencedevices>
> >     <fencedevice agent="fence_bladecenter" ipaddr="blade-mm.com"
> > login="USERID" name="BladeCenterFencing" passwd="PASSW0RD"/>
> >     <fencedevice agent="fence_manual" name="ManualFencing"/>
> >   </fencedevices>
> >
> > Thanks,
> > Parvez
>
> > --
> > Linux-cluster mailing list
> > Linux-cluster@redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to