Nikita,

> ...
>  - what about configure monitor operation of IP in cib.xml - sth. like this:
>    <resources>
>       <primitive id="IPaddr_194_37_40_42" class="ocf" provider="heartbeat" 
> type="IPaddr">
>          <meta_attributes id="primitive-IPaddr_194_37_40_42meta"/>
>           <operations>
>             <op name="monitor" interval="60s" id="IPaddr_194_37_40_42_mon" 
> timeout="60s"/>
>           </operations>
> 
> - it works for me very well ;-)

As far as I can see the "monitor" function of "IPaddr" basically
pings the IP address of the interface ... unfortunately, at least
under RedHat/CentOS, if you physically pull the plug on an ethernet
then the interface will still continue to ping successfully on it's
own address, even though the link is in fact down - so the ping is
not telling you that everything is working as it should. I do not
think that "IPaddr2" does any better.

Mia,

> ...
> why not just using ethtool or other mii tools to detect the link failure in
> IPaddr2 script?

Just looking at the link status will not tell you if something else
is wrong with your connectivity to the network and the other cluster
nodes - so you need to use something like the "ping" resource as
suggested by Lars.

However, IMHO, something should be monitoring the local link status,
as it is a very quick and cheap way to find out the health of your
connection, rather than relying on pings all the time. Monitor the
link status very often, and do pings every N times that you find
that the link is up - the link status is probably a pretty good
indicator that you have connectivity.

[One thing I dislike about the "ocf:pacemaker:ping" resource is that
 it just sets an attribute and never actually stop/starts if it has
 failed to ping something - this means that when looking from crm_mon
 you may see that an IP resource has been moved to another node, but
 it is not obvious that it has moved because the link is down, ping
 is still happily 'running' (yes, yes, there are other things which
 can tell you what happened). I understand why things are like this
 but it is a pity that from the monitor is not just a little bit
 more obvious what is going on ... my solution is to have an extra
 resource that will stop/start depending on the value of the attribute
 set by ping - is there a better way?]

Max
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to