Nikita, > ... > - what about configure monitor operation of IP in cib.xml - sth. like this: > <resources> > <primitive id="IPaddr_194_37_40_42" class="ocf" provider="heartbeat" > type="IPaddr"> > <meta_attributes id="primitive-IPaddr_194_37_40_42meta"/> > <operations> > <op name="monitor" interval="60s" id="IPaddr_194_37_40_42_mon" > timeout="60s"/> > </operations> > > - it works for me very well ;-)
As far as I can see the "monitor" function of "IPaddr" basically pings the IP address of the interface ... unfortunately, at least under RedHat/CentOS, if you physically pull the plug on an ethernet then the interface will still continue to ping successfully on it's own address, even though the link is in fact down - so the ping is not telling you that everything is working as it should. I do not think that "IPaddr2" does any better. Mia, > ... > why not just using ethtool or other mii tools to detect the link failure in > IPaddr2 script? Just looking at the link status will not tell you if something else is wrong with your connectivity to the network and the other cluster nodes - so you need to use something like the "ping" resource as suggested by Lars. However, IMHO, something should be monitoring the local link status, as it is a very quick and cheap way to find out the health of your connection, rather than relying on pings all the time. Monitor the link status very often, and do pings every N times that you find that the link is up - the link status is probably a pretty good indicator that you have connectivity. [One thing I dislike about the "ocf:pacemaker:ping" resource is that it just sets an attribute and never actually stop/starts if it has failed to ping something - this means that when looking from crm_mon you may see that an IP resource has been moved to another node, but it is not obvious that it has moved because the link is down, ping is still happily 'running' (yes, yes, there are other things which can tell you what happened). I understand why things are like this but it is a pity that from the monitor is not just a little bit more obvious what is going on ... my solution is to have an extra resource that will stop/start depending on the value of the attribute set by ping - is there a better way?] Max _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
