Thanks Marian,

that just did the trick for my solution, only thing is an error in your
script where:

if [ "$iface" == "yes" ]; 
should have been 
if [ "$link_status" == "yes" ];

Now my monitored service fails as soon as a cable is unplugged - which is
just perfect, but the HA tries to just restart it on the same node instead
of failing it over to the other node. How can I make sure that a service is
tried restarted for e.g. 3 times and then failed over if the restart was not
successfull? Do I have to setup stickiness or any constraint and how?

Thanks in advance.
Kasper Andersen


Marian Marinov-2 wrote:
> 
> On Friday 21 November 2008 03:12:56 Marian Marinov wrote:
>> On Wednesday 19 November 2008 16:49:55 KAD_USER wrote:
>> > Hi,
>> >
>> > Running SLES10.2 we have the following setup:
>> >
>> > Node1:
>> > - eth0 and eth1 are bonded into bond0, eth3 is used for heartbeat.
>> > - LINUX HA has been setup with one resource which uses bond0 to provide
>> > another virtual IP-address has been setup.
>> >
>> > Node2:
>> > - eth0 and eth1 are bonded into bond0, eth3 is used for heartbeat.
>> > - LINUX HA has been setup and all resources are inherited from Node1.
>> >
>> > Issue:
>> > When someone pulls the ethernet cable on eth0 the node continues to
>> work,
>> > but when someone pulls both eth0 and eth1 and no data can leave or
>> enter
>> > the system one would expect the resource which uses bond0 to fail and
>> > perform a failover of resources as would a default installation of a
>> > Windows Cluster do!
>> >
>> > Is there a way to monitor resources like link status on ethernet cards
>> > and then perform a failover once it is down?
>>
>> Hi,
>>
>> I don't know if there is a script or resource agent ready for that, but
>> here are my 2 bits of code that are a simple LSB script that can help you
>> monitor this resource.
>>
>> I hope you will like the script. You should be able to configure this
>> script as a standard LSB resource in cib.xml.
>>
>> #!/bin/bash
>> #
>> # link-state
>> #
>> # chkconfig: - 26 74
>> # description: Network Interfaces link state monitoring script
>>
>> ### BEGIN INIT INFO
>> # Provides:             link-state
>> # Required-Start:
>> # Required-Stop:
>> # Default-Start:
>> # Default-Stop:
>> # Short-Description:    provides monitoring and reconfiguration
>> # Description:          provides monitoring and reconfiguration
>> #                                               for various network
>> interfaces with easy
>> #                                               configuration, portable
>> on
>> different
>> #                                               distributions
>> ### END INIT INFO
>>
>> # set secure PATH
>> PATH="/bin:/usr/bin:/sbin:/usr/sbin"
>>
>>
>> interfaces='eth0 eth1'
>> ip[0]='10.0.0.1'
>> ip[1]='10.0.0.5'
>> netmask[0]='255.255.255.252'
>> netmask[1]='255.255.255.252'
>>
>> function ifstatus() {
>>         for iface in $interfaces; do
>>                 link_status=$(ethtool $iface|awk '/Link/{print $3}')
>>                 echo -n "Checking link status on interface $iface "
>>                 if [ "$iface" == "yes" ]; then
>>                         echo -ne "\\033[60G[\\033[0;32m  OK 
>> \\033[0;39m]\r\n" else
>>                         echo -ne
>> "\\033[60G[\\033[1;38mFAILED\\033[0;39m]\r\n" exit 1
>>                 fi
>>         done
>> }
>> function ifstop() {
>>         for iface in $interfaces; do
>>                 echo -n "Stoping interface $iface "
>>                 ifconfig $iface down
>>                 if [ "$?" == 0 ]; then
>>                         echo -ne "\\033[60G[\\033[0;32m  OK 
>> \\033[0;39m]\r\n" else
>>                         echo -ne
>> "\\033[60G[\\033[1;38mFAILED\\033[0;39m]\r\n" exit 1
>>                 fi
>>         done
>> }
>> function ifstart() {
>>         count=0
>>         for iface in $interfaces; do
>>                 echo -n "Starting interface $iface "
>>                 ifconfig $iface ${ip[$count]} netmask ${netmask[$count]}
>>         if [ "$?" == 0 ]; then
>>             echo -ne "\\033[60G[\\033[0;32m  OK  \\033[0;39m]\r\n"
>>         else
>>             echo -ne "\\033[60G[\\033[1;38mFAILED\\033[0;39m]\r\n"
>>             exit 1
>>         fi
>>         done
>> }
>>
>> case "$1" in
>>         start)
>>                 ifstart
>>         ;;
>>         stop)
>>                 ifstop
>>         ;;
>>         restart)
>>                 ifstop
>>                 ifstart
>>         ;;
>>         status)
>>                 ifstatus
>>         ;;
>>         *)
>>                 echo "Usage: $0 start|stop|restart|status"
>>                 exit 1
>> esac
>> exit 0
> 
> Ups, I forgot some things :)
> Here is a patch for my script:
> 
> --- link-state.old      2008-11-21 03:26:53.000000000 +0200
> +++ link-state  2008-11-21 03:26:56.000000000 +0200
> @@ -23,21 +23,21 @@ PATH="/bin:/usr/bin:/sbin:/usr/sbin"
> 
> 
>  interfaces='eth0 eth1'
> -ip[0]='10.0.0.1'
> -ip[1]='10.0.0.5'
> -netmask[0]='255.255.255.252'
> -netmask[1]='255.255.255.252'
> +ip=(10.0.0.1 10.0.0.5)
> +netmask=(255.255.255.252 255.255.255.252)
> 
>  function ifstatus() {
> +       count=0
>         for iface in $interfaces; do
>                 link_status=$(ethtool $iface|awk '/Link/{print $3}')
> -               echo -n "Checking link status on interface $iface "
> +               echo -n "Checking link status on interface 
> $iface(${ip[$count]}) "
>                 if [ "$iface" == "yes" ]; then
>                         echo -ne "\\033[60G[\\033[0;32m  OK 
> \\033[0;39m]\r\n"
>                 else
>                         echo -ne
> "\\033[60G[\\033[1;38mFAILED\\033[0;39m]\r\n"
>                         exit 1
>                 fi
> +               let count++
>         done
>  }
>  function ifstop() {
> @@ -63,6 +63,7 @@ function ifstart() {
>              echo -ne "\\033[60G[\\033[1;38mFAILED\\033[0;39m]\r\n"
>              exit 1
>          fi
> +               let count++
>         done
>  }
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Network-interface-monitoring-and-failover-once-failed-tp20581371p20677754.html
Sent from the Linux-HA mailing list archive at Nabble.com.

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to