Hi!
I attached a quick and dirty patch for check_snmp_linkstatus.
--
kind regards, Henry
On Di, 2010-01-12 at 13:21 +0000, Ton Voon wrote:
>
> On 12 Jan 2010, at 12:38, [email protected] wrote:
>
> >
> > -------- Original-Nachricht --------
> > > Datum: Tue, 12 Jan 2010 11:53:46 +0000
> > > Von: Ton Voon <[email protected]>
> > > An: Opsview Users <[email protected]>
> > > Betreff: Re: [opsview-users] Host Interfaces: critical if admin
> > > down and link down?
> >
> > >
> > > On 11 Jan 2010, at 17:56, [email protected] wrote:
> > > >
> > > > If an interface's Links status is down and the admin status is
> > > > also
> > > > down, the Opsview status is critical. But I'd consider the
> > > > down/down
> > > > situation as a intended one, so the Opsview status should be OK.
> > >
> > > I see what you mean, but why are you monitoring it? Should it be
> > > a
> > > warning or unknown instead so you can disable monitoring of it?
> > >
> > > At the moment, you have to "tick" that you want that interface
> > > monitored. This is because we need the extra services created at
> > > Nagios to store the performance data and provide the alerting.
> >
> > That's the point: If the interfaces that are 'admin down'/'link
> > down' turn OK in Opsview, you'd be able to "tick" all interfaces
> > when adding a host (switch/router) and won't have to "tick" it if a
> > port gets configured 'admin enabled' and connected. Also if a host
> > is removed, the port gets configured 'admin down' and cable is
> > disconnected, it wouldn't be necessary to reconfigure the host and
> > "untick" the interface monitoring.
> >
> > To put it in a short sentence: I'd like the Opsview status of a
> > monitored port to be non-OK *only* if the 'links status' and 'admin
> > status' differ.
> >
>
>
> Fair enough - I can see that is easier to maintain. I've
> raised https://secure.opsera.com/jira/browse/OPS-947 to cover this.
>
>
> Ton
>
>
> _______________________________________________
> Opsview-users mailing list
> [email protected]
> http://lists.opsview.org/lists/listinfo/opsview-users
--- check_snmp_linkstatus.ori 2009-12-18 13:41:08.000000000 +0100
+++ check_snmp_linkstatus 2010-01-12 18:54:48.000000000 +0100
@@ -120,6 +120,7 @@ my $throughput_out_friendly = 0;
my $warning_pct = 0;
my $critical_pct = 0;
my $linkstate;
+my $admstate;
my $user_specified_ifname;
my $user_specified_index;
my $verbose = 0;
@@ -832,6 +833,7 @@ if ( $s->error ) {
my $name = $s->var_bind_list()->{"$oid_interfaces_base.2.$ifindex"};
$linkstate = $s->var_bind_list()->{"$oid_interfaces_base.8.$ifindex"};
$link_speed = $s->var_bind_list()->{"$oid_interfaces_base.5.$ifindex"};
+$admstate = get_oid_value("$oid_adminstatus.$ifindex");
# This should only get called when the initial cache data holds NULL for ifAlias
if ( !defined $ifAlias ) {
@@ -913,11 +915,11 @@ if ( $name eq $user_specified_ifname ) {
# Update the DB
set_out_octets( $ifindex, $curval_bigint->copy() );
}
- else {
- print "CRITICAL - $interface_display_name is down", $/;
- db_disconnect();
- exit 2;
- }
+ #else {
+ # print "CRITICAL - $interface_display_name is down", $/;
+ # db_disconnect();
+ # exit 2;
+ #}
}
elsif ( defined $user_specified_index && ( $name ne $user_specified_ifname ) ) {
print "WARNING - Interface name $user_specified_ifname expected at index $user_specified_index, but got $interface_display_name!\n";
@@ -946,8 +948,8 @@ if ($interface_vanished) {
db_disconnect();
# So what's the verdict?
-# If the interface is down, this is always critical
-if ( $linkstate == 2 ) {
+if ( $admstate == 1 && $linkstate == 2 ) {
+ # If link is down, while interface is configured admin up, it is critical
$retval = 2;
$retmsg = "Interface $interface_display_name is DOWN!";
@@ -957,6 +959,28 @@ if ( $linkstate == 2 ) {
$throughput_in = $throughput_in_pct = 0;
$throughput_out = $throughput_out_pct = 0;
}
+elsif ( $admstate == 2 && $linkstate == 1 ) {
+# If links is up and admin state is down, the inteface is misconfigured, this is a warning
+ $retval = 1;
+ $retmsg = "Interface $interface_display_name is UP, but admin state is DOWN!";
+
+ # Even though we may have had bytes sent/received since the last check
+ # if the state has changed from up to down, we'll still report a
+ # throughput of 0 as this is a more sensible thing to display.
+ $throughput_in = $throughput_in_pct = 0;
+ $throughput_out = $throughput_out_pct = 0;
+}
+elsif ( $admstate == 2 && $linkstate == 2 ) {
+# If link is down and admin state is down, everything is fine, interface is OK
+ $retval = 0;
+ $retmsg = "Interface $interface_display_name: admin state is down and link state is down!";
+
+ # Even though we may have had bytes sent/received since the last check
+ # if the state has changed from up to down, we'll still report a
+ # throughput of 0 as this is a more sensible thing to display.
+ $throughput_in = $throughput_in_pct = 0;
+ $throughput_out = $throughput_out_pct = 0;
+}
else {
# The interface is up, so what about the thresholds?
@@ -1002,6 +1026,9 @@ else {
$retmsg = "$interface_display_name throughput (in/out) $throughput_in_friendly/$throughput_out_friendly, $throughput_in_pct%/$throughput_out_pct% has exceeded warning threshold!";
$retval = 1;
}
+ else {
+ $retmsg="$interface_display_name is up, throughput (in/out) $throughput_in_friendly/$throughput_out_friendly, $throughput_in_pct%/$throughput_out_pct%";
+ }
}
}
@@ -1016,9 +1043,9 @@ if ( $warning || $critical ) {
$perfdata .= ";$warning;$critical";
}
-# Show appropriate message (we don't have a warning state)
+# Show appropriate message
if ( $retval == 0 ) {
- print "OK - $interface_display_name is up, throughput (in/out) $throughput_in_friendly/$throughput_out_friendly, $throughput_in_pct%/$throughput_out_pct%|$perfdata\n";
+ print "$retmsg|$perfdata\n";
}
elsif ( $retval == 1 ) {
print "WARNING - $retmsg|$perfdata\n";
_______________________________________________
Opsview-users mailing list
[email protected]
http://lists.opsview.org/lists/listinfo/opsview-users