from:"Matthias Flacke"

Re: [Nagios-users] problem state with check_multi

2011-07-04 Thread Matthias Flacke


Hi Stanislas,

the RC UNKNOWN and the message '[timeout encountered after 10s]' is not
from check_icmp, but from the check_multi child check timeout, which
also encountered after 10s. :)

To get sure which timeout actually is working, either specify another
timeout for check_icmp, or for check_multi child child checks.

Cheers,
-Matthias

On 7/4/11 5:13 PM, LEVEAU Stanislas wrote:
> Hi,
> 
> I created a file with command and state : pix.cmd
> 
> command [ pix1 ] = check_icmp -H 1.2.3.4 -w 3000.0,80% -c 5000.0,100% -p 1
> command [ pix2 ] = check_icmp -H 1.2.3.5 -w 3000.0,80% -c 5000.0,100% -p 1
> command [ pix3 ] = check_icmp -H 1.2.3.6 -w 3000.0,80% -c 5000.0,100% -p 1
> 
> state [ OK ] = pix1 == OK && pix2 == OK && pix3 == OK
> state [ CRITICAL ] = pix1 == CRITICAL || pix2 == CRITICAL || pix3 ==
> CRITICAL
> 
> 
> When i run a command with the plugin check_multi the result is UNKNOWN,
> instead of CRITICAL.
> 
> # /usr/lib/nagios/plugins/check_multi -f pix.cmd -r 5
> 
> UNKNOWN - 3 plugins checked, 2 unknown (pix1, pix2), 1 ok [please don't
> run plugins as root!]
> [ 1] pix1 OK - 10.50.184.245: rta 74,980ms, lost 0%
> [ 2] pix2 CRITICAL - 217.108.39.122: rta nan, lost 100% [timeout
> encountered after 10s]
> [ 3] pix3 CRITICAL - 217.108.39.122: rta nan, lost 100% [timeout
> encountered after 10s]
> 
> 
> Have you any idea?
> regards
> Stan
> 
> -- 
> *Stanislas LEVEAU**
> 
> *Rectorat de Caen
> 168, rue Caponière
> B.P. 6184
> 14061 CAEN Cedex
>   Service Informatique de l'Académie de Caen
> Département Systèmes & Réseaux
> 
> stanislas.lev...@ac-caen.fr
>  Tel : 02.31.30.17.86
> 
> 
> 
> 
> 
> --
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security 
> threats, fraudulent activity, and more. Splunk takes this data and makes 
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> 
> 
> 
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null


--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Difference between check_multi output on host and output received by chech_by_ssh

2010-10-27 Thread Matthias . Flacke


Hi Tim,

> However when I run it from the nagios server I don't see the output from
> our sendmail mailqueue check, which is the last check listed in our
> /usr/local/nagios/checks/systemhealth.cmd file:

The report option -r 11 (1+2+8) is not a good idea, because you left out -r 4 
to show errors. 
So I would recommend -r 15 (1+2+4+8) in your case.

What is the result if you run check_mailq directly without check_multi:
$ ./checks/check_by_ssh -H 134.171.46.232 -C "/path/to/check_mailq -w 2 -c 4"

Cheers,
-Matthias

--
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Alleviating Nagios i/o contention problem

2010-09-27 Thread Matthias . Flacke

> - The ramdisk idea is also interesting.   I'm curious though, about why one
> would want to rsync it back to the local disk periodically.  It's just a
> run-time status file, right?  Unless I misread the docs, it goes away when
> Nagios is shut down.  What would having a local disk copy of status.dat
> benefit me?  Also, nagios.log isn't written to that often in our case (we
> don't log passive check results, for example).  I'm not sure I'd see the
> benefit for us in putting that on ramdisk.  Although... we do have Splunk
> watch that file so that would be some additional read overhead I guess.

This is a misunderstanding ;). Only nagios.log needs to be saved for 
statistics, history etc, but status.dat and the checkresults files do not. 
status.dat will be recreated soon and losing some checkresults is mostly a 
matter of the retry interval.

nagios.log - as you said - depends on the traffic there. Max as far as I 
remember syncs it regularily. We normally have less than 5 messages per minute, 
so no reason to put it on ramdisk.

-Matthias

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Alleviating Nagios i/o contention problem

2010-09-25 Thread Matthias Flacke

On 9/25/10 2:30 PM, Frost, Mark {PBC} wrote:
> Greetings, listers,
> 
>  
> 
> We've got an on-going issue with i/o contention.  There's the obvious
> problem that we've got a whole lot of things all writing to the same
> partition.  In this case, there's just one big chunk of RAID 5 disk on a
> single controller so I don't believe that making more partitions is
> going to help.
> 
>  
> 
> On this same partition we have:
> 
>  
> 
> 1) Nagios 3.2.1 running as the central/reporting server for a couple of
> other Nagios nodes that are sending check results via NSCA. 
> Approximately 6-7K checks.
> 
>  
> 
> 2) pnp4nagios 0.6.2 (with rrd 1.4.2) writing graph data.
> 
>  
> 
> There's a 2nd server configured identically to the first that's acting
> as a "hot spare" so it also receives check data from the 2 distributed
> nodes and writes its own copy of the graph data locally as well.
> 
>  
> 
> At the moment I'm concerned about the graphdata, but because I can only
> see i/o utilization as an aggregate, I can't tell what is the worst
> component on that filesystem -- status.dat updates?  graph data?  writes
> to the var/spool directory?  We also look at continued growth so this is
> only going to get worse.
> 
>  
> 
> These systems are quite lightly loaded from a CPU (2 dual-core CPUs) and
> memory (4GB) perspective, but the i/o to the nagios filesystem is
> queuing now.
> 
>  
> 
> We're about to order new hardware for these servers and I want to make a
> reasonable choice.  I'd like to make some reasonable changes without
> requiring too exotic of a setup.  I believe these servers are currently
> Dell 2950s and they're all running Suse Linux 10.3 SP2.
> 
>  
> 
> My first thought was to potentially move the graphs to a NAS share which
> would shift that i/o to the network.  I don't know how that would work
> though and it would ultimately be an experiment.
> 
>  
> 
> What experiences do people out there have handling this kind of i/o and
> what have you done to ease it?

You didn't say how many of your checks create perfdata - but I assume
that most of your disk I/O is related to RRD updates.
RRD cached (see http://docs.pnp4nagios.org/pnp-0.6/rrdcached for PNP
integration) is a good means to collect multiple RRD updates and burst
write the RRD files.

status.dat and the checkresults directory are always good candidates to
be stored on a ramdisk, especially since they're volatile data. As a
side note: status.dat on ramdisk is a pure boost for the CGIs :).
I know people which also store nagios.log on a ramdisk and regularily
save them via rsync onto a hard disk.

My own systems with ~4000 checks and ~20.000 performance relevant data
sets went down from 30% to less than 2% wait I/O with rrdcached and
ramdisk use.

Cheers,
-Matthias

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Override interval_length for specific service only

2010-01-07 Thread Matthias . Flacke


> I was wondering if there was any way with nagios 3.2 to override the global 
> interval_length
> for a specific service? I have one service that I would like to check every 
> 30 seconds, but 
> interval_length in 60. I could change the interval_length definition, of 
> course, but then I 
> would have to go back through all my definitions and change the interval 
> values I used there. 
> This may not be too bad, as I make heavy use of templating, but it would be 
> preferable to 
> simply override this for the one service.
 
In Nagios3 you can specify fraction numbers for the check_interval.

So with the standard interval_length of 60 seconds a check_interval of 0.5 
would configure
a 30 seconds interval for your particular check.

-Matthias

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios disfunctional , perhaps due to time change ?

2009-10-26 Thread Matthias Flacke



Ton Voon wrote:
> Hi!
> 
> On 26 Oct 2009, at 10:18, Mattias Ryrlén wrote:
> 
>> This can be solved with issue the following error.
>>
>> service nagios stop; now=$(date +%s); sed -i
>> "s/^next_check=.*/next_check=$now/" /usr/local/nagios/var/status.sav;
>> service nagios start
>>
>> this will only work if:
>> 1. you can issue command 'service' :)
>> 2. you have it installed to /usr/local/nagios
> 
> We've also seen this bug.
> 
> We've provided another way of fixing this. You can find more details
> here: http://opsview-blog.opsera.com/dotorg/2009/10/nagios-scheduling-bug.html

Hi,

this change is causing the wrong reschedule:
http://nagios.cvs.sourceforge.net/viewvc/nagios/nagios/base/utils.c?r1=1.236&r2=1.237&view=patch

It has been applied on 15th of January 2009 and introduced some
statements

t->tm_isdst=-1;

into base/utils.c.

Affected are all Nagios versions >= 3.1.0.

I could reproduce it here: if the patch above is backed out, the
scheduling behaves correctly.

-Matthias

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] problem with check_yum

2009-04-15 Thread Matthias Flacke



Seth Simmons wrote:
> I think I found the problem.  When I ran check_yum locally I forgot it was 
> running as root which explains why it worked.  Running locally as nagios 
> caused the error.  Apparently it isn't the plugin.  If, as nagios, I run yum 
> check-update it tells me I need to be root to access RHN repositories.  Seems 
> CentOS doesn't have that requirement with yum, which explains why it works on 
> all of those systems.
> 
> Running check_yum as nagios locally with sudo (nagios ALL=NOPASSWD: 
> /usr/local/nagios/libexec/check_yum) for that one binary works.
> 
> Problem now is, using the plugin with sudo through nrpe.  When I try to run 
> through nrpe it returns "unable to read output".
> 
> This is the command:
> command[check_yum_rhn]=/usr/bin/sudo /usr/local/nagios/libexec/check_yum
> 
> been looking around and some have had the same problem; often fixed by adding 
> the complete path to sudo, but it isn't working here.

It seems that your /etc/sudoers contains a line

'Defaults requiretty'

If you comment it, also connections without a real tty are allowed
to run sudo commands.

-Matthias


--
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-30 Thread Matthias Flacke



Jarrod Moore wrote:
> On Fri, Mar 27, 2009 at 5:43 PM, Matthias Flacke  
> wrote:
>> Jarrod Moore wrote:
>>> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
>>>> Jarrod Moore wrote:
>>>>> Hello everyone,
>>>>>
>>>>> I have a couple of related questions regarding service dependencies in
>>>>> Nagios and their limitations. I have two service checks (let's call
>>>>> them A and B) and service A depends on service B to function
>>>>> correctly. I want to set Nagios up so that if service B crashes then
>>>>> both services A and B are put into the critical state in Nagios. I've
>>>>> tried using service dependencies in Nagios to represent this behaviour
>>>>> but have yet to be successful. I can only get it to suppress
>>>>> notifications of service A if both services go down.
>>>>>
>>>> This is expected behaviour. If A is truly dependant on B, then A will
>>>> turn into a non-ok state of its own volition rather than as a result
>>>> of any dependency magic. Dependencies are designed as a means of
>>>> suppressing notifications. Otherwise, you would *always* get a
>>>> notification for B first, and a minute or so later from A (actually,
>>>> without the dependency you could get from A first).
>>>>
>>>>> Is there a way to do what I'm trying to do here? I'd have thought it
>>>>> would be logical that if a service depends on another service and the
>>>>> service depended on dies then all services depending on it would fail
>>>>> their checks as well, but there;s probably some scenario where it
>>>>> doesn't work so well. I've had a look through the mailing list
>>>>> archives and found someone had asked a similar question to the
>>>>> nagios-devel list about 2.5 years ago and didn't end up getting an
>>>>> answer, so I thought I might ask whether solutions to this type of
>>>>> problem had been developed since then.
>>>>>
>>>> They haven't. You're using dependencies the wrong way, really. If
>>>> A is truly dependent on B and doesn't go into a non-ok state after
>>>> B has crashed, then your check isn't doing what it's supposed to do,
>>>> or you've misunderstood the relationship somehow.
>>>>
>>>> If you were to explain what the two services actually are, it would
>>>> be easier to point you to a solution that works.
>>>>
>>>> --
>>>> Andreas Ericsson   andreas.erics...@op5.se
>>>> OP5 AB www.op5.se
>>>> Tel: +46 8-230225  Fax: +46 8-230231
>>>>
>>>> Considering the successes of the wars on alcohol, poverty, drugs and
>>>> terror, I think we should give some serious thought to declaring war
>>>> on peace.
>>>>
>>> Well basically I have a map (similar to Google Maps) embedded in a
>>> website, which hits a URL to retrieve maps. So I have one check using
>>> check_http to check that the website itself is up and another check on
>>> that URL to make sure that the map service is available. Now if the
>>> map service goes down, the website is still up but the maps won't
>>> appear, which means the website's functionality is significantly
>>> affected. However, it is still up and viewable so doing a check on the
>>> website URL still passes.
>>>
>>> Now of course I could just write a script or something to check both
>>> URLs and set that as the check command. There is a problem for me with
>>> this approach, however, because I have some other instances where a
>>> web service depends on other web services. When I want to use these
>>> services in websites, I'd then have to write a check for each script,
>>> each containing every service in the chain that is needed to display
>>> the website correctly. This way of doing things just seems a bit
>>> repetitive to me, especially when I have a check for these web
>>> services already.
>> You can give check_multi a try (http://my-plugin.de/check_multi).
>>
>> It allows to combine multiple checks on plugin level and has a
>> builtin state logic to evaluate the results of these checks.
>> You can reuse the command files by implementing macros.
>>
>> If I understood your setup correctly the whole resul

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-27 Thread Matthias Flacke


Jarrod Moore wrote:
> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
>> Jarrod Moore wrote:
>>> Hello everyone,
>>>
>>> I have a couple of related questions regarding service dependencies in
>>> Nagios and their limitations. I have two service checks (let's call
>>> them A and B) and service A depends on service B to function
>>> correctly. I want to set Nagios up so that if service B crashes then
>>> both services A and B are put into the critical state in Nagios. I've
>>> tried using service dependencies in Nagios to represent this behaviour
>>> but have yet to be successful. I can only get it to suppress
>>> notifications of service A if both services go down.
>>>
>> This is expected behaviour. If A is truly dependant on B, then A will
>> turn into a non-ok state of its own volition rather than as a result
>> of any dependency magic. Dependencies are designed as a means of
>> suppressing notifications. Otherwise, you would *always* get a
>> notification for B first, and a minute or so later from A (actually,
>> without the dependency you could get from A first).
>>
>>> Is there a way to do what I'm trying to do here? I'd have thought it
>>> would be logical that if a service depends on another service and the
>>> service depended on dies then all services depending on it would fail
>>> their checks as well, but there;s probably some scenario where it
>>> doesn't work so well. I've had a look through the mailing list
>>> archives and found someone had asked a similar question to the
>>> nagios-devel list about 2.5 years ago and didn't end up getting an
>>> answer, so I thought I might ask whether solutions to this type of
>>> problem had been developed since then.
>>>
>> They haven't. You're using dependencies the wrong way, really. If
>> A is truly dependent on B and doesn't go into a non-ok state after
>> B has crashed, then your check isn't doing what it's supposed to do,
>> or you've misunderstood the relationship somehow.
>>
>> If you were to explain what the two services actually are, it would
>> be easier to point you to a solution that works.
>>
>> --
>> Andreas Ericsson   andreas.erics...@op5.se
>> OP5 AB www.op5.se
>> Tel: +46 8-230225  Fax: +46 8-230231
>>
>> Considering the successes of the wars on alcohol, poverty, drugs and
>> terror, I think we should give some serious thought to declaring war
>> on peace.
>>
> 
> Well basically I have a map (similar to Google Maps) embedded in a
> website, which hits a URL to retrieve maps. So I have one check using
> check_http to check that the website itself is up and another check on
> that URL to make sure that the map service is available. Now if the
> map service goes down, the website is still up but the maps won't
> appear, which means the website's functionality is significantly
> affected. However, it is still up and viewable so doing a check on the
> website URL still passes.
> 
> Now of course I could just write a script or something to check both
> URLs and set that as the check command. There is a problem for me with
> this approach, however, because I have some other instances where a
> web service depends on other web services. When I want to use these
> services in websites, I'd then have to write a check for each script,
> each containing every service in the chain that is needed to display
> the website correctly. This way of doing things just seems a bit
> repetitive to me, especially when I have a check for these web
> services already.

You can give check_multi a try (http://my-plugin.de/check_multi).

It allows to combine multiple checks on plugin level and has a
builtin state logic to evaluate the results of these checks.
You can reuse the command files by implementing macros.

If I understood your setup correctly the whole result should return
CRITICAL if either the main website or the map are not accessible.
This is the standard behaviour of check_multi and could be
implemented like this:

# foo.cmd
# call: check_multi -f  -s URLWEB= -s
URLMAP=
command [ website ] = check_http ... -u $URLWEB$ ...
command [ map ] = check_http ... -u $URLMAP$ ...

It should work already with these two statements like you expect it
with simple check_http, only combined. If one of the child checks
fails, the whole construct returns WARNING or CRITICAL.

If you need the RC determination more sophisticated, you can define
it in perl syntax like this:
state [ WARNING ] = website != OK || $website$=~/some evil output/
state [ CRITICAL] = website >= WARNING && map != OK

Cheers,
-Matthias

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to

Re: [Nagios-users] Nagios configuration for SAN monitoring

2009-02-23 Thread Matthias Flacke


Jim Avery wrote:
> The book I used to recommend (Nagios by Wolfgang Barth
> published by No Starch Press) gives an excellent introduction but
> unless it's been updated it won't include a lot of useful stuff new in
> version 3.

Just as a side note: it is up2date. In October 2008 the 2nd edition
was released which fully covers Nagios 3 topics on now 720 pages. ;)

-Matthias

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_ping and dual ip addresses

2009-01-26 Thread Matthias Flacke


Christian Schneemann wrote:
> On Monday January 26 2009 09:39:56 am Kevin Zellar wrote:
>> how to use check_ping if there are 2 ip addresses per host?

[snip some definitions for check_ping]

> I hope there is an easier way to do this. 
> Maybe with check_multi?

Although you can do it with check_multi, it's much easier with Andreas' 
check_icmp:

$ check_icmp -m 1  

The -m switch determines how many IPs have to be reachable at minimum to 
let the whole host check succeed.

-Matthias

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Hysteresis anyone?

2008-11-05 Thread Matthias Flacke

Maybe a check_multi solution is something for you
(http://my-plugin.de/check_multi):

- 8< --
# hysteresis.cmd
# call: check_multi -f hysteresis.cmd \
#-s LASTSERVICESTATEID=$LASTSERVICESTATEID$ \
#-s UPPER= -s LOWER=
#
# 1. get temperature value
command [ temperature ] = check_snmp ...

# 2. evaluate states
state [ CRITICAL ] = \
  $temperature$ >= $UPPER$ || \
( $temperature$ >= $LOWER$ && $LASTSERVICESTATEID$ != OK )
- 8< --

Didn't tested it, but it should work that way ;-)

The trick is the state evaluation which allows arbitrary perl expressions.
Nagios macros or extra parameters can be passed via -s/--set.

HTH - Matthias

Simon Kainz wrote:
> I'm currently monitoring several room temperature and humidity meters
> and was wondering if anyone already has implementet some kind of
> hysteresis in Nagios ?
> 
> My scenario is the following: I want an critical state when temp rises
> above 26 degrees. As long as it doesnt drop below, say 24 deg, the state
> should stay critial. The temp usually floats around 26 degrees (my upper
> bound) which would lead to lots of notifications. But i only want the ok
> notification after the temp drops below my lower bound (24 deg).
> Everything inbetween shoud not trigger any warnings.
> 
> 
> I've arguing to implement this behaviour directly in my temp check
> plugins but was wondering if there is another more general approach to
> this. Maybe some event handler magic ?
> 
> 
> Hope i got this clear...
> 
> 
> TIA, Simon

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Checking multiple different procs in one service check

2008-07-05 Thread Matthias Flacke


Hi Hari,

please have a look at http://my-plugin.de/check_multi.

It wraps arbitrary calls of plugins, commands, whatever you need and is not 
limited to process checks.

Just as a more generic approach, while the basic idea is very similar ;-)

-Matthias


Hari Sekhon wrote:
> Hi,
> 
>I have a need to test a collection of procs and their arguments in 
> one service, I was going to post to the list to ask for suggestions, but 
> then I thought to check NagiosExchange and found a couple of plugins, 
> but nothing that suited my need, specifically they were orientated to 
> testing if a process name is running, and the number of processes of a 
> given name running, but they did not allow me to specify arguments that 
> those processes must be running.
> 
> I found this to be severely limiting when checking on collections of 
> interdependent scripts (I don't want to just check how many "bash"s or 
> "python"s are running...)
> 
> So I've quickly written my own plugin in Bash to do this and I've posted 
> it to NagiosExchange in case anyone else has this same requirement.
> 
> The script is basically a wrapper around the standard check_procs 
> allowing you to test several detailed services in one service check 
> which gives a lot more intelligence that the traditional check if a 
> single process is running, especially if you have any stack of 
> programs/scripts/operations that all need to be running but you don't 
> want to monitor them individually or you have too many of them to 
> practically monitor individually.
> 
> It currently takes a really nice and simple config file so it's easy to 
> have a stack of specific things to test.
> 
> Feedback is welcome and improvements will be made as needed/requested. I 
> was considering adding support for process states and process metrics 
> other than just the number of processes, but I don't (yet) use these 
> options in check_procs and so I decided to keep the config file very 
> very simple and clean, rather than bloat it with options nobody is 
> using. I'll revise the plugin as I get requests from people doing 
> specific things.
> 
> 
> You can find the plugin here:
> 
> http://www.nagiosexchange.org/cgi-bin/page.cgi?g=Detailed%2F2649.html;d=1
> 
> So there you go, a question answered and a new plugin that will 
> hopefully be generally useful for specific multi process checking.
> 
> -h
> 

-- 
http://my-plugin.de/check_multi

-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_users: list the users?

2008-04-17 Thread Matthias Flacke

Jay R. Ashworth wrote:
 > I'm setting check_users (via nrpe) to warn at 1 and CRIT at 3 users --
 > I'm putting this on servers that, typically, have no humans logged in
 > at all.
 >
 > Since that's true, I'd like to have the list of users from who -q
 > returned as part of the status message.
 >
 > a) Is this a not uncommon usage, and when I patch the program, should I
 > therefore post and or submit a patch to do this?
 >
 > b) How long can that status message become without breaking 1) nagios
 > and 2) nrpe?
 >
 > Anyone have opinions, answers, or pointers?

...as a check_multi snippet (uses Nagios3 multiline feature):

$ cat users.cmd
command [ count ] = check_users -w 1 -c 3
command [ who_q ] = who -q | head -1

$ ./check_multi -n users -f users.cmd -r 1
users CRITICAL - 2 plugins checked, 1 critical (users), 1 ok
[ 1] users USERS CRITICAL - 3 users currently logged in
[ 2] who_q mflacke mflacke mflacke

Greetz,
Matthias Flacke

-- 
http://my-plugin.de/check_multi

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Cluster services dependencies

2008-03-13 Thread Matthias Flacke


Hi,

Riccardo Cupardo wrote:
> Ty for your reply Marc,
> 
> I configured check_service_cluster and working very well, but requied 
> two stand-alone Mssql-Check services... because is a Mssql cluster, the 
> normal scenario is: one started and one stopped
> And i have a critical report on the stand-alone service looking for the 
> stand-by mssql server...
> 
> Any tips?


Have a look onto check_multi 
(http://www.my-plugin.de/wiki/projects/check_multi/start) and especially the 
state evaluation 
(http://www.my-plugin.de/wiki/projects/check_multi/process_views). It also 
includes a cluster monitoring example using _one_ nagios service.

Greetz,
Matthias

> Ty in advance.
> 
> Marc Powell ha scritto:
>> On Mar 13, 2008, at 6:50 AM, Riccardo Cupardo wrote:
>>
>>   
>>> Hi all,
>>>
>>> i have a trouble i need to check a MSSQL service in cluster  
>>> mode...
>>>
>>> ad u know, in the service is running on hostA is stopped on hostB  
>>> and viceversa...
>>>
>>> There is a method to avoid this with dependecies?
>>> 
>>
>> Does check_cluster[2] work for your needs?
>>
>> --
>> Marc
>>
>> -
>> This SF.net email is sponsored by: Microsoft
>> Defy all challenges. Microsoft(R) Visual Studio 2008.
>> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
>> ___
>> Nagios-users mailing list
>> Nagios-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
>> any issue. 
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>>   
> 
> 
> -- 
> Riccardo Cupardo
> 
> Area Network
> mailto:[EMAIL PROTECTED]
> Tel:  +39 095 37 83 111
> Fax: +39 095 37 83 444
>  
> __ T. NET 
> Sede T. net Italia S.r.l.: Viale Africa, 84 - 95129 Catania - Italy
> Tel: +39 095 37 83 111 - Fax: +39 095 37 83 444
> P. I.V.A.: 03979950874
> www.tnet.it   www.lavocevola.it
> __T. NET Telecommunication Company
>  
>  
> ***
> Le informazioni in questa e-mail sono confidenziali e
> riservate esclusivamente al destinatario del messaggio.
> Information in this email is confidential and intended
> solely for the addressee; it may be legally privileged.
> 
> 
> 
> 
> -
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> 
> 
> 
> 
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null

-- 
http://my-plugin.de/check_multi

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] State Stalking and notifications

2008-02-20 Thread Matthias Flacke

[EMAIL PROTECTED] wrote:
>  > I had thought about writing a custom check for each line
>  > of output that this command generates, but that seems needlessly
>  > painful.
[...]
>  > I'm guessing the answer here is "Nagios can't do that", but I thought
>  > I'd ask anyway.
> 
> Technically Nagios can't do that. At least not from the vantage point 
> you have described. We are much more granular in our monitoring for 
> exactly the scenario you have described. At this point we don't combine 
> multiple pieces into a single service check unless a department manager 
> specifically requests a full overview in a single check. We monitor each 
> piece with its own service check so we have complete control over who 
> get notified for what, when they get notified, how often they get 
> notified, and so on. I would say that Nagios can do what you want but 
> that it is up to you to make your checks more granular.

If you need both granularity and flexibility and on top an overall evaluation 
of your particular results then have a look onto check_multi 
(http://www.my-plugin.de/check_multi).

It combines both concepts - granular results and process views - just as you 
define it.

-Matthias

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios 3.0 rc1 urlize nsca embeded links fail

2008-01-21 Thread Matthias Flacke

[EMAIL PROTECTED] wrote:
> I just upgraded from 2.X line to 3.0 rc1, and I notice that the Status
> Information field no longer supports embedded URLs..
> Previously I could send a passive check result via nsca like:
> Some alarm text http://somelink.com";> click me 
> Now this no longer renders as an embedded link, but instead the literal
> string.
> Is this a bug or is there a way to get the embedded link back?

There's a new option in cgi.cfg which controls this:

escape_html_tags=0

should recover the old behaviour.

-Matthias

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_cluster for hosts and warning state not yellow.

2008-01-07 Thread Matthias Flacke

js wrote:
> I'm using Nagios 3.0RC1.
> I'm using the latest check_cluster plugin in order to implement cluster
> logic into Nagios.
> I'm monitoring clustered hosts and NOT services. 
> The whole thing seems to work correct but when the clusterhost is in a
> warning state, it is still displayed in GREEN, while that should be YELLOW
> not? 
> When the cluster check is critical, Nagios displays the host status RED,
> which is OK. 
> 
> Here's the verbose output of the plugin. 
> 
> check_cluster - Warning: start=1 end=1: Critical: start=2 end=2 CLUSTER
> WARNING: Host Cluster: 1 up, 1 down, 0 unreachable 

Instead of check_cluster you can try check_multi 
(http://my-plugin.de/check_multi) with the following command file:

# cluster.cmd with host checks
command [ host1 ]  = check_icmp -H host1
command [ host2 ]  = check_icmp -H host2
state [ WARNING ]  = host1 != OK || host2 != OK
state [ CRITICAL ] = host1 != OK && host2 != OK

Greetz,
Matthias

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] best solution for configuration changes

2008-01-03 Thread Matthias Flacke

I'd like to suggest a completely different approach - again with the plugin 
check_multi (http://my-plugin.de/check_multi) which I yesterday introduced in 
another context. ;-)

Take the following scenario: you have a parent check_multi service for each 
region (user/admin/organisational) you want to serve and it is configured 
within nagios by you - this is the fix frame.

Now the flexible part - the plugin configuration. Multiple child plugins are 
configured in a plain ASCII file like the following:

# group1.cmd
command [ proc1 ] = check_procs -C abc ...
command [ disk1 ] = check_disk -w ...
command [ load  ] = check_load ...
...

This command file is easy to understand, to change and to test (on the command 
line). There has nothing to be changed in Nagios, no need to run any further 
'nagios -v' check. So you can delegate this job to your users.

If the users do anything wrong with it, they only earn an error message (which 
is shown in Nagios parent plugin output) and at worst a CRITICAL state for 
their parent service. But Nagios remains sane and integer.
These check_multi configs can be maintained by SVN, and security is based on 
standard Unix access control.

The only disadvantage: all child checks only have one parent. This means only 
one notification, escalation logics a.s.o. If you want to split this, take a 
second service.
And another point: it's more or less Nagios3 stuff because it extensively uses 
multiline output.

Greetz,
Matthias

Brian Loe wrote:
> I was thinking of some convoluted solution for users to configure 
> "their" configuration files and them diffed or uniqed to the original 
> with that being saved and then the newly updated config file to be 
> copied over the old one and nagios reloaded, etc., etc.. BUT, I figure 
> there has to be a better way. What I have for host config files are:
> 
> windows.cfg
> net.cfg
> resources.cfg
> unix.cfg
> security.cfg
> 
> Each of those host config files has a group of people associated with 
> them - and I'll need to break out the services.cfg file the same way.
> 
> I haven't added a user to the system either - and I'm not even sure what 
> the best way of doing that!
> 
> The end result of what ever I do allows, for instance, people in the 
> security group to modify /etc/nagios/security_hosts.cfg and 
> /etc/nagios/security_svcs.cfg, and the changes would be saved somewhere 
> (repository or log file or whatever), and then nagios service would be 
> reloaded.
> 
> Suggestions? Solutions? It'd be great if this could be done via a/the 
> web interface but I haven't found one that works that well yet - and I 
> still am not sure how to provide users with a login/pw that they can 
> change upon the next log on (obviously I'm not an expert linux admin)!

-- 
http://my-plugin.de/check_multi

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Propagating Service Changes

2008-01-02 Thread Matthias . Flacke


Maybe you can solve your issue already on the plugin level:

- If you want to evaluate the state of existing Nagios services take a look 
onto check_cluster.

- If you want to cover more sophisticated setups and be able to do a fine 
granulated evaluation of the results of your child checks you can also take 
check_multi (http://www.my-plugin.de/wiki/projects/check_multi/start) , which 
is mainly written for Nagios 3, but also works with Nagios 2.
The state evaluation topic is handled in the 'Process Views' page 
(http://www.my-plugin.de/wiki/projects/check_multi/process_views)

HTH - Matthias Flacke
 
- original message 

Subject: [Nagios-users] Propagating Service Changes
Sent: Wed, 02 Jan 2008
From: Mohr James<[EMAIL PROTECTED]>

> Hi All!
> 
> First off my apologies. Apparently I forget to stop the subscription
> while I was on vacation, so I would imagine that most everyone got a
> vacation notice from me. Sorry!
> 
> We have a situation (using Nagios 2.5) we need to monitor/report
> services "conceptually". That is, we have several services that are
> monitored and only when all of these are not accessible do we report
> that the main service is not accessible. I couldn't find any information
> on propagating services, so I was thinking about using an event handler
> that would run whenever one of the low-level services changes state. The
> handler would then check the other services (perhaps using the
> Nagios-Object perl module) and if all of the other services are
> critical, the handler would use send_nsca to send a message to the
> master service. However, if there is some mechanism already built in
> (even Nagios 3.0) that would save me the work of re-inventing the wheel.
> 
> I would be grateful for any input or ideas.
> 
> Regards,
> 
> Jim Mohr
> 
> -
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 
> 

--- original message end 


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] problem state with check_multi

Re: [Nagios-users] Difference between check_multi output on host and output received by chech_by_ssh

Re: [Nagios-users] Alleviating Nagios i/o contention problem

Re: [Nagios-users] Alleviating Nagios i/o contention problem

Re: [Nagios-users] Override interval_length for specific service only

Re: [Nagios-users] Nagios disfunctional , perhaps due to time change ?

Re: [Nagios-users] problem with check_yum

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

Re: [Nagios-users] Nagios configuration for SAN monitoring

Re: [Nagios-users] check_ping and dual ip addresses

Re: [Nagios-users] Hysteresis anyone?

Re: [Nagios-users] Checking multiple different procs in one service check

Re: [Nagios-users] check_users: list the users?

Re: [Nagios-users] Cluster services dependencies

Re: [Nagios-users] State Stalking and notifications

Re: [Nagios-users] Nagios 3.0 rc1 urlize nsca embeded links fail

Re: [Nagios-users] check_cluster for hosts and warning state not yellow.

Re: [Nagios-users] best solution for configuration changes

Re: [Nagios-users] Propagating Service Changes

20 matches

Site Navigation

Mail list logo

Footer information