Thank you, Allan -  yes this so obvious after you spelled it out for met.
Somehow I was thinking in the absolute numbers of open files instead of % of 
total.

From: Allan Clark [mailto:all...@chickenandporn.com]
Sent: Friday, June 08, 2012 2:09 PM
To: Nagios Users List
Subject: Re: [Nagios-users] monitor number of open files on linux

On Fri, Jun 8, 2012 at 1:53 PM, Parkman, Mikhail 
<mikhail_park...@cable.comcast.com<mailto:mikhail_park...@cable.comcast.com>> 
wrote:
Thanks - I decided to go with check_open_files.pl<http://check_open_files.pl>
http://exchange.nagios.org/directory/Plugins/Uncategorized/Operating-Systems/Linux/check-open-files/details

I didn't find help_me/read_me info for this plugin.
After I installed it on the target box into /usr/local/nagios/libexec and just 
executed it, I got:
----------
[root@target_host libexec]# ./check_open_files.pl<http://check_open_files.pl>
Usage:  -w <warn> -c <crit> [-t <timeout>] [-v version] [-h help]
[root@target_host libexec]#
======
That told me that I should run it at least with "-w some_value1 -c some_value2"
Then I tried to run it with different -w -c values and I am not clear why I am 
getting different threshold values (bold, red) :
===============
[root@ target_host libexec]# ./check_open_files.pl<http://check_open_files.pl>  
-w 500 -c 10000
OK: open files (4590) is below threshold 
(16194515/323890300)|open_files=4590;16194515;323890300
[root@ target_host libexec]# ./check_open_files.pl<http://check_open_files.pl> 
-w 1000 -c 10000
OK: open files (4590) is below threshold 
(32389030/323890300)|open_files=4590;32389030;323890300
[root@ target_host libexec]# ./check_open_files.pl<http://check_open_files.pl> 
-w 10 -c 100
OK: open files (4590) is below threshold 
(323890/3238903)|open_files=4590;323890;3238903
===============
Why do I get in response 2 threshold values and why are they different each 
time I enter another number of warning and critical limits?

Clearly, in general terms compared to other plugins:

1) you're getting "OK" because 4590 is less than the thresholds you've set; had 
it exceeded 323890 (in the -w10 example) then you'd get WARN, and if it 
exceeded the other, an ERROR response.  The actual thresholds are returned back 
because they are based on a calculation, and when the values are below, but the 
suer thinks they shouldn't be, the Nagios/Icinga screen would show the ref 
values as well as a comment.

2) your question as to why the numbers change might be more complex than I'm 
reading, but it's clearly taking % of total system files as a threshold:

-w 500 --> 500% of (cat /proc/sys/fs/file-max) ==> 16194515
-c 10000 --> 10000% of (cat /proc/sys/fs/file-max) ==> 323890300

Have I misread your question(s)?

I would suggest you set your thresholds to alarm on percentages; I'm not sure 
50% and 80% are good numbers, but "-w 50 -c 80" would achieve those.

Allan
--
all...@chickenandporn.com<mailto:all...@chickenandporn.com>  "金鱼" 
http://linkedin.com/in/goldfish
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Reply via email to