Thanks Karl. This was my suspicion, as one of the SQL Servers did show
signs of high CPU utilization. The other SQL Server didn't seem to have
high CPU usage, but it could be that we are logging into the server soon
after the CPU settles down.
From: Yost, Karl [mailto:[EMAIL PROTECTED]
Sent: Friday, February 22, 2008 3:19 PM
To: Ed Anderson; [email protected]
Subject: RE: [Nagios-users] Intermittent issues with check_nt and/or
NSClient
I tend to get these when the Windows box his at a high cpu utilization,
other times just cycling the service has fixed it when it wasn't the
CPU, 98% of the time I have found it to be CPU.
Thanks,
Karl Yost
IQOR
[EMAIL PROTECTED]
Vice President, Technology
Phone: 614.284.3985
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ed
Anderson
Sent: Friday, February 22, 2008 5:48 PM
To: [email protected]
Subject: [Nagios-users] Intermittent issues with check_nt and/or
NSClient
I'm hoping that someone has run into this before. I'm having
intermittent issues with check_nt that produces any of the following
error messages:
Could not fetch information from server
or
Connection reset by peer
or
No data was received from host
This only happens with 2 remote servers (so far), both of which are
Windows 2k3 EE / SQL 2k5 database servers. The symptoms last for about
13-15 minutes, enough time for the notifications to be sent out, then
snaps back to OK statuses. I'm using NSClient on all of my Windows
server, and the check_nt plugin to watch things such as disk
utilization, services, CPU, etc. Here are some samples of the check_nt
commands I'm using:
$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v SERVICESTATE -l $ARG1
$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v CPULOAD -l $ARG1$
$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v USEDDISKSPACE -l $ARG1$ -w
$ARG2$ -c $ARG3$
Something else to note...these Nagios service checks have been in place
for quite some time, and it's just recently started acting up. I'd say
within the last couple weeks.
Any ideas? Is this as simple as increasing the timeout for check_nt, or
does this mean I have some other underlying issues I need to address?
I'm thinking 10 seconds should be plenty of time to run the service
checks as the servers are all within the same location.
Nagios version = 2.10
check_nt version = v1590
NSClient version = ?? not sure how to find it.
Edward Anderson
Information Services| BI Administrator/Report Developer
[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> | cell: 971.221.7523
| office: 503.624.1951 Ext. 4121
NWEA
Partnering to help all kids learn
5885 SW Meadows Road, Suite 200
Lake Oswego, OR 97035-3256
Fax: 503.639.7873
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue.
::: Messages without supporting info will risk being sent to /dev/null