[Nagios-users] check_by_ssh timeouts / how to work around?
Hi, I have a couple machines that spit out a warning similar to this: WARNING - check_by_ssh: Remote command '/home/nagios/nagios-plugs/ check_disk' returned status 1 I believe this to be caused by the check itself is timing out. As when I try to login it will sometimes take up to a minute or two just to get a prompt. The server will respond to ping, so I'm generally not totally concerned about it. And the checks usually clear up in 5 minutes or soon as the server gets whatever IO hog out of the way. Is anyone else experiencing this, and if so how do you cope / deal with this? Thanks, Charlie - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_by_ssh timeouts / how to work around?
I should also mention that I also have these timeouts in place... service_check_timeout=90 host_check_timeout=30 event_handler_timeout=30 notification_timeout=60 ocsp_timeout=5 perfdata_timeout=5 Charlie On Oct 6, 2008, at 10:35 AM, Charlie Reddington wrote: Hi, I have a couple machines that spit out a warning similar to this: WARNING - check_by_ssh: Remote command '/home/nagios/nagios-plugs/ check_disk' returned status 1 I believe this to be caused by the check itself is timing out. As when I try to login it will sometimes take up to a minute or two just to get a prompt. The server will respond to ping, so I'm generally not totally concerned about it. And the checks usually clear up in 5 minutes or soon as the server gets whatever IO hog out of the way. Is anyone else experiencing this, and if so how do you cope / deal with this? Thanks, Charlie - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_by_ssh timeouts / how to work around?
Are you using a LDAP server and RSA keys? -Original Message- From: Charlie Reddington [mailto:[EMAIL PROTECTED] Sent: Monday, October 06, 2008 11:35 AM To: Nagios User list Subject: [Nagios-users] check_by_ssh timeouts / how to work around? Hi, I have a couple machines that spit out a warning similar to this: WARNING - check_by_ssh: Remote command '/home/nagios/nagios-plugs/ check_disk' returned status 1 I believe this to be caused by the check itself is timing out. As when I try to login it will sometimes take up to a minute or two just to get a prompt. The server will respond to ping, so I'm generally not totally concerned about it. And the checks usually clear up in 5 minutes or soon as the server gets whatever IO hog out of the way. Is anyone else experiencing this, and if so how do you cope / deal with this? Thanks, Charlie - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_by_ssh timeouts / how to work around?
On Mon, October 6, 2008 11:37 am, Charlie Reddington wrote: I should also mention that I also have these timeouts in place... service_check_timeout=90 host_check_timeout=30 event_handler_timeout=30 notification_timeout=60 ocsp_timeout=5 perfdata_timeout=5 Charlie On Oct 6, 2008, at 10:35 AM, Charlie Reddington wrote: Hi, I have a couple machines that spit out a warning similar to this: WARNING - check_by_ssh: Remote command '/home/nagios/nagios-plugs/ check_disk' returned status 1 I believe this to be caused by the check itself is timing out. As when I try to login it will sometimes take up to a minute or two just to get a prompt. The server will respond to ping, so I'm generally not totally concerned about it. And the checks usually clear up in 5 minutes or soon as the server gets whatever IO hog out of the way. Is anyone else experiencing this, and if so how do you cope / deal with this? Thanks, Charlie The timeouts in nagios.cfg are ow long the nagios process waits before aborting a check. There are usually check specific timeouts that you can add to the command definition. Run the check_* command manually and see what the syntax is (sometimes '-t xx'). - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_by_ssh timeouts / how to work around?
Sorry, forgot the mail list I'm using no ldap, but with DSA keys. On Oct 6, 2008, at 10:58 AM, Matt Rivet wrote: Are you using a LDAP server and RSA keys? -Original Message- From: Charlie Reddington [mailto:[EMAIL PROTECTED] Sent: Monday, October 06, 2008 11:35 AM To: Nagios User list Subject: [Nagios-users] check_by_ssh timeouts / how to work around? Hi, I have a couple machines that spit out a warning similar to this: WARNING - check_by_ssh: Remote command '/home/nagios/nagios-plugs/ check_disk' returned status 1 I believe this to be caused by the check itself is timing out. As when I try to login it will sometimes take up to a minute or two just to get a prompt. The server will respond to ping, so I'm generally not totally concerned about it. And the checks usually clear up in 5 minutes or soon as the server gets whatever IO hog out of the way. Is anyone else experiencing this, and if so how do you cope / deal with this? Thanks, Charlie - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_by_ssh timeouts / how to work around?
On Oct 6, 2008, at 11:03 AM, James wrote: On Mon, October 6, 2008 11:37 am, Charlie Reddington wrote: I should also mention that I also have these timeouts in place... service_check_timeout=90 host_check_timeout=30 event_handler_timeout=30 notification_timeout=60 ocsp_timeout=5 perfdata_timeout=5 Charlie On Oct 6, 2008, at 10:35 AM, Charlie Reddington wrote: Hi, I have a couple machines that spit out a warning similar to this: WARNING - check_by_ssh: Remote command '/home/nagios/nagios-plugs/ check_disk' returned status 1 I believe this to be caused by the check itself is timing out. As when I try to login it will sometimes take up to a minute or two just to get a prompt. The server will respond to ping, so I'm generally not totally concerned about it. And the checks usually clear up in 5 minutes or soon as the server gets whatever IO hog out of the way. Is anyone else experiencing this, and if so how do you cope / deal with this? Thanks, Charlie The timeouts in nagios.cfg are ow long the nagios process waits before aborting a check. There are usually check specific timeouts that you can add to the command definition. Run the check_* command manually and see what the syntax is (sometimes '-t xx'). I thought I had did that already , and just put the --timeout option on the check_by_ssh, but I guess not. I added the timeout, from 30 to 60. We'll see how it goes. Charlie - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_by_ssh timeouts / how to work around?
I believe this to be caused by the check itself is timing out. As when I try to login it will sometimes take up to a minute or two just to get a prompt. As for setting the timeouts for that sort of thing, this is what I do. In my resource.cfg: -- # check_by_ssh timeout $USER4$=10 -- .. and in my commands.cfg definitions.. --- # 'check_disk_remote' command definition define command { command_namecheck_disk_remote command_line$USER1$/check_by_ssh -H $HOSTADDRESS$ -t $USER4$ - C $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ } --- And I use the same $USER4$ definition for all of the check_by_ssh calls, so that it's easy to tune. Have you looked into the reason for the long login delay though? I think I'd start there. A 60 second wait for ssh to get you a shell indicates some sort of problem. Either the target machine is so resource starved that it can't negotiate the authentication and encryption, or you've got some other delay in there. The most likely culprit to my mind is DNS -- ssh itself, login and your shell on the target machine might all be trying to do a reverse DNS lookup on the source of the connection. If that's timing out, it could cause very long delays. There are lots of other potential problems, but I'd start looking there. PGP.sig Description: This is a digitally signed message part - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null