Neither of Frank's suggestions solved it, but something strange did happen. First there was no known_users folder under nagios, only under /root. I copied this file and tried the check again. Nothing.
I deleted the lines referring to the host both by name and by ip. I logged in again, and I went directly to the command line without being asked for a host key. I checked both known_user files again, and there were NO REFERENCES to the host machine, even though I was in the machine via ssh. Curiouser and curiouser. In learning more about error logs, I found the following. Nagios command line send looks good. The only error it is getting is remote command execution failed. I checked the logs on the target machine, and they said " Error: PAM: Authentication failure for nagios from (fully_qualified_machine_name)" Robert L. O'Donnell -----Original Message----- Add "-o StrictHostKeyChecking=no" to your check_by_ssh command line. Or you can edit ~nagios/.ssh/known_hosts, remove the line for that host, then su to the nagios user and manually ssh to the host. Type "yes" when it asks you about the host key. -f On Tue, 20 Apr 2010, O'Donnell, Robert L wrote: > Date: Tue, 20 Apr 2010 16:18:05 -0700 > From: "O'Donnell, Robert L" <robert.l.odonn...@intel.com> > > No joy in Mudville. > > I can ssh interactive as user nagios with both the ip address and the name of > the target machine. > > Also, my .cfg file does define the host name and address > > Define host{ > Use linux-server > Host_name (machine_name) > Alias (machine_name) > Address (machine_ip) > Notes xxxxx > Notes_url (internal_wiki_web_link) > > > I also tried to change the command.cfg command_line to check_by_ssh -H > $HOSTADDRESS$ ... to $HOSTNAME$ with no impact. > > Any help would be GREATLY appreciated, as I have been banging my head on this > for over a week. That said, I know a LOT more about ssh and DSA security > than I did last week. > > > ================ > > Hi O'Donnell,! > > On Wed, 14 Apr 2010, O'Donnell, Robert L wrote: > >> Giorgio, >> >> Yes. >> >> Ssh (remote_machine_ip) -l nagios ?I /etc/nagios/.ssh/id_dsa >> >> logs me in with prompt ?nagios@(remote_machine_name):~>? and >> whoami returns nagios. > > I see that you're logging in by IP, but in the config snippet you > included previously there was no IP defined in the host definition, so > $HOSTADDRESS$ will expand to the same value as host_name. You'll need > to log in using the same exact address (in this case, name) Nagios will > be using to create an entry in your known_hosts file that matches what > Nagios will be using. > > ================== > > If u try an ssh login from Nagios box to the remote box, u can login without > been asked for a password? > > > Ciao, > > Giorgio > > Il giorno 14/apr/2010, alle ore 23.15, "O'Donnell, Robert L" > <robert.l.odonn...@intel.com> ha scritto: > Running nagios 3.0.6 on OpenSUSE and trying to get info from a SUSE > enterprise machine. > > Chk_by_ssh returns UNKNOWN and "Remote command execution failed: Host key > verification failed", check_ping works. > > I have played with every combo of options I could think of, but same results. > This is my first setup with remote checks on a linux machine (windows > machine checks working fine), so it could be a newb problem. > > (from commands.cfg) > # check_ssh_dummy command definition > define command{ > command_name check_ssh_dummy > command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -l nagios -2 > \ > -I /etc/nagios/.ssh/id_dsa -C "/usr/lib/nagios/plugins/check_dummy 0" > } > > I took out the desired response text to eliminate possible double quote > issues, but it made no difference > > =================== > > Running nagios 3.0.6 on OpenSUSE and trying to get info from a SUSE > enterprise machine. > > Chk_by_ssh returns UNKNOWN and "Remote command execution failed: Host key > verification failed", check_ping works. > > I have played with every combo of options I could think of, but same results. > This is my first setup with remote checks on a linux machine (windows > machine checks working fine), so it could be a newb problem. > > (from commands.cfg) > # check_ssh_dummy command definition > define command{ > command_name check_ssh_dummy > command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -l nagios -2 > \ > -I /etc/nagios/.ssh/id_dsa -C "/usr/lib/nagios/plugins/check_dummy 0" > } > > I took out the desired response text to eliminate possible double quote > issues, but it made no difference > > > (from object config) > > define service{ > use local-service > host_name (machine_name) > service_description PING > check_command check_ping!100.0,20%!500.o,60% > } > > define_service{ > use local-service > host_name (machine_name) > service_description check_ssh_dummy! > } > > >> From the command line, I run: > > /usr/lib/nagios/plugins/check_by_ssh -H (machine_ip) -l nagios -i > /etc/nagios.ssh/id_dsa -C "usr/lib/nagios/plugins/check_dummy 0" > > It returns OK > > I am having the same issue with check_ssh_disk. I do not have a MOTD for the > shell, I have tried putting explicit paths in place of USER1, putting in the > actual address for HOSTADDRESS, and many options on the > > > Robert L. O'Donnell > Advanced Equipment Engineering, IMO > > ------------------------------------------------------------------------------ > _______________________________________________ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------------ _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null