On Wed, Sep 7, 2011 at 1:18 PM, Jonathan Gazeley <jonathan.gaze...@bristol.ac.uk> wrote: > Hi list, > > I've used Nagios for a few years now, largely without any problems, but > since I just rebuilt my Nagios server I'm having a problem. > > My nagios log file is full of entries like this, that recur every few > seconds: > > Error: Unable to rename file > '/var/log/nagios/spool/checkresults/checkf8zhrH' to > '/var/log/nagios/spool/checkresults/c8M6TqA': No such file or directory > Warning: Unable to move file > '/var/log/nagios/spool/checkresults/checkf8zhrH' to check results queue. > Error: Unable to rename file > '/var/log/nagios/spool/checkresults/check3OnQ7y' to > '/var/log/nagios/spool/checkresults/cKzmO7d': No such file or directory > Warning: Unable to move file > '/var/log/nagios/spool/checkresults/check3OnQ7y' to check results queue. > Error: Unable to rename file > '/var/log/nagios/spool/checkresults/checkbsjxap' to > '/var/log/nagios/spool/checkresults/c6TEIkd': No such file or directory > Warning: Unable to move file > '/var/log/nagios/spool/checkresults/checkbsjxap' to check results queue. > Error: Unable to rename file > '/var/log/nagios/spool/checkresults/checkyHICiz' to > '/var/log/nagios/spool/checkresults/c28Thaw': No such file or directory > Warning: Unable to move file > '/var/log/nagios/spool/checkresults/checkyHICiz' to check results queue. > Error: Unable to rename file > '/var/log/nagios/spool/checkresults/checknXxstZ' to > '/var/log/nagios/spool/checkresults/cNhpsRH': No such file or directory > Warning: Unable to move file > '/var/log/nagios/spool/checkresults/checknXxstZ' to check results queue. > > > I see from searching for the problem online that it can be caused by > multiple running instances of nagios. When I do a "ps -ef | grep nagios" > there are usually 4 processes - one that seems persistent (2337 in this > case) and the other 3 that disappear and reappear with new pids. Killing > the 3 "extra" processes makes them just reappear. Is this normal? > > [root@monitor ~]# ps -ef | grep \/usr\/sbin\/nagios > nagios 2337 1 0 13:05 ? 00:00:02 /usr/sbin/nagios -d > /etc/nagios/nagios.cfg > nagios 15453 1 0 13:12 ? 00:00:00 /usr/sbin/nagios -d > /etc/nagios/nagios.cfg > nagios 15621 1 0 13:12 ? 00:00:00 /usr/sbin/nagios -d > /etc/nagios/nagios.cfg > nagios 15707 1 0 13:12 ? 00:00:00 /usr/sbin/nagios -d > /etc/nagios/nagios.cfg > root 15744 6284 0 13:12 pts/0 00:00:00 grep /usr/sbin/nagios >
It is usual for the mutiple process (at least on our systems anyway ;) Little confused about your PPids though, eg should they not be owned by the original Nagios process? ~# ps -ef | grep nagios.cfg root 22219 22001 0 16:00:50 pts/9 0:00 grep nagios.cfg nagios 22192 9808 0 16:00:49 ? 0:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg nagios 22207 9808 0 16:00:50 ? 0:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg nagios 22213 9808 0 16:00:50 ? 0:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg nagios 9808 19242 3 14:27:08 ? 5:57 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg nagios 22212 9808 0 16:00:50 ? 0:00 /opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg nag03 ~]# ps -ef | grep nagios.cfg nagios 757 1 24 Aug15 ? 5-20:20:43 /usr/sbin/nagios -d /etc/nagios/nagios.cfg nagios 27004 757 0 16:02 ? 00:00:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg nagios 28460 757 0 16:02 ? 00:00:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg nagios 29513 757 0 16:02 ? 00:00:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg nagios 29760 757 0 16:02 ? 00:00:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg nagios 30516 757 0 16:02 ? 00:00:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg hth -- ritchie > > This is a 64-bit CentOS 6.0 virtual machine. It was running SELinux but > I disabled it for debugging in case it was causing problems. > > Permissions on ls -la /var/log/nagios/spool/checkresults/ and parents > are traversable and writable by the nagios user. > > I also saw online that sometimes permissions on /dev/null can cause this > problem, but in my case /dev/null is world-writable so I can't see a > problem. > > I adjusted max_check_result_file_age to 0 in case my checkresult files > were being deleted prematurely, but the problem persists. > > So, I have no idea what to look at next while troubleshooting this. Can > anyone suggest a pointer? > > Many thanks, > Jonathan > > ------------------------------------------------------------------------------ > Using storage to extend the benefits of virtualization and iSCSI > Virtualization increases hardware utilization and delivers a new level of > agility. Learn what those decisions are and how to modernize your storage > and backup environments for virtualization. > http://www.accelacomm.com/jaw/sfnl/114/51434361/ > _______________________________________________ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------------ Doing More with Less: The Next Generation Virtual Desktop What are the key obstacles that have prevented many mid-market businesses from deploying virtual desktops? How do next-generation virtual desktops provide companies an easier-to-deploy, easier-to-manage and more affordable virtual desktop model.http://www.accelacomm.com/jaw/sfnl/114/51426474/ _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null