Re: [Nagios-users] Unable to move file to check results queue

2011-09-08 Thread mail
On Wed, Sep 7, 2011 at 1:18 PM, Jonathan Gazeley
jonathan.gaze...@bristol.ac.uk wrote:
 Hi list,

 I've used Nagios for a few years now, largely without any problems, but
 since I just rebuilt my Nagios server I'm having a problem.

 My nagios log file is full of entries like this, that recur every few
 seconds:

 Error: Unable to rename file
 '/var/log/nagios/spool/checkresults/checkf8zhrH' to
 '/var/log/nagios/spool/checkresults/c8M6TqA': No such file or directory
 Warning: Unable to move file
 '/var/log/nagios/spool/checkresults/checkf8zhrH' to check results queue.
 Error: Unable to rename file
 '/var/log/nagios/spool/checkresults/check3OnQ7y' to
 '/var/log/nagios/spool/checkresults/cKzmO7d': No such file or directory
 Warning: Unable to move file
 '/var/log/nagios/spool/checkresults/check3OnQ7y' to check results queue.
 Error: Unable to rename file
 '/var/log/nagios/spool/checkresults/checkbsjxap' to
 '/var/log/nagios/spool/checkresults/c6TEIkd': No such file or directory
 Warning: Unable to move file
 '/var/log/nagios/spool/checkresults/checkbsjxap' to check results queue.
 Error: Unable to rename file
 '/var/log/nagios/spool/checkresults/checkyHICiz' to
 '/var/log/nagios/spool/checkresults/c28Thaw': No such file or directory
 Warning: Unable to move file
 '/var/log/nagios/spool/checkresults/checkyHICiz' to check results queue.
 Error: Unable to rename file
 '/var/log/nagios/spool/checkresults/checknXxstZ' to
 '/var/log/nagios/spool/checkresults/cNhpsRH': No such file or directory
 Warning: Unable to move file
 '/var/log/nagios/spool/checkresults/checknXxstZ' to check results queue.


 I see from searching for the problem online that it can be caused by
 multiple running instances of nagios. When I do a ps -ef | grep nagios
 there are usually 4 processes - one that seems persistent (2337 in this
 case) and the other 3 that disappear and reappear with new pids. Killing
 the 3 extra processes makes them just reappear. Is this normal?

 [root@monitor ~]# ps -ef | grep \/usr\/sbin\/nagios
 nagios    2337     1  0 13:05 ?        00:00:02 /usr/sbin/nagios -d
 /etc/nagios/nagios.cfg
 nagios   15453     1  0 13:12 ?        00:00:00 /usr/sbin/nagios -d
 /etc/nagios/nagios.cfg
 nagios   15621     1  0 13:12 ?        00:00:00 /usr/sbin/nagios -d
 /etc/nagios/nagios.cfg
 nagios   15707     1  0 13:12 ?        00:00:00 /usr/sbin/nagios -d
 /etc/nagios/nagios.cfg
 root     15744  6284  0 13:12 pts/0    00:00:00 grep /usr/sbin/nagios


It is usual for the mutiple process (at least on our systems anyway ;)
Little confused about your PPids though, eg should they not be owned
by the original Nagios process?

~# ps -ef | grep nagios.cfg
root 22219 22001   0 16:00:50 pts/9   0:00 grep nagios.cfg
  nagios 22192  9808   0 16:00:49 ?   0:00
/opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
  nagios 22207  9808   0 16:00:50 ?   0:00
/opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
  nagios 22213  9808   0 16:00:50 ?   0:00
/opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
  nagios  9808 19242   3 14:27:08 ?   5:57
/opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg
  nagios 22212  9808   0 16:00:50 ?   0:00
/opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg

nag03 ~]# ps -ef | grep nagios.cfg
nagios 757 1 24 Aug15 ?5-20:20:43 /usr/sbin/nagios -d
/etc/nagios/nagios.cfg
nagios   27004   757  0 16:02 ?00:00:00 /usr/sbin/nagios -d
/etc/nagios/nagios.cfg
nagios   28460   757  0 16:02 ?00:00:00 /usr/sbin/nagios -d
/etc/nagios/nagios.cfg
nagios   29513   757  0 16:02 ?00:00:00 /usr/sbin/nagios -d
/etc/nagios/nagios.cfg
nagios   29760   757  0 16:02 ?00:00:00 /usr/sbin/nagios -d
/etc/nagios/nagios.cfg
nagios   30516   757  0 16:02 ?00:00:00 /usr/sbin/nagios -d
/etc/nagios/nagios.cfg

hth
--
ritchie

 This is a 64-bit CentOS 6.0 virtual machine. It was running SELinux but
 I disabled it for debugging in case it was causing problems.

 Permissions on ls -la /var/log/nagios/spool/checkresults/ and parents
 are traversable and writable by the nagios user.

 I also saw online that sometimes permissions on /dev/null can cause this
 problem, but in my case /dev/null is world-writable so I can't see a
 problem.

 I adjusted max_check_result_file_age to 0 in case my checkresult files
 were being deleted prematurely, but the problem persists.

 So, I have no idea what to look at next while troubleshooting this. Can
 anyone suggest a pointer?

 Many thanks,
 Jonathan

 --
 Using storage to extend the benefits of virtualization and iSCSI
 Virtualization increases hardware utilization and delivers a new level of
 agility. Learn what those decisions are and how to modernize your storage
 and backup environments for virtualization.
 http://www.accelacomm.com/jaw/sfnl/114/51434361/
 ___
 Nagios-users mailing list
 

[Nagios-users] Unable to move file to check results queue

2011-09-07 Thread Jonathan Gazeley
Hi list,

I've used Nagios for a few years now, largely without any problems, but 
since I just rebuilt my Nagios server I'm having a problem.

My nagios log file is full of entries like this, that recur every few 
seconds:

Error: Unable to rename file 
'/var/log/nagios/spool/checkresults/checkf8zhrH' to 
'/var/log/nagios/spool/checkresults/c8M6TqA': No such file or directory
Warning: Unable to move file 
'/var/log/nagios/spool/checkresults/checkf8zhrH' to check results queue.
Error: Unable to rename file 
'/var/log/nagios/spool/checkresults/check3OnQ7y' to 
'/var/log/nagios/spool/checkresults/cKzmO7d': No such file or directory
Warning: Unable to move file 
'/var/log/nagios/spool/checkresults/check3OnQ7y' to check results queue.
Error: Unable to rename file 
'/var/log/nagios/spool/checkresults/checkbsjxap' to 
'/var/log/nagios/spool/checkresults/c6TEIkd': No such file or directory
Warning: Unable to move file 
'/var/log/nagios/spool/checkresults/checkbsjxap' to check results queue.
Error: Unable to rename file 
'/var/log/nagios/spool/checkresults/checkyHICiz' to 
'/var/log/nagios/spool/checkresults/c28Thaw': No such file or directory
Warning: Unable to move file 
'/var/log/nagios/spool/checkresults/checkyHICiz' to check results queue.
Error: Unable to rename file 
'/var/log/nagios/spool/checkresults/checknXxstZ' to 
'/var/log/nagios/spool/checkresults/cNhpsRH': No such file or directory
Warning: Unable to move file 
'/var/log/nagios/spool/checkresults/checknXxstZ' to check results queue.


I see from searching for the problem online that it can be caused by 
multiple running instances of nagios. When I do a ps -ef | grep nagios 
there are usually 4 processes - one that seems persistent (2337 in this 
case) and the other 3 that disappear and reappear with new pids. Killing 
the 3 extra processes makes them just reappear. Is this normal?

[root@monitor ~]# ps -ef | grep \/usr\/sbin\/nagios
nagios2337 1  0 13:05 ?00:00:02 /usr/sbin/nagios -d 
/etc/nagios/nagios.cfg
nagios   15453 1  0 13:12 ?00:00:00 /usr/sbin/nagios -d 
/etc/nagios/nagios.cfg
nagios   15621 1  0 13:12 ?00:00:00 /usr/sbin/nagios -d 
/etc/nagios/nagios.cfg
nagios   15707 1  0 13:12 ?00:00:00 /usr/sbin/nagios -d 
/etc/nagios/nagios.cfg
root 15744  6284  0 13:12 pts/000:00:00 grep /usr/sbin/nagios


This is a 64-bit CentOS 6.0 virtual machine. It was running SELinux but 
I disabled it for debugging in case it was causing problems.

Permissions on ls -la /var/log/nagios/spool/checkresults/ and parents 
are traversable and writable by the nagios user.

I also saw online that sometimes permissions on /dev/null can cause this 
problem, but in my case /dev/null is world-writable so I can't see a 
problem.

I adjusted max_check_result_file_age to 0 in case my checkresult files 
were being deleted prematurely, but the problem persists.

So, I have no idea what to look at next while troubleshooting this. Can 
anyone suggest a pointer?

Many thanks,
Jonathan

--
Using storage to extend the benefits of virtualization and iSCSI
Virtualization increases hardware utilization and delivers a new level of
agility. Learn what those decisions are and how to modernize your storage 
and backup environments for virtualization.
http://www.accelacomm.com/jaw/sfnl/114/51434361/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null