From: Rick Mangus [mailto:[email protected]]
Sent: 29 January 2010 17:02
> Hello, all.
>
> Forgive me, I am new to the list, and have only begun working with nagios
> recently. I have
> searched this list and googled furiously with little result, so must cease my
> lurking and
> present my problem to you.
>
> I will begin with the problem: Sometime after midnight every night, my nagios
> server starts
> to have trouble processing service checks. I don't know the cause, and
> cannot find a
> solution. I can describe the symptoms in detail and hope we can diagnose it.
>
> The web interface shows the last service check came in at 02:28:34 (EST). I
> know that
> around 4:15 every morning, xinetd starts refusing connections to nsca due to
> high load
> (max_load is 18), and that eventually I will have 32000+ nsca connections
> using up all
> available PIDs leading to an inability to fork new processes, effectively
> killing the
> machine. While all this happens, the nagios.log appears to periodically
> stall, making no
> new entries for 15 minutes at a time, and then flush 15000 in the space of a
> single
> second. Also, it seems the checkresults directory is empty most of the time,
> but sometimes
> pops up to 2045 files (it's on a ramdisk with 2048 inodes) and not a single
> one gets
> deleted in a time period I have been patient enough to observe.
>
> The periods in which the nagios log is going nowhere are accompanied by
> nagios taking 100%
> of 2 CPUs. One thread appears to poll() approximately every 25 usecs, and
> another is
> inscrutable, with mprotect() the only strace-visible syscall. All the nsca
> processes have
> a blocking write() they are waiting on. When the log is showing new entries,
> there are
> still no updates made to the services, and it seems that that is what is
> filling up
> checkresults. I admit I have not checked to find the order of the log and
> checkresults
> processes, though I assumed they would operate in the opposite order of what
> this appears
> to show.
>
> I know this behavior has been ongoing for at least 1 month. I have disabled
> all cron jobs
> that I feared might be interfering. I will answer any and all questions to
> the best of my > ability, and hope someone here can shed some light on the
> situation.
1. Do you run ndoutils (to write results to a MySQL database) ? If so, which
version ? I ask because I used to have a similar problem which I eventually
tracked down to an interfering backup on the MySQL server that hosted the
database.
2. Do you run other services on the Nagios server which might interfere with
Nagios (e.g backups which start sometime after midnight) ?
3. Have you thought of upgrading to nagios 3.2.0 which is the latest stable
version ?
Jonathan Wheeler
e-Science Centre
Rutherford Appleton Laboratory
--
Scanned by iCritical.
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Nagios-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue.
::: Messages without supporting info will risk being sent to /dev/null
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Nagios-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue.
::: Messages without supporting info will risk being sent to /dev/null