What kind of options does one have, if your master nagios server is getting overloaded?
I have half a dozen slaves doing polling, submitting passive check results back via send_nsca. The master does no active polling, just event processing, notifications, and web ui. Under normal circumstances, it works alright. But after a restart it can take up to half an hour before the master catches up; and if there are a lot of events, the act of sending out notifications can cause it to fall behind. I'm pre-caching my object file, I'm skipping circular dependency checks, and I've gotten a notification cycle down to 9 seconds. I tried modifying nagios to fork before notifications, but that failed pretty spectacularly; so that 9 seconds is a time where 900 or so passive check submissions block until the notifications are done. Are there any options for running a dual-master setup, or other ways to spread the load across multiple machines? Has anyone patched nsca to submit check results into the checkresults directory, instead of via the nagios.cmd pipe? What kind of improvement can one expect from that? Any other advice? -- Mike Lindsey ------------------------------------------------------------------------------ Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null