> -----Original Message----- > From: [EMAIL PROTECTED] [mailto:nagios-users- > [EMAIL PROTECTED] On Behalf Of Michael W. Lucas > Sent: Tuesday, June 19, 2007 5:16 AM > To: Kyle Sexton > Cc: nagios-users@lists.sourceforge.net > Subject: Re: [Nagios-users] Problems with FreeBSD and Nagios > > On Mon, Jun 18, 2007 at 06:42:18PM -0500, Kyle Sexton wrote: > > On 12/14/06, Andreas Ericsson <[EMAIL PROTECTED]> wrote: > > > Jonathan Call wrote: > > > > > > > > Given your ideas and some google work I seem to have found my > problem: > > > > > > > > http://lists.freebsd.org/pipermail/freebsd-hackers/2005- > August/013247.ht > > > > ml > > > > > > > > Not a pretty discussion. :( > > > > > > > > > > Nope. Definitely not. > > > > > > The problem for Nagios is that threading was added after the fact so > > > nagios actually breaks some of the *strong* recommendations on what to > > > do and what not to do in a threaded application after a fork(). > > > > > > The problem for *BSD and their thread implementation of the thread > > > library is that Nagios actually works everywhere but on *BSD, and it > > > *often* works there too, but not always. This "often-but-not-always" > is > > > usually a sign of a broken implementation, although exactly > > > "often-but-not-always" is a sign of the errors you'll run into when > you > > > do what Nagios does post-fork(). > > > > > > I don't know of any other program that has the same problem on *BSD, > but > > > it would be interesting to see if there's a common pattern so one can > > > pinpoint the exact pattern that causes the lock contention and races. > It > > > would, from a practical point of view, be best to patch it in the > > > library, as that is a fix that would work for all possible future > > > problems as well, although it's technically more correct to fix it in > > > Nagios. > > > > > > Ugly discussion indeed. > > > > > > > > > > I'll try using a non SMP kernel to see it might help. If it doesn't > this > > > > pretty much renders Nagios useless on FreeBSD. (Which makes me > wonder > > > > why they even bother maintaining it in ports?) > > > > > > > > > > Out of curiousity, do you use passive checks, active checks or a mix > of > > > both in your setup? > > Was there ever a solution found to this problem?
No. I was forced to implement a distributed model and limit the service checks to less than 1000 on a server. Even then I still have to run a cron job that checks for nagios children than are spinning on the CPU as a result of this fork issue. I've found that somewhere after 1500+ service checks there will be a random weekly event that causes almost a hundred nagios checks to hit this fork issue all at the same time and promptly tank the FreeBSD server. > > Skimming the (long) discussion thread, my first thought is to try > libthr instead of libkse. The discussion seems to be on 5.x, I'd > definitely try libthr on 6.x. Check libmap.conf for details. Are you referring to this type of mapping within /etc/libmap.conf? [/usr/local/bin/nagios] libpthread.so.2 libthr.so.2 libpthread.so libthr.so If so I'd be willing to try it on my FreeBSD 6.2 server. Jonathan ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null