Re: Nagios + 6.3-RELEASE == Hung Process
On 03/01/2008, at 11:56 AM, Marc G. Fournier wrote: As noted in my original report, this isn't a nagios issue per se ... my first experience with this issue was with Azureus/java ... so its a 'threading issue in general' ... A patch to force the package to link against libthr() has been committed [1] and should be available once mirrors update as net-mgmt/ nagios 2.10_1. This has been tested since this conversation stated in the net-mgmt/nagios-devel port [2] without any negative feedback being received. Bundled in the update is the inclusion of libltdl as requested by Tom Judge. I'd be interested to know how people go with the updated port. Thanks for your patience. [1] http://www.freebsd.org/cgi/query-pr.cgi?pr=120150 [2] http://www.freebsd.org/cgi/query-pr.cgi?pr=119246 Jarrod. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
At 06:17 PM 2/4/2008, Jarrod Sayers wrote: On 03/01/2008, at 11:56 AM, Marc G. Fournier wrote: As noted in my original report, this isn't a nagios issue per se ... my first experience with this issue was with Azureus/java ... so its a 'threading issue in general' ... A patch to force the package to link against libthr() has been committed [1] and should be available once mirrors update as net-mgmt/ nagios 2.10_1. This has been tested since this conversation stated in the net-mgmt/nagios-devel port [2] without any negative feedback being We have been using nagios linked against libthr via libmap.conf since the end of November and its been working great since then. Prior to that, we would see 100% CPU usage a couple of times a week on various nagios procs. Hasnt happened since. ---Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
On Thu, Jan 03, 2008 at 01:50:15PM -0500, Mike Tancsa wrote: At 10:55 AM 1/3/2008, Vivek Khera wrote: As noted in my original report, this isn't a nagios issue per se ... my first experience with this issue was with Azureus/java ... so its a 'threading issue in general' ... For years now I've been running with libthr as the default threading library as set in libmap.conf. The *only* issue I've run into is with Java, and that requires libpthread. So my libmap.conf looks like this, and everything works really well (including Nagios, mysql, etc.) Same here. We were getting quite a few Nagios threads spinning their wheels (almost 1 per day) with 6.3-PRERELEASE FreeBSD 6.3-PRERELEASE #0: Sun Dec 2 running Nagios 2.5. Changing to libthr fixed the problem and we have yet to see a stuck thread since making the change. I'm going to pick up PR 119246 and hopefully get it committed in a few days (pending mentor approval). According to my understanding of this thread this will fix the threading problems people have been discussing here, and will hopefully be picked up in net-mgmt/nagios sooner rather than later. :) -- WXS ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
On Jan 2, 2008, at 8:26 PM, Marc G. Fournier wrote: My gut feeling is that it's not an architecture issue but more an interoperability issue between the Nagios threading code and the libpthread() threading library. As noted in my original report, this isn't a nagios issue per se ... my first experience with this issue was with Azureus/java ... so its a 'threading issue in general' ... For years now I've been running with libthr as the default threading library as set in libmap.conf. The *only* issue I've run into is with Java, and that requires libpthread. So my libmap.conf looks like this, and everything works really well (including Nagios, mysql, etc.) --cut here-- # use libthr instead of pthread lib libpthread.so.2 libthr.so.2 libpthread.so libthr.so # JDK HotSpot compiler fails randomly with libthr. [java] libpthread.so libpthread.so libpthread.so.2 libpthread.so.2 --cut here-- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
At 10:55 AM 1/3/2008, Vivek Khera wrote: As noted in my original report, this isn't a nagios issue per se ... my first experience with this issue was with Azureus/java ... so its a 'threading issue in general' ... For years now I've been running with libthr as the default threading library as set in libmap.conf. The *only* issue I've run into is with Java, and that requires libpthread. So my libmap.conf looks like this, and everything works really well (including Nagios, mysql, etc.) Same here. We were getting quite a few Nagios threads spinning their wheels (almost 1 per day) with 6.3-PRERELEASE FreeBSD 6.3-PRERELEASE #0: Sun Dec 2 running Nagios 2.5. Changing to libthr fixed the problem and we have yet to see a stuck thread since making the change. ---Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
Michael Butler wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Marc G. Fournier wrote: G'day ... Yesterday, I setup nagios to do some system monitoring ... installed the latest version from ports into a jail, so that I could easily move it around between machines as I upgrade, without losing data ... after about 30 minutes running, I get a second nagios process running (fork?) that takes up ch CPU time as is available, and just hangs there until I kill -9 it ... [ .. ] After searching the 'Net a bit, came across this thread: http://www.nagiosexchange.org/nagios-users.34.0.html?tx_maillisttofaq_pi1%5Bmode%5D=1tx_maillisttofaq_pi1%5BshowUid%5D=7694 That recommends modifying libmap.conf with: [/usr/local/bin/nagios] libpthread.so.2 libthr.so.2 libpthread.so libthr.so Thanks for pointing this out. I've had similar problems with nagios but hadn't found a solution until I saw your pointer. Sadly, my expertise with both thread libraries is sufficiently lacking that I have no clue where to start looking for the cause :-( I have also seen this issue, but have always put it down to the way that we manage our nagios deployments with cfengine. I will try to deploy this change and monitor for the problem to see if it persists. On a side note if you want to use broker modules with nagios from port you need to change the following in the port Makefile in order to make them load properly: From: USE_AUTOTOOLS= autoconf:259 To: SE_AUTOTOOLS= autoconf:259 libltdl:15 I sent an email to the maintainer but got no response and my email did not seem to have affected the last commit to upgrade to 2.10. Tom ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
On Wed, 2008-01-02 at 15:26 +, Tom Judge wrote: Michael Butler wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm pretty sure that he's getting ready to ship 3.0 release w/ broken threading on FreeBSD. I haven't had time to test it on NetBSD yet, but since it can be fixed by switching up which threading engine you link against on the Free* side of *BSD, its likely a long-term fix in Ports instead of polluting the code with #IFDEF's for FreeBSD-specific POSIX thread nits (it's a hard-sell since the same code works fine on Solaris and Linux w/o issue) What we need is: 1) Nightly builds of Nagios against various releng trees 2) Serious BSD involvement in the project to look at the threading code (beyond me) 3) Bug tracking on the Nagios side I recently proposed #1 and #2 on nagios-user@, but I got a lot of push-back from Andreas. Any type of professional project management improvements that quote Aren't fun are heavily frowned upon. ~BAS ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
On 03/01/2008, at 1:56 AM, Tom Judge wrote: I have also seen this issue, but have always put it down to the way that we manage our nagios deployments with cfengine. I will try to deploy this change and monitor for the problem to see if it persists. I hope I can confirm your frustrations. There is a threading issue with Nagios when it's binaries are linked against libpthread(3) threading library, the default on recent FreeBSD 5.x releases and all 6.x releases. The issue is random and extremely difficult to track down with the symptoms being a second Nagios process sitting on the system hanging a CPU. Be rest assured that I have been working on it, and have seen it on one system of mine. Changes have been submitted for net-mgmt/nagios-devel (aka Nagios 3.0.r1)) to force the build process to link against libthr(3) where available, removing the need to map libpthread() out with /etc/ libmap.conf. If this goes well, as stated in the PR, i'll back-port it to net-mgmt/nagios (aka Nagios 2.10) in the next few days. If anyone out there is running net-mgmt/nagios-devel and feels like trying it for me, see ports/119246 and drop me an email with a before and after ldd /usr/local/bin/nagios. On a side note if you want to use broker modules with nagios from port you need to change the following in the port Makefile in order to make them load properly: From: USE_AUTOTOOLS= autoconf:259 To: SE_AUTOTOOLS= autoconf:259 libltdl:15 I sent an email to the maintainer but got no response and my email did not seem to have affected the last commit to upgrade to 2.10 I did receive that email and the changes went in with the last commit of net-mgmt/nagios-devel to test. No issues have arisen so i'll be back-porting it to net-mgmt/nagios soon for you. There also has been a rather large ports freeze which delayed the upgrade to Nagios 2.10, that PR was submitted on the 1st of November and committed on the 13th of December. Unfortunately your email fell somewhere in the middle, apologies for not letting you know. Jarrod. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
Jarrod Sayers wrote: On 03/01/2008, at 1:56 AM, Tom Judge wrote: I have also seen this issue, but have always put it down to the way that we manage our nagios deployments with cfengine. I will try to deploy this change and monitor for the problem to see if it persists. I hope I can confirm your frustrations. There is a threading issue with Nagios when it's binaries are linked against libpthread(3) threading library, the default on recent FreeBSD 5.x releases and all 6.x releases. The issue is random and extremely difficult to track down with the symptoms being a second Nagios process sitting on the system hanging a CPU. Be rest assured that I have been working on it, and have seen it on one system of mine. Not sure if this is related at all but out of the 3 nagios deployments we have here I have only ever seen it on one (It currently has 2 nagios threads spinning CPU time atm). The differences on that server are: * It is amd64 compared to i386 * It also runs ndo2db from ndoutils 1.4b7 All the systems run 6.2-RELEASE-p5 and nagios-2.9_1, they are also all patched with gnu libltdl patch below. Don't know if that info is of any use to you. Changes have been submitted for net-mgmt/nagios-devel (aka Nagios 3.0.r1)) to force the build process to link against libthr(3) where available, removing the need to map libpthread() out with /etc/libmap.conf. If this goes well, as stated in the PR, i'll back-port it to net-mgmt/nagios (aka Nagios 2.10) in the next few days. If anyone out there is running net-mgmt/nagios-devel and feels like trying it for me, see ports/119246 and drop me an email with a before and after ldd /usr/local/bin/nagios. On a side note if you want to use broker modules with nagios from port you need to change the following in the port Makefile in order to make them load properly: From: USE_AUTOTOOLS= autoconf:259 To: SE_AUTOTOOLS= autoconf:259 libltdl:15 I sent an email to the maintainer but got no response and my email did not seem to have affected the last commit to upgrade to 2.10 I did receive that email and the changes went in with the last commit of net-mgmt/nagios-devel to test. No issues have arisen so i'll be back-porting it to net-mgmt/nagios soon for you. There also has been a rather large ports freeze which delayed the upgrade to Nagios 2.10, that PR was submitted on the 1st of November and committed on the 13th of December. Unfortunately your email fell somewhere in the middle, apologies for not letting you know. Thanks for this, I currently maintain the patch on our build servers. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Wednesday, January 02, 2008 22:54:33 + Tom Judge [EMAIL PROTECTED] wrote: Not sure if this is related at all but out of the 3 nagios deployments we have here I have only ever seen it on one (It currently has 2 nagios threads spinning CPU time atm). The differences on that server are: * It is amd64 compared to i386 I never tried on i386, but in my case it was an amd64 system as well ... not sure if that is relevant or not ... has anyone seen this problem *with* i386? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHfB0s4QvfyHIvDvMRAudqAKCuiXkAYPL5goXbmlvJjylpMlqUIwCgiRfM m15NQlmqpRtO/MtEXR7m+RU= =utJ9 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
On Wed, Jan 02, 2008 at 07:24:28PM -0400, Marc G. Fournier wrote: - --On Wednesday, January 02, 2008 22:54:33 + Tom Judge [EMAIL PROTECTED] wrote: Not sure if this is related at all but out of the 3 nagios deployments we have here I have only ever seen it on one (It currently has 2 nagios threads spinning CPU time atm). The differences on that server are: * It is amd64 compared to i386 I never tried on i386, but in my case it was an amd64 system as well ... not sure if that is relevant or not ... has anyone seen this problem *with* i386? Yes. We run Nagios on an i386 machine (dual Athlon MP 1800+), and I first saw this problem with a build of 6-STABLE as of 2007-10-04, and it continues (if I don't use the libmap.conf settings) with the running system of 6.3-PRERLEASE as of 2007-12-18 and nagios-2.10 (from ports of same date). -- greg byshenk - [EMAIL PROTECTED] - Leiden, NL ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
On Wed, 2 Jan 2008, Tom Judge wrote: Jarrod Sayers wrote: I hope I can confirm your frustrations. There is a threading issue with Nagios when it's binaries are linked against libpthread(3) threading library, the default on recent FreeBSD 5.x releases and all 6.x releases. The issue is random and extremely difficult to track down with the symptoms being a second Nagios process sitting on the system hanging a CPU. Be rest assured that I have been working on it, and have seen it on one system of mine. Not sure if this is related at all but out of the 3 nagios deployments we have here I have only ever seen it on one (It currently has 2 nagios threads spinning CPU time atm). The differences on that server are: * It is amd64 compared to i386 * It also runs ndo2db from ndoutils 1.4b7 All the systems run 6.2-RELEASE-p5 and nagios-2.9_1, they are also all patched with gnu libltdl patch below. Don't know if that info is of any use to you. That's actually good to know, as you're now (unless I am mistaken) the first user to contact me about this problem on non-i386 systems. One user, plus myself, have also seen the issue under Nagios 3.x, both on i386 systems though. I also have a net-mgmt/ndoutils port in the works (less the database support for now) which also has the same issue so using broker modules doesn't seem to affect the outcome. My gut feeling is that it's not an architecture issue but more an interoperability issue between the Nagios threading code and the libpthread() threading library. [yoink] I did receive that email and the changes went in with the last commit of net-mgmt/nagios-devel to test. No issues have arisen so i'll be back-porting it to net-mgmt/nagios soon for you. There also has been a rather large ports freeze which delayed the upgrade to Nagios 2.10, that PR was submitted on the 1st of November and committed on the 13th of December. Unfortunately your email fell somewhere in the middle, apologies for not letting you know. Thanks for this, I currently maintain the patch on our build servers. No worries, I will look at bundling in the change with the libthr() fix over the next few days. Thanks for pointing that out too as it was a bug instead of a feature request, as on systems where the library was available, the build process would link to it. Hmm... Jarrod. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Marc G. Fournier wrote: I never tried on i386, but in my case it was an amd64 system as well ... not sure if that is relevant or not ... has anyone seen this problem *with* i386? When I read about it, I was in the middle of upgrading the problem machine to 7-stable - which now reports as follows: FreeBSD 7.0-PRERELEASE #0: Tue Jan 1 22:12:02 EST 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/AARON Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel Pentium III (701.59-MHz 686-class CPU) Origin = GenuineIntel Id = 0x681 Stepping = 1 Features=0x387f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE real memory = 1073479680 (1023 MB) avail memory = 1041297408 (993 MB) kbd1 at kbdmux0 acpi0: INTEL TR440BXA on motherboard -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFHfDKWQv9rrgRC1JIRAgTzAJ0T4HwQcR8kSj+iuKL90S2oz5EWMACeLPqd pBkMfN9J08zv+ibT3TgcYHA= =vmkg -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Thursday, January 03, 2008 11:05:16 +1030 Jarrod Sayers [EMAIL PROTECTED] wrote: That's actually good to know, as you're now (unless I am mistaken) the first user to contact me about this problem on non-i386 systems. One user, plus myself, have also seen the issue under Nagios 3.x, both on i386 systems though. I also have a net-mgmt/ndoutils port in the works (less the database support for now) which also has the same issue so using broker modules doesn't seem to affect the outcome. My gut feeling is that it's not an architecture issue but more an interoperability issue between the Nagios threading code and the libpthread() threading library. As noted in my original report, this isn't a nagios issue per se ... my first experience with this issue was with Azureus/java ... so its a 'threading issue in general' ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHfDm94QvfyHIvDvMRAtZkAKCf4z6csc+YaXBS1/UMurQ3NIqXDgCeLCif jplg0JQzX4xKQEgJsVy/nGY= =dA7G -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Marc G. Fournier wrote: G'day ... Yesterday, I setup nagios to do some system monitoring ... installed the latest version from ports into a jail, so that I could easily move it around between machines as I upgrade, without losing data ... after about 30 minutes running, I get a second nagios process running (fork?) that takes up ch CPU time as is available, and just hangs there until I kill -9 it ... [ .. ] After searching the 'Net a bit, came across this thread: http://www.nagiosexchange.org/nagios-users.34.0.html?tx_maillisttofaq_pi1%5Bmode%5D=1tx_maillisttofaq_pi1%5BshowUid%5D=7694 That recommends modifying libmap.conf with: [/usr/local/bin/nagios] libpthread.so.2 libthr.so.2 libpthread.so libthr.so Thanks for pointing this out. I've had similar problems with nagios but hadn't found a solution until I saw your pointer. Sadly, my expertise with both thread libraries is sufficiently lacking that I have no clue where to start looking for the cause :-( Michael -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFHenK4Qv9rrgRC1JIRAqifAKChinXb0dEPTMMlnXNYsuECLJL+vgCgvLF5 G5UYcIuvPe+UEk+qJSplrnY= =xXMF -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]