On 10/13/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > [EMAIL PROTECTED] wrote on > 12/10/2006 21:27:59: > > > > I'm running NRPE 2.5.1 on a number of machines and occasionally I'll > > see the processes get stuck just spinning their wheels. It ends > > utilizing the CPU heavily and needs to be killed. I haven't been able > > to track down what is causing this, so I wanted to ask if anyone on > > this list has seen this behavior before. Also, I've seen that NRPE > > 2.5.2 fixes "a number of bugs" but do not know if it addresses this > > specifically. I've included output from top and truss below. > > > > top: > > PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND > > 20950 nagios 1 20 0 3792K 1688K run 78.6H 49.58% nrpe > > 20675 nagios 1 20 0 3792K 1720K run 79.0H 48.62% nrpe > > > > truss: > > getpid() = 20675 [1] > > time() = 1160684094 > > getpid() = 20675 [1] > > read(8, 0x00050FF8, 5) Err#11 EAGAIN > > getpid() = 20675 [1] > > time() = 1160684094 > > getpid() = 20675 [1] > > read(8, 0x00050FF8, 5) Err#11 EAGAIN > > getpid() = 20675 [1] > > time() = 1160684094 > > getpid() = 20675 [1] > > read(8, 0x00050FF8, 5) Err#11 EAGAIN > > getpid() = 20675 [1] > > time() = 1160684094 > > getpid() = 20675 [1] > > read(8, 0x00050FF8, 5) Err#11 EAGAIN > > getpid() = 20675 [1] > > time() = 1160684094 > > getpid() = 20675 [1] > > read(8, 0x00050FF8, 5) Err#11 EAGAIN > > getpid() = 20675 [1] > > Hi, > I don't have any firm answers for you, but I have seen similar wierdness > in NRPE on Solaris 8 and 10 when it's compiled with lots of optimisation > flags: -O2 -funroll-loops etc.. > > My only recommendation would be to recompile the same source without running > configure again (make clean; vi Makefile; make), but remove all optimisation > flags and install just the new nrpe binary on one of the affected servers.. > > If you're still having problems, please go back through the truss and tell > me where the open() call reads the file/socket which returns filehandle 8 > and paste in some more info > > Other things to try: If you're compiling 64bit, try 32bit.. If you have an > older version of OpenSSL than 0.9.8b, try updating.. If you have an older > GCC 3.x than 3.3 or 3.4 then try updating.. > > Cheers > rob/mossko > > This message and any attachments (the "message") is > intended solely for the addressees and is confidential. > If you receive this message in error, please delete it and > immediately notify the sender. Any use not in accord with > its purpose, any dissemination or disclosure, either whole > or partial, is prohibited except formal approval. The internet > can not guarantee the integrity of this message. > BNP PARIBAS (and its subsidiaries) shall (will) not > therefore be liable for the message if modified. > > ********************************************************************************************** > > BNP Paribas Private Bank London Branch is authorised > by CECEI & AMF and is regulated by the Financial Services > Authority for the conduct of its investment business in > the United Kingdom. > > BNP Paribas Securities Services London Branch is authorised > by CECEI & AMF and is regulated by the Financial Services > Authority for the conduct of its investment business in > the United Kingdom. > > BNP Paribas Fund Services UK Limited is authorised and > regulated by the Financial Services Authority > > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > Nagios-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null > > Unfortunately, I won't be able to collect all of the truss data because I am not able to replicate this situation at will. It just seems to happen after NRPE has been running for a long time. The output which I was able to include in the previous post was just the truss for the running process after I noticed it happening. Thanks for the suggestions, I'll give them a try.
Mike ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
