It's hard to tell as I had to reboot the box in order to get it working properly again. I know that sendmail was not the first of the defunct processes. I do have triggers set, and I was receiving trigger email while these defunct processes existed. I do not have any gaps in the graphs for any services, so jff was still running, though very slowly due to these orphans. I have enabled snmp monitoring of the jff server to try and track the defunct processes if they reappear. If they come back I will add information if possible.
Brad On Tue, 2005-09-13 at 19:23 -0300, Javier Szyszlican wrote: > If Sendmail was involved you should check what was happening to it. > > Because if you have triggers set, JFFNMS consolidate will be blocked until it > can send the email and that can cause all sorts of issues. > > Javier > > Brad Hudson wrote: > > There appears to be an issue with jffnms leaving orphans behind. My > > system was running fine for 31 days, but this morning there were over > > 34,000 orphaned. These processes were mostly php or fping with a > > smattering of id and sendmail. These defunct processes are unkillable > > and brought my system to a crawl. I had to reboot to clear them. > > > >>From what I understand the orphans are caused when a process forks a > > child and then exits before the child finishes. When this happens the > > forked process' parent changes to init, which should normally clean them > > up when they finish running. When a process is defunct before init > > inherits them, init is unable to clean them and they stay in the system > > forever or until the system is rebooted. > > > > The only reason I can think of for the child to be passed to init as > > defunct is high load causing a delay between when the real parent exits > > and the child finishes running with another delay between the parent > > exiting and the child being passed to init. The only solution I can see > > to this is to ensure that all children have completed before allowing > > the parent process to exit. I do not know which specific module left > > the orphans behind, but it must be one of the pollers as I am not > > running auto discovery on any hosts. > > > > Little problems like these are hardly enough to make me stop using this > > great product Javier, you should be proud of it. I just wanted to bring > > it to your attention. > > > > Regards; > > > > Brad > > > > > -- Brad Hudson Systems Analyst [EMAIL PROTECTED] Telephone: (613) 694-2681 Fax: (613) 759-1651 ------------------------------------------------------- SF.Net email is sponsored by: Tame your development challenges with Apache's Geronimo App Server. Download it for free - -and be entered to win a 42" plasma tv or your very own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php _______________________________________________ jffnms-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/jffnms-users
