Howdy Mark, Thanks! It seems much better. I've had it running on a few systems here for about 5 hours now and haven't seen any "MAXFD" messages in the logs or any of the accompanying "zombie" processes (I didn't notice it was creating a zombie for each fd until after I sent this report).
BTW/FYI, SVN is missing a few files needed to make the doc folder. In particular, scan.eps and n_loadavg.eps for cfengine-Anomalies.texinfo and the cfcomdoc.css referenced in doc/Makefile. jack/Slick On Tue, 2008-04-22 at 08:43 +0200, Mark Burgess wrote: > I found a file descriptor leak that would only occur on linux from the > latest change that added cpu utilization monitoring. I am testing it > now. Please check too and tell me if things get better. > > SiliconSlick wrote: > > Howdy all, > > > > I rebuilt an RPM using the latest SVN code > > late last week and installed it here. I'm > > now getting a lot of cfenvd failures. It > > appears to be leaking file descriptors. > > > > A sample from the log gives the following: > > > > Apr 19 22:34:52 gemenon cfenvd[22964]: File descriptor 1021 of child > > higher than MAXFD, check for defunct children > > Apr 19 22:34:52 gemenon cfenvd[22964]: File descriptor 1022 of child > > 13520 higher than MAXFD, check for defunct children > > Apr 19 22:34:52 gemenon cfenvd[22964]: File descriptor 1022 of child > > higher than MAXFD, check for defunct children > > Apr 19 22:37:22 gemenon cfenvd[22964]: File descriptor 1022 of child > > 13533 higher than MAXFD, check for defunct children > > Apr 19 22:37:22 gemenon cfenvd[22964]: File descriptor 1022 of child > > higher than MAXFD, check for defunct children > > Apr 19 22:40:05 gemenon cfenvd[22964]: Couldn't open average > > database /var/cfengine/state/cf_observations.db > > Apr 19 22:40:05 gemenon cfenvd[22964]: db_open: Too many open files > > Apr 19 22:40:05 gemenon cfenvd[22964]: Error reading average database > > Apr 19 23:01:43 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: Executing > > shell command: /etc/rc.d/init.d/cfenvd restart > > Apr 19 23:01:43 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: Restart: > > Stopping cfengine anomaly detection service (cfenvd): [FAILED] > > Apr 19 23:01:43 gemenon cfenvd[14285]: Lock > > lock.db.localhost.cfenvd.daemon_2743 expired (after 2575/1 minutes) > > Apr 19 23:01:43 gemenon cfenvd[14283]: cfenvd: starting > > Apr 19 23:01:43 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: Restart: > > Starting cfengine anomaly detection service (cfenvd): [ OK ] > > Apr 19 23:01:44 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: (Done > > with /etc/rc.d/init.d/cfenvd restart) > > Apr 19 23:24:21 gemenon cfenvd[14285]: LDT Buffer full at 10 > > Apr 19 23:39:22 gemenon cfenvd[14285]: File descriptor 20 of child > > 14584 higher than MAXFD, check for defunct children > > Apr 19 23:39:22 gemenon cfenvd[14285]: File descriptor 20 of child > > higher than MAXFD, check for defunct children > > Apr 19 23:41:52 gemenon cfenvd[14285]: File descriptor 20 of child > > 14602 higher than MAXFD, check for defunct children > > Apr 19 23:41:52 gemenon cfenvd[14285]: File descriptor 20 of child > > higher than MAXFD, check for defunct children > > > > It appears after 40 minutes, it has reached MAXFD==20. It goes > > along for a while and then eventually dies a day and a half later > > (at ~30/hour and with 1024 fds avail, about 34 hours). > > > > Given Mark's recent changes and request for help with cfenvd, > > I thought it might be related. Looking at the diff between > > revision 550 and 553 of cfenvd.c[*], I'm thinking the culprit > > might be a return without "fclose(fp)" on line 1404[**]. I haven't > > tested a fix yet since I'm not sure what the fix is (close > > the file first?... don't return?). > > > > Does this seem like it could be the cause of the problem > > I'm seeing above? Anyone else having similar problems? > > > > jack/SiliconSlick > > > > [*] > > http://svn.iu.hio.no/viewvc/trunk/src/cfenvd.c?root=Cfengine-2&r1=550&r2=553 > > > > [**] this bit: > > > > else > > { > > Verbose("Found nothing (%s)\n",cpuname); > > index = ob_spare; > > return; > > } > > > > > > > > > > > > > > _______________________________________________ > > Bug-cfengine mailing list > > [email protected] > > https://cfengine.org/mailman/listinfo/bug-cfengine > _______________________________________________ Bug-cfengine mailing list [email protected] https://cfengine.org/mailman/listinfo/bug-cfengine
