Hey folks. I recently had a co-worker present me with a problem regarding the NSCA plugin. It seems that under certain cirumstances (unfortunately, those circumstances are unknown to him and thus me as well), NSCA just kind of hangs (an strace shows basically an idle screen) and these sorts of errors start flooding the daemon log:
nsca[28640]: Network server accept failure (9: Bad file descriptor) The quick fix is to restart NSCA, and then everything hums along until the next incident. It's possible there's a bad block on the disk or something, and an fsck might yield some clues, but I haven't had the chance to schedule downtime to do that yet. It's also possible it's hitting the fd limit, but in the time I've been monitoring it, I don't see any leaking of fd's that would point to that as a suspect (the limit is the default of 1024). Additionally, according to ulimit, the pipe size is 4k, which could be an issue as the nsca clients write to a pipe on the server (nagios.cmd), but that's only an option configurable at kernel compile-time and I expect I'd see more widespread reports of problems from other folks in the community if overflowing the default pipe buffer was really the issue. I've seen some sparse reports on Google of a similar problem, but they're just that - sparse. Which kind of makes me think it's not Nagios or NSCA, but a bad block on the hard drive. Anybody have a similar experience or opinion? Thanks! Ryan ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
