On Tue, 9 Nov 2010, John Feuerstein wrote:
On 11/09/2010 05:34 PM, [email protected] wrote:
On Tue, 9 Nov 2010, John Feuerstein wrote:
BTW, thinking more about it... since the problem always occurs after
some days - perhaps it is related to using %$now% within the filename
templates in combination with the $DynaFileCacheSize?
The lsof shows a lot of open FDs ("REG") for "(deleted)" files of
previous days. FD leaking?
does a kill -HUP (which tells rsyslog to close and re-open all output
files) clear up the problem?
This does indeed do a *LOT* to lsof output. The differencte between the
pre- and post- lsof outputs is huge. I guess that we are on the right
track here, the old FDs marked as "(deleted)" are not closed properly by
rsyslogd?
lsof output before sending SIGHUP:
http://biz.baze.de/debug/rsyslog/pre-post-SIGHUP/pre-lsof.txt
lsof output after sending SIGHUP:
http://biz.baze.de/debug/rsyslog/pre-post-SIGHUP/post-lsof.txt
Note that I'm using rsyslogd's %$now% dynamic file template only to
rotate logs without the need of any external helper. So I never send
SIGHUP, because I expect rsyslogd to handle that internally?
However, the original problem of the permanent 100% CPU thread is still
there, see the pre- and post-process-tree.txt files at:
http://biz.baze.de/debug/rsyslog/pre-post-SIGHUP/
one other thing that I find useful to understand things is to do a strace
on the other threads to identify what each ofthem are.
you have the parent thread (doesn't do much)
each input will have it's own thread listening.
you will have one (or more if you have multiple worker threads configured)
output threads.
if you can identify what is happening on the other threads, you may be
able to figure out what's happening on the problem thread by process of
elimination.
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com