Hello,
posting for the first time in this forum, I hope I am providing enough
details.
I have been using rsyslog v7.4.10 in our embedded system for over a year
now and only recently started noticing a memory leak. Less than 1Mb /
day. In the process of debugging this issue I have also tried v7.6.1,
v8.4.2 and the most recent v8.6.0, all of which exhibited the same
memory leak.
I am now able to speed up this memory leak (multiple Mbs/day) under
certain conditions, which I will try to describe here.
I am using over 50 outchannels in this setup, which are all setup
similar to this:
$template logfmt,"%timegenerated:::date-rfc3339% [%syslogpriority%]
%syslogtag% %msg%\n"
$outchannel sshd_log, /var/log/sshd.log, 500000,
/usr/sbin/logrotate /etc/logrotate.conf
if $programname == 'sshd' then :omfile:$sshd_log;logfmt
The log rotations are setup this way in /etc/logrotate.conf:
/var/log/sshd.log {
rotate 2
size=400k
postrotate
/etc/init.d/rsyslog.sh reload > /dev/null 2>&1 || true
endscript
}
I then generate lots of syslog messages (~200/sec) from various
applications to stress the system.
The initial memory leak I observed was due to the rate-limiting
configuration (after lots of trial and error in rsyslog.conf):
module(load="imuxsock" SysSock.RateLimit.Interval="2"
SysSock.RateLimit.Burst="200" SysSock.RateLimit.Severity="5"
SysSock.UsePIDFromSystem="on")
When I eliminated the rate limiting from the config, part of the memory
leak has disappeared, but not all.
I also have a remote-host configured for forwarding to a syslog server,
but removing that configuration did not make any difference.
I then tried reducing the maxsize for the 50+ outchannels from 500k to
50k and generated again ~200/sec messages (total) in an attempt to speed
up the memory leak. This increased the speed of memory loss. ~1Mb/hour.
I then removed the maxsize from the outchannels and called the logrotate
program from a cron job every 5 minutes. This eliminated the memory
leak. Ran it overnight with no memory loss. So, the problem seems to be
related to (the frequency of) calling logrotate when maxsize is reached,
and possibly related to the fact that logrotate in turn HUPs rsyslog
upon rotating log files. If I just run rsyslog.sh reload (which results
in HUP), I don't see a memory leak either. It has to be rsyslog calling
the maxsize action (logrotate).
I ran valgrind on rsyslog with some success (logrotate's HUP doesn't
seem to be passed down to rsyslog from valgrind). Through valgrind, I
noticed at least two leaks (based on 8.6.0 sources):
1. 15k leak at ochConstruct during initialization:
==18966== 15,891 (32 direct, 15,859 indirect) bytes in 1 blocks are
definitely lost in loss record 88 of 88
==18966== at 0xFF88B9C: calloc (in
/usr/lib/valgrind/vgpreload_memcheck-ppc32-linux.so)
==18966== by 0x1005F9EF: ochConstruct (outchannel.c:47)
==18966== by 0x1005FAAF: ochAddLine (outchannel.c:207)
==18966== by 0x1002B4AF: doNameLine (conf.c:190)
==18966== by 0x1005857B: processCfSysLineCommand (cfsysline.c:760)
==18966== by 0x1002B63B: cfsysline (conf.c:233)
==18966== by 0x1002F08F: cnfDoCfsysline (rsconf.c:453)
==18966== by 0x1001DE5B: yylex (lexer.l:251)
==18966== by 0x1001B0AF: yyparse (grammar.c:1536)
==18966== by 0x1002E593: load (rsconf.c:1278)
==18966== by 0x1000D117: initAll (rsyslogd.c:930)
==18966== by 0x1000A6A3: main (rsyslogd.c:1318)
The pOch pointer doesn't seem to be registered anywhere to be freed
later. This is not a huge issue, because it is only done once during
initialization, not a continuous leak like the other case(s).
2. The next one is in tools/omfile.c in prepareFile:
==16419== 280 bytes in 7 blocks are definitely lost in loss record 81 of 91
==16419== at 0xFF86274: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-ppc32-linux.so)
==16419== by 0x40934FF: strdup (strdup.c:42)
==16419== by 0x10013D2F: prepareFile (unicode-helper.h:46)
==16419== by 0x10014A23: commitTransaction (omfile.c:831)
==16419== by 0x1005B687: actionCommit (action.c:1088)
==16419== by 0x1005C9A3: actionCommitAllDirect (action.c:1256)
==16419== by 0x10054A03: processBatch (ruleset.c:484)
==16419== by 0x1000B7BF: msgConsumer (rsyslogd.c:342)
==16419== by 0x10050347: ConsumerReg (queue.c:1897)
==16419== by 0x1004C3D3: wtiWorker (wti.c:334)
==16419== by 0x1004B077: wtpWorker (wtp.c:390)
==16419== by 0xFE17BE7: start_thread (pthread_create.c:314)
The strdup'd pszSizeLimitCmd (which translates to "/usr/bin/logrotate
/etc/logrotate.conf") doesn't seem to be freed later. I am not sure if
this is a major leak or not. But, it appears that a free for
pszSizeLimitCmd is missing in runtime/stream.c: CODESTARTobjDestruct
routine.
Neither of the above findings explain the rate and amount of memory loss
I am experiencing, so I think some of the codepath is not being executed
when rsyslog is run in the context of valgrind (e.g. the HUP case I
mentioned earlier).
I also have impstats enabled and monitoring the maxrss from there. The
queue sizes are *not* growing.
I will be running mtrace (and maybe coverity) on this setup as well, and
will post further findings. If anyone else knows about any rsyslog
memory leaks due to frequent log rotations, please post to this forum.
Thanks and regards,
Koral
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.