Sorry for the late reply, the busy holiday period and then break took it's
tool...

On Sat, Dec 6, 2014 at 12:03 AM, Koral Ilgun <[email protected]> wrote:

> Hello,
> posting for the first time in this forum, I hope I am providing enough
> details.
>
> I have been using rsyslog v7.4.10 in our embedded system for over a year
> now and only recently started noticing a memory leak. Less than 1Mb / day.
> In the process of debugging this issue I have also tried v7.6.1, v8.4.2 and
> the most recent v8.6.0, all of which exhibited the same memory leak.
> I am now able to speed up this memory leak (multiple Mbs/day) under
> certain conditions, which I will try to describe here.
>
> I am using over 50 outchannels in this setup, which are all setup similar
> to this:
>
> $template logfmt,"%timegenerated:::date-rfc3339% [%syslogpriority%]
> %syslogtag% %msg%\n"
> $outchannel sshd_log, /var/log/sshd.log,             500000,
> /usr/sbin/logrotate /etc/logrotate.conf
> if $programname == 'sshd'               then :omfile:$sshd_log;logfmt
>
>
> The log rotations are setup this way in /etc/logrotate.conf:
>
> /var/log/sshd.log {
>     rotate 2
>     size=400k
>     postrotate
>     /etc/init.d/rsyslog.sh reload > /dev/null 2>&1 || true
>     endscript
> }
>
> I then generate lots of syslog messages (~200/sec) from various
> applications to stress the system.
> The initial memory leak I observed was due to the rate-limiting
> configuration (after lots of trial and error in rsyslog.conf):
>
> module(load="imuxsock" SysSock.RateLimit.Interval="2"
> SysSock.RateLimit.Burst="200" SysSock.RateLimit.Severity="5"
> SysSock.UsePIDFromSystem="on")
>
> When I eliminated the rate limiting from the config, part of the memory
> leak has disappeared, but not all.
>
> I also have a remote-host configured for forwarding to a syslog server,
> but removing that configuration did not make any difference.
>
> I then tried reducing the maxsize for the 50+ outchannels from 500k to 50k
> and generated again ~200/sec messages (total) in an attempt to speed up the
> memory leak. This increased the speed of memory loss. ~1Mb/hour.
>
> I then removed the maxsize from the outchannels and called the logrotate
> program from a cron job every 5 minutes. This eliminated the memory leak.
> Ran it overnight with no memory loss. So, the problem seems to be related
> to (the frequency of) calling logrotate when maxsize is reached, and
> possibly related to the fact that logrotate in turn HUPs rsyslog upon
> rotating log files. If I just run rsyslog.sh reload (which results in HUP),
> I don't see a memory leak either. It has to be rsyslog calling the maxsize
> action (logrotate).
>
> I ran valgrind on rsyslog with some success (logrotate's HUP doesn't seem
> to be passed down to rsyslog from valgrind). Through valgrind, I noticed at
> least two leaks (based on 8.6.0 sources):
>
> 1. 15k leak at ochConstruct during initialization:
>
> ==18966== 15,891 (32 direct, 15,859 indirect) bytes in 1 blocks are
> definitely lost in loss record 88 of 88
> ==18966==    at 0xFF88B9C: calloc (in /usr/lib/valgrind/vgpreload_
> memcheck-ppc32-linux.so)
> ==18966==    by 0x1005F9EF: ochConstruct (outchannel.c:47)
> ==18966==    by 0x1005FAAF: ochAddLine (outchannel.c:207)
> ==18966==    by 0x1002B4AF: doNameLine (conf.c:190)
> ==18966==    by 0x1005857B: processCfSysLineCommand (cfsysline.c:760)
> ==18966==    by 0x1002B63B: cfsysline (conf.c:233)
> ==18966==    by 0x1002F08F: cnfDoCfsysline (rsconf.c:453)
> ==18966==    by 0x1001DE5B: yylex (lexer.l:251)
> ==18966==    by 0x1001B0AF: yyparse (grammar.c:1536)
> ==18966==    by 0x1002E593: load (rsconf.c:1278)
> ==18966==    by 0x1000D117: initAll (rsyslogd.c:930)
> ==18966==    by 0x1000A6A3: main (rsyslogd.c:1318)
>
> The pOch pointer doesn't seem to be registered anywhere to be freed later.
> This is not a huge issue, because it is only done once during
> initialization, not a continuous leak like the other case(s).
>

Yeah, that's cosmetic. Haven't fixed it for now.


> 2. The next one is in tools/omfile.c in prepareFile:
>
> ==16419== 280 bytes in 7 blocks are definitely lost in loss record 81 of 91
> ==16419==    at 0xFF86274: malloc (in /usr/lib/valgrind/vgpreload_
> memcheck-ppc32-linux.so)
> ==16419==    by 0x40934FF: strdup (strdup.c:42)
> ==16419==    by 0x10013D2F: prepareFile (unicode-helper.h:46)
> ==16419==    by 0x10014A23: commitTransaction (omfile.c:831)
> ==16419==    by 0x1005B687: actionCommit (action.c:1088)
> ==16419==    by 0x1005C9A3: actionCommitAllDirect (action.c:1256)
> ==16419==    by 0x10054A03: processBatch (ruleset.c:484)
> ==16419==    by 0x1000B7BF: msgConsumer (rsyslogd.c:342)
> ==16419==    by 0x10050347: ConsumerReg (queue.c:1897)
> ==16419==    by 0x1004C3D3: wtiWorker (wti.c:334)
> ==16419==    by 0x1004B077: wtpWorker (wtp.c:390)
> ==16419==    by 0xFE17BE7: start_thread (pthread_create.c:314)
>
> The strdup'd pszSizeLimitCmd (which translates to "/usr/bin/logrotate
> /etc/logrotate.conf") doesn't seem to be freed later. I am not sure if this
> is a major leak or not. But, it appears that a free for pszSizeLimitCmd is
> missing in runtime/stream.c: CODESTARTobjDestruct routine.
>
>
Yup, and it is not only cosmetic, but not big either (as you say). In any
case, I have now fixed it. The fix will be part of the 8.7.0 release next
Tuesday.


> Neither of the above findings explain the rate and amount of memory loss I
> am experiencing, so I think some of the codepath is not being executed when
> rsyslog is run in the context of valgrind (e.g. the HUP case I mentioned
> earlier).
>
> I also have impstats enabled and monitoring the maxrss from there. The
> queue sizes are *not* growing.
>
>
I have also reviewed the code in question, but I don't see anything that
looks like it could become a memory leak. Also, valgrind usually is quite
good in reporting those leaks. I usually also get the HUP signal when
running under valgrind (as well as others), so you might want to check your
setup (including valgrind versions, maybe that matters -- but I use it for
many years now).

So in short: I have no idea where the remaining problem you see stems back
from. It would probably be good if you could re-check with 8.7.0 when it is
out (or git master-candidate branch if you would like to check immediately).

Again, my apologies for the long silence.
Rainer


> I will be running mtrace (and maybe coverity) on this setup as well, and
> will post further findings. If anyone else knows about any rsyslog memory
> leaks due to frequent log rotations, please post to this forum.
>
> Thanks and regards,
> Koral
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to