Sorry for the late reply, the busy holiday period and then break took it's tool...
On Sat, Dec 6, 2014 at 12:03 AM, Koral Ilgun <[email protected]> wrote: > Hello, > posting for the first time in this forum, I hope I am providing enough > details. > > I have been using rsyslog v7.4.10 in our embedded system for over a year > now and only recently started noticing a memory leak. Less than 1Mb / day. > In the process of debugging this issue I have also tried v7.6.1, v8.4.2 and > the most recent v8.6.0, all of which exhibited the same memory leak. > I am now able to speed up this memory leak (multiple Mbs/day) under > certain conditions, which I will try to describe here. > > I am using over 50 outchannels in this setup, which are all setup similar > to this: > > $template logfmt,"%timegenerated:::date-rfc3339% [%syslogpriority%] > %syslogtag% %msg%\n" > $outchannel sshd_log, /var/log/sshd.log, 500000, > /usr/sbin/logrotate /etc/logrotate.conf > if $programname == 'sshd' then :omfile:$sshd_log;logfmt > > > The log rotations are setup this way in /etc/logrotate.conf: > > /var/log/sshd.log { > rotate 2 > size=400k > postrotate > /etc/init.d/rsyslog.sh reload > /dev/null 2>&1 || true > endscript > } > > I then generate lots of syslog messages (~200/sec) from various > applications to stress the system. > The initial memory leak I observed was due to the rate-limiting > configuration (after lots of trial and error in rsyslog.conf): > > module(load="imuxsock" SysSock.RateLimit.Interval="2" > SysSock.RateLimit.Burst="200" SysSock.RateLimit.Severity="5" > SysSock.UsePIDFromSystem="on") > > When I eliminated the rate limiting from the config, part of the memory > leak has disappeared, but not all. > > I also have a remote-host configured for forwarding to a syslog server, > but removing that configuration did not make any difference. > > I then tried reducing the maxsize for the 50+ outchannels from 500k to 50k > and generated again ~200/sec messages (total) in an attempt to speed up the > memory leak. This increased the speed of memory loss. ~1Mb/hour. > > I then removed the maxsize from the outchannels and called the logrotate > program from a cron job every 5 minutes. This eliminated the memory leak. > Ran it overnight with no memory loss. So, the problem seems to be related > to (the frequency of) calling logrotate when maxsize is reached, and > possibly related to the fact that logrotate in turn HUPs rsyslog upon > rotating log files. If I just run rsyslog.sh reload (which results in HUP), > I don't see a memory leak either. It has to be rsyslog calling the maxsize > action (logrotate). > > I ran valgrind on rsyslog with some success (logrotate's HUP doesn't seem > to be passed down to rsyslog from valgrind). Through valgrind, I noticed at > least two leaks (based on 8.6.0 sources): > > 1. 15k leak at ochConstruct during initialization: > > ==18966== 15,891 (32 direct, 15,859 indirect) bytes in 1 blocks are > definitely lost in loss record 88 of 88 > ==18966== at 0xFF88B9C: calloc (in /usr/lib/valgrind/vgpreload_ > memcheck-ppc32-linux.so) > ==18966== by 0x1005F9EF: ochConstruct (outchannel.c:47) > ==18966== by 0x1005FAAF: ochAddLine (outchannel.c:207) > ==18966== by 0x1002B4AF: doNameLine (conf.c:190) > ==18966== by 0x1005857B: processCfSysLineCommand (cfsysline.c:760) > ==18966== by 0x1002B63B: cfsysline (conf.c:233) > ==18966== by 0x1002F08F: cnfDoCfsysline (rsconf.c:453) > ==18966== by 0x1001DE5B: yylex (lexer.l:251) > ==18966== by 0x1001B0AF: yyparse (grammar.c:1536) > ==18966== by 0x1002E593: load (rsconf.c:1278) > ==18966== by 0x1000D117: initAll (rsyslogd.c:930) > ==18966== by 0x1000A6A3: main (rsyslogd.c:1318) > > The pOch pointer doesn't seem to be registered anywhere to be freed later. > This is not a huge issue, because it is only done once during > initialization, not a continuous leak like the other case(s). > Yeah, that's cosmetic. Haven't fixed it for now. > 2. The next one is in tools/omfile.c in prepareFile: > > ==16419== 280 bytes in 7 blocks are definitely lost in loss record 81 of 91 > ==16419== at 0xFF86274: malloc (in /usr/lib/valgrind/vgpreload_ > memcheck-ppc32-linux.so) > ==16419== by 0x40934FF: strdup (strdup.c:42) > ==16419== by 0x10013D2F: prepareFile (unicode-helper.h:46) > ==16419== by 0x10014A23: commitTransaction (omfile.c:831) > ==16419== by 0x1005B687: actionCommit (action.c:1088) > ==16419== by 0x1005C9A3: actionCommitAllDirect (action.c:1256) > ==16419== by 0x10054A03: processBatch (ruleset.c:484) > ==16419== by 0x1000B7BF: msgConsumer (rsyslogd.c:342) > ==16419== by 0x10050347: ConsumerReg (queue.c:1897) > ==16419== by 0x1004C3D3: wtiWorker (wti.c:334) > ==16419== by 0x1004B077: wtpWorker (wtp.c:390) > ==16419== by 0xFE17BE7: start_thread (pthread_create.c:314) > > The strdup'd pszSizeLimitCmd (which translates to "/usr/bin/logrotate > /etc/logrotate.conf") doesn't seem to be freed later. I am not sure if this > is a major leak or not. But, it appears that a free for pszSizeLimitCmd is > missing in runtime/stream.c: CODESTARTobjDestruct routine. > > Yup, and it is not only cosmetic, but not big either (as you say). In any case, I have now fixed it. The fix will be part of the 8.7.0 release next Tuesday. > Neither of the above findings explain the rate and amount of memory loss I > am experiencing, so I think some of the codepath is not being executed when > rsyslog is run in the context of valgrind (e.g. the HUP case I mentioned > earlier). > > I also have impstats enabled and monitoring the maxrss from there. The > queue sizes are *not* growing. > > I have also reviewed the code in question, but I don't see anything that looks like it could become a memory leak. Also, valgrind usually is quite good in reporting those leaks. I usually also get the HUP signal when running under valgrind (as well as others), so you might want to check your setup (including valgrind versions, maybe that matters -- but I use it for many years now). So in short: I have no idea where the remaining problem you see stems back from. It would probably be good if you could re-check with 8.7.0 when it is out (or git master-candidate branch if you would like to check immediately). Again, my apologies for the long silence. Rainer > I will be running mtrace (and maybe coverity) on this setup as well, and > will post further findings. If anyone else knows about any rsyslog memory > leaks due to frequent log rotations, please post to this forum. > > Thanks and regards, > Koral > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

