And finally a reply to "the meat" ;) > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Mr. Demeanour > Sent: Monday, November 09, 2009 5:25 PM > To: rsyslog-users > Subject: Re: [rsyslog] Rsyslog 4.4.2: server out-of-memory with gnutls > > Rainer Gerhards wrote: > > Jack, > > > > Quick update: I was able to run a (relatively quick) test on a > > configuration that hopefully later becomes part of the testbench (I > > am working towards that goal and thought doing a manual > check at that > > stage would not hurt). It is not based your config, and it is > > relatively simple and straightforward. Still, it uses TLS, and uses > > it in anon mode like you did. I used 4.4.2 on the server. I > processed > > around 500,000 message (not kept track of the actual number). I did > > three such runs, all runing under the excellent valgrind[1] memory > > debugger. > > > > For none of the runs, valgrind reported any memory leaks. While this > > may not be an ultimate indication, valgrind is *very* effective in > > finding leaks and based on what you wrote I would have expected a > > small chunk of memory to be lost per message. > > OK (thanks). > > I have formed the impression that the problems are occurring > after some > period of running time; free memory decreases quite slowly, until some > unknown event causes a rather rapid degeneration. That is: I suspect > that any leak is not likely to be observable on a per-message basis, > until this unknown event has occured.
This would explain what we currently see. It seems to be TLS-related, as you said. So it is unlikely to be a message at all, but rather a TLS event that causes the mem leak. With that said, I should probbably see if a connection abort can trigger one... > > So the bottom line is that I currently cannot reproduce the > bug. This > > may change when I finally import your config. However, it would be > > useful for me if you could run valgrind on rsyslogd in your > > environment and let me know if valgrind reports any memory leaks. > > Doing so considerably slows down rsyslogd, but given your load, I'd > > expect that it would be acceptable. > > > > To run under valgrind is relatively simply. Valgrind is available as > > a package on almost all distros. All you need to do is run valgrind > > and specify your usual rsyslogd command line as the parameter. It is > > recommended to do this in the foreground (see rsyslog > troubleshooting > > doc). > > > > So, for example, if you start rsyslogd usually by > > > > /sbin/rsyslogd -c4 -... > > So "valgrind /usr/sbin/rsyslogd -c4" results in rsyslogd backgrounding > promptly, at which point valgrind prints its report (which shows no > leaks - unsurprisingly). > > "valgrind /usr/sbin/rsyslogd -c4 -n" results in a hang. > CTRL-C fails to > kill the foreground task. That's intended behavior, ctrl-c is only enabled in debug mode (I took this over from sysklogd and never question it". > "kill -9 <pid>" kills the task, but no > valgrind report is produced. You must not use -9, which is untrappable. Just send the default SIGTERM: $ kill <pid> # no signal specified at all! >The same command without valgrind also > results in a hung foreground task. If run under valgrind, > memcheck-x86-li goes to 99% CPU on CTRL-C. > > I currently suspect problems with mySQL as the origin of > these problems. > I was this morning getting messages of the form > "Lost connection to MySQL server at 'reading authorization packet'". > I was also observing MySQL aborted clients and connects. I have > increased the MySQL connect timeout, and can no longer reproduce these > reports. For now, I assume that problem is fixed, but I can't > yet say if > the rsyslog hangs have stopped. > > I wonder if what was happening was that MySQL was "going away" in some > sense, and that rsyslog was not reconnecting to it successfully, *and* > not retrying? Hard to guess... > > I noticed that although the ActionQueue for the mysql output module is > not disk-assisted, the debug log records: > action 4 queue: save on shutdown 1, max disk space allowed 0 The debug output just spits out the values of the variables. Saveonshutdown is true by default, but if there is no disk queue, that won't help anything at all ;) In short: that's OK. > So I've set $ActionQueueSaveOnShutdown off. However this > hasn't changed > the hang behaviour with -n. > > Since I can't get rsyslog to run in the foreground under > valgrind, I am > now running daemonized without valgrind (but with encryption); perhaps > these changes have fixed the problem. I should know by late > this evening > - when the problem is observed, the server never lives for more than a > few hours. I hope this -and the other- mail will help straighten out the issue. Any logs you can send to my private mail address as you have already done ;) Rainer > -- > Jack. > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

