And finally a reply to "the meat" ;) 

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Mr. Demeanour
> Sent: Monday, November 09, 2009 5:25 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] Rsyslog 4.4.2: server out-of-memory with gnutls
> 
> Rainer Gerhards wrote:
> > Jack,
> > 
> > Quick update: I was able to run a (relatively quick) test on a
> > configuration that hopefully later becomes part of the testbench (I
> > am working towards that goal and thought doing a manual 
> check at that
> > stage would not hurt). It is not based your config, and it is
> > relatively simple and straightforward. Still, it uses TLS, and uses
> > it in anon mode like you did. I used 4.4.2 on the server. I 
> processed
> > around 500,000 message (not kept track of the actual number). I did
> > three such runs, all runing under the excellent valgrind[1] memory
> > debugger.
> > 
> > For none of the runs, valgrind reported any memory leaks. While this
> > may not be an ultimate indication, valgrind is *very* effective in
> > finding leaks and based on what you wrote I would have expected a
> > small chunk of memory to be lost per message.
> 
> OK (thanks).
> 
> I have formed the impression that the problems are occurring 
> after some
> period of running time; free memory decreases quite slowly, until some
> unknown event causes a rather rapid degeneration. That is: I suspect
> that any leak is not likely to be observable on a per-message basis,
> until this unknown event has occured.

This would explain what we currently see. It seems to be TLS-related, as you
said. So it is unlikely to be a message at all, but rather a TLS event that
causes the mem leak. With that said, I should probbably see if a connection
abort can trigger one... 

> > So the bottom line is that I currently cannot reproduce the 
> bug. This
> > may change when I finally import your config. However, it would be
> > useful for me if you could run valgrind on rsyslogd in your
> > environment and let me know if valgrind reports any memory leaks.
> > Doing so considerably slows down rsyslogd, but given your load, I'd
> > expect that it would be acceptable.
> > 
> > To run under valgrind is relatively simply. Valgrind is available as
> > a package on almost all distros. All you need to do is run valgrind
> > and specify your usual rsyslogd command line as the parameter. It is
> > recommended to do this in the foreground (see rsyslog 
> troubleshooting
> > doc).
> > 
> > So, for example, if you start rsyslogd usually by
> > 
> > /sbin/rsyslogd -c4 -...
> 
> So "valgrind /usr/sbin/rsyslogd -c4" results in rsyslogd backgrounding
> promptly, at which point valgrind prints its report (which shows no
> leaks - unsurprisingly).
> 
> "valgrind /usr/sbin/rsyslogd -c4 -n" results in a hang. 
> CTRL-C fails to
> kill the foreground task. 

That's intended behavior, ctrl-c is only enabled in debug mode (I took this
over from sysklogd and never question it".

> "kill -9 <pid>" kills the task, but no
> valgrind report is produced. 

You must not use -9, which is untrappable. Just send the default SIGTERM:

$ kill <pid>  # no signal specified at all!

>The same command without valgrind also
> results in a hung foreground task. If run under valgrind,
> memcheck-x86-li goes to 99% CPU on CTRL-C.
> 
> I currently suspect problems with mySQL as the origin of 
> these problems.
> I was this morning getting messages of the form
> "Lost connection to MySQL server at 'reading authorization packet'".
> I was also observing MySQL aborted clients and connects. I have
> increased the MySQL connect timeout, and can no longer reproduce these
> reports. For now, I assume that problem is fixed, but I can't 
> yet say if
> the rsyslog hangs have stopped.
> 
> I wonder if what was happening was that MySQL was "going away" in some
> sense, and that rsyslog was not reconnecting to it successfully, *and*
> not retrying?

Hard to guess...

> 
> I noticed that although the ActionQueue for the mysql output module is
> not disk-assisted, the debug log records:
> action 4 queue: save on shutdown 1, max disk space allowed 0

The debug output just spits out the values of the variables. Saveonshutdown
is true by default, but if there is no disk queue, that won't help anything
at all ;) In short: that's OK.

> So I've set $ActionQueueSaveOnShutdown off. However this 
> hasn't changed
> the hang behaviour with -n.
> 
> Since I can't get rsyslog to run in the foreground under 
> valgrind, I am
> now running daemonized without valgrind (but with encryption); perhaps
> these changes have fixed the problem. I should know by late 
> this evening
> - when the problem is observed, the server never lives for more than a
> few hours.

I hope this -and the other- mail will help straighten out the issue. Any logs
you can send to my private mail address as you have already done ;)

Rainer
> -- 
> Jack.
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
> 
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to