Re: [rsyslog] rsyslog bringing machines down due amount of messages (?)

David Lang Tue, 01 Oct 2013 06:33:32 -0700

On Tue, 1 Oct 2013, Erik van Dam wrote:

Hi David,


You are right, we are doing TCP/SSL. The config is:

rsyslog-gnutls-5.8.10-2.el6.x86_64
rsyslog-5.8.10-2.el6.x86_64


$ModLoad imuxsock.so    # provides support for local system logging (e.g. via 
logger command)
$ModLoad imklog.so      # provides kernel logging support (previously done by 
rklogd)
$IMUXSockRateLimitInterval 0
$ModLoad imudp.so
$UDPServerRun 514
$ModLoad imtcp.so
$PreserveFQDN on
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat

$DefaultNetstreamDriver gtls
$DefaultNetstreamDriverCAFile /etc/rsyslog/protected/ca.pem
$DefaultNetstreamDriverCertFile /etc/rsyslog/protected/cert.pem
$DefaultNetstreamDriverKeyFile /etc/rsyslog/protected/key.pem

$InputTCPServerStreamDriverPermittedPeer machine1
$InputTCPServerStreamDriverPermittedPeer machine2
$InputTCPServerStreamDriverPermittedPeer machine3
$InputTCPServerStreamDriverPermittedPeer machine4
$InputTCPServerStreamDriverPermittedPeer machine5
$InputTCPServerStreamDriverPermittedPeer machine6
$InputTCPServerStreamDriverPermittedPeer machine7
$InputTCPServerStreamDriverMode 1
$InputTCPServerRun 514


$template 
DailyPerHostLogs,"/bigdisk/syslog/%$YEAR%/%$MONTH%/%$DAY%/%FROMHOST-IP%_messages.log"
$template 
DailyrootshPerHostLogs,"/bigdisk/syslog/rootsh/%$YEAR%/%$MONTH%/%$DAY%/%FROMHOST-IP%_messages.log"
local5.info                                             -?DailyrootshPerHostLogs
& ~

$template 
cactilog,"/bigdisk/syslog/%$YEAR%/%$MONTH%/%$DAY%/%FROMHOST-IP%_cacti-access.log"
if $syslogfacility-text == 'local0' and $msg contains '/cacti' then -?cactilog
& ~

on 5.x the if..then is very slow, you would want to upgrade to 7.x or refactorthis entire section into a ruleset so you would do one test to see if thefacility is local0 and if it is, call the ruleset that does all the other tests.

another thing is that you are making a lot of use of the dynamid filenamegeneration, the number of files that rsyslog keeps open by default is tiny, youneed to set the parameter $DynaFileCacheSize to something larger than the numberof files that you are going to create, otherwise your system spends all it'stime opening and closing files under load.

$template 
nagioslog,"/bigdisk/syslog/%$YEAR%/%$MONTH%/%$DAY%/%FROMHOST-IP%_nagios-access.log"
if $syslogfacility-text == 'local0' and $msg contains '/nagios' then -?nagioslog
& ~

$template 
somedomainname,"/bigdisk/syslog/%$YEAR%/%$MONTH%/%$DAY%/%FROMHOST-IP%_somedomainname.log"
if $syslogfacility-text == 'local0' and $msg contains 'somedomainname' then 
-?somedomainname
& ~

$template 
somedomainname,"/bigdisk/syslog/%$YEAR%/%$MONTH%/%$DAY%/%FROMHOST-IP%_somedomainname.log"
if $syslogfacility-text == 'local0' and $msg contains 'somedomainname' then 
-?somedomainname
& ~

$template 
somedomainname,"/bigdisk/syslog/%$YEAR%/%$MONTH%/%$DAY%/%FROMHOST-IP%_somedomainname.log"
if $syslogfacility-text == 'local0' and $msg contains 'somedomainname' then 
-?somedomainname
& ~

$template 
somedomainname,"/bigdisk/syslog/%$YEAR%/%$MONTH%/%$DAY%/%FROMHOST-IP%_somedomainname.log"
if $syslogfacility-text == 'local0' and $msg contains 'somedomainname' then 
-?somedomainname
& ~

$template 
nagiosandcactierror,"/bigdisk/syslog/%$YEAR%/%$MONTH%/%$DAY%/%FROMHOST-IP%_nagiosandcactierror.log"
if $syslogfacility-text == 'local1' then -?nagiosandcactierror
& ~

local0.* ~

*.*                                                     -?DailyPerHostLogs
-------------------

You and rainer allready opted a couple of tweaks that i can use. But i wanted 
to show the stats as it's now occuring. Is the exiting of rsyslog by design?

I'm not sure what you're referring to when you are asking about the exiting ofrsyslog.

it looks like you do not have the config parameter telling rsyslog not to do afull shutdown/restart when it gets a HUP, so every time you go to roll the filesyou will be doing a full restart (and loosing some logs in the process). notdoing a full restart was a capability introduced in 4.x and made defaultbefore 7.x


you really should upgrade to 7.x

David Lang

Regards,
Erik

On Tue, 1 Oct 2013 05:34:40 -0700 (PDT)
David Lang <[email protected]> wrote:

Eric, am I remembering correctly that you are using TCP for communication
between the client and server?

can you post your server rsyslog.conf config (since it's been a while, I don't
remember details)

moving to rsyslog 7.x should help

setting the clients to have a disk assisted queue so that when the server falls
behind they can keep running would help

and we can look to try and figure out what the bottleneck on the server is.

David Lang

On Tue, 1 Oct 2013, Erik van Dam wrote:

We are running (client & server):

rsyslog-5.8.10-2.el6.x86_64
rsyslog-gnutls-5.8.10-2.el6.x86_64

Regards,
Erik


On Tue, 1 Oct 2013 14:12:18 +0200
Rainer Gerhards <[email protected]> wrote:

On Tue, Oct 1, 2013 at 2:00 PM, Erik van Dam <[email protected]> wrote:

Hi Rainer,

Finally i got some data. Today at 10:12 rsyslog (client) died propably due
the high amount of messages. I captured the stats from server & client.
However was not able to do an top on the syslog server.

server = https://defuse.ca/b/ivGdutJMwFjZWkpys7F7F1
client = https://defuse.ca/b/Lt6l6BzuqVm0bPNfjJmXnL

Thanks for your help.

It looks like the server's main queue went full and for some reason is not
able to drain quickly enough. Unfortunately, it is not clear what may cause
this.

Which version of rsyslog is that? I notice, for example, that imtcp stats
counters are missing (which would be useful).

Rainer

Regards



On Thu, 12 Sep 2013 14:42:51 +0200
Rainer Gerhards <[email protected]> wrote:

On Thu, Sep 12, 2013 at 2:40 PM, Rainer Gerhards
<[email protected]>wrote:

On Thu, Sep 12, 2013 at 2:39 PM, Rainer Gerhards <

[email protected]

wrote:


On Thu, Sep 12, 2013 at 2:31 PM, Erik van Dam <[email protected]>

wrote:

Sure!

http://pastebin.com/tBb2NWUR

Do you restart rsyslog every hour? From the stats, it looks so...

I guess I can answer that myself: of course you do, trying to

circumvent

the problem ;) Sorry for the noise...

mhhh... unfortunately, this means we do never see the error, and so we
cannot see what triggered it. The stats I got look fine and provide no
indication of a problem. Do I guess right that there was no problem in

that

timeframe? If it was, could you point me to the time the problem occured.

If there was no problem, you need to re-run impstats, but this time let
rsyslog run into trouble. Then we can see if s/t fills up. For best

resuts,

I suggest to use a stats reporting interval of 1 minute.

Rainer



--
Met vriendelijke groet,

Erik van Dam
RedBee / FortyTwo

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] rsyslog bringing machines down due amount of messages (?)

Reply via email to