On 11/25/2013 02:45 PM, David Lang wrote:
On Mon, 25 Nov 2013, Erik Steffl wrote:

On 11/25/2013 12:18 AM, David Lang wrote:
On Mon, 25 Nov 2013, Pavel Levshin wrote:

Date: Mon, 25 Nov 2013 12:05:15 +0400
From: Pavel Levshin <[email protected]>
Reply-To: rsyslog-users <[email protected]>
To: [email protected]
Subject: Re: [rsyslog] Rsyslog with RELP not sending/receiving
messages for
    long intervals

23.11.2013 7:01, Erik Steffl:

Another possibly interesting message is:

7975.038523942:7fe2064cf700: main Q: doEnqSingleObject: LightDelay
mark reached for light delayable message - blocking a bit.

  which was received approximately once per second during following
interval (this is also when the traffic went down to zero):

1385087975 Fri Nov 22 02:39:35 UTC 2013
1385088546 Fri Nov 22 02:49:06 UTC 2013

  Does this shed any light on what's going on?

I bet this is the problem. This blocks main queue for a second.

There are possible workarounds, including increasing main queue size
(this watermark is set to 70% by default).

searching through the documentation, this seems to be a tcp input option

old style

$InputTCPFlowControl defaults on set to off to disable

new style
FlowControl='off' on the module load

 is this for RELP too? We don't use tcp module at all, these are the
loaded modules:

it's not documented as being part of RELP.

ubuntu@domU-12-31-39-06-75-11:~$ grep -E '^\$ModLoad|module'
/etc/rsyslog.conf /etc/rsyslog.d/*
/etc/rsyslog.conf:$ModLoad imuxsock # provides support for local
system logging
/etc/rsyslog.conf:$ModLoad imklog   # provides kernel logging support
(previously done by rklogd)
/etc/rsyslog.d/31-yummly.conf:module(load="mmjsonparse")
/etc/rsyslog.d/31-yummly.conf:module(load="omrelp")
/etc/rsyslog.d/31-yummly.conf:module(load="imudp")
/etc/rsyslog.d/41-yummly-collector.conf:module(load="mmjsonparse")
/etc/rsyslog.d/41-yummly-collector.conf:module(load="imrelp"
ruleset="collector")

 On the affected system the bursts are coming over /dev/log and going
out over RELP, steady traffic is coming via RELP and going into set of
files.

if the copy that's receiving the burst of logs via RELP gets
bottlenecked, that bottleneck will propogate back to the sender.

not sure that's happening, the latest scenario that I think is most revealing is like this (described before but summarizing here cause it's hard to follow which is which in the thread):

  1 host steady traffic -> RELP -> collector-test -> file

  6 hosts steady traffic -> RELP -> collector-prod -> file

collector-test 200 messages burst -> /dev/log -> collector-test -> RELP -> collector-prod -> file

problem: when the burst of messages is sent all traffic to collector-test stops until next burst of messages. I.e. 1 host that was sending traffic to collector-test is not sending data anymore. All other traffic is working as before, e.g. the 6 hosts that are sending traffic to collector-prod continue to work fine.

That's why I don't think the it's the receiving side of burst of message that has problem (cause that's collector-prod which continues to receive RELP messages from the 6 hosts).

Hope this makes it clear what is actually getting stuck. Think this is the most revealing scenario, cause it shows that it's the sender of the burst of messages (still not sure which part there) that is having the problem (could be the part that handles /dev/log or that handles sending over RELP).

        erik
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to