btw I have a different instance with slightly older version of my log mover and slightly older version of rsyslog which does not exhibit the same problem (just checked the Amazon metrics and the traffic to those servers is even, not drops like I see with my current instance).

  no problem: 7.5.4-0adiscon1

  problem: 7.5.5-0adiscon2

looking at http://www.rsyslog.com/changelog/ I don't see anything that looks related (no RELP), any ideas whether something changed that could potentially be related?

Also see that there is new rsyslog_7.5.6-0adiscon2_i386.deb package (saucy only) so I'll try that as well...

        erik

On 11/01/2013 01:02 AM, David Lang wrote:
the purpose of RELP is to tell the senders if the receiver was able to
accept the message or not, so that's working as designed.

When you do a full restart of rsyslog, it stops processing new messages
and disconnects all senders (you are doing a full stop of rsyslog and
then starting it from scratch, this takes time). My guess is that the
senders are detecting 'too many failures' when trying to send messages,
so they are backing off and sleeping for a while rather tthan performing
a mini DOS attack on the network and server. Every third restart is
probably triggering a threshold.

Instead of doing a full restart, just send the rsyslog daemon a HUP
signal instead. That will tell rsyslog to flush and close all files so
that you can rotate them (If you are doing compression on the files, you
may need to sleep for a few seconds to let the fluch complete). Rsyslog
can continue to receive new messages during this time, so your senders
will not see an outage.

By the way, I'm needing a script to upload rsyslog archives to S3, could
you send me a copy of yours? (remove any passwords first please :-)

David Lang


  On Thu, 31 Oct 2013, Erik Steffl wrote:

Date: Thu, 31 Oct 2013 18:44:23 -0700
From: Erik Steffl <[email protected]>
Reply-To: rsyslog-users <[email protected]>
To: rsyslog-users <[email protected]>
Subject: [rsyslog] Rsyslog with RELP not sending/receiving messages
for long
    intervals

We have a fairly simple setup of 6 hosts sending syslog messages to
one collector host, all of these run rsyslog 7.5.5-0adiscon2 from
adiscon repo and use RELP to transfer messages. There is also load
balancer in front of the collector machine but I dont' think it
matters in this case.

Rsyslog on collector machine is configured to write to files,
switching to a new file every 15 minute, using config like this
(abbreviated a bit):

template(name="jsonFilename" type="list") {
 constant(value="/path/")
 property(name="$now")
 constant(value="/")
 property(name="$hour")
 constant(value="/")
 property(name="$qhour")
 constant(value="/")
 constant(value="log.json")
}

action(type="omfile" DynaFile="jsonFilename" Template="jsonFormat")

 We run a script at every 2, 17, 32, 47 minute of the hour and upload
the just finished file to S3. The uploading works like this:

 - let's say it's 3:02:00, rsyslog is writing to
/path/2013-10-10/03/00/log.json

 - get the filename log.json (anything that's not current, usually
just one previous file which in the example would be
/path/2013-10-10/02/03/log.json)

 - rename /path/2013-10-10/02/03/log.json to
/path/2013-10-10/02/03/log.json.uploading.0

 - reload rsyslog (to make sure that even if for some reason it was
writing to just renamed file it would close it and open a new file)

 - upload /path/2013-10-10/02/03/log.json.uploading.0 to S3

 - remove /path/2013-10-10/02/03/log.json.uploading.0

Here's what happens every third run (yes, regularly EVERY THIRD RUN)
of this script:

 - rsyslog stops writing to the CURRENT file
(/path/2013-10-10/03/00/log.json, the one that is NOT being renamed)
few seconds into the run of the script (e.g. 3:02:04)

 - 6 hosts that were sending syslog messages to the log collector STOP
sending anything (as verified by stracing rsyslogd, tcpdump and in
amazon AWS console metric for network in)

 - after this nothing is ever written into
/path/2013-10-10/03/00/log.json

 - the 6 clients start sending sysog messages again when the next file
is created (in this example it would be /path/2013-10-10/03/01/log.json)

 I checked and double check the files, dates, verified that the
current file is not touched but can't figure out what's going on. I
tried the script without reload rsyslog but it didn't make any
difference. If I don't run this script rsyslog works flawlessly.

 any ideas how to troubleshoot this? What could be causing the rsyslog
to stop writing to the file and for the senders to stop sending syslog
messages? I assume the rsyslog on the collector host somehow signals
to the 6 hosts that send messages that it's not ready or something...

 thanks!

    erik
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
if you DON'T LIKE THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to