On Wed, Nov 20, 2013 at 1:22 PM, Erik Steffl <[email protected]> wrote:

>   we use the same packages (version 7.5.6-0adiscon2 from
> http://ubuntu.adiscon.com/v7-devel saucy/) on all hosts (6 senders and 1
> collector at the moment)
>
>   installed packages (ii means installed, un means not installed):
>
> ubuntu@domU-12-31-39-06-75-11:~$ dpkg -l rsyslog\*
> \Desired=Unknown/Install/Remove/Purge/Hold
> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/
> trig-aWait/Trig-pend
> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
> ||/ Name                      Version           Architecture Description
> +++-=========================-=================-============
> =====-=======================================================
> ii  rsyslog                   7.5.6-0adiscon2   amd64 reliable system and
> kernel logging daemon
> un  rsyslog-doc               <none>                              (no
> description available)
> un  rsyslog-elasticsearch     <none>                              (no
> description available)
> ii  rsyslog-gnutls            7.5.6-0adiscon2   amd64             TLS
> protocol support for rsyslog
> un  rsyslog-gssapi            <none>                              (no
> description available)
> ii  rsyslog-imptcp            7.5.6-0adiscon2   amd64 High-performance,
> threaded TCP input module for rsyslog
> ii  rsyslog-mmjsonparse       7.5.6-0adiscon2   amd64 Parsing/handling of
> CEE/Lumberjack JSON messages in rsy
> un  rsyslog-mysql             <none>                              (no
> description available)
> un  rsyslog-pgsql             <none>                              (no
> description available)
> ii  rsyslog-relp              7.5.6-0adiscon2   amd64             RELP
> protocol support for rsyslog
>
>
these are the rsyslog packages. I need librelp ;)

Rainer


>
>         erik
>
>
> On 11/20/2013 03:09 AM, Rainer Gerhards wrote:
>
>> On Tue, Nov 19, 2013 at 10:08 PM, Erik Steffl <[email protected]> wrote:
>>
>>     using 7.5.6-0adiscon2 from http://ubuntu.adiscon.com/v7-devel saucy/
>>>
>>>    it's not obvious (to me) which relp version it's using but strace of
>>> rsyslogd on the collector machine says:
>>>
>>> 01:32:54.082301 sendto(2, "1 open 85 relp_version=0\nrelp_software=
>>> librelp,1.2.0,http://librelp.adiscon.com\ncommands=syslog\n";, 96, 0,
>>> NULL, 0) = 96
>>>
>>>    so I guess it's 1.2.0?
>>>
>>>
>>>  looks so, but that's the collector version.
>>
>> In Wireshark, you can also see the client open, which includes their
>> version. It's also included in the (server) debug log. However, just
>> checking the packages should also be sufficient. Something along the lines
>> of
>>
>> yum list librelp*
>>
>> which should give you the exact version.
>>
>> HTH
>> Rainer
>>
>>
>>     The 6 sender machines also use same 7.5.6-0adiscon2 packages.
>>>
>>>          erik
>>>
>>>
>>> On 11/19/2013 06:57 AM, Rainer Gerhards wrote:
>>>
>>>  I have checked the debug log that was on the probably related bug
>>>> tracker
>>>> at:
>>>>
>>>> http://bugzilla.adiscon.com/show_bug.cgi?id=208
>>>>
>>>> There, it looks like the client librelp version causes problems (just a
>>>> hypothesis so far).
>>>>
>>>> That brings me to: which version of librelp do you use on client and
>>>> server?
>>>>
>>>> Rainer
>>>>
>>>>
>>>> On Tue, Nov 19, 2013 at 2:45 PM, Rainer Gerhards
>>>> <[email protected]>wrote:
>>>>
>>>>   OK, I am slowly catching up and can most probably look into this
>>>> problem
>>>>
>>>>> now.
>>>>>
>>>>> Erik, I think I need to create an instrumented version in any case so
>>>>> that
>>>>> we can get a bit more insight. Are you able to deploy such a version
>>>>> from
>>>>> git or source tarball? Also, I will need at least a partial debug log,
>>>>> from
>>>>> before things go wrong up until the situation recovers. If the debug
>>>>> log
>>>>> get's to large, "debug on demand" (turn on via signal) will probably be
>>>>> helpful.
>>>>>
>>>>> Could you assist with these things?
>>>>>
>>>>> Thx,
>>>>> Rainer
>>>>>
>>>>>
>>>>> On Tue, Nov 19, 2013 at 9:08 AM, David Lang <[email protected]> wrote:
>>>>>
>>>>>   On Mon, 18 Nov 2013, Erik Steffl wrote:
>>>>>
>>>>>>
>>>>>>    On 11/18/2013 08:06 PM, David Lang wrote:
>>>>>>
>>>>>>
>>>>>>>   could you run impstats with a rapid reporting cycle while you
>>>>>>>
>>>>>>>> duplicate
>>>>>>>> the problem?
>>>>>>>>
>>>>>>>>
>>>>>>>>     run impstats how? I can add it, any suggested configuration?
>>>>>>> Just log
>>>>>>> into a file every minute?
>>>>>>>
>>>>>>>
>>>>>>>  yes, log into a file every minute or so. We may end up shrinking
>>>>>> this to
>>>>>> a faster interval
>>>>>>
>>>>>> When you log as fast as you can, I expect that the queue will grow,
>>>>>> and
>>>>>> then take a little bit of time to drain off. If it gets stuck while
>>>>>> draining, the pstats output will show us.
>>>>>>
>>>>>> David Lang
>>>>>>
>>>>>>
>>>>>>    With a rapid trigger like this, a new debug log may be helpful with
>>>>>> less
>>>>>>
>>>>>>  'stray noise' in it.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     will try to set that up, at least it will be easier to get
>>>>>>> close to
>>>>>>> the
>>>>>>> time the problem starts (sometime between 1st and 100th burst
>>>>>>> message)
>>>>>>>
>>>>>>>    Speculation:
>>>>>>>
>>>>>>>
>>>>>>>> My guess is that something is getting 'stuck' processing the last
>>>>>>>> message of the third batch, and this is stalling the system
>>>>>>>> (probably
>>>>>>>> something is holding a lock that's preventing RELP from finishing
>>>>>>>> processing messages, which is causing the senders to pause until
>>>>>>>> RELP
>>>>>>>> finally acknowledges the sent messages)
>>>>>>>>
>>>>>>>> knowing if all the messages from the script make it to the log or if
>>>>>>>> some of them don't make it until things startup again would be
>>>>>>>> useful
>>>>>>>> data. This may involve creating a custom output template to add
>>>>>>>> extra
>>>>>>>> time info (received time, now, etc) to the output line and see what
>>>>>>>> happens.
>>>>>>>>
>>>>>>>>
>>>>>>>>     sounds reasonable, will try to figure out if that's happening,
>>>>>>>
>>>>>>>    thanks!
>>>>>>>
>>>>>>>           erik
>>>>>>>
>>>>>>>
>>>>>>>   David Lang
>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, 18 Nov 2013, Erik Steffl wrote:
>>>>>>>>
>>>>>>>>     figured out more specific scenario that is causing the silence.
>>>>>>>>
>>>>>>>>
>>>>>>>>>    to recap this is the current setup:
>>>>>>>>>
>>>>>>>>>    - 6 machines sending logs to collector machine using RELP
>>>>>>>>>
>>>>>>>>>    - 1 machine collecting all the logs on disk (20-30 files at a
>>>>>>>>> time,
>>>>>>>>> split by origin etc.)
>>>>>>>>>
>>>>>>>>>    There is fairly steady traffic of 200-300 messages per second,
>>>>>>>>> 200
>>>>>>>>> -
>>>>>>>>> 300 kB/s.
>>>>>>>>>
>>>>>>>>>    In this scenario if I run a program that logs 100 messages as
>>>>>>>>> fast
>>>>>>>>> as
>>>>>>>>> possible the whole system goes silent (every THIRD time this
>>>>>>>>> happens),
>>>>>>>>> all six senders stop sending messages (or almost stop, the traffic
>>>>>>>>> goes down to maybe 1kB/s) and doesn't start again until I run
>>>>>>>>> another
>>>>>>>>> burst of messages (which is equally weird, why would it start again
>>>>>>>>> after next burst of messages?).
>>>>>>>>>
>>>>>>>>>    The program that sends 100 messages is running on the log
>>>>>>>>> collector
>>>>>>>>> host (not sure if it's relevant, likely not).
>>>>>>>>>
>>>>>>>>>    This produces the silences virtually 100% (every third time it
>>>>>>>>> runs),
>>>>>>>>> as long as I cron the 100-messages-log-sender the problem is there,
>>>>>>>>> if
>>>>>>>>> I comment it out it works fine (tested both number of times,
>>>>>>>>> sometime
>>>>>>>>> left it running for days).
>>>>>>>>>
>>>>>>>>>    Previously I thought this might be related to renaming of the
>>>>>>>>> files
>>>>>>>>> or reload rsyslog (HUP signal) but that is completely irrelevant.
>>>>>>>>> Regardless of renaming or HUP signal it's the burst of messages
>>>>>>>>> that
>>>>>>>>> makes it go silent (and somehow signaling to all senders to go
>>>>>>>>> silent,
>>>>>>>>> perhaps by not sending RELP confirmations?).
>>>>>>>>>
>>>>>>>>>    The program that logs burst of 100 messages is a perl script
>>>>>>>>> that
>>>>>>>>> does essentially this (some irrelevant parts left out, think it
>>>>>>>>> would
>>>>>>>>> run as is but it might need some fixes):
>>>>>>>>>
>>>>>>>>> use strict;
>>>>>>>>> use Log::Log4perl;
>>>>>>>>>
>>>>>>>>> my $logConfig = '
>>>>>>>>>    log4perl.rootLogger=DEBUG, SYSLOG
>>>>>>>>>    log4perl.appender.SYSLOG = Log::Dispatch::Syslog
>>>>>>>>>    log4perl.appender.SYSLOG.min_level = debug
>>>>>>>>>    log4perl.appender.SYSLOG.ident = myTag
>>>>>>>>>    log4perl.appender.SYSLOG.facility = local0
>>>>>>>>>    log4perl.appender.SYSLOG.layout = Log::Log4perl::Layout::
>>>>>>>>> PatternLayout
>>>>>>>>>    log4perl.appender.SYSLOG.layout.ConversionPattern=@cee:%m
>>>>>>>>> ';
>>>>>>>>>
>>>>>>>>> Log::Log4perl::init(\$logConfig);
>>>>>>>>> my $l = Log::Log4perl::get_logger();
>>>>>>>>> my $now = localtime();
>>>>>>>>>
>>>>>>>>> for(my $i = 0; $i < 100; $i++) {
>>>>>>>>>    $l->info(JSON::encode_json({now=>$now,i=>$i}));
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>>    it needs liblog-log4perl-perl and liblog-dispatch-perl ubuntu
>>>>>>>>> packages (or equivalent on other distros).
>>>>>>>>>
>>>>>>>>>    This does not seem to be related to rate-limiting (as
>>>>>>>>> documented)
>>>>>>>>> since there are no dropped messages, didn't find any rsyslog
>>>>>>>>> messages
>>>>>>>>> about dropping messages due to rate limiting (grep'd all files in
>>>>>>>>> /var/log/). Could this be related to Queue slowdown? We do not set
>>>>>>>>> dequeueslowdown for any queues.
>>>>>>>>>
>>>>>>>>>    Any ideas why would a short burst of 100 messages cause this
>>>>>>>>> problem?
>>>>>>>>>
>>>>>>>>>    thanks!
>>>>>>>>>
>>>>>>>>>       erik
>>>>>>>>>
>>>>>>>>> On 11/14/2013 01:53 AM, Rainer Gerhards wrote:
>>>>>>>>>
>>>>>>>>>   On Thu, Nov 14, 2013 at 6:58 AM, Erik Steffl <[email protected]>
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>       did you have a chance to look at debug log?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    not yet, but I am finalizing the 8.1.0 release. Bear with me
>>>>>>>>>>> another
>>>>>>>>>>>
>>>>>>>>>>>  couple
>>>>>>>>>> of days pls.
>>>>>>>>>> Rainer
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>       did more testing in the meantime, figured out that:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>      6 senders, 1 collector version 7.5.4, 300 kB/s  always works
>>>>>>>>>>>
>>>>>>>>>>>      6 senders, 1 collector versions 7.5.5, 7.5.6 300 kB/s do not
>>>>>>>>>>> work
>>>>>>>>>>>
>>>>>>>>>>>      6 senders, 2 collectors versions 7.5.5, 7.5.6 300 kB/s
>>>>>>>>>>> mostly
>>>>>>>>>>> work but
>>>>>>>>>>> not always (periods of silence are much more predicable if the re
>>>>>>>>>>> is
>>>>>>>>>>> no
>>>>>>>>>>> reload/HUP signal)
>>>>>>>>>>>
>>>>>>>>>>>      1 sender, one collector any version always works (this has
>>>>>>>>>>> less
>>>>>>>>>>> files
>>>>>>>>>>> open plusthe total traffic is around 100lB/s, not sure if that's
>>>>>>>>>>> siginificant)
>>>>>>>>>>>
>>>>>>>>>>>      Looked at strace on both sides (sender and receiver) during
>>>>>>>>>>> time
>>>>>>>>>>> when
>>>>>>>>>>> traffic went from 300kB/s down to virtually zero but see nothing
>>>>>>>>>>> suspicious, essentially there is lot of messages sent, then one
>>>>>>>>>>> message
>>>>>>>>>>> confirming all of them. Then at some point there is timeout, then
>>>>>>>>>>> few
>>>>>>>>>>> messages are confirmed one by one then nothing, then it starts
>>>>>>>>>>> again.
>>>>>>>>>>>
>>>>>>>>>>>      As far as I can tell it's the rename of the files that
>>>>>>>>>>> causes
>>>>>>>>>>> the
>>>>>>>>>>> problem (we rename athen remove them but if I comment out the
>>>>>>>>>>> remove
>>>>>>>>>>> part
>>>>>>>>>>> the problem is still there). We never rename the files that
>>>>>>>>>>> rsyslogd
>>>>>>>>>>> is
>>>>>>>>>>> actually writing to (even though they are still open).
>>>>>>>>>>>
>>>>>>>>>>>      Given the behaviour in the scenarios above it seems
>>>>>>>>>>> something
>>>>>>>>>>> got
>>>>>>>>>>> broken
>>>>>>>>>>> between 7.5.4 and 7.5.5. Did the code that handles renamed files
>>>>>>>>>>> change in
>>>>>>>>>>> any way? I looked at changelog (http://www.rsyslog.com/
>>>>>>>>>>> changelog-for-7-5-5-v7-devel/) before but didn't see anything
>>>>>>>>>>> that
>>>>>>>>>>> looks
>>>>>>>>>>> related.
>>>>>>>>>>>
>>>>>>>>>>>      Any ideas how to troubleshoot this further?
>>>>>>>>>>>
>>>>>>>>>>>      thanks!
>>>>>>>>>>>
>>>>>>>>>>>            erik
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 11/05/2013 10:04 AM, Rainer Gerhards wrote:
>>>>>>>>>>>
>>>>>>>>>>>    On Tue, Nov 5, 2013 at 3:41 AM, Erik Steffl <[email protected]>
>>>>>>>>>>>
>>>>>>>>>>>  wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>        See the debug log attached, the first line is the last
>>>>>>>>>>>> message
>>>>>>>>>>>> in one
>>>>>>>>>>>>
>>>>>>>>>>>>   of
>>>>>>>>>>>>
>>>>>>>>>>>>> the files (maybe not last message overall, but all
>>>>>>>>>>>>> communication
>>>>>>>>>>>>> stops at
>>>>>>>>>>>>> that second, there's number of files so it's hard to figure out
>>>>>>>>>>>>> which
>>>>>>>>>>>>> message is the last one).
>>>>>>>>>>>>>
>>>>>>>>>>>>>       Also noticed that number of messages in log with same
>>>>>>>>>>>>> number
>>>>>>>>>>>>> at the
>>>>>>>>>>>>> beginning drops right after this message, I assume the
>>>>>>>>>>>>> beginning
>>>>>>>>>>>>> of the
>>>>>>>>>>>>> line has something to do with time, these are the counts and
>>>>>>>>>>>>> values at
>>>>>>>>>>>>> the
>>>>>>>>>>>>> beginning of the debug log line:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>    It's the last 5 digits of a Unix timestamp - so you don't
>>>>>>>>>>>>> get
>>>>>>>>>>>>> an
>>>>>>>>>>>>>
>>>>>>>>>>>>>  absolute
>>>>>>>>>>>> time, but an idea how things progress. After the dot I think
>>>>>>>>>>>> there
>>>>>>>>>>>> are ms
>>>>>>>>>>>> (or so) and after the colon there is a thread ID (same ID = same
>>>>>>>>>>>> thread).
>>>>>>>>>>>>
>>>>>>>>>>>> Hope that helps a bit, will try to look at the debug log
>>>>>>>>>>>> tomorrow
>>>>>>>>>>>> morning.
>>>>>>>>>>>>
>>>>>>>>>>>> Rainer
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        21264 9547.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        22378 9548.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>         2725 9549.
>>>>>>>>>>>>
>>>>>>>>>>>>         1877 9550.
>>>>>>>>>>>>> ...9551 to 9969 left out, low counts around 2000
>>>>>>>>>>>>> ...9970 to 0519 left out, low counts around  6
>>>>>>>>>>>>>           6 0520.
>>>>>>>>>>>>>           6 0521.
>>>>>>>>>>>>>       39302 0522.
>>>>>>>>>>>>>       51997 0523.
>>>>>>>>>>>>>
>>>>>>>>>>>>>       thanks!
>>>>>>>>>>>>>
>>>>>>>>>>>>>             erik
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/03/2013 01:05 PM, David Lang wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>     I would post everything in the log file starting with the
>>>>>>>>>>>>> last
>>>>>>>>>>>>> log that
>>>>>>>>>>>>>
>>>>>>>>>>>>>   it successfully sent and continuing a while after that
>>>>>>>>>>>>> message
>>>>>>>>>>>>>
>>>>>>>>>>>>>> (a
>>>>>>>>>>>>>> while
>>>>>>>>>>>>>> being a couple hundred lines or so to be safe)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> once we look at the log we may ask for more.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David Lang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>       On Sun, 3 Nov 2013, Erik Steffl wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      Date: Sun, 03 Nov 2013 11:45:03 -0800
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    From: Erik Steffl <[email protected]>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Reply-To: rsyslog-users <[email protected]>
>>>>>>>>>>>>>>> To: rsyslog-users <[email protected]>
>>>>>>>>>>>>>>> Subject: Re: [rsyslog] Rsyslog with RELP not
>>>>>>>>>>>>>>> sending/receiving
>>>>>>>>>>>>>>> messages for
>>>>>>>>>>>>>>>         long intervals
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>      I have debug log where I was able to find last message
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>> logs files (right before the silence period) but it's 3 GB :)
>>>>>>>>>>>>>>> Any hint
>>>>>>>>>>>>>>> how much to pick the relevant part of the debug log? Is the
>>>>>>>>>>>>>>> timestamp
>>>>>>>>>>>>>>> part of the log message? There is something in the beginning
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>> debug log line but I don't know how to translate it to time
>>>>>>>>>>>>>>> (part of
>>>>>>>>>>>>>>> it is transaction id, the rest not sure about)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>      thanks! (and of course I unserstand you can't drop
>>>>>>>>>>>>>>> everything
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> work on my problem, thanks for answers!)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         erik
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 11/02/2013 08:51 AM, Rainer Gerhards wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     On Fri, Nov 1, 2013 at 11:31 PM, Erik Steffl <
>>>>>>>>>>>>>>> [email protected]>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>          running reload every 43 seconds, behaviour is
>>>>>>>>>>>>>>>> exactly
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>    (30 min
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  traffic, 15 min silence) so I think it's not related to
>>>>>>>>>>>>>>>>> reload.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>        will test the latest version and I guess re-test the
>>>>>>>>>>>>>>>>> older
>>>>>>>>>>>>>>>>> version and
>>>>>>>>>>>>>>>>> go from there...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>      It would probably be a good idea to try to capture a
>>>>>>>>>>>>>>>>> debug
>>>>>>>>>>>>>>>>> log so
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    that we
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  can see what rsyslog itself thinks what happens. Note
>>>>>>>>>>>>>>>> that you
>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>> just enable it shortly before you expect the problem to
>>>>>>>>>>>>>>>> happen
>>>>>>>>>>>>>>>> (google for
>>>>>>>>>>>>>>>> "debug on demand").
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Right now, I am totally busy with complex work and cannot
>>>>>>>>>>>>>>>> look
>>>>>>>>>>>>>>>> at it.
>>>>>>>>>>>>>>>> If
>>>>>>>>>>>>>>>> this is for a business, I would strongly suggest to purchase
>>>>>>>>>>>>>>>> one of
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> low-priced support options. That would enable an engineer to
>>>>>>>>>>>>>>>> begin
>>>>>>>>>>>>>>>> working
>>>>>>>>>>>>>>>> on the case within half a day or so. I would *really love*
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> drop
>>>>>>>>>>>>>>>> this to
>>>>>>>>>>>>>>>> someone else, as I really don't like the smell of this bug
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> but if
>>>>>>>>>>>>>>>> I do,
>>>>>>>>>>>>>>>> I am pretty sure other bugs will creep up, and that would
>>>>>>>>>>>>>>>> again
>>>>>>>>>>>>>>>> be the
>>>>>>>>>>>>>>>> death of the new work I am doing (and it took me 3 years or
>>>>>>>>>>>>>>>> so
>>>>>>>>>>>>>>>> to get
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> this point!).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So it looks like non-paid work needs to stay focused on that
>>>>>>>>>>>>>>>> work,
>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>> will be useful for all users in the hopefully not so distant
>>>>>>>>>>>>>>>> future.
>>>>>>>>>>>>>>>> All
>>>>>>>>>>>>>>>> non-emergency bugs (usually security related things) or
>>>>>>>>>>>>>>>> request
>>>>>>>>>>>>>>>> thus
>>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>>> to be on hold for a couple of days or maybe till the end of
>>>>>>>>>>>>>>>> month.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If you get a debug log, I will have time to at least look
>>>>>>>>>>>>>>>> quickly at
>>>>>>>>>>>>>>>> it,
>>>>>>>>>>>>>>>> but I can't say if that will be sufficient to provide a
>>>>>>>>>>>>>>>> solution.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> sorry for that and I hope for your understanding.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Rainer
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>        thanks!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>               erik
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 11/01/2013 02:29 PM, David Lang wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>      Ok, this is definantly sounding like a bug in RELP, it
>>>>>>>>>>>>>>>>> worked on an
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    earlier version, check if it works on the latest
>>>>>>>>>>>>>>>>> version,
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  then I
>>>>>>>>>>>>>>>>>> would suggest fileing a bug on the
>>>>>>>>>>>>>>>>>> www.rsyslog.combugtracker
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This sounds like something Rainer is going to need to look
>>>>>>>>>>>>>>>>>> at. He's
>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> Germany, so I believe his weekend has started, he may see
>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>> over
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> weekend or it may wait until his work week starts.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> David Lang
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>        On Fri, 1 Nov 2013, Erik Steffl wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>        pretty sure reload is sending HUP (that's upstart
>>>>>>>>>>>>>>>>>> job
>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>> inlcluded in
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     adiscon packages), the rsyslog keeps the same PID, I
>>>>>>>>>>>>>>>>>> even
>>>>>>>>>>>>>>>>>> tried to
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>   strace it while running reload, here's syslog message:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Nov  1 10:47:25 ip-10-238-198-103 rsyslogd: [origin
>>>>>>>>>>>>>>>>>>> software="rsyslogd" swVersion="7.5.5" x-pid="3865"
>>>>>>>>>>>>>>>>>>> x-info="http://www.rsyslog.com";] rsyslogd was HUPed
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>       but I think it must be related to something else
>>>>>>>>>>>>>>>>>>> cause
>>>>>>>>>>>>>>>>>>> when I
>>>>>>>>>>>>>>>>>>> removed
>>>>>>>>>>>>>>>>>>> the reload (so the log mover was doing only rename,
>>>>>>>>>>>>>>>>>>> upload,
>>>>>>>>>>>>>>>>>>> remove) it
>>>>>>>>>>>>>>>>>>> was still exhibiting the same works for 30 min, silent
>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>> 15 min
>>>>>>>>>>>>>>>>>>> etc.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>       running reload every 43 seconds now and will see
>>>>>>>>>>>>>>>>>>> what
>>>>>>>>>>>>>>>>>>> happens...
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>          erik
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 11/01/2013 12:57 PM, David Lang wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>      a HUP of the server should not cause any
>>>>>>>>>>>>>>>>>>> interruption
>>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>> processing
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    messages.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> are you sure upstart is seing HUP not doing a full
>>>>>>>>>>>>>>>>>>>> stop/start?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> could you try sending HUP to rsyslog manually a few
>>>>>>>>>>>>>>>>>>>> times
>>>>>>>>>>>>>>>>>>>> to see
>>>>>>>>>>>>>>>>>>>> if you
>>>>>>>>>>>>>>>>>>>> can duplicate the problem?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> David Lang
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Fri, 1 Nov 2013, Erik Steffl wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>        sorry, it wasn't entirely clear in my previous
>>>>>>>>>>>>>>>>>>>> email
>>>>>>>>>>>>>>>>>>>> but we
>>>>>>>>>>>>>>>>>>>> are doing
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     ubuntu upstart reload which is sending HUP signal,
>>>>>>>>>>>>>>>>>>>> pretty
>>>>>>>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   it's
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> doing the right thing since logrotate scripts use it,
>>>>>>>>>>>>>>>>>>>>> syslog
>>>>>>>>>>>>>>>>>>>>> message
>>>>>>>>>>>>>>>>>>>>> says rsyslog received HUP signal, rsyslogd keeps the
>>>>>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>>>>> PID
>>>>>>>>>>>>>>>>>>>>> and all
>>>>>>>>>>>>>>>>>>>>> files are closed (as verified using lsof).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>       On top of that if, during one of these quiet
>>>>>>>>>>>>>>>>>>>>> periods,
>>>>>>>>>>>>>>>>>>>>> I try to
>>>>>>>>>>>>>>>>>>>>> send
>>>>>>>>>>>>>>>>>>>>> messages from another client (on which I just restarted
>>>>>>>>>>>>>>>>>>>>> rsyslog
>>>>>>>>>>>>>>>>>>>>> so it
>>>>>>>>>>>>>>>>>>>>> has no error counters etc.) I think it's also not
>>>>>>>>>>>>>>>>>>>>> sending
>>>>>>>>>>>>>>>>>>>>> anything,
>>>>>>>>>>>>>>>>>>>>> when stracing it (the sender) I see it recieved
>>>>>>>>>>>>>>>>>>>>> messages
>>>>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>>> logger
>>>>>>>>>>>>>>>>>>>>> but is not sending them anywhere. So I think that the
>>>>>>>>>>>>>>>>>>>>> collector
>>>>>>>>>>>>>>>>>>>>> rsyslog continues to send some kind of signal that it's
>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>> ready but
>>>>>>>>>>>>>>>>>>>>> not sure why or how to figure out whether it actually
>>>>>>>>>>>>>>>>>>>>> does
>>>>>>>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>       Any ideas how to troubleshoot this? I think I'll
>>>>>>>>>>>>>>>>>>>>> record more
>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> strace output and try to reconstruct RELP messages, I
>>>>>>>>>>>>>>>>>>>>> guess
>>>>>>>>>>>>>>>>>>>>> there is
>>>>>>>>>>>>>>>>>>>>> some RELP response that has something else than
>>>>>>>>>>>>>>>>>>>>> RSP-CODE
>>>>>>>>>>>>>>>>>>>>> =
>>>>>>>>>>>>>>>>>>>>> 200
>>>>>>>>>>>>>>>>>>>>> (looking at http://www.rsyslog.com/doc/relp.html)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>       I was just about to publish the S3 upload scripts
>>>>>>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>>> since
>>>>>>>>>>>>>>>>>>>>> they
>>>>>>>>>>>>>>>>>>>>> cause rsyslog to misbehave this way I didn't do it yet,
>>>>>>>>>>>>>>>>>>>>> maybe
>>>>>>>>>>>>>>>>>>>>> I'll
>>>>>>>>>>>>>>>>>>>>> post them as they are and fix later...
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>          erik
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 11/01/2013 01:02 AM, David Lang wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>      the purpose of RELP is to tell the senders if the
>>>>>>>>>>>>>>>>>>>>> receiver was
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>    able to
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  accept the message or not, so that's working as
>>>>>>>>>>>>>>>>>>>>>> designed.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> When you do a full restart of rsyslog, it stops
>>>>>>>>>>>>>>>>>>>>>> processing new
>>>>>>>>>>>>>>>>>>>>>> messages
>>>>>>>>>>>>>>>>>>>>>> and disconnects all senders (you are doing a full stop
>>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>> rsyslog and
>>>>>>>>>>>>>>>>>>>>>> then starting it from scratch, this takes time). My
>>>>>>>>>>>>>>>>>>>>>> guess
>>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>> that the
>>>>>>>>>>>>>>>>>>>>>> senders are detecting 'too many failures' when trying
>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> send
>>>>>>>>>>>>>>>>>>>>>> messages,
>>>>>>>>>>>>>>>>>>>>>> so they are backing off and sleeping for a while
>>>>>>>>>>>>>>>>>>>>>> rather
>>>>>>>>>>>>>>>>>>>>>> tthan
>>>>>>>>>>>>>>>>>>>>>> performing
>>>>>>>>>>>>>>>>>>>>>> a mini DOS attack on the network and server. Every
>>>>>>>>>>>>>>>>>>>>>> third
>>>>>>>>>>>>>>>>>>>>>> restart is
>>>>>>>>>>>>>>>>>>>>>> probably triggering a threshold.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Instead of doing a full restart, just send the rsyslog
>>>>>>>>>>>>>>>>>>>>>> daemon a
>>>>>>>>>>>>>>>>>>>>>> HUP
>>>>>>>>>>>>>>>>>>>>>> signal instead. That will tell rsyslog to flush and
>>>>>>>>>>>>>>>>>>>>>> close
>>>>>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>>>>> files so
>>>>>>>>>>>>>>>>>>>>>> that you can rotate them (If you are doing compression
>>>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> files, you
>>>>>>>>>>>>>>>>>>>>>> may need to sleep for a few seconds to let the fluch
>>>>>>>>>>>>>>>>>>>>>> complete).
>>>>>>>>>>>>>>>>>>>>>> Rsyslog
>>>>>>>>>>>>>>>>>>>>>> can continue to receive new messages during this time,
>>>>>>>>>>>>>>>>>>>>>> so
>>>>>>>>>>>>>>>>>>>>>> your
>>>>>>>>>>>>>>>>>>>>>> senders
>>>>>>>>>>>>>>>>>>>>>> will not see an outage.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> By the way, I'm needing a script to upload rsyslog
>>>>>>>>>>>>>>>>>>>>>> archives to
>>>>>>>>>>>>>>>>>>>>>> S3,
>>>>>>>>>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>>>>>>>>> you send me a copy of yours? (remove any passwords
>>>>>>>>>>>>>>>>>>>>>> first
>>>>>>>>>>>>>>>>>>>>>> please
>>>>>>>>>>>>>>>>>>>>>> :-)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> David Lang
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>        On Thu, 31 Oct 2013, Erik Steffl wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>       Date: Thu, 31 Oct 2013 18:44:23 -0700
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     From: Erik Steffl <[email protected]>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>   Reply-To: rsyslog-users <[email protected]>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> To: rsyslog-users <[email protected]>
>>>>>>>>>>>>>>>>>>>>>>> Subject: [rsyslog] Rsyslog with RELP not
>>>>>>>>>>>>>>>>>>>>>>> sending/receiving
>>>>>>>>>>>>>>>>>>>>>>> messages
>>>>>>>>>>>>>>>>>>>>>>> for long
>>>>>>>>>>>>>>>>>>>>>>>          intervals
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> We have a fairly simple setup of 6 hosts sending
>>>>>>>>>>>>>>>>>>>>>>> syslog
>>>>>>>>>>>>>>>>>>>>>>> messages to
>>>>>>>>>>>>>>>>>>>>>>> one collector host, all of these run rsyslog
>>>>>>>>>>>>>>>>>>>>>>> 7.5.5-0adiscon2
>>>>>>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>>>>> adiscon repo and use RELP to transfer messages. There
>>>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>>>>>>>>> load
>>>>>>>>>>>>>>>>>>>>>>> balancer in front of the collector machine but I
>>>>>>>>>>>>>>>>>>>>>>> dont'
>>>>>>>>>>>>>>>>>>>>>>> think it
>>>>>>>>>>>>>>>>>>>>>>> matters in this case.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Rsyslog on collector machine is configured to write
>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> files,
>>>>>>>>>>>>>>>>>>>>>>> switching to a new file every 15 minute, using config
>>>>>>>>>>>>>>>>>>>>>>> like this
>>>>>>>>>>>>>>>>>>>>>>> (abbreviated a bit):
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> template(name="jsonFilename" type="list") {
>>>>>>>>>>>>>>>>>>>>>>>       constant(value="/path/")
>>>>>>>>>>>>>>>>>>>>>>>       property(name="$now")
>>>>>>>>>>>>>>>>>>>>>>>       constant(value="/")
>>>>>>>>>>>>>>>>>>>>>>>       property(name="$hour")
>>>>>>>>>>>>>>>>>>>>>>>       constant(value="/")
>>>>>>>>>>>>>>>>>>>>>>>       property(name="$qhour")
>>>>>>>>>>>>>>>>>>>>>>>       constant(value="/")
>>>>>>>>>>>>>>>>>>>>>>>       constant(value="log.json")
>>>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> action(type="omfile" DynaFile="jsonFilename"
>>>>>>>>>>>>>>>>>>>>>>> Template="jsonFormat")
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       We run a script at every 2, 17, 32, 47 minute
>>>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> hour
>>>>>>>>>>>>>>>>>>>>>>> and upload
>>>>>>>>>>>>>>>>>>>>>>> the just finished file to S3. The uploading works
>>>>>>>>>>>>>>>>>>>>>>> like
>>>>>>>>>>>>>>>>>>>>>>> this:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       - let's say it's 3:02:00, rsyslog is writing to
>>>>>>>>>>>>>>>>>>>>>>> /path/2013-10-10/03/00/log.json
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       - get the filename log.json (anything that's
>>>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>> current,
>>>>>>>>>>>>>>>>>>>>>>> usually
>>>>>>>>>>>>>>>>>>>>>>> just one previous file which in the example would be
>>>>>>>>>>>>>>>>>>>>>>> /path/2013-10-10/02/03/log.json)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       - rename /path/2013-10-10/02/03/log.json to
>>>>>>>>>>>>>>>>>>>>>>> /path/2013-10-10/02/03/log.json.uploading.0
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       - reload rsyslog (to make sure that even if for
>>>>>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>>>>> reason
>>>>>>>>>>>>>>>>>>>>>>> it was
>>>>>>>>>>>>>>>>>>>>>>> writing to just renamed file it would close it and
>>>>>>>>>>>>>>>>>>>>>>> open
>>>>>>>>>>>>>>>>>>>>>>> a new
>>>>>>>>>>>>>>>>>>>>>>> file)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       - upload /path/2013-10-10/02/03/log.
>>>>>>>>>>>>>>>>>>>>>>> json.uploading.0
>>>>>>>>>>>>>>>>>>>>>>> to S3
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       - remove /path/2013-10-10/02/03/log.
>>>>>>>>>>>>>>>>>>>>>>> json.uploading.0
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Here's what happens every third run (yes, regularly
>>>>>>>>>>>>>>>>>>>>>>> EVERY
>>>>>>>>>>>>>>>>>>>>>>> THIRD RUN)
>>>>>>>>>>>>>>>>>>>>>>> of this script:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       - rsyslog stops writing to the CURRENT file
>>>>>>>>>>>>>>>>>>>>>>> (/path/2013-10-10/03/00/log.json, the one that is
>>>>>>>>>>>>>>>>>>>>>>> NOT
>>>>>>>>>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>>>>>>>> renamed)
>>>>>>>>>>>>>>>>>>>>>>> few seconds into the run of the script (e.g. 3:02:04)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       - 6 hosts that were sending syslog messages to
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> log
>>>>>>>>>>>>>>>>>>>>>>> collector
>>>>>>>>>>>>>>>>>>>>>>> STOP
>>>>>>>>>>>>>>>>>>>>>>> sending anything (as verified by stracing rsyslogd,
>>>>>>>>>>>>>>>>>>>>>>> tcpdump
>>>>>>>>>>>>>>>>>>>>>>> and in
>>>>>>>>>>>>>>>>>>>>>>> amazon AWS console metric for network in)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       - after this nothing is ever written into
>>>>>>>>>>>>>>>>>>>>>>> /path/2013-10-10/03/00/log.json
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       - the 6 clients start sending sysog messages
>>>>>>>>>>>>>>>>>>>>>>> again
>>>>>>>>>>>>>>>>>>>>>>> when the
>>>>>>>>>>>>>>>>>>>>>>> next
>>>>>>>>>>>>>>>>>>>>>>> file
>>>>>>>>>>>>>>>>>>>>>>> is created (in this example it would be
>>>>>>>>>>>>>>>>>>>>>>> /path/2013-10-10/03/01/log.json)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       I checked and double check the files, dates,
>>>>>>>>>>>>>>>>>>>>>>> verified that
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> current file is not touched but can't figure out
>>>>>>>>>>>>>>>>>>>>>>> what's
>>>>>>>>>>>>>>>>>>>>>>> going
>>>>>>>>>>>>>>>>>>>>>>> on. I
>>>>>>>>>>>>>>>>>>>>>>> tried the script without reload rsyslog but it didn't
>>>>>>>>>>>>>>>>>>>>>>> make any
>>>>>>>>>>>>>>>>>>>>>>> difference. If I don't run this script rsyslog works
>>>>>>>>>>>>>>>>>>>>>>> flawlessly.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       any ideas how to troubleshoot this? What could
>>>>>>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>> causing
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> rsyslog
>>>>>>>>>>>>>>>>>>>>>>> to stop writing to the file and for the senders to
>>>>>>>>>>>>>>>>>>>>>>> stop
>>>>>>>>>>>>>>>>>>>>>>> sending
>>>>>>>>>>>>>>>>>>>>>>> syslog
>>>>>>>>>>>>>>>>>>>>>>> messages? I assume the rsyslog on the collector host
>>>>>>>>>>>>>>>>>>>>>>> somehow
>>>>>>>>>>>>>>>>>>>>>>> signals
>>>>>>>>>>>>>>>>>>>>>>> to the 6 hosts that send messages that it's not ready
>>>>>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>>>> something...
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       thanks!
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>          erik
>>>>>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>> rsyslog mailing list
>>>>>>>>>>>>>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>>>>>>>>>>>> What's up with rsyslog? Follow
>>>>>>>>>>>>>>>>>>>>>>> https://twitter.com/rgerhards
>>>>>>>>>>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are
>>>>>>>>>>>>>>>>>>>>>>> ARCHIVED
>>>>>>>>>>>>>>>>>>>>>>> by a
>>>>>>>>>>>>>>>>>>>>>>> myriad of sites beyond our control. PLEASE
>>>>>>>>>>>>>>>>>>>>>>> UNSUBSCRIBE
>>>>>>>>>>>>>>>>>>>>>>> and DO
>>>>>>>>>>>>>>>>>>>>>>> NOT
>>>>>>>>>>>>>>>>>>>>>>> POST
>>>>>>>>>>>>>>>>>>>>>>> if you DON'T LIKE THAT.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>       ______________________________
>>>>>>>>>>>>>>>>>>>>>>> _________________
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>     rsyslog mailing list
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>   http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>>>>>>>>>>> What's up with rsyslog? Follow
>>>>>>>>>>>>>>>>>>>>>> https://twitter.com/rgerhards
>>>>>>>>>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are
>>>>>>>>>>>>>>>>>>>>>> ARCHIVED by
>>>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>> myriad
>>>>>>>>>>>>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO
>>>>>>>>>>>>>>>>>>>>>> NOT POST
>>>>>>>>>>>>>>>>>>>>>> if you
>>>>>>>>>>>>>>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>   rsyslog mailing list
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>>>>>>>>>> What's up with rsyslog? Follow
>>>>>>>>>>>>>>>>>>>>> https://twitter.com/rgerhards
>>>>>>>>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are
>>>>>>>>>>>>>>>>>>>>> ARCHIVED by a
>>>>>>>>>>>>>>>>>>>>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE
>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> DO
>>>>>>>>>>>>>>>>>>>>> NOT POST
>>>>>>>>>>>>>>>>>>>>> if you DON'T LIKE THAT.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>       _______________________________________________
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     rsyslog mailing list
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>   http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>>>>>>>>> What's up with rsyslog? Follow
>>>>>>>>>>>>>>>>>>>> https://twitter.com/rgerhards
>>>>>>>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are
>>>>>>>>>>>>>>>>>>>> ARCHIVED by a
>>>>>>>>>>>>>>>>>>>> myriad
>>>>>>>>>>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO
>>>>>>>>>>>>>>>>>>>> NOT
>>>>>>>>>>>>>>>>>>>> POST
>>>>>>>>>>>>>>>>>>>> if you
>>>>>>>>>>>>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     _______________________________________________
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   rsyslog mailing list
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>>>>>>>> What's up with rsyslog? Follow
>>>>>>>>>>>>>>>>>>> https://twitter.com/rgerhards
>>>>>>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are
>>>>>>>>>>>>>>>>>>> ARCHIVED
>>>>>>>>>>>>>>>>>>> by a
>>>>>>>>>>>>>>>>>>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE
>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>> DO NOT
>>>>>>>>>>>>>>>>>>> POST
>>>>>>>>>>>>>>>>>>> if you DON'T LIKE THAT.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>       _______________________________________________
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     rsyslog mailing list
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>   http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>>>>>>> What's up with rsyslog? Follow
>>>>>>>>>>>>>>>>>> https://twitter.com/rgerhards
>>>>>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are
>>>>>>>>>>>>>>>>>> ARCHIVED
>>>>>>>>>>>>>>>>>> by a
>>>>>>>>>>>>>>>>>> myriad
>>>>>>>>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
>>>>>>>>>>>>>>>>>> POST if
>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     _______________________________________________
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>   rsyslog mailing list
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>>>>>> What's up with rsyslog? Follow
>>>>>>>>>>>>>>>>> https://twitter.com/rgerhards
>>>>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are
>>>>>>>>>>>>>>>>> ARCHIVED
>>>>>>>>>>>>>>>>> by a
>>>>>>>>>>>>>>>>> myriad
>>>>>>>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
>>>>>>>>>>>>>>>>> POST if
>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>      _______________________________________________
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    rsyslog mailing list
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>>>>> What's up with rsyslog? Follow
>>>>>>>>>>>>>>>> https://twitter.com/rgerhards
>>>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED
>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and
>>>>>>>>>>>>>>>> DO
>>>>>>>>>>>>>>>> NOT
>>>>>>>>>>>>>>>> POST if you DON'T LIKE THAT.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     _______________________________________________
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>   rsyslog mailing list
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED
>>>>>>>>>>>>>>> by a
>>>>>>>>>>>>>>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO
>>>>>>>>>>>>>>> NOT POST
>>>>>>>>>>>>>>> if you DON'T LIKE THAT.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>      _______________________________________________
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    rsyslog mailing list
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED
>>>>>>>>>>>>>> by
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>> myriad
>>>>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
>>>>>>>>>>>>>> POST
>>>>>>>>>>>>>> if you
>>>>>>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   _______________________________________________
>>>>>>>>>>>>>>
>>>>>>>>>>>>> rsyslog mailing list
>>>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED
>>>>>>>>>>>>> by a
>>>>>>>>>>>>> myriad
>>>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
>>>>>>>>>>>>> if
>>>>>>>>>>>>> you
>>>>>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>>>>>>
>>>>>>>>>>>>>     _______________________________________________
>>>>>>>>>>>>>
>>>>>>>>>>>>>   rsyslog mailing list
>>>>>>>>>>>>>
>>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by
>>>>>>>>>>>> a
>>>>>>>>>>>> myriad
>>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
>>>>>>>>>>>> if
>>>>>>>>>>>> you
>>>>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>    _______________________________________________
>>>>>>>>>>>>
>>>>>>>>>>>>  rsyslog mailing list
>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>>>>>> myriad
>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
>>>>>>>>>>> if
>>>>>>>>>>> you
>>>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>>>>
>>>>>>>>>>>    _______________________________________________
>>>>>>>>>>>
>>>>>>>>>>>  rsyslog mailing list
>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>>>>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
>>>>>>>>>> POST if you DON'T LIKE THAT.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   _______________________________________________
>>>>>>>>>>
>>>>>>>>> rsyslog mailing list
>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>>>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
>>>>>>>>> POST
>>>>>>>>> if you DON'T LIKE THAT.
>>>>>>>>>
>>>>>>>>>    _______________________________________________
>>>>>>>>>
>>>>>>>>>  rsyslog mailing list
>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>>> myriad
>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
>>>>>>>> you
>>>>>>>> DON'T LIKE THAT.
>>>>>>>>
>>>>>>>>
>>>>>>>>  _______________________________________________
>>>>>>> rsyslog mailing list
>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>> myriad
>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
>>>>>>> you
>>>>>>> DON'T LIKE THAT.
>>>>>>>
>>>>>>>    _______________________________________________
>>>>>>>
>>>>>>>  rsyslog mailing list
>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>> http://www.rsyslog.com/professional-services/
>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>> myriad
>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>>>> DON'T LIKE THAT.
>>>>>>
>>>>>>
>>>>>>
>>>>>   _______________________________________________
>>>>>
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>> DON'T LIKE THAT.
>>>>
>>>>
>>>>  _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>> DON'T LIKE THAT.
>>>
>>>  _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>> DON'T LIKE THAT.
>>
>>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to