Hi Todd, thanks for the detailed report. Unfortunately, I do not have time at the moment to go through this longer debugging effort (I need to create slides for a conference next week, plus have to do some paid work...). I'd appreciate if you could open a bug tracker with the info. I will look at it asap, but that's probably after next week.
Rainer > -----Original Message----- > From: [email protected] [mailto:rsyslog- > [email protected]] On Behalf Of Todd Michael Bushnell > Sent: Friday, March 18, 2011 6:34 AM > To: rsyslog-users > Subject: Re: [rsyslog] Back logs from disk assisted queue > notflowingtocentralloghost after service restored > > Rainer, > > Will send you additional debug to your private email momentarily. Here's > what I'm seeing: As expected, rsyslog starts to locally queue logs in file > identified by ActionQueueFileName (e.g. failqueue-loghost#.0000n) if central > loghost is inaccessible. This is good. To simulate, I use iptables to block traffic > to one of my loghosts and then blast 10,000 messages on that client. Here's > what $WorkDirectory looks like when I do this: > > [root@server1 rsyslog]# ls -al > total 4988 > drwxr-x--- 2 root wheel 4096 Mar 17 21:34 . > drwxr-xr-x 7 root root 4096 Mar 17 04:08 .. > -rw------- 1 root root 619948 Mar 17 21:34 failqueue-loghost1.00000002 > -rw------- 1 root root 1048800 Mar 17 21:34 failqueue-loghost2.00000001 > -rw------- 1 root root 1048850 Mar 17 21:34 failqueue-loghost2.00000002 > -rw------- 1 root root 1048581 Mar 17 21:34 failqueue-loghost2.00000003 > -rw------- 1 root root 1048988 Mar 17 21:34 failqueue-loghost2.00000004 > -rw------- 1 root root 234515 Mar 17 21:34 failqueue-loghost2.00000005 > > Note: loghost2 is the server I make inaccessible. loghost1 is still accessible. > assume it's queuing because loghost can't keep up with message blast. > > I then restart iptables to make loghost2 accessible again. after a minute or so > I check $WorkDirectory and it looks like this: > > [root@server1 rsyslog]# ls -al > total 860 > drwxr-x--- 2 root wheel 4096 Mar 17 21:36 . > drwxr-xr-x 7 root root 4096 Mar 17 04:08 .. > -rw------- 1 root root 621295 Mar 17 21:36 failqueue-loghost1.00000002 > -rw------- 1 root root 236716 Mar 17 21:36 failqueue-loghost2.00000005 > > So as you can see, most of the logs clear out as expected, but I'm always left > with one logfile for each of my logservers. When I check the central loghosts > they have already received all of the test messages so these remaining files > contain messages that the central loghosts already have. Furthermore, > future logs destined for the central loghosts get appended to these files > even though they are arriving at the central loghosts. > > I then stop rsyslog (I clearly identify where I do this by echoing "RSYSLOG > RESTART" in debug file) and start it back up. When I do this, both files go > away. > > Note: though not represented in this debug, I'm sometimes seeing the same > behavior with the my MainMsgQueue. The file will stick around and all new > log entries get copied to it until rsyslog is restarted and the files go away. > > Hopefully the debug log will provide some answers. Thx. > > Todd > > On Mar 17, 2011, at 8:53 AM, Rainer Gerhards wrote: > > > I have had a quick look at the debug log. Check line 133. It looks > > like there is some problem within the queue file. This makes rsyslog > > switch over to using a pure memory queue. > > > > Rainer > > > >> -----Original Message----- > >> From: [email protected] [mailto:rsyslog- > >> [email protected]] On Behalf Of Rainer Gerhards > >> Sent: Thursday, March 17, 2011 4:01 PM > >> To: rsyslog-users > >> Subject: Re: [rsyslog] Back logs from disk assisted queue not > >> flowingtocentralloghost after service restored > >> > >> Please feel free to send to my private email address (the list will > > probably > >> reject due to size anyway). I promise to have a quick look, but I > >> will > > probably > >> not be able to have an in-depth look until some time next week (but > >> hopefully the quick look helps ;)) > >> > >> Rainer > >> > >>> -----Original Message----- > >>> From: [email protected] [mailto:rsyslog- > >>> [email protected]] On Behalf Of Todd Michael Bushnell > >>> Sent: Thursday, March 17, 2011 3:53 PM > >>> To: rsyslog-users > >>> Subject: Re: [rsyslog] Back logs from disk assisted queue not > >>> flowing tocentralloghost after service restored > >>> > >>> Will do Rainer. Just confirming, I should send zipped debug logs to > >>> this > >> list or > >>> is there a private email address you prefer? Also, I ran debug on > >>> an > >> existing > >>> system moments ago - a system that currently has several of these > "stuck" > >>> failqueue logfiles. Want to make sure that will give you what you > >>> need or > >> if I > >>> need to start over, simulate a central loghost outage and grab that > >>> information? If the former, I have what you need and will send once > >>> I get confirm on location to send. The latter will take some time > >>> so I can > >> simulate > >>> worthwhile test. Thx. > >>> > >>> todd > >>> > >>> > >>> > >>> > >>> On Mar 17, 2011, at 12:39 AM, Rainer Gerhards wrote: > >>> > >>>> This looks like we need a debug log... > >>>> > >>>> Rainer > >>>> > >>>>> -----Original Message----- > >>>>> From: [email protected] [mailto:rsyslog- > >>>>> [email protected]] On Behalf Of Todd Michael Bushnell > >>>>> Sent: Thursday, March 17, 2011 1:18 AM > >>>>> To: rsyslog-users > >>>>> Subject: [rsyslog] Back logs from disk assisted queue not flowing > >>>>> to centralloghost after service restored > >>>>> > >>>>> Have central loghost configured with disk assisted queue like so: > >>>>> > >>>>> $WorkDirectory /var/log/rsyslog > >>>>> $ActionQueueType LinkedList > >>>>> $ActionQueueFileName failqueue-loghost2 > $ActionResumeRetryCount > >> -1 > >>>>> $ActionQueueSaveOnShutdown on > >>>>> > >>>>> # remote logging of everything > >>>>> *.* @@loghost1:5140 > >>>>> > >>>>> Central loghost still running syslog-ng. Had a problem with it > >>>>> that caused it to fail on multiple occasions over the past couple > > days. > >>>>> Resolved the problem and logs are now flowing to it, but the files > >>>>> that were created on the clients during this period are not going > >>>>> away, nor are the back logs flowing to the central loghost. For > > example: > >>>>> > >>>>> # syslog client > >>>>> #/var/log/syslog > >>>>> -rw------- 1 root root 1049189 Mar 16 01:13 > >>>>> failqueue-loghost2.00000002 > >>>>> -rw------- 1 root root 1048848 Mar 14 13:25 > >>>>> failqueue-loghost2.00000003 > >>>>> -rw------- 1 root root 1048648 Mar 14 17:20 > >>>>> failqueue-loghost2.00000004 > >>>>> -rw------- 1 root root 1049066 Mar 15 00:19 > >>>>> failqueue-loghost2.00000005 > >>>>> -rw------- 1 root root 1048619 Mar 15 00:27 > >>>>> failqueue-loghost2.00000006 > >>>>> -rw------- 1 root root 1048907 Mar 15 13:20 > >>>>> failqueue-loghost2.00000007 > >>>>> -rw------- 1 root root 949887 Mar 16 01:13 > > failqueue-loghost2.00000008 > >>>>> -rw------- 1 root root 1653 Mar 16 01:13 failqueue-loghost2.qi > >>>>> > >>>>> Running rsyslog-5.6.4. > >>>>> > >>>>> _______________________________________________ > >>>>> rsyslog mailing list > >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >>>>> http://www.rsyslog.com > >>>> _______________________________________________ > >>>> rsyslog mailing list > >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >>>> http://www.rsyslog.com > >>> > >>> _______________________________________________ > >>> rsyslog mailing list > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >>> http://www.rsyslog.com > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

