I am not sure if I understood your scenario correctly. Is this your lab setup:
- make rsyslog monitor a file f - continuously append lines to f - send messages both to remote server and log them locally - break connection to remote server - some more lines from f are logged locally, and then no more lines form f are processed If so, I think I can explain what happens. The file monitor is flagged as a full delayable message source. That means if it begins to fill queues, the input can be blocked without causing harm. This is most useful in (almost) all setups, because otherwise a large file monitor would copy over potentially huge amounts of data to a queue file if a remote destination goes down (in fact, we have seen this in earlier releases). However, this means that the file monitor is actually blocked, and thus it will also not submit messages for local submission. I think this is what happens here (it took me a while to actually understand the scenario...). I can see that the assumption that a file monitor message can be delayed may not be valid in some cases, e.g. with extended offline time of a receiver. However, trust me that it is useful in many more cases. So what could be done, if I am right with my analysis, is to create a config setting so that imfile's priority can be changed to non-delayable. This probably solves your situation, but you should note that this can place considerable burden on your server in case of failover. Also, you probably need around two to three times the disk space of those files that are to be enqueued. Am I on the right path? Rainer > -----Original Message----- > From: [email protected] [mailto:rsyslog- > [email protected]] On Behalf Of Andreas Piesk > Sent: Sunday, October 16, 2011 8:52 PM > To: [email protected] > Subject: Re: [rsyslog] rsyslog stops local logging and local logging hangsif > remote destination is unresponsive > > On 15.10.2011 12:49, Rainer Gerhards wrote: > > Thanks for the debug log. I've so far just have a quick look, but there seem > > to come no more messages in. Can you point me to exactly what happens > and > > what is problematic, if possible in sequence so that I can try to find that > > inside the debug log. > > OK, i try to explain how to reproduce it. i must say i had a hard time to > reproduce it in a virtual > environment but fortunately i succeeded :) > > one thing i didn't mention (shame on me) was the fact, that rsyslog on the > client also reads some > logfiles with imfile. my first attempt to reproduce it didn't work because > rsyslog was setup to read > logfiles with imfile but nothing was written to these files. after adapting the > test case by > simultanously generating syslog messages with logger and writing to a > monitored logfile i always got > a hanging rsyslog. > > here's my test case. first the used client rsyslog configs: > > # cat /etc/rsyslog.conf > $ModLoad imuxsock > $ModLoad imklog > $ModLoad immark > $MarkMessagePeriod 1200 > $SystemLogRateLimitInterval 0 > $ActionFileDefaultTemplate RSYSLOG_FileFormat > *.info;mail.none;authpriv.none;cron.none;local6.none;local0.none > /var/log/messages > authpriv.* > /var/log/secure > mail.* - > /var/log/mail > cron.* > /var/log/cron > *.emerg > * > uucp,news.crit > /var/log/spooler > local7.* > /var/log/boot > > # cat /etc/rsyslog.d/forwarder.conf > $ActionForwardDefaultTemplate RSYSLOG_ForwardFormat > $WorkDirectory /var/spool/rsyslog > $ActionQueueType LinkedList > $ActionQueueMaxDiskSpace 1024m > $ActionQueueCheckpointInterval 10 > $ActionQueueFileName forward > $ActionQueueMaxFileSize 1m > $ActionResumeRetryCount -1 > $ActionQueueSaveOnShutdown on > $ActionWriteAllMarkMessages on > *.* @@10.10.0.254:514;RSYSLOG_ForwardFormat > > # cat /etc/rsyslog.d/pickup.conf > $ModLoad imfile > $InputFilePollInterval 10 > $InputFileName /var/log/test.log > $InputFileTag test: > $InputFileStateFile test > $InputFilePersistStateInterval 10 > $InputFileSeverity info > $InputFileFacility local6 > $InputRunFileMonitor > > generate messages on the client: > > client # while true; do date; date >> /var/log/test.log; date | logger; sleep 1; > done > > the messages are logged on the client by rsyslog: > > client # tail -f messages test.log > ==> messages <== > 2011-10-16T19:34:34+02:00 gold-centos5-x86_64 logger: Sun Oct 16 19:34:34 > CEST 2011 > > ==> test.log <== > Sun Oct 16 19:34:34 CEST 2011 > > and on the server too: > > server # tail -f logger.log test.log > ==> logger.log <== > 2011-10-16T19:35:12.000000+02:00 gold-centos5-x86_64 logger: Sun Oct 16 > 19:35:12 CEST 2011 > 2011-10-16T19:35:13.000000+02:00 gold-centos5-x86_64 logger: Sun Oct 16 > 19:35:13 CEST 2011 > 2011-10-16T19:35:14.000000+02:00 gold-centos5-x86_64 logger: Sun Oct 16 > 19:35:14 CEST 2011 > 2011-10-16T19:35:15.000000+02:00 gold-centos5-x86_64 logger: Sun Oct 16 > 19:35:15 CEST 2011 > 2011-10-16T19:35:16.000000+02:00 gold-centos5-x86_64 logger: Sun Oct 16 > 19:35:16 CEST 2011 > 2011-10-16T19:35:17.000000+02:00 gold-centos5-x86_64 logger: Sun Oct 16 > 19:35:17 CEST 2011 > 2011-10-16T19:35:18.000000+02:00 gold-centos5-x86_64 logger: Sun Oct 16 > 19:35:18 CEST 2011 > 2011-10-16T19:35:19.000000+02:00 gold-centos5-x86_64 logger: Sun Oct 16 > 19:35:19 CEST 2011 > 2011-10-16T19:35:20.000000+02:00 gold-centos5-x86_64 logger: Sun Oct 16 > 19:35:20 CEST 2011 > 2011-10-16T19:35:21.000000+02:00 gold-centos5-x86_64 logger: Sun Oct 16 > 19:35:21 CEST 2011 > > ==> test.log <== > 2011-10-16T19:35:21.100215+02:00 gold-centos5-x86_64 test: Sun Oct 16 > 19:35:11 CEST 2011 > 2011-10-16T19:35:21.100307+02:00 gold-centos5-x86_64 test: Sun Oct 16 > 19:35:12 CEST 2011 > 2011-10-16T19:35:21.100464+02:00 gold-centos5-x86_64 test: Sun Oct 16 > 19:35:13 CEST 2011 > 2011-10-16T19:35:21.100608+02:00 gold-centos5-x86_64 test: Sun Oct 16 > 19:35:14 CEST 2011 > 2011-10-16T19:35:21.100867+02:00 gold-centos5-x86_64 test: Sun Oct 16 > 19:35:15 CEST 2011 > 2011-10-16T19:35:21.101008+02:00 gold-centos5-x86_64 test: Sun Oct 16 > 19:35:16 CEST 2011 > 2011-10-16T19:35:21.101228+02:00 gold-centos5-x86_64 test: Sun Oct 16 > 19:35:17 CEST 2011 > 2011-10-16T19:35:21.101409+02:00 gold-centos5-x86_64 test: Sun Oct 16 > 19:35:18 CEST 2011 > 2011-10-16T19:35:21.101549+02:00 gold-centos5-x86_64 test: Sun Oct 16 > 19:35:19 CEST 2011 > 2011-10-16T19:35:21.101680+02:00 gold-centos5-x86_64 test: Sun Oct 16 > 19:35:20 CEST 2011 > > now simulate an unresponsive server: > > server # iptables -A INPUT -s gold-centos5-x86_64 -p tcp --dport 514 -j DROP > > the logging on the server stops immediately, of course. > > the logging on the client continues but not for long: > > ==> messages <== > 2011-10-16T19:41:09+02:00 gold-centos5-x86_64 logger: Sun Oct 16 19:41:09 > CEST 2011 > > ==> test.log <== > Sun Oct 16 19:41:09 CEST 2011 > > ==> messages <== > 2011-10-16T19:41:10+02:00 gold-centos5-x86_64 logger: Sun Oct 16 19:41:10 > CEST 2011 > > ==> test.log <== > Sun Oct 16 19:41:10 CEST 2011 > Sun Oct 16 19:41:11 CEST 2011 > Sun Oct 16 19:41:12 CEST 2011 > Sun Oct 16 19:41:13 CEST 2011 > Sun Oct 16 19:41:14 CEST 2011 > Sun Oct 16 19:41:15 CEST 2011 > Sun Oct 16 19:41:16 CEST 2011 > Sun Oct 16 19:41:17 CEST 2011 > Sun Oct 16 19:41:18 CEST 2011 > Sun Oct 16 19:41:19 CEST 2011 > Sun Oct 16 19:41:20 CEST 2011 > Sun Oct 16 19:41:21 CEST 2011 > Sun Oct 16 19:41:22 CEST 2011 > Sun Oct 16 19:41:23 CEST 2011 > Sun Oct 16 19:41:24 CEST 2011 > Sun Oct 16 19:41:25 CEST 2011 > Sun Oct 16 19:41:26 CEST 2011 > Sun Oct 16 19:41:27 CEST 2011 > Sun Oct 16 19:41:29 CEST 2011 > Sun Oct 16 19:41:30 CEST 2011 > Sun Oct 16 19:41:31 CEST 2011 > > logging to messages stopped. there's no spool file: > > client # ls -l /var/spool/rsyslog/ > total 8 > -rw------- 1 root root 195 Oct 16 19:42 test > > > to make it easier for you i'll send the logs of such a test run along with > descriptions to > [email protected]. > > the logging stops at or around these messages in rsyslog_debug.log: > > 7644.220848000:44b3c940: strm 0x2acdd9c6a0e0: file 7 read 0 bytes > 7644.220858000:44b3c940: stream checking for file change on > '/var/log/test.log', inode 2359447/23594 > 47 > 7644.220897000:40cc0940: wti 0x2acdd9c532d0: worker awoke from idle > processing > 7644.220911000:40cc0940: we deleted 0 objects and enqueued 0 objects > 7644.220916000:40cc0940: delete batch from store, new sizes: log 10, phys 10 > 7644.220924000:40cc0940: processBatch: batch of 10 elements must be > processed > .. > 7644.221601000:40cc0940: Processing next action > 7644.221606000:40cc0940: Called action(Batch), logging to builtin-fwd > 7644.221611000:40cc0940: action 0x2acdd9c4d030: filterOK:1 state:0 > execWhenPrev:0 prevWasSusp:0 > 7644.221626000:40cc0940: action 8 queue: enqueueMsg: FullDelay mark > reached for full delayable > message - blocking. > > > if i run the same test with exactly the same config but _without_ writing to > /var/log/test.log > ("while true; do date; date | logger; sleep 1; done"), rsyslog creates a spool > file and doesn't block. > > regards, > -ap > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

