[rsyslog] Forwarding logs to downed server, not queuing

Jonas Courteau Thu, 04 Aug 2011 17:07:32 -0700

Hello:

First of all, apologies if this has been addressed before!


We're forwarding our logs to a central server using the following config:

$WorkDirectory /var/lib/rsyslog

$ActionQueueType LinkedList
$ActionQueueFileName remoteq
$ActionResumeRetryCount -1
$ActionQueueSaveOnShutdown on
*.* @@172.16.0.75:514
# -- a bunch of rules also logging to local files follow

This normally works great; and the queuing works fine when the serverrsyslog process is shut down. But, when we reboot the server, or shutit down, huge chunks of forwarded data fail to get queued. Looking atthe debug logs, it looks like the client side thinks the server's stillaround, and isn't queuing the logs:


6939.784970000:42790940: Message from UNIX socket: #3

6939.785145000:42790940: logmsg: flags 4, from 'testclient', msg Aug 415:28:59 root: new test message # 47

6939.785236000:42790940: Message has legacy syslog format.
6939.785330000:42790940: main Q: entry added, size now 1 entries
6939.785440000:42790940: wtpAdviseMaxWorkers signals busy
6939.785530000:42790940: main Q: EnqueueMsg advised worker start

6939.785640000:42790940: --------imuxsock calling select, active filedescriptors (max 3): 3

6939.786501000:41d8f940: main Q: entry deleted, state 0, size now 0 entries
6939.786616000:41d8f940: testing filter, f_pmask 255
6939.786699000:41d8f940: Called action, logging to builtin-fwd
6939.786793000:41d8f940: action 1 queue: entry added, size now 1 entries
6939.786905000:41d8f940: wtpAdviseMaxWorkers signals busy
6939.786989000:41d8f940: action 1 queue: EnqueueMsg advised worker start
6939.787099000:41d8f940: testing filter, f_pmask 127
6939.787186000:41d8f940: Called action, logging to builtin-file
6939.787274000:41d8f940: file to log to: /var/log/messages
6939.787360000:41d8f940: doWrite, pData->pStrm 0x2b237853a650, lenBuf 65

6939.787445000:41d8f940: strm 0x2b237853a650: file 6(messages) flush,buflen 65

6939.787575000:41d8f940: strm 0x2b237853a650: file 6 write wrote 65 bytes
6939.787690000:41d8f940: testing filter, f_pmask 0
6939.787775000:41d8f940: testing filter, f_pmask 0
6939.787855000:41d8f940: testing filter, f_pmask 0
6939.787934000:41d8f940: testing filter, f_pmask 1
6939.788014000:41d8f940: testing filter, f_pmask 0
6939.788096000:41d8f940: testing filter, f_pmask 0
6939.788195000:41d8f940: main Q:Reg/w0: worker IDLE, waiting for work.

6939.788288000:4138e940: action 1 queue: entry deleted, state 0, sizenow 0 entries

6939.788401000:4138e940:  172.16.0.75
6939.788481000:4138e940:  172.16.0.75:514/tcp
6939.788571000:4138e940: TCP sent 69 bytes, requested 69

6939.788659000:4138e940: action 1 queue:Reg/w0: worker IDLE, waiting forwork.

Sending a log message every second from the client, on the server,looking at a combined log, this is what we see during a reboot scenario(only listing remote test messages and logging-related server messages):


Aug  4 16:25:49 testclient root: Test message # 5
Aug  4 16:25:50 testclient root: Test message # 6
Aug  4 16:25:51 logserver shutdown[2104]: shutting down for system reboot
Aug  4 16:25:51 logserver init: Switching to runlevel: 6
Aug  4 16:25:51 testclient root: Test message # 7
Aug  4 16:25:52 testclient root: Test message # 8
Aug  4 16:25:53 testclient root: Test message # 9

Aug 4 16:25:53 logserver rsyslogd: db error (2002): Can't connect tolocal MySQL server through socket '/var/lib/mysql/mysql.sock' (2)

Aug  4 16:25:54 testclient root: Test message # 10
Aug  4 16:25:55 testclient root: Test message # 11
Aug  4 16:25:56 testclient root: Test message # 12
Aug  4 16:25:57 testclient root: Test message # 13
Aug  4 16:25:58 logserver kernel: Kernel logging (proc) stopped.

Aug 4 16:25:58 logserver rsyslogd: [origin software="rsyslogd"swVersion="4.6.2" x-pid="1282" x-info="http://www.rsyslog.com";] exitingon signal 15.Aug 4 16:26:44 logserver kernel: imklog 4.6.2, log source = /proc/kmsgstarted.Aug 4 16:26:44 logserver rsyslogd: [origin software="rsyslogd"swVersion="4.6.2" x-pid="1282" x-info="http://www.rsyslog.com";] (re)startAug 4 16:26:44 logserver kernel: Bootdata ok (command line isroot=/dev/sys.vg/root.lv ro panic=30 time)Aug 4 16:26:44 logserver kernel: Linux version2.6.18-194.32.1.1.el5.xen (mockbuild@*****.com) (gcc version 4.1.220080704 (Red Hat 4.1.2-48)) #1 SMP Fri May 20 11:57:35 PDT 2011

---- whole boot process here ----

Aug 4 16:26:44 logserver rsyslogd: db error (2002): Can't connect tolocal MySQL server through socket '/var/lib/mysql/mysql.sock' (2)

Aug  4 16:26:50 testclient root: Test message # 66
Aug  4 16:26:51 testclient root: Test message # 67
---- back to normal ---

The DB errors look fine/unrelated - even with db logging turned off andonly spitting logs out to file, we get this problem.

Is this just a consequence of using TCP? I don't mind losing a few logentries, but potentially losing a few hundred is another matter. Is theanswer to switch to RELP? Or am I missing some key configuration setting?

We're using rsyslog 4.6.2 as packaged for RHEL 6, on CentOS 5.6 (we'repreparing to move to CentOS 6, hence the version weirdness...). Icertainly don't mind running a different version if that's what it takesthough.


Thanks in advance for any ideas, feedback, etc.

-Jonas Courteau
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

[rsyslog] Forwarding logs to downed server, not queuing

Reply via email to