Hi. 

I use rsyslogd for centralizing my webserver's logs. 
Webservers runs rsyslogd too to locally process the logs, and send them to the 
central log server with OMRELP module. 

Here is the configuration: 

$template 
DailyAccess,"/data/opt/logs/apache/access.%$YEAR%-%$MONTH%-%$DAY%.log" 
$template DailyError,"/data/opt/logs/apache/error.%$YEAR%-%$MONTH%-%$DAY%.log" 
$template apache,"%HOSTNAME% -%msg%\n" 
$template tracedigitick,"/data/opt/logs/apache/%app-name%/%$now%.log" 

$WorkDirectory /data/rsyslog # where to place spool files 
$MainMsgQueueSaveOnShutdown on # save messages to disk on shutdown 
$ActionQueueFileName remotequeue # unique name prefix for spool files 
$ActionQueueMaxDiskSpace 10g # 1gb space limit (use as much as possible) 
$ActionQueueSaveOnShutdown on # save messages to disk on shutdown 
$ActionQueueType LinkedList # run asynchronously 
$ActionResumeRetryCount -1 # infinety retries if host is do 
$ActionResumeInterval 10 

local5.* :omrelp:192.168.13.10:20514;RSYSLOG_ForwardFormat 
& ~ 
local6.* :omrelp:192.168.13.10:20514;RSYSLOG_ForwardFormat 
& ~ 
if $programname == 'httpd' and $syslogfacility-text == 'user' then 
:omrelp:192.168.3.9:20514;RSYSLOG_ForwardFormat 
& ~ 
local7.* :omrelp:192.168.3.9:20514;RSYSLOG_ForwardFormat 
local7.* ~ 
~ 

THE PROBLEM: 

This morning, my central rsyslog server was unreachable, and all the webserver 
went bad (high load, not responding to requests) 
I discovered that they were stuck to Log 'L' state (in apache's server-status 
page) 
System console 'logger' command was terribly slow too. (4-5 seconds to process) 


*** System information: 

- RedHat Linux myname 2.6.9-89.0.26.ELsmp #1 SMP Sun May 30 09:31:52 EDT 2010 
i686 i686 i386 GNU/Linux 

RSyslogd custom compilation: 

/usr/local/rsyslogd/sbin/rsyslogd -v 
rsyslogd 5.6.3, compiled with: 
FEATURE_REGEXP: Yes 
FEATURE_LARGEFILE: Yes 
FEATURE_NETZIP (message compression): Yes 
GSSAPI Kerberos 5 support: No 
FEATURE_DEBUG (debug build, slow code): No 
Atomic operations supported: No 
Runtime Instrumentation (slow code): No 

*** steps to reproduce 

1. use "IPTABLES -A -d IP.OF.CENTAL.RSYSLOGD -p tcp --dport 20514 -j DROP" to 
isolate the webservers 
2. run while true; do logger "Hello from console logger loop"; done 
3. Wait a couple of seconds, stop the 2. loop and test: 

[root@webserver01 log]# time logger aa 
real 0m4.885s 
user 0m0.001s 
sys 0m0.000s 
[root@digul-ws03 log]# 


I can read strange messages in -dn mode : 

7399.141790000:b7f0eaa0: input thread term: had a timeout waiting on thread 
termination 
7399.154054000:b7f0eaa0: input thread term: had a timeout waiting on thread 
termination 
7399.166314000:b7f0eaa0: input thread term: had a timeout waiting on thread 
termination 
7399.177537000:b6109ba0: main Q: enqueueMsg: cond timeout, dropping message! 
7399.177562000:b6109ba0: main Q: EnqueueMsg advised worker start 
7399.177585000:b6109ba0: imudp: epoll_wait() returned with 1 fds 
7399.177609000:b6109ba0: thrdStarter: usrThrdMain 0xb6109ba0 returned with iRet 
0, exiting now. 
7399.177659000:b7f0eaa0: input thread term: thread returned normally and is 
terminated 
7399.177674000:b7f0eaa0: non-cancel input thread termination succeeded for 
thread 0xb6109ba0 
7399.177767000:b7f0eaa0: main Q: queue nearly full (10000 entries), but could 
not drop msg (iRet: 0, severity 6) 
7399.177785000:b7f0eaa0: main Q: enqueueMsg: queue FULL - waiting to drain. 
7401.179944000:b7f0eaa0: main Q: enqueueMsg: cond timeout, dropping message! 
7401.179983000:b7f0eaa0: main Q: EnqueueMsg advised worker start 


Can you tell me more about this behaviour ? 
How can I fix that so the other processes can survive RSyslog communication 
problems ? 

Thanks 


-- 
Olivier Fauchon 
Tel: 06.85.53.26.41 
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to