On 05/03/2011 19:26, Olivier Fauchon wrote:
Hi.
I use rsyslogd for centralizing my webserver's logs.
Webservers runs rsyslogd too to locally process the logs, and send them to the
central log server with OMRELP module.
Here is the configuration:
$template DailyAccess,"/data/opt/logs/apache/access.%$YEAR%-%$MONTH%-%$DAY%.log"
$template DailyError,"/data/opt/logs/apache/error.%$YEAR%-%$MONTH%-%$DAY%.log"
$template apache,"%HOSTNAME% -%msg%\n"
$template tracedigitick,"/data/opt/logs/apache/%app-name%/%$now%.log"
$WorkDirectory /data/rsyslog # where to place spool files
$MainMsgQueueSaveOnShutdown on # save messages to disk on shutdown
$ActionQueueFileName remotequeue # unique name prefix for spool files
$ActionQueueMaxDiskSpace 10g # 1gb space limit (use as much as possible)
$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
$ActionQueueType LinkedList # run asynchronously
$ActionResumeRetryCount -1 # infinety retries if host is do
$ActionResumeInterval 10
local5.* :omrelp:192.168.13.10:20514;RSYSLOG_ForwardFormat
& ~
local6.* :omrelp:192.168.13.10:20514;RSYSLOG_ForwardFormat
& ~
if $programname == 'httpd' and $syslogfacility-text == 'user' then
:omrelp:192.168.3.9:20514;RSYSLOG_ForwardFormat
& ~
local7.* :omrelp:192.168.3.9:20514;RSYSLOG_ForwardFormat
local7.* ~
~
THE PROBLEM:
This morning, my central rsyslog server was unreachable, and all the webserver
went bad (high load, not responding to requests)
I discovered that they were stuck to Log 'L' state (in apache's server-status
page)
System console 'logger' command was terribly slow too. (4-5 seconds to process)
*** System information:
- RedHat Linux myname 2.6.9-89.0.26.ELsmp #1 SMP Sun May 30 09:31:52 EDT 2010
i686 i686 i386 GNU/Linux
RSyslogd custom compilation:
/usr/local/rsyslogd/sbin/rsyslogd -v
rsyslogd 5.6.3, compiled with:
FEATURE_REGEXP: Yes
FEATURE_LARGEFILE: Yes
FEATURE_NETZIP (message compression): Yes
GSSAPI Kerberos 5 support: No
FEATURE_DEBUG (debug build, slow code): No
Atomic operations supported: No
Runtime Instrumentation (slow code): No
*** steps to reproduce
1. use "IPTABLES -A -d IP.OF.CENTAL.RSYSLOGD -p tcp --dport 20514 -j DROP" to
isolate the webservers
2. run while true; do logger "Hello from console logger loop"; done
3. Wait a couple of seconds, stop the 2. loop and test:
[root@webserver01 log]# time logger aa
real 0m4.885s
user 0m0.001s
sys 0m0.000s
[root@digul-ws03 log]#
I can read strange messages in -dn mode :
7399.141790000:b7f0eaa0: input thread term: had a timeout waiting on thread
termination
7399.154054000:b7f0eaa0: input thread term: had a timeout waiting on thread
termination
7399.166314000:b7f0eaa0: input thread term: had a timeout waiting on thread
termination
7399.177537000:b6109ba0: main Q: enqueueMsg: cond timeout, dropping message!
7399.177562000:b6109ba0: main Q: EnqueueMsg advised worker start
7399.177585000:b6109ba0: imudp: epoll_wait() returned with 1 fds
7399.177609000:b6109ba0: thrdStarter: usrThrdMain 0xb6109ba0 returned with iRet
0, exiting now.
7399.177659000:b7f0eaa0: input thread term: thread returned normally and is
terminated
7399.177674000:b7f0eaa0: non-cancel input thread termination succeeded for
thread 0xb6109ba0
7399.177767000:b7f0eaa0: main Q: queue nearly full (10000 entries), but could
not drop msg (iRet: 0, severity 6)
7399.177785000:b7f0eaa0: main Q: enqueueMsg: queue FULL - waiting to drain.
7401.179944000:b7f0eaa0: main Q: enqueueMsg: cond timeout, dropping message!
7401.179983000:b7f0eaa0: main Q: EnqueueMsg advised worker start
Can you tell me more about this behaviour ?
How can I fix that so the other processes can survive RSyslog communication
problems ?
Thanks
Why wather go down?
You had created some process that don't work toghether with your system
read files.log
maybe you find the problem for the solution.
to much time-out, queque, wayting
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com