Hi,
I am running rsyslog server on my clients and I store events locally and also
forward them to remote rsyslog servers. My clients are CentOS 5.5 running
rsyslogd 3.22.1. All of them forward data to 2 rsyslog servers using omrelp
module. Central servers are running RHEL 6 and rsyslog 4.6.2.
Everything works fine most of the time but some of my servers start
experiencing some problems once every few weeks. When this happens random
clients stop logging any data to the disk and they stop forwarding anything to
remote servers. This also affects system responsiveness, it appears that many
daemons are waiting for syslog operation to timeout which takes some time. For
example ssh login is delayed by about 10 seconds, the same happens to sudo
commands and some others as well. The only way I found to resolve the problem
is to restart rsyslog process (kill -HUP does not work).
I attached strace to hanged process and it hanged with following system call:
select(1, NULL, NULL, NULL, {56238, 855000}
What is also strange kill -HUP did not work properly. I attached strace to the
process during kill -HUP, output below:
) = -1 EINTR (Interrupted system call)
--- SIGHUP (Hangup) @ 0 (0) ---
rt_sigaction(SIGHUP, {0x409520, [], SA_RESTORER, 0x3b93e0eb10}, NULL, 8) = 0
rt_sigreturn(0x1) = -1 EINTR (Interrupted system call)
futex(0x2aaac0009d1c, FUTEX_WAIT_PRIVATE, 1, NULL
output hangs on futex system call.
My configuration on the clients:
## Modules
$ModLoad immark.so # provides --MARK-- message capability
$ModLoad imuxsock.so # provides support for local system logging (e.g.
via logger command)
$ModLoad imklog.so # kernel logging (formerly provided by rklogd)
$ModLoad imtcp.so # provides TCP syslog reception and GSS-API
$InputTCPServerRun 514 # start up TCP listener at port 514
$ModLoad imudp.so # provides UDP syslog reception
$UDPServerRun 514 # start a UDP syslog server at standard port 514
$ModLoad omrelp.so # provides RELP syslog transmission
## Global Directives
# Use default timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
$EscapeControlCharactersOnReceive off
## Rules
#Log anything (except mail) of level info or higher.
#Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none;user.none -/var/log/messages
#The authpriv file has restricted access.
authpriv.* /var/log/secure
#Log all the mail messages in one place.
mail.* -/var/log/maillog
#Log cron stuff
cron.* -/var/log/cron
#Everybody gets emergency messages
*.emerg *
#Save boot messages also to boot.log
local7.* /var/log/boot.log
## Include configs from the /etc/rsyslog.d directory
$IncludeConfig /etc/rsyslog.d/*.conf
## Log remotely with a fully-qualified hostname
$template fmtFQDN,"<%pri%>%timegenerated% hostname %syslogtag% %msg%\n"
$WorkDirectory /var/spool/rsyslog
$ActionQueueType LinkedList
$ActionQueueFileName rsyslog001-buffer
$ActionQueueMaxFileSize 10m
$ActionResumeRetryCount -1
$ActionQueueSaveOnShutdown on
*.*; :omrelp:rsyslogserver1:20514;fmtFQDN
$ActionQueueType LinkedList
$ActionQueueFileName rsyslog002-buffer
$ActionQueueMaxFileSize 10m
$ActionResumeRetryCount -1
$ActionQueueSaveOnShutdown on
*.*; :omrelp:rsyslogserver2:20514;fmtFQDN
Does anybody know what the problem could be and how to resolve it?
--
Regards,
Jacek
________________________________________________________________________
In order to protect our email recipients, Betfair Group use SkyScan from
MessageLabs to scan all Incoming and Outgoing mail for viruses.
________________________________________________________________________
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com