Appreciate the feedback. Sorry last night's message so sparse - wasn't feeling too great and wanted to crash. Few points followed by my config file:
1. Using TCP, not RELP, because I'm still using syslog-NG as central loghost with rsyslog on servers. 2. Already have configured to queue locally in the event of outage. See config below. I've tested successfully in the past, but yesterday, when there were problems, I checked the local queue and did not see local queueing occurring. Perhaps it was just slow enough to slow things to a crawl, killing Apache, but not quite slow enough to result in local queueing. Is that possible? Should I look to tune this? 3. Running "ancient" version of Rsyslog because this is the latest in CentOS 5 repo. Figured this is because it's stable which is what I want. No need for some of the newer bells at this point. If I guessed wrong here and latest will give me better stability and performance I'll build new RPMs. 4. Have a number of admitted design deficiencies with Apache and Tomcat that could be contributing to performance although this does not impact sysklog which is why I proceeded as-is until I could get engineering to fix. 4a. Apache uses logger to send to local syslog socket (where rsyslog writes locally and sends to 2 remote servers) and also writes to its own files locally so we're logging twice locally for every message. Not good when traffic gets high, I presume. Just noticed this yesterday so need to get fixed. To make matters worse, all logging is happening on the same volume. Until fixed, maybe I should just have rsyslog write local Apache logs to /dev/null and forward to remote syslog - nothing else. Thoughts? 4b. Log4j sending directly to syslog servers, writing to its own local files and sending to localhost:514 for local logging. Would prefer all gets handed to rsyslog for local and remote logging. Need to get engineering to fix that too. Like mentioned before, to reduce IO contention and avoid duplication, might just configure rsyslog to write to /dev/null as long as it's configured like this. Only question with 4a/b is that this never posed a problem with sysklog, but is a problem with rsyslog. This is the reason I did not try to make any major changes in phase 1. # Configuration File # Provides kernel logging support (previously done by rklogd) $ModLoad imklog # Provides support for local system logging (e.g. via logger command) $ModLoad imuxsock # Max Message Size (default 2k) $MaxMessageSize 8192 # Must listen on localhost for Log4j. Need engineering to change this $ModLoad imudp $UDPServerAddress 127.0.0.1 $UDPServerRun 514 # Use traditional timestamp format $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat # ownership/permissions $umask 0000 $FileOwner root $FileGroup wheel $FileCreateMode 0640 # include directory for breaking directives into separate files (future) $IncludeConfig /etc/rsyslog.d/ # forward to remote host, queueing to local disk if host is down and memory fills up # work (spool) files directory $WorkDirectory /var/log/rsyslog # loghost1 # in-memory queue; set for asynchronous processing (?) $ActionQueueType LinkedList # failover queue filename; also enables disk mode $ActionQueueFileName failqueue-loghost1 # infinite retries on insert failure $ActionResumeRetryCount -1 # save in-memory data if rsyslog shuts down $ActionQueueSaveOnShutdown on # remote logging of everything *.* @@loghost1:5140 # loghost2 # in-memory queue; set for asynchronous processing (?) $ActionQueueType LinkedList # failover queue filename; also enables disk mode $ActionQueueFileName failqueue-loghost2 # infinite retries on insert failure $ActionResumeRetryCount -1 # save in-memory data if rsyslog shuts down $ActionQueueSaveOnShutdown on # remote logging of everything *.* @@loghost2:5140 # Log Filtering Rules # Emergency Messages if $syslogseverity <= '0' then * if $syslogseverity <= '0' then /var/log/messages if $syslogseverity <= '0' then ~ # Apache if $programname == 'logger' and ($msg contains 'access_log' or $msg contains 'cookie_log' or $msg contains 'r equest_log') then /var/log/http & ~ if $programname == 'httpd' and ($syslogfacility-text == 'local5' or $syslogfacility-text == 'local6') then /var/log/http_err & ~ # Log4j (App Logs) if $programname == 'com.redacted.infra.syslog.Log4jSystemLogger' then /var/log/log4j & ~ # Kernel & IPTables if $programname == 'kernel' and ($msg contains 'LOGACCEPT' or $msg contains 'LOGDROP') then /var/log/iptables & ~ # Auth Messages if $syslogfacility-text == 'auth' or $syslogfacility-text == 'authpriv' then /var/log/secure & ~ # Mail if $syslogfacility-text == 'mail' then /var/log/maillog & ~ # Catchall for remaining log messages *.* /var/log/messages On Mar 6, 2011, at 10:43 PM, Todd Michael Bushnell wrote: > Been planning an rsyslog deployment for about a month. Everything performed > as expected in my limited use dev environment, but when I deployed rsyslog > today to my production environment multiple systems yielded similar > disastrous results: > > After a few hours Apache jumped up to 250+ processes (max=256, normal=~50) > and then started hanging. At this time, rsyslog also stopped logging > altogether. As soon as I killed rsyslog and started sysklog, httpd processes > dropped to 50 and everything went back to normal. > > I'm not sure if this is a case where rsyslog froze and it's state resulted in > Apache's inability to close processes or if there is a problem with Apache > and Rsyslog when a decent volume of traffic is passed through. I'm happy to > provide additional information if someone could give me some clues as to > where to start looking. At this point we're reverting until I can diagnose > this issue and assure my team that I've fixed the problem for good. > > Version: rsyslog-3.22.1-3.el5_5.1 > System: Linux ******* 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 11:57:43 EST 2008 > x86_64 x86_64 x86_64 GNU/Linux > > > > Todd Michael Bushnell > [email protected] > > > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

