David, Thanks for the response. This was sent before I did the upgrade. Many of the problems were resolved with that upgrade. Couple follow-up points to your thorough email:
1. I configured the mainmsgqueue per recommendation of someone else on the list as a way to prevent rsyslog log processing bottleneck from causing Apache to choke and die: $MainMsgQueueFileName mainqueue $MainMsgQueueType LinkedList $MainMsgQueueSaveOnShutdown on After upgrading and testing performance this might not even be necessary in my environment. This decision was pre-upgrade. The remaining bottleneck appears to be my TCP loghosts, but backup queuing here, I presume, will go in the Action queue, not the main queue so I might end up removing this. Correct me if I'm wrong here. 2. You recommended configuring an imfile for Apache, rather than using logger. This is actually what I want to do for Log4j (and perhaps Apache), but my reading of imfile documentation gives me the impression that it's geared toward gobbling up non-syslog formatted data and turning it into syslog formatted data. Thus the reason it assigns default facility and severity, on a per-file basis. Is it also a viable solution for files containing logs from, for example, Log4j where I have log entries that may be different severity levels? If not, I'm sure I can break up accordingly. Performance wise, is this a production grade alternative to using Logger to throw log messages at the domain socket? I wasn't sure given that the online docs for configuring Apache do not mention imfile. Thanks again for the solid feedback. Todd On Mar 14, 2011, at 12:57 PM, [email protected] wrote: > On Mon, 7 Mar 2011, Todd Michael Bushnell wrote: > >> Appreciate the feedback. Sorry last night's message so sparse - wasn't >> feeling too great and wanted to crash. Few points followed by my config >> file: >> >> 1. Using TCP, not RELP, because I'm still using syslog-NG as central >> loghost with rsyslog on servers. > > Ok, this is still a mechanism that will stall if the server stops accepting > messages > >> 2. Already have configured to queue locally in the event of outage. See >> config below. I've tested successfully in the past, but yesterday, when >> there were problems, I checked the local queue and did not see local >> queueing occurring. Perhaps it was just slow enough to slow things to a >> crawl, killing Apache, but not quite slow enough to result in local >> queueing. Is that possible? Should I look to tune this? > > no, if things are slow it will queue locally, and apache will only see things > slow down if the queue fills up (or if you are writing the queue to disk, if > the disk can't keep up) > > what are you looking at to decide that rsyslog is not queuing messages? > >> 3. Running "ancient" version of Rsyslog because this is the latest in >> CentOS 5 repo. Figured this is because it's stable which is what I want. >> No need for some of the newer bells at this point. If I guessed wrong here >> and latest will give me better stability and performance I'll build new RPMs. > > the latest will definantly give you better performance, but the other part of > the problem is that since it is so old, getting help here is a bit harder, > simply because it's ahrd to remember back that far. > >> 4. Have a number of admitted design deficiencies with Apache and Tomcat that >> could be contributing to performance although this does not impact sysklog >> which is why I proceeded as-is until I could get engineering to fix. >> >> 4a. Apache uses logger to send to local syslog socket (where rsyslog writes >> locally and sends to 2 remote servers) and also writes to its own files >> locally so we're logging twice locally for every message. Not good when >> traffic gets high, I presume. Just noticed this yesterday so need to get >> fixed. To make matters worse, all logging is happening on the same volume. >> Until fixed, maybe I should just have rsyslog write local Apache logs to >> /dev/null and forward to remote syslog - nothing else. Thoughts? > > if you want rsyslog to write the queue to disk you will also have performance > issues (the rsyslog disk queue is not very efficient) > > One question to ask is how critical it is that no logs get lost? you may want > to configure rsyslog to discard messages if it gets too many queued rather > than stopping apache. > > or you may want to have apache write log files and then have rsyslog use > imfile to read the file. > >> 4b. Log4j sending directly to syslog servers, writing to its own local files >> and sending to localhost:514 for local logging. Would prefer all gets >> handed to rsyslog for local and remote logging. Need to get engineering to >> fix that too. Like mentioned before, to reduce IO contention and avoid >> duplication, might just configure rsyslog to write to /dev/null as long as >> it's configured like this. Only question with 4a/b is that this never posed >> a problem with sysklog, but is a problem with rsyslog. This is the reason I >> did not try to make any major changes in phase 1. > > remember that sysklog didn't do TCP logging, it only did UDP logging, so it > would send the messages out over the network as fast as it could, and if the > receiver can't keep up the message is lost. > > you may want to do a test with rsyslog using UDP instead of TCP and see if > the behavior is what you expect. If it is, then you are loosing logs because > your central server can't keep up with UDP, but with TCP you are stalling and > killing apache instead. > >> # Configuration File >> >> # Provides kernel logging support (previously done by rklogd) >> $ModLoad imklog >> # Provides support for local system logging (e.g. via logger command) >> $ModLoad imuxsock >> >> # Max Message Size (default 2k) >> $MaxMessageSize 8192 > > hmm, you may want to look at enabling jumbo packets on your network so that > each log message can be pushed in a single packet. > >> # Must listen on localhost for Log4j. Need engineering to change this >> $ModLoad imudp >> $UDPServerAddress 127.0.0.1 >> $UDPServerRun 514 >> >> # Use traditional timestamp format >> $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat >> >> # ownership/permissions >> $umask 0000 >> $FileOwner root >> $FileGroup wheel >> $FileCreateMode 0640 >> >> # include directory for breaking directives into separate files (future) >> $IncludeConfig /etc/rsyslog.d/ >> >> # forward to remote host, queueing to local disk if host is down and memory >> fills up >> # work (spool) files directory >> $WorkDirectory /var/log/rsyslog >> >> # loghost1 >> # in-memory queue; set for asynchronous processing (?) >> $ActionQueueType LinkedList > > in my testing LinkedList was slower than the default. everything does > asynchronous processing. > > but both the default and LinkedList are limited to memory size and the > $MainMsgQueueSize or $ActionMsgQueueSize (which I think default to 10000) > >> # failover queue filename; also enables disk mode >> $ActionQueueFileName failqueue-loghost1 > > I don't think this enables disk mode, you also need to set the > $ActionQueueType to a disk related type. > > David Lang > >> # infinite retries on insert failure >> $ActionResumeRetryCount -1 >> # save in-memory data if rsyslog shuts down >> $ActionQueueSaveOnShutdown on >> # remote logging of everything >> *.* @@loghost1:5140 >> >> # loghost2 >> # in-memory queue; set for asynchronous processing (?) >> $ActionQueueType LinkedList >> # failover queue filename; also enables disk mode >> $ActionQueueFileName failqueue-loghost2 >> # infinite retries on insert failure >> $ActionResumeRetryCount -1 >> # save in-memory data if rsyslog shuts down >> $ActionQueueSaveOnShutdown on >> # remote logging of everything >> *.* @@loghost2:5140 >> >> # Log Filtering Rules >> >> # Emergency Messages >> if $syslogseverity <= '0' then * >> if $syslogseverity <= '0' then /var/log/messages >> if $syslogseverity <= '0' then ~ >> >> # Apache >> if $programname == 'logger' and ($msg contains 'access_log' or $msg contains >> 'cookie_log' or $msg contains 'r >> equest_log') then /var/log/http >> & ~ >> if $programname == 'httpd' and ($syslogfacility-text == 'local5' or >> $syslogfacility-text == 'local6') then /var/log/http_err >> & ~ >> >> # Log4j (App Logs) >> if $programname == 'com.redacted.infra.syslog.Log4jSystemLogger' then >> /var/log/log4j >> & ~ >> >> # Kernel & IPTables >> if $programname == 'kernel' and ($msg contains 'LOGACCEPT' or $msg contains >> 'LOGDROP') then /var/log/iptables >> & ~ >> >> # Auth Messages >> if $syslogfacility-text == 'auth' or $syslogfacility-text == 'authpriv' then >> /var/log/secure >> & ~ >> >> # Mail >> if $syslogfacility-text == 'mail' then /var/log/maillog >> & ~ >> >> # Catchall for remaining log messages >> *.* /var/log/messages >> >> >> >> On Mar 6, 2011, at 10:43 PM, Todd Michael Bushnell wrote: >> >>> Been planning an rsyslog deployment for about a month. Everything >>> performed as expected in my limited use dev environment, but when I >>> deployed rsyslog today to my production environment multiple systems >>> yielded similar disastrous results: >>> >>> After a few hours Apache jumped up to 250+ processes (max=256, normal=~50) >>> and then started hanging. At this time, rsyslog also stopped logging >>> altogether. As soon as I killed rsyslog and started sysklog, httpd >>> processes dropped to 50 and everything went back to normal. >>> >>> I'm not sure if this is a case where rsyslog froze and it's state resulted >>> in Apache's inability to close processes or if there is a problem with >>> Apache and Rsyslog when a decent volume of traffic is passed through. I'm >>> happy to provide additional information if someone could give me some clues >>> as to where to start looking. At this point we're reverting until I can >>> diagnose this issue and assure my team that I've fixed the problem for good. >>> >>> Version: rsyslog-3.22.1-3.el5_5.1 >>> System: Linux ******* 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 11:57:43 EST >>> 2008 x86_64 x86_64 x86_64 GNU/Linux >>> >>> >>> >>> Todd Michael Bushnell >>> [email protected] >>> >>> >>> >> >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com >> > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

