Re: [rsyslog] Troubleshooting Rsyslog/Apache Issues

david Mon, 14 Mar 2011 13:37:40 -0700

On Fri, 11 Mar 2011, Todd Michael Bushnell wrote:

Appreciate the stellar advice Rainer.  I built a 5.6.x (latest stable), but 
before upgrading I wanted to do some tests with my existing version/config and 
then apply some of the rule syntax changes you recommended to gauge performance 
benefit.  I used the syslog_caller tool to perform a few tests.  Here are those 
results:


Running the following command on a test box and simply recording realtime for 
comparison:
time ./syslog_caller -m 50000

Initial (Apache killing) Config w/ Remote TCP logging:          5m38.163s
Switch to UDP Remote Logging:                                                   
4m22.497s
Disable Remote Logging:                                                         
        3m23.322s
Initial w/ Amended Rules (see RULES section):                           
4m54.055s
Amended Rules w/ expression-based rules commented out:  2m36.023s
Amended Rules w/ MainMsg Disk Queue                                     7m43.498

Sysklog (for comparison):  2m13.986s

Glad to see just changing my rules improved performance by about 13%.  My 
initial reaction was to send this info and ask a series of questions based on 
the data, but instead decided to give it a whirl with the latest stable 
version: 5.6.4.

5.6.4 w/ Amended rules:  0m3.773s

Wow - I almost fell off my chair!  This is AMAZING!  Thank you!  Given these 
results, I just have a couple final questions:

we really mean it when we say that performance has improved with the laterversions ;-)

now, given what you have been describing, I suspect that you are stillgoing to have problems, because I think that your central log server justcan't quite keep up, so with TCP logging you will still block eventually.

In compliance heavy environments (which I'm in), I assume therecommendation is to add disk queuing for the main queue. Is thiscorrect? Something like:
$MainMsgQueueFileName mainqueue
$MainMsgQueueType LinkedList
$MainMsgQueueSaveOnShutdown on
I understand there is a performance tradeoff, but given PCI-DSS, it'llbe worth it, I think.

the thing is that this isn't giving you the reliability that you think itis.


the process of logging with rsyslog has many steps

1. write the log to /dev/log

if the system crashes here the log is lost. has the application alreadycompleted the action it's trying to log? if so you have no record of it.


2. rsyslog accepts the message and puts it in the main message queue

unless the main message queue is a disk queue (not a disk assistedqueue) and you have fsyncs enabled, if the box crashes at this point youloose the log

3. rsyslog decides if the message should go to a particular destination,if you have a separate action queue for this destination, the message isput into that queue.


  again, unless you are using a disk queue, a crash can loose the message

4a. rsyslog sends the log to the remote server and deletes it from theaction queue.

unless you are using RELP, rsyslog may send the message to the TCPstack, but it has no way of knowing if the remote server has received themessage.


4b. rsyslog writes the log to a local file

unless you have fsync enabled after each write, a crash at this timewill loose the log message.

note that disk queues are very slow, and fsync on ext3 with other writeactivity can stall for seconds at a time.

I did some testing a year or so ago with a very high performance solidstate drive (a fusion-io PCI card that cost >$5K for 80G of storage), withthat drive and ext2, without an action queue, I was able to process 8Klogs/sec, compared to 400K logs/sec with memory queues (at the time Icould only write out around 80K logs/sec, but faster bursts that fit inmemory were handled just fine, since then there has been more improvementto rsyslog and people are reporting write rates of several hundredthousand logs/sec)

doing the same test on a standard SATA drive resulted in around 10 (yesTEN) logs/sec being processed.

I also operate in a PCI environment, there are limites to what is expectedof you in terms of preserving logs.

I would suggest that you end up with two copies of rsyslog running on yourservers.


the first copy for compliance critical logs.

  these should be a relativly low volume

  the application should be double-logging everything

    i.e.  I am about to do X
          I just tried to do X and it succeded/failed

this way you can tell if something failed in the middle of atransaction and can investigate if the transaction took place or not

this instance of rsyslog can be configured to sync everything, use RELP,write to mirrored drives, etc to do everything you can to make sure thelog does not get lost.

the application need to either use RELP to talk to rsyslog, or use/dev/log (writes to /dev/log do not return until the log is in the queue)

if this instance stops (runs out of disk space, crashes, etc) theapplication will halt.



the second copy is for normal activity logs (apache logs, etc)

  these will be a fairly high volume (especially by comparison)

  if systems fail you will loose some of these logs

at this point you can decide what reliability measures you deem prudentfor these logs.

personally, for this second category, failover syslog servers running UDPon a fairly quiet network are good enough for me, I've tested this setupto hundreds of thousands of logs/sec without loosing any logs in my tests(and the tests have involved sending billions of log messages), so whileit is not guaranteed reliability, in practice it is 'good enough'


David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Re: [rsyslog] Troubleshooting Rsyslog/Apache Issues

Reply via email to