On Tue, 15 Feb 2011, Dirk wrote:

Am 15.02.11 10:32, schrieb Rainer Gerhards:
I guess you look for performance counters. Unfortunately, there are not
present before the introduction of impstat in v5.

We will have a look at that, thanks. We start trying to implant v5 into SLES10.

We stumbled upon a strange phenomenon: rsyslog reads messages from a log file and sends them to rsyslog on a central log server which parses them and writes them to log files. The client needs 1 % CPU for that, and the server needs 100 % CPU for that - with only the messages from this one client! Both machines are exactly the same.

The configuration on the server is quite complex, so the messages we test with have to be parsed by 550 rules before they match, get written and discarded.

Is this asynchronous resource usage "normal"? Or is it specially v3 doing it thus - would we benefit from using v5? Does it depend on the number of rules to be parsed - would we benefit from using regular expressions (assuming this is possible)?


yes, it is very normal for the receiver to use much more CPU than the sender.

if you think about what's happening, all the sender needs to do is to read the text, add a bit of formatting, and then send it over the network

the receiver needs to receive arbatrary text, parse it to decide what sort of message it is and how it is formatted, then process the rules to decide if each rule applies, and then if the rule does apply, assemble a new output message (potentially changing the text that it has) and writing it out.

that being said, there are a lot of ways to improve this.

there is a fair amount of overhead in rsyslog when receiving messages as they get moved to and from the queue, the newer versions will move multiple messages at once, so they cut down this overhead a lot. There are a lot of other performance improvements since version 3.

you can save 5-10% CPU by having predefined templates for writing the logs to a file instead of using the very flexible runtime defined templates

but the big cost (and therefor the big win) will be in working to optimize the rules that you have to evaluate.

why do you have so many rules?

can you say that once a rule has matched the log none of the other rules apply? (or if you can't say this as a blanket statement, are there cases where you can say this?)

do you have some rules that are much more common to match than others? (especially important in combination with the prior question)

if you think of your rules logically, do they (or portions of them) form a tree where you can look for something and then branch into two different sets of rules to then evaluate after that (if so, then the new rulesets feature may be the right thing for you)

As part of this, the different types of matching rules have very different costs (an if (regex) then arrangement being the highest overhead). it may be worth trying to use different types of matching rules, especially for the most common cases.

once we can get an idea of what your rules look like, we may be able to suggest other optimizations.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to