On Tue, 15 Feb 2011, Dirk wrote:
Am 15.02.11 10:32, schrieb Rainer Gerhards:
I guess you look for performance counters. Unfortunately, there are not
present before the introduction of impstat in v5.
We will have a look at that, thanks. We start trying to implant v5 into
SLES10.
We stumbled upon a strange phenomenon: rsyslog reads messages from a log file
and sends them to rsyslog on a central log server which parses them and
writes them to log files. The client needs 1 % CPU for that, and the server
needs 100 % CPU for that - with only the messages from this one client! Both
machines are exactly the same.
The configuration on the server is quite complex, so the messages we test
with have to be parsed by 550 rules before they match, get written and
discarded.
Is this asynchronous resource usage "normal"? Or is it specially v3
doing it thus - would we benefit from using v5? Does it depend on the
number of rules to be parsed - would we benefit from using regular
expressions (assuming this is possible)?
yes, it is very normal for the receiver to use much more CPU than the
sender.
if you think about what's happening, all the sender needs to do is to read
the text, add a bit of formatting, and then send it over the network
the receiver needs to receive arbatrary text, parse it to decide what sort
of message it is and how it is formatted, then process the rules to
decide if each rule applies, and then if the rule does apply, assemble a
new output message (potentially changing the text that it has) and writing
it out.
that being said, there are a lot of ways to improve this.
there is a fair amount of overhead in rsyslog when receiving messages as
they get moved to and from the queue, the newer versions will move
multiple messages at once, so they cut down this overhead a lot. There are
a lot of other performance improvements since version 3.
you can save 5-10% CPU by having predefined templates for writing the logs
to a file instead of using the very flexible runtime defined templates
but the big cost (and therefor the big win) will be in working to optimize
the rules that you have to evaluate.
why do you have so many rules?
can you say that once a rule has matched the log none of the other rules
apply? (or if you can't say this as a blanket statement, are there cases
where you can say this?)
do you have some rules that are much more common to match than others?
(especially important in combination with the prior question)
if you think of your rules logically, do they (or portions of them) form a
tree where you can look for something and then branch into two different
sets of rules to then evaluate after that (if so, then the new rulesets
feature may be the right thing for you)
As part of this, the different types of matching rules have very different
costs (an if (regex) then arrangement being the highest overhead). it may
be worth trying to use different types of matching rules, especially for
the most common cases.
once we can get an idea of what your rules look like, we may be able to
suggest other optimizations.
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com