I can see other uses for a sequence number, so thanks for creating this.
However:
The picture is not quite as bleak as you are making it sound. Rsyslog already
scales pretty well to large numbers of cores.
The key thing to remember is that you are almost always going to be doing more
than one thing, so while any one thing may end up being single threaded, you can
still have many threads operating at a time.
most action modules have some point where they cannot be single threaded (think
writing to a file or TCP socket).
The key to doing a lot of things in parallel is the rsyslog queue parameters.
If you configure multiple queue workers, they may not be doing the same action
at the same time, but they can be working on different actions at the same time.
With some action modules, such as the ones that do database inserts, the module
does support having multiple threads, because the remote end is able to handle
parallel writes.
With file output, you can enable async writes, so that you have one thread
writing the output to disk (potentially with compression, signing, etc) while
another thread is crafting the strings to be written.
It's very common that the bottleneck ends up being in string generation (complex
template patterns for the file format or for the dynamic filename). Rsyslog
supports string modules, which can be significantly more efficient in creating
these strings than the template languange. The built-in templates were
implemented this way and resulted in a noticable improvement on the peak
performance of rsyslog, and they are relatively simple templates. With more
complex templates the gains can be substantially bigger.
What action are you doing that is running into a problem?
David Lang
On Sun, 20 Oct 2013, Pavel Levshin wrote:
Hello.
So, now I know that actions are not reentrant in rsyslog. Therefore, any
single action cannot consume more than one core of CPU. Nowadays, there
are common servers having 24 cores, and this limits our ability to handle
high load. Making all modules thread-safe would be great, but takes huge
amount of effort. And there is much simpler solution.
We can define multiple identical or similar actions and divide the load
between them. Rsyslog has conditional statements to accomplish this.
But, unfortunately, rsyslog does not provide a variable or a function which
could be checked in such statement. Or, possibly, I am just unable to find
it. Basically, all we need is a variable, which values are evenly
distributed in a known range.
Additional use of this method would be to balance load between two or more
different output actions. Say, to use rsyslog as a network load balancer.
Let me introduce mmsequence. It is a message modification module, heavily
based on mmcount. It's purpose is to generate some numbers and
store them in message properties. I'm not a rsyslog guru or a professional
programmer, so please review my code. But, at least, it seemes to work here.
The patch is based on HEAD.
Description:
This module generates numeric sequences of different kinds. It can be used
to count messages up to a limit and to number them. It can generate random
numbers in a given range.
This module is implemented via the output module interface, so it is
called just as an action. The number generated is stored in "CEE/Lumberjack"
property of the message.
Action Parameters:
- mode "random" or "instance" or "key"
Specifies mode of the action. Default mode is "random", which
generates uniformly distributed integer numbers in a range defined
by "from" and "to".
In "instance" mode, the action produces a counter in range [from,
to).
This counter is specific to this action instance.
In "key" mode, the counter can be shared between multiple instances.
This counter is identified by a name, which is defined with "key"
parameter.
- from [non-negative integer], default "0"
Starting value for counters and lower margin for random generator.
- to [positive integer], default "2"
Upper margin for all sequences. Note that this margin is not
inclusive. When next value for a counter is equal or greater than
this parameter, the counter resets to the starting value.
- step [non-negative integer], default "1"
Increment for counters. If step is "0", it can be used to fetch
current value without modification. The latter not applies to
"random" mode. This is useful in "key" mode or to get constant
values in "instance" mode.
- key [word], default ""
Name of the global counter which is used in this action.
- var [word], default "!mmsequence"
Name of the message property where the number will be stored.
Should start with "!".
Sample:
# load balance
Ruleset(
name="logd"
queue.workerthreads="5"
){
Action(
type="mmsequence"
mode="instance"
from="0"
to="2"
var="!seq"
)
if $!seq == "0" then {
Action(
type="mmnormalize"
userawmsg="on"
rulebase="/etc/rsyslog.d/rules.rb"
)
} else {
Action(
type="mmnormalize"
userawmsg="on"
rulebase="/etc/rsyslog.d/rules.rb"
)
}
# output logic here
}
# generate random numbers
action(
type="mmsequence"
mode="random"
to="100"
var="!rndz"
)
# count from 0 to 99
action(
type="mmsequence"
mode="instance"
to="100"
var="!cnt1"
)
# the same as before but the counter is global
action(
type="mmsequence"
mode="key"
key="key1"
to="100"
var="!cnt2"
)
# count specific messages but place the counter in every message
if $msg contains "txt" then
action(
type="mmsequence"
mode="key"
to="100"
var="!cnt3"
)
else
action(
type="mmsequence"
mode="key"
to="100"
step="0"
var="!cnt3"
key=""
)
Legacy Configuration Directives:
Not supported.
--
Pavel Levshin
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.