On Wed, 9 Sep 2015, Risto Vaarandi wrote:
I am currently tuning one of my rsyslog+elasticsearch installations and
questions about optimal settings have emerged. In the web, there is a nice
guide with several recommendations
http://blog.sematext.com/2014/01/20/rsyslog-8-1-elasticsearch-output-performance/,
but it has one elasticsearch action, while my configuration has many. In a
nutshell, my current setup looks like this:
<snip>
Altogether, I have about 20 omelasticsearch actions in the above block of
statements. My questions is -- should I use larger values for queue and batch
size than just 10000 and 500? The guide
http://blog.sematext.com/2014/01/20/rsyslog-8-1-elasticsearch-output-performance/
recommends much larger values, but these are used for only one action
statement which handles all writes to Elasticsearch. In contrast, my setup has
many actions, and although some actions are less busy, the most active 7-8
actions see roughly the same amount of traffic. This installations receives
4-5 thousand messages per second, but the workload will increase gradually.
Also, what about the queue sizes for the entire ruleset, do the current
settings look reasonable? (As I have understood, each ruleset uses its own
queue, and changing the size of the main queue does not influence the
ruleset.)
Are there any other settings I should consider, in order to increase
performance?
Max queue size is how many messages you want to be able to handle if ES is down.
Once you get beyond ~2x batch size, it won't have any effect on performance
dequeue size is the max number of messages to pull from the queue and attempt to
send to ES at once. If it's too small, the per-send and per-batch processing
overhead of ES will waste resources. If it's too large, ES ends up needing too
much RAM to process the messages, so the optimum batch size depends on the size
of individual messages
If you have 200B messages, you should send a lot more of them at once then
if you have 2MB messages. Sending 1000 200B messages will be just over 200KB of
data, but 1000 2M messages will need 2G of buffering to process them.
From other comments, it sounds as if ES is limited in the number of inbound
connections it can handle, so you may want to do something along the lines of:
$template manual,"%$.custommessage%\n"
ruleset es(queue.type="linkedlist" queue.size="10000"
queue.dequeuebatchsize="500") {
action(type="omelasticsearch" template="manual" dynSearchIndex="on"
searchIndex="SyslogIndex" server="localhost" bulkmode="on"
action.resumeretrycount="-1")
{
then do
if $programname contains 'app1' then {
set $.custommessage = exec_template("App1");
call es
stop
}
if $programname contains 'app2' then {
set $.custommessage = exec_template("App2");
call es
stop
}
so that all your sending to ES is funneled through one queue and one connection
to ES rather than a separate one per filter.
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.