logstash Configuration

Aleksandar Lazic Mon, 30 Jun 2014 07:54:12 -0700

Dear Doug.

Am 30-06-2014 15:05, schrieb Doug McClure:

Thanks, I'll play with it this week and see what changes. The main flowis
rsyslog -> haproxy -> logstash farm (4) -> log index/search product.


What's in the haproxy log about the timing and error codes?

https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#8.2

How looks your haproxy config?
Are there any rsyslog errors in the syslog log?
What's in the logstash  logs?

I'm very suspect of the end product as I don't yet have much visibility
into it's scale/performance at the input side yet.  I'm setting up
instrumentation all along the way to help 'see' what's happening andwhere.
What I think I'm seeing on the rsyslog side is that the DA files are
getting created but they appear to not be processed even thougheverythingdownstream (haproxy/logstash) is up and passing messages from rsyslog.Isaw it a number of times this weekend where the file is created andholds
queue'd up messages from the burst period and never flushes them out.
Normal message traffic continues to flow through with no problem so
something isn't set up right to flush those DA file content.
Any pointers on this? Searching Google returns lots of similarsymptoms.
Current config:

$MaxMessageSize 64k
$MainMsgQueueSize 10000000 # how many messages(messages,
not bytes!) to hold in memory
$MainMsgQueueDequeueBatchSize 1000
$MainMsgQueueWorkerThreads 3

action(type="omfwd"
Target="10.x.x.x" # target end point wherehaproxy
listens
       Port="10515"                        # port for haproxy listner
       Protocol="tcp"                    # use tcp
Template="LogFormatDSV" # use the output format aboveto
create a CSV message format
       queue.filename="logstashqueue"     # set file name, also enables
disk mode
queue.size="10000000" # how many messages (messages,not
bytes!) to hold in queue
queue.dequeuebatchsize="1000" # number of messages in eachbatch
removed from queues
queue.maxdiskspace="25g" # maximum disk space usedfor
the disk assist queue
queue.maxfilesize="100m" # maximum file size ofdisk
assist files
       queue.saveonshutdown="on"        # save the queue contents when
stopping rsyslog
       queue.type="LinkedList"            # use asynchronous processing
queue.workerthreads="5" # threads to process theoutput
action, queues
action.resumeRetryCount="-1" # rsyslog retry indef ifhaproxydown, (builds queue) When haproxy back, it will start inserting fromthe
queue
       RebindInterval="5"                # restablish connections with
haproxy after N messages (?)
       name="logstashforwarder"            # name of the action
       )

Tks!

Doug


On Sat, Jun 28, 2014 at 8:37 PM, David Lang <[email protected]> wrote:
On Sat, 28 Jun 2014, Doug McClure wrote:
I just found the magical setting it appears - rebindinterval! I setthis
to 10 and have balanced traffic across each logstash instances behind
haproxy.

Any guidance on setting rebindinterval?
It depends on the system that's receiving the data.

rebinding is an expensive operation (especially if you end up using
encryption), so you want to do it as little as possible, but you wantto do
it frequently enough to keep all the receivers busy.
I like to set it to a level that has you rebinding a few times asecond onthe theory that the receiving system should be able to accept a secondorso worth of traffic ahead of what it can output, so rebindinginfrequtently
will keep both of the receivers busy all the time.
If you have a lot of receivers, or they have very little queueing,then
you will need to set it to a smaller value.

there will always be _some_ queuing available, even if the application
doesn't intend for there to be, because the TCP stack on the receiving
machine will ack the data from the network even if the software isn'treadyfor it. you just have to make sure that when the connection is closed,thesoftware processes all the data the machine has received or you willloose
some data

I don't know how much queueing logstash will do, you will have to
experiment.
what is logstash doing with the data? Depending on where thebottleneck is
there, that may give us hints as to how much it can do.

David Lang
On Sat, Jun 28, 2014 at 11:00 AM, Doug McClure <[email protected]>
wrote:

 Thanks for reply David - I've enjoyed reading your notes/prezos on
rsyslog
recently!
I feel it's somewhere downstream as well. I've made these changesto try
and improve throughput.
I added haproxy between rsyslog and a logstash farm. I have twologstashinstances accepting connections from the haproxy frontend. I'mstilltrying to figure out how to get balanced traffic across each, but Isee
traffic hitting both at some point.

A spike came in again this AM and there were a few thousand DA files
created so I'm not sure what else to try here. Is there a way toincreasethe rsyslog output processing? Can multiple tcp output connectionsbeestablished to the haproxy input to increase output? Can theprocessing
of
DA files be increased/accelerated somehow?


Thanks for any pointers!

Doug


On Fri, Jun 27, 2014 at 3:07 PM, David Lang <[email protected]> wrote:

 On Fri, 27 Jun 2014, Doug McClure wrote:
I'm in the tuning & optimization phase for a large rsyslog -logstash
deployment and working through very close monitoring of system
performance,
logging load and imstats.
This morning, a very large incoming log spike occurred on thersyslog
imptcp input and was processed. That spike was routed to my omfwd
action
to
be shipped to a logstash tcp input.
What it appears is happening is that very large input spike wentto
disk
assisted queues and has created over 37K files on disk. They arebeingprocessed and shipped and I have a steady 60Kb/s output headedtowards
logstash tcp input.
My question - where should I begin to look to optimize theprocessing
of
that output action queue or changing configuration to avoidqueuing up
so
much in the first place?  How would I determine if this is due to
rsyslog
setup or on the logstash input side?
This is probably logstash not being able to keep up with the floodof
traffic from rsyslog. Then once rsyslog spills to disk, it gets
significantly slower (one of the things that can use arevision/upgrade
is
the disk queue code)
you can specify a larger queue to hold more im memory (also,FixedArrayseems to be a little more efficient thatn LinkedList, but that'snot
your
problem now)
If logstash can handle more inputs effectively, you can try addingmoreworkers, but that can acutally slow things down (the lockingbetweenthreads can sometimes slow things down more than the added coresworking
speed it up)

But start by looking on the logstash side.

David Lang
Looking at imstats in the analyzer for logstash output (omfwd) Isee
significant processed/failed numbers and then in the logstashouput
-DA-
it
looks like things are as described above - large queues, with no
failures
being processed gradually.


rsyslog 7.4.4 output action:

action(type="omfwd"
      Target="10.x.x.x"
      Port="10515"
      Protocol="tcp"
      Template="LogFormatDSV"
queue.filename="logstashqueue" # set file name, alsoenables
disk mode
      queue.size="1000000"
queue.type="LinkedList" # use asynchronousprocessing
      queue.workerthreads="5"
      name="logstashforwarder"
      )


Thanks for any pointers!

Doug
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST ifyou
DON'T LIKE THAT.

 _______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by amyriadof sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST ifyou
DON'T LIKE THAT.
 _______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by amyriadof sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST ifyou
DON'T LIKE THAT.

 _______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by amyriad
of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
if you DON'T LIKE THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] Tuning / Optimizing rsyslog --> logstash Configuration

Reply via email to