logstash Configuration

Dan Finn Mon, 30 Jun 2014 11:38:43 -0700

How/where does it keep track of those?

Is there a better way to see if the log has been sent other than trying to 
track down that log entry on the destination server?



> On Jun 30, 2014, at 12:25 PM, "David Lang" <[email protected]> wrote:
> 
>> On Mon, 30 Jun 2014, Doug McClure wrote:
>> 
>> I most definitely swamped all four logstash instances during the processing
>> of over 100 end points we just activated so designing for
>> burst/spikes/backlogs is one part I need to do. The other is designing for
>> steady state.
>> 
>> What I'm seeing this AM is the number of disk cache files growing and then
>> they are promptly processed down to the last one. When their is only one
>> cache file left the contents within are not purged down to a file size of
>> zero or so.  I've read that one file may always exist, but what I'm seeing
>> is that one file holds onto messages and they are not flushed out.
>> 
>> Mon Jun 30 13:08:03 2014: imuxsock: submitted=413 ratelimit.discarded=0
>> ratelimit.numratelimiters=275
>> Mon Jun 30 13:08:03 2014: dynafile cache LocalFileOutput: requests=180936
>> level0=137528 missed=126 evicted=0 maxused=126
>> Mon Jun 30 13:08:03 2014: LocalFileOutput: processed=180936 failed=0
>> Mon Jun 30 13:08:03 2014: action 2: processed=3 failed=0
>> Mon Jun 30 13:08:03 2014: logstashforwarder: processed=180936 failed=0
>> Mon Jun 30 13:08:03 2014: imptcp(*/10514/IPv4): submitted=180520
>> Mon Jun 30 13:08:03 2014: imptcp(*/10514/IPv6): submitted=0
>> 
>> *Mon Jun 30 13:08:03 2014: logstashforwarder[DA]: size=0 enqueued=14919
>> full=0 discarded.full=0 discarded.nf <http://discarded.nf>=0 maxqsize=14631
>> Mon Jun 30 13:08:03 2014: logstashforwarder: size=0 enqueued=180936 full=0
>> discarded.full=0 discarded.nf <http://discarded.nf>=0 maxqsize=2699*
>> Mon Jun 30 13:08:03 2014: main Q: size=0 enqueued=180937 full=0
>> discarded.full=0 discarded.nf=0 maxqsize=620
>> 
>> cache file:
>> -rw------- 1 syslog syslog  *11M Jun 30 12:21 logstashqueue.00000001*
>> 
>> 
>> What settings should I increase to try and prevent triggering the DA
>> queues?  Can that DA file be triggered to flush?
> 
> the only way to prevent triggering the DA queues is to have enough space in 
> the memory queues
> 
> are you sure the logs in that file have not been sent? rsyslog doesn't 
> re-write the file after each log is sent, it just keeps track of what has 
> been sent and what hasn't. once the file is no longer needed it's deleted. As 
> I understand it, the last file is not deleted in case it's needed again (it's 
> far faster to work with an existing file than to create a new file, so in 
> cases where you are right on the border of using the DA portion of the queue, 
> hitting it then emptying it in rapid succession, deleting and recreating the 
> file would be _very_ expensive)
> 
> David Lang
> 
>> 
>> On Mon, Jun 30, 2014 at 10:53 AM, Aleksandar Lazic <[email protected]>
>> wrote:
>> 
>>> Dear Doug.
>>> 
>>> Am 30-06-2014 15:05, schrieb Doug McClure:
>>> 
>>> Thanks, I'll play with it this week and see what changes. The main flow is
>>>> 
>>>> rsyslog -> haproxy -> logstash farm (4) -> log index/search product.
>>> 
>>> What's in the haproxy log about the timing and error codes?
>>> 
>>> https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#8.2
>>> 
>>> How looks your haproxy config?
>>> Are there any rsyslog errors in the syslog log?
>>> What's in the logstash  logs?
>>> 
>>> 
>>> I'm very suspect of the end product as I don't yet have much visibility
>>>> into it's scale/performance at the input side yet.  I'm setting up
>>>> instrumentation all along the way to help 'see' what's happening and
>>>> where.
>>>> 
>>>> What I think I'm seeing on the rsyslog side is that the DA files are
>>>> getting created but they appear to not be processed even though everything
>>>> downstream (haproxy/logstash) is up and passing messages from rsyslog.  I
>>>> saw it a number of times this weekend where the file is created and holds
>>>> queue'd up messages from the burst period and never flushes them out.
>>>> Normal message traffic continues to flow through with no problem so
>>>> something isn't set up right to flush those DA file content.
>>>> 
>>>> Any pointers on this?  Searching Google returns lots of similar symptoms.
>>>> 
>>>> Current config:
>>>> 
>>>> $MaxMessageSize 64k
>>>> $MainMsgQueueSize 10000000                # how many messages (messages,
>>>> not bytes!) to hold in memory
>>>> $MainMsgQueueDequeueBatchSize 1000
>>>> $MainMsgQueueWorkerThreads 3
>>>> 
>>>> action(type="omfwd"
>>>>       Target="10.x.x.x"                # target end point where haproxy
>>>> listens
>>>>       Port="10515"                        # port for haproxy listner
>>>>       Protocol="tcp"                    # use tcp
>>>>       Template="LogFormatDSV"            # use the output format above to
>>>> create a CSV message format
>>>>       queue.filename="logstashqueue"     # set file name, also enables
>>>> disk mode
>>>>       queue.size="10000000"             # how many messages (messages,
>>>> not
>>>> bytes!) to hold in queue
>>>>       queue.dequeuebatchsize="1000"    # number of messages in each batch
>>>> removed from queues
>>>>       queue.maxdiskspace="25g"            # maximum disk space used for
>>>> the disk assist queue
>>>>       queue.maxfilesize="100m"               # maximum file size of disk
>>>> assist files
>>>>       queue.saveonshutdown="on"        # save the queue contents when
>>>> stopping rsyslog
>>>>       queue.type="LinkedList"            # use asynchronous processing
>>>>       queue.workerthreads="5"            # threads to process the output
>>>> action, queues
>>>>       action.resumeRetryCount="-1"        # rsyslog retry indef if
>>>> haproxy
>>>> down, (builds queue) When haproxy back, it will start inserting from the
>>>> queue
>>>>       RebindInterval="5"                # restablish connections with
>>>> haproxy after N messages (?)
>>>>       name="logstashforwarder"            # name of the action
>>>>       )
>>>> 
>>>> Tks!
>>>> 
>>>> Doug
>>>> 
>>>> 
>>>> On Sat, Jun 28, 2014 at 8:37 PM, David Lang <[email protected]> wrote:
>>>> 
>>>> On Sat, 28 Jun 2014, Doug McClure wrote:
>>>>> 
>>>>> I just found the magical setting it appears - rebindinterval!  I set
>>>>> this
>>>>> 
>>>>>> to 10 and have balanced traffic across each logstash instances behind
>>>>>> haproxy.
>>>>>> 
>>>>>> Any guidance on setting rebindinterval?
>>>>> It depends on the system that's receiving the data.
>>>>> 
>>>>> rebinding is an expensive operation (especially if you end up using
>>>>> encryption), so you want to do it as little as possible, but you want to
>>>>> do
>>>>> it frequently enough to keep all the receivers busy.
>>>>> 
>>>>> I like to set it to a level that has you rebinding a few times a second
>>>>> on
>>>>> the theory that the receiving system should be able to accept a second or
>>>>> so worth of traffic ahead of what it can output, so rebinding
>>>>> infrequtently
>>>>> will keep both of the receivers busy all the time.
>>>>> 
>>>>> If you have a lot of receivers, or they have very little queueing, then
>>>>> you will need to set it to a smaller value.
>>>>> 
>>>>> there will always be _some_ queuing available, even if the application
>>>>> doesn't intend for there to be, because the TCP stack on the receiving
>>>>> machine will ack the data from the network even if the software isn't
>>>>> ready
>>>>> for it. you just have to make sure that when the connection is closed,
>>>>> the
>>>>> software processes all the data the machine has received or you will
>>>>> loose
>>>>> some data
>>>>> 
>>>>> I don't know how much queueing logstash will do, you will have to
>>>>> experiment.
>>>>> 
>>>>> what is logstash doing with the data? Depending on where the bottleneck
>>>>> is
>>>>> there, that may give us hints as to how much it can do.
>>>>> 
>>>>> David Lang
>>>>> 
>>>>> 
>>>>> 
>>>>> On Sat, Jun 28, 2014 at 11:00 AM, Doug McClure <[email protected]>
>>>>>> wrote:
>>>>>> 
>>>>>> Thanks for reply David - I've enjoyed reading your notes/prezos on
>>>>>> 
>>>>>>> rsyslog
>>>>>>> recently!
>>>>>>> 
>>>>>>> I feel it's somewhere downstream as well.  I've made these changes to
>>>>>>> try
>>>>>>> and improve throughput.
>>>>>>> 
>>>>>>> I added haproxy between rsyslog and a logstash farm. I have two
>>>>>>> logstash
>>>>>>> instances accepting connections from the haproxy frontend.  I'm still
>>>>>>> trying to figure out how to get balanced traffic across each, but I see
>>>>>>> traffic hitting both at some point.
>>>>>>> 
>>>>>>> A spike came in again this AM and there were a few thousand DA files
>>>>>>> created so I'm not sure what else to try here. Is there a way to
>>>>>>> increase
>>>>>>> the rsyslog output processing? Can multiple tcp output connections be
>>>>>>> established to the haproxy input to increase output? Can the processing
>>>>>>> of
>>>>>>> DA files be increased/accelerated somehow?
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks for any pointers!
>>>>>>> 
>>>>>>> Doug
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Jun 27, 2014 at 3:07 PM, David Lang <[email protected]> wrote:
>>>>>>> 
>>>>>>> On Fri, 27 Jun 2014, Doug McClure wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> I'm in the tuning & optimization phase for a large rsyslog - logstash
>>>>>>>> 
>>>>>>>> deployment and working through very close monitoring of system
>>>>>>>>> performance,
>>>>>>>>> logging load and imstats.
>>>>>>>>> 
>>>>>>>>> This morning, a very large incoming log spike occurred on the rsyslog
>>>>>>>>> imptcp input and was processed. That spike was routed to my omfwd
>>>>>>>>> action
>>>>>>>>> to
>>>>>>>>> be shipped to a logstash tcp input.
>>>>>>>>> 
>>>>>>>>> What it appears is happening is that very large input spike went to
>>>>>>>>> disk
>>>>>>>>> assisted queues and has created over 37K files on disk.  They are
>>>>>>>>> being
>>>>>>>>> processed and shipped and I have a steady 60Kb/s output headed
>>>>>>>>> towards
>>>>>>>>> logstash tcp input.
>>>>>>>>> 
>>>>>>>>> My question - where should I begin to look to optimize the processing
>>>>>>>>> of
>>>>>>>>> that output action queue or changing configuration to avoid queuing
>>>>>>>>> up
>>>>>>>>> so
>>>>>>>>> much in the first place?  How would I determine if this is due to
>>>>>>>>> rsyslog
>>>>>>>>> setup or on the logstash input side?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> This is probably logstash not being able to keep up with the flood
>>>>>>>> of
>>>>>>>> traffic from rsyslog. Then once rsyslog spills to disk, it gets
>>>>>>>> significantly slower (one of the things that can use a
>>>>>>>> revision/upgrade
>>>>>>>> is
>>>>>>>> the disk queue code)
>>>>>>>> 
>>>>>>>> you can specify a larger queue to hold more im memory (also,
>>>>>>>> FixedArray
>>>>>>>> seems to be a little more efficient thatn LinkedList, but that's not
>>>>>>>> your
>>>>>>>> problem now)
>>>>>>>> 
>>>>>>>> If logstash can handle more inputs effectively, you can try adding
>>>>>>>> more
>>>>>>>> workers, but that can acutally slow things down (the locking between
>>>>>>>> threads can sometimes slow things down more than the added cores
>>>>>>>> working
>>>>>>>> speed it up)
>>>>>>>> 
>>>>>>>> But start by looking on the logstash side.
>>>>>>>> 
>>>>>>>> David Lang
>>>>>>>> 
>>>>>>>> Looking at imstats in the analyzer for logstash output (omfwd) I see
>>>>>>>> 
>>>>>>>> significant processed/failed numbers and then in the logstash ouput
>>>>>>>>> -DA-
>>>>>>>>> it
>>>>>>>>> looks like things are as described above - large queues, with no
>>>>>>>>> failures
>>>>>>>>> being processed gradually.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> rsyslog 7.4.4 output action:
>>>>>>>>> 
>>>>>>>>> action(type="omfwd"
>>>>>>>>>      Target="10.x.x.x"
>>>>>>>>>      Port="10515"
>>>>>>>>>      Protocol="tcp"
>>>>>>>>>      Template="LogFormatDSV"
>>>>>>>>>      queue.filename="logstashqueue"     # set file name, also
>>>>>>>>> enables
>>>>>>>>> disk mode
>>>>>>>>>      queue.size="1000000"
>>>>>>>>>      queue.type="LinkedList"            # use asynchronous
>>>>>>>>> processing
>>>>>>>>>      queue.workerthreads="5"
>>>>>>>>>      name="logstashforwarder"
>>>>>>>>>      )
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks for any pointers!
>>>>>>>>> 
>>>>>>>>> Doug
>>>>>>>>> _______________________________________________
>>>>>>>>> rsyslog mailing list
>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>>>> myriad
>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
>>>>>>>>> you
>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> 
>>>>>>>>> rsyslog mailing list
>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>>> myriad
>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>>>>>> DON'T LIKE THAT.
>>>>>>> _______________________________________________
>>>>>> rsyslog mailing list
>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>> http://www.rsyslog.com/professional-services/
>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>>>> DON'T LIKE THAT.
>>>>>> 
>>>>>> _______________________________________________
>>>>> rsyslog mailing list
>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>> http://www.rsyslog.com/professional-services/
>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>>> DON'T LIKE THAT.
>>>>> 
>>>>> _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
>>>> if you DON'T LIKE THAT.
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T 
>> LIKE THAT.
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T 
> LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] Tuning / Optimizing rsyslog --> logstash Configuration

Reply via email to