Also see

http://blog.gerhards.net/2013/07/rsyslog-why-disk-assisted-queues-keep.html?m=1

Sent from phone, thus brief.
Am 30.06.2014 21:11 schrieb "David Lang" <[email protected]>:

> On Mon, 30 Jun 2014, Doug McClure wrote:
>
>  I *assumed* the file would deplete to zero. I spot checked the first
>> message in the cache file and did find it was in my end search tool. It
>> must not deplete that last file to near zero file size.
>>
>
> The problem is that you can truncate the end of the file, but you can't
> remove anything from the beginning of a file without re-writing the entire
> file, so it's just not worth the hassle.
>
>  As far as tweaking the output action queue size, just increasing this to
>> handle steady state and "average bursts" sound reasonable?
>>
>> queue.size="10000000"             # how many messages (messages, not
>> bytes!) to hold in queue
>>
>
> I think there is a limit to how large you can go, but this is the approach
> to take (watch the startup output to see if you get any complaints about
> the size)
>
> David Lang
>
>  tks!
>>
>> Doug
>>
>>
>> On Mon, Jun 30, 2014 at 2:25 PM, David Lang <[email protected]> wrote:
>>
>>  On Mon, 30 Jun 2014, Doug McClure wrote:
>>>
>>>  I most definitely swamped all four logstash instances during the
>>>
>>>> processing
>>>> of over 100 end points we just activated so designing for
>>>> burst/spikes/backlogs is one part I need to do. The other is designing
>>>> for
>>>> steady state.
>>>>
>>>> What I'm seeing this AM is the number of disk cache files growing and
>>>> then
>>>> they are promptly processed down to the last one. When their is only one
>>>> cache file left the contents within are not purged down to a file size
>>>> of
>>>> zero or so.  I've read that one file may always exist, but what I'm
>>>> seeing
>>>> is that one file holds onto messages and they are not flushed out.
>>>>
>>>> Mon Jun 30 13:08:03 2014: imuxsock: submitted=413 ratelimit.discarded=0
>>>> ratelimit.numratelimiters=275
>>>> Mon Jun 30 13:08:03 2014: dynafile cache LocalFileOutput:
>>>> requests=180936
>>>> level0=137528 missed=126 evicted=0 maxused=126
>>>> Mon Jun 30 13:08:03 2014: LocalFileOutput: processed=180936 failed=0
>>>> Mon Jun 30 13:08:03 2014: action 2: processed=3 failed=0
>>>> Mon Jun 30 13:08:03 2014: logstashforwarder: processed=180936 failed=0
>>>> Mon Jun 30 13:08:03 2014: imptcp(*/10514/IPv4): submitted=180520
>>>> Mon Jun 30 13:08:03 2014: imptcp(*/10514/IPv6): submitted=0
>>>>
>>>> *Mon Jun 30 13:08:03 2014: logstashforwarder[DA]: size=0 enqueued=14919
>>>> full=0 discarded.full=0 discarded.nf <http://discarded.nf>=0
>>>> maxqsize=14631
>>>>
>>>> Mon Jun 30 13:08:03 2014: logstashforwarder: size=0 enqueued=180936
>>>> full=0
>>>> discarded.full=0 discarded.nf <http://discarded.nf>=0 maxqsize=2699*
>>>>
>>>> Mon Jun 30 13:08:03 2014: main Q: size=0 enqueued=180937 full=0
>>>> discarded.full=0 discarded.nf=0 maxqsize=620
>>>>
>>>> cache file:
>>>> -rw------- 1 syslog syslog  *11M Jun 30 12:21 logstashqueue.00000001*
>>>>
>>>>
>>>>
>>>> What settings should I increase to try and prevent triggering the DA
>>>> queues?  Can that DA file be triggered to flush?
>>>>
>>>>
>>> the only way to prevent triggering the DA queues is to have enough space
>>> in the memory queues
>>>
>>> are you sure the logs in that file have not been sent? rsyslog doesn't
>>> re-write the file after each log is sent, it just keeps track of what has
>>> been sent and what hasn't. once the file is no longer needed it's
>>> deleted.
>>> As I understand it, the last file is not deleted in case it's needed
>>> again
>>> (it's far faster to work with an existing file than to create a new file,
>>> so in cases where you are right on the border of using the DA portion of
>>> the queue, hitting it then emptying it in rapid succession, deleting and
>>> recreating the file would be _very_ expensive)
>>>
>>> David Lang
>>>
>>>
>>>
>>>  On Mon, Jun 30, 2014 at 10:53 AM, Aleksandar Lazic <[email protected]>
>>>> wrote:
>>>>
>>>>  Dear Doug.
>>>>
>>>>>
>>>>> Am 30-06-2014 15:05, schrieb Doug McClure:
>>>>>
>>>>>  Thanks, I'll play with it this week and see what changes. The main
>>>>> flow
>>>>> is
>>>>>
>>>>>
>>>>>> rsyslog -> haproxy -> logstash farm (4) -> log index/search product.
>>>>>>
>>>>>>
>>>>>>  What's in the haproxy log about the timing and error codes?
>>>>>
>>>>> https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#8.2
>>>>>
>>>>> How looks your haproxy config?
>>>>> Are there any rsyslog errors in the syslog log?
>>>>> What's in the logstash  logs?
>>>>>
>>>>>
>>>>>  I'm very suspect of the end product as I don't yet have much
>>>>> visibility
>>>>>
>>>>>  into it's scale/performance at the input side yet.  I'm setting up
>>>>>> instrumentation all along the way to help 'see' what's happening and
>>>>>> where.
>>>>>>
>>>>>> What I think I'm seeing on the rsyslog side is that the DA files are
>>>>>> getting created but they appear to not be processed even though
>>>>>> everything
>>>>>> downstream (haproxy/logstash) is up and passing messages from rsyslog.
>>>>>>  I
>>>>>> saw it a number of times this weekend where the file is created and
>>>>>> holds
>>>>>> queue'd up messages from the burst period and never flushes them out.
>>>>>> Normal message traffic continues to flow through with no problem so
>>>>>> something isn't set up right to flush those DA file content.
>>>>>>
>>>>>> Any pointers on this?  Searching Google returns lots of similar
>>>>>> symptoms.
>>>>>>
>>>>>> Current config:
>>>>>>
>>>>>> $MaxMessageSize 64k
>>>>>> $MainMsgQueueSize 10000000                # how many messages
>>>>>> (messages,
>>>>>> not bytes!) to hold in memory
>>>>>> $MainMsgQueueDequeueBatchSize 1000
>>>>>> $MainMsgQueueWorkerThreads 3
>>>>>>
>>>>>> action(type="omfwd"
>>>>>>        Target="10.x.x.x"                # target end point where
>>>>>> haproxy
>>>>>> listens
>>>>>>        Port="10515"                        # port for haproxy listner
>>>>>>        Protocol="tcp"                    # use tcp
>>>>>>        Template="LogFormatDSV"            # use the output format
>>>>>> above
>>>>>> to
>>>>>> create a CSV message format
>>>>>>        queue.filename="logstashqueue"     # set file name, also
>>>>>> enables
>>>>>> disk mode
>>>>>>        queue.size="10000000"             # how many messages
>>>>>> (messages,
>>>>>> not
>>>>>> bytes!) to hold in queue
>>>>>>        queue.dequeuebatchsize="1000"    # number of messages in each
>>>>>> batch
>>>>>> removed from queues
>>>>>>        queue.maxdiskspace="25g"            # maximum disk space used
>>>>>> for
>>>>>> the disk assist queue
>>>>>>        queue.maxfilesize="100m"               # maximum file size of
>>>>>> disk
>>>>>> assist files
>>>>>>        queue.saveonshutdown="on"        # save the queue contents when
>>>>>> stopping rsyslog
>>>>>>        queue.type="LinkedList"            # use asynchronous
>>>>>> processing
>>>>>>        queue.workerthreads="5"            # threads to process the
>>>>>> output
>>>>>> action, queues
>>>>>>        action.resumeRetryCount="-1"        # rsyslog retry indef if
>>>>>> haproxy
>>>>>> down, (builds queue) When haproxy back, it will start inserting from
>>>>>> the
>>>>>> queue
>>>>>>        RebindInterval="5"                # restablish connections with
>>>>>> haproxy after N messages (?)
>>>>>>        name="logstashforwarder"            # name of the action
>>>>>>        )
>>>>>>
>>>>>> Tks!
>>>>>>
>>>>>> Doug
>>>>>>
>>>>>>
>>>>>> On Sat, Jun 28, 2014 at 8:37 PM, David Lang <[email protected]> wrote:
>>>>>>
>>>>>>  On Sat, 28 Jun 2014, Doug McClure wrote:
>>>>>>
>>>>>>
>>>>>>>  I just found the magical setting it appears - rebindinterval!  I set
>>>>>>> this
>>>>>>>
>>>>>>>  to 10 and have balanced traffic across each logstash instances
>>>>>>> behind
>>>>>>>
>>>>>>>> haproxy.
>>>>>>>>
>>>>>>>> Any guidance on setting rebindinterval?
>>>>>>>>
>>>>>>>>
>>>>>>>>  It depends on the system that's receiving the data.
>>>>>>>>
>>>>>>>
>>>>>>> rebinding is an expensive operation (especially if you end up using
>>>>>>> encryption), so you want to do it as little as possible, but you want
>>>>>>> to
>>>>>>> do
>>>>>>> it frequently enough to keep all the receivers busy.
>>>>>>>
>>>>>>> I like to set it to a level that has you rebinding a few times a
>>>>>>> second
>>>>>>> on
>>>>>>> the theory that the receiving system should be able to accept a
>>>>>>> second
>>>>>>> or
>>>>>>> so worth of traffic ahead of what it can output, so rebinding
>>>>>>> infrequtently
>>>>>>> will keep both of the receivers busy all the time.
>>>>>>>
>>>>>>> If you have a lot of receivers, or they have very little queueing,
>>>>>>> then
>>>>>>> you will need to set it to a smaller value.
>>>>>>>
>>>>>>> there will always be _some_ queuing available, even if the
>>>>>>> application
>>>>>>> doesn't intend for there to be, because the TCP stack on the
>>>>>>> receiving
>>>>>>> machine will ack the data from the network even if the software isn't
>>>>>>> ready
>>>>>>> for it. you just have to make sure that when the connection is
>>>>>>> closed,
>>>>>>> the
>>>>>>> software processes all the data the machine has received or you will
>>>>>>> loose
>>>>>>> some data
>>>>>>>
>>>>>>> I don't know how much queueing logstash will do, you will have to
>>>>>>> experiment.
>>>>>>>
>>>>>>> what is logstash doing with the data? Depending on where the
>>>>>>> bottleneck
>>>>>>> is
>>>>>>> there, that may give us hints as to how much it can do.
>>>>>>>
>>>>>>> David Lang
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  On Sat, Jun 28, 2014 at 11:00 AM, Doug McClure <[email protected]>
>>>>>>>
>>>>>>>  wrote:
>>>>>>>>
>>>>>>>>  Thanks for reply David - I've enjoyed reading your notes/prezos on
>>>>>>>>
>>>>>>>>  rsyslog
>>>>>>>>
>>>>>>>>> recently!
>>>>>>>>>
>>>>>>>>> I feel it's somewhere downstream as well.  I've made these changes
>>>>>>>>> to
>>>>>>>>> try
>>>>>>>>> and improve throughput.
>>>>>>>>>
>>>>>>>>> I added haproxy between rsyslog and a logstash farm. I have two
>>>>>>>>> logstash
>>>>>>>>> instances accepting connections from the haproxy frontend.  I'm
>>>>>>>>> still
>>>>>>>>> trying to figure out how to get balanced traffic across each, but I
>>>>>>>>> see
>>>>>>>>> traffic hitting both at some point.
>>>>>>>>>
>>>>>>>>> A spike came in again this AM and there were a few thousand DA
>>>>>>>>> files
>>>>>>>>> created so I'm not sure what else to try here. Is there a way to
>>>>>>>>> increase
>>>>>>>>> the rsyslog output processing? Can multiple tcp output connections
>>>>>>>>> be
>>>>>>>>> established to the haproxy input to increase output? Can the
>>>>>>>>> processing
>>>>>>>>> of
>>>>>>>>> DA files be increased/accelerated somehow?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks for any pointers!
>>>>>>>>>
>>>>>>>>> Doug
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Jun 27, 2014 at 3:07 PM, David Lang <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>>  On Fri, 27 Jun 2014, Doug McClure wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   I'm in the tuning & optimization phase for a large rsyslog -
>>>>>>>>>> logstash
>>>>>>>>>>
>>>>>>>>>>  deployment and working through very close monitoring of system
>>>>>>>>>>
>>>>>>>>>>  performance,
>>>>>>>>>>> logging load and imstats.
>>>>>>>>>>>
>>>>>>>>>>> This morning, a very large incoming log spike occurred on the
>>>>>>>>>>> rsyslog
>>>>>>>>>>> imptcp input and was processed. That spike was routed to my omfwd
>>>>>>>>>>> action
>>>>>>>>>>> to
>>>>>>>>>>> be shipped to a logstash tcp input.
>>>>>>>>>>>
>>>>>>>>>>> What it appears is happening is that very large input spike went
>>>>>>>>>>> to
>>>>>>>>>>> disk
>>>>>>>>>>> assisted queues and has created over 37K files on disk.  They are
>>>>>>>>>>> being
>>>>>>>>>>> processed and shipped and I have a steady 60Kb/s output headed
>>>>>>>>>>> towards
>>>>>>>>>>> logstash tcp input.
>>>>>>>>>>>
>>>>>>>>>>> My question - where should I begin to look to optimize the
>>>>>>>>>>> processing
>>>>>>>>>>> of
>>>>>>>>>>> that output action queue or changing configuration to avoid
>>>>>>>>>>> queuing
>>>>>>>>>>> up
>>>>>>>>>>> so
>>>>>>>>>>> much in the first place?  How would I determine if this is due to
>>>>>>>>>>> rsyslog
>>>>>>>>>>> setup or on the logstash input side?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  This is probably logstash not being able to keep up with the
>>>>>>>>>>> flood
>>>>>>>>>>>
>>>>>>>>>>>  of
>>>>>>>>>> traffic from rsyslog. Then once rsyslog spills to disk, it gets
>>>>>>>>>> significantly slower (one of the things that can use a
>>>>>>>>>> revision/upgrade
>>>>>>>>>> is
>>>>>>>>>> the disk queue code)
>>>>>>>>>>
>>>>>>>>>> you can specify a larger queue to hold more im memory (also,
>>>>>>>>>> FixedArray
>>>>>>>>>> seems to be a little more efficient thatn LinkedList, but that's
>>>>>>>>>> not
>>>>>>>>>> your
>>>>>>>>>> problem now)
>>>>>>>>>>
>>>>>>>>>> If logstash can handle more inputs effectively, you can try adding
>>>>>>>>>> more
>>>>>>>>>> workers, but that can acutally slow things down (the locking
>>>>>>>>>> between
>>>>>>>>>> threads can sometimes slow things down more than the added cores
>>>>>>>>>> working
>>>>>>>>>> speed it up)
>>>>>>>>>>
>>>>>>>>>> But start by looking on the logstash side.
>>>>>>>>>>
>>>>>>>>>> David Lang
>>>>>>>>>>
>>>>>>>>>>  Looking at imstats in the analyzer for logstash output (omfwd) I
>>>>>>>>>> see
>>>>>>>>>>
>>>>>>>>>>  significant processed/failed numbers and then in the logstash
>>>>>>>>>> ouput
>>>>>>>>>>
>>>>>>>>>>  -DA-
>>>>>>>>>>> it
>>>>>>>>>>> looks like things are as described above - large queues, with no
>>>>>>>>>>> failures
>>>>>>>>>>> being processed gradually.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> rsyslog 7.4.4 output action:
>>>>>>>>>>>
>>>>>>>>>>> action(type="omfwd"
>>>>>>>>>>>       Target="10.x.x.x"
>>>>>>>>>>>       Port="10515"
>>>>>>>>>>>       Protocol="tcp"
>>>>>>>>>>>       Template="LogFormatDSV"
>>>>>>>>>>>       queue.filename="logstashqueue"     # set file name, also
>>>>>>>>>>> enables
>>>>>>>>>>> disk mode
>>>>>>>>>>>       queue.size="1000000"
>>>>>>>>>>>       queue.type="LinkedList"            # use asynchronous
>>>>>>>>>>> processing
>>>>>>>>>>>       queue.workerthreads="5"
>>>>>>>>>>>       name="logstashforwarder"
>>>>>>>>>>>       )
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks for any pointers!
>>>>>>>>>>>
>>>>>>>>>>> Doug
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> rsyslog mailing list
>>>>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>>>>>> myriad
>>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
>>>>>>>>>>> if
>>>>>>>>>>> you
>>>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>>>>
>>>>>>>>>>>  _______________________________________________
>>>>>>>>>>>
>>>>>>>>>>>  rsyslog mailing list
>>>>>>>>>>>
>>>>>>>>>>>  http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>>>>> myriad
>>>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
>>>>>>>>>> you
>>>>>>>>>> DON'T LIKE THAT.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   _______________________________________________
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  rsyslog mailing list
>>>>>>>>>
>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>>> myriad
>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
>>>>>>>> you
>>>>>>>> DON'T LIKE THAT.
>>>>>>>>
>>>>>>>>  _______________________________________________
>>>>>>>>
>>>>>>>>  rsyslog mailing list
>>>>>>>>
>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>>> http://www.rsyslog.com/professional-services/
>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>>> myriad
>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
>>>>>>> you
>>>>>>> DON'T LIKE THAT.
>>>>>>>
>>>>>>>  _______________________________________________
>>>>>>>
>>>>>>>  rsyslog mailing list
>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>> http://www.rsyslog.com/professional-services/
>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST
>>>>>> if you DON'T LIKE THAT.
>>>>>>
>>>>>>
>>>>>>   _______________________________________________
>>>>>
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>> DON'T LIKE THAT.
>>>>
>>>>  _______________________________________________
>>>>
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>> DON'T LIKE THAT.
>>>
>>>  _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>> DON'T LIKE THAT.
>>
>>  _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to