How/where does it keep track of those? Is there a better way to see if the log has been sent other than trying to track down that log entry on the destination server?
> On Jun 30, 2014, at 12:25 PM, "David Lang" <[email protected]> wrote: > >> On Mon, 30 Jun 2014, Doug McClure wrote: >> >> I most definitely swamped all four logstash instances during the processing >> of over 100 end points we just activated so designing for >> burst/spikes/backlogs is one part I need to do. The other is designing for >> steady state. >> >> What I'm seeing this AM is the number of disk cache files growing and then >> they are promptly processed down to the last one. When their is only one >> cache file left the contents within are not purged down to a file size of >> zero or so. I've read that one file may always exist, but what I'm seeing >> is that one file holds onto messages and they are not flushed out. >> >> Mon Jun 30 13:08:03 2014: imuxsock: submitted=413 ratelimit.discarded=0 >> ratelimit.numratelimiters=275 >> Mon Jun 30 13:08:03 2014: dynafile cache LocalFileOutput: requests=180936 >> level0=137528 missed=126 evicted=0 maxused=126 >> Mon Jun 30 13:08:03 2014: LocalFileOutput: processed=180936 failed=0 >> Mon Jun 30 13:08:03 2014: action 2: processed=3 failed=0 >> Mon Jun 30 13:08:03 2014: logstashforwarder: processed=180936 failed=0 >> Mon Jun 30 13:08:03 2014: imptcp(*/10514/IPv4): submitted=180520 >> Mon Jun 30 13:08:03 2014: imptcp(*/10514/IPv6): submitted=0 >> >> *Mon Jun 30 13:08:03 2014: logstashforwarder[DA]: size=0 enqueued=14919 >> full=0 discarded.full=0 discarded.nf <http://discarded.nf>=0 maxqsize=14631 >> Mon Jun 30 13:08:03 2014: logstashforwarder: size=0 enqueued=180936 full=0 >> discarded.full=0 discarded.nf <http://discarded.nf>=0 maxqsize=2699* >> Mon Jun 30 13:08:03 2014: main Q: size=0 enqueued=180937 full=0 >> discarded.full=0 discarded.nf=0 maxqsize=620 >> >> cache file: >> -rw------- 1 syslog syslog *11M Jun 30 12:21 logstashqueue.00000001* >> >> >> What settings should I increase to try and prevent triggering the DA >> queues? Can that DA file be triggered to flush? > > the only way to prevent triggering the DA queues is to have enough space in > the memory queues > > are you sure the logs in that file have not been sent? rsyslog doesn't > re-write the file after each log is sent, it just keeps track of what has > been sent and what hasn't. once the file is no longer needed it's deleted. As > I understand it, the last file is not deleted in case it's needed again (it's > far faster to work with an existing file than to create a new file, so in > cases where you are right on the border of using the DA portion of the queue, > hitting it then emptying it in rapid succession, deleting and recreating the > file would be _very_ expensive) > > David Lang > >> >> On Mon, Jun 30, 2014 at 10:53 AM, Aleksandar Lazic <[email protected]> >> wrote: >> >>> Dear Doug. >>> >>> Am 30-06-2014 15:05, schrieb Doug McClure: >>> >>> Thanks, I'll play with it this week and see what changes. The main flow is >>>> >>>> rsyslog -> haproxy -> logstash farm (4) -> log index/search product. >>> >>> What's in the haproxy log about the timing and error codes? >>> >>> https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#8.2 >>> >>> How looks your haproxy config? >>> Are there any rsyslog errors in the syslog log? >>> What's in the logstash logs? >>> >>> >>> I'm very suspect of the end product as I don't yet have much visibility >>>> into it's scale/performance at the input side yet. I'm setting up >>>> instrumentation all along the way to help 'see' what's happening and >>>> where. >>>> >>>> What I think I'm seeing on the rsyslog side is that the DA files are >>>> getting created but they appear to not be processed even though everything >>>> downstream (haproxy/logstash) is up and passing messages from rsyslog. I >>>> saw it a number of times this weekend where the file is created and holds >>>> queue'd up messages from the burst period and never flushes them out. >>>> Normal message traffic continues to flow through with no problem so >>>> something isn't set up right to flush those DA file content. >>>> >>>> Any pointers on this? Searching Google returns lots of similar symptoms. >>>> >>>> Current config: >>>> >>>> $MaxMessageSize 64k >>>> $MainMsgQueueSize 10000000 # how many messages (messages, >>>> not bytes!) to hold in memory >>>> $MainMsgQueueDequeueBatchSize 1000 >>>> $MainMsgQueueWorkerThreads 3 >>>> >>>> action(type="omfwd" >>>> Target="10.x.x.x" # target end point where haproxy >>>> listens >>>> Port="10515" # port for haproxy listner >>>> Protocol="tcp" # use tcp >>>> Template="LogFormatDSV" # use the output format above to >>>> create a CSV message format >>>> queue.filename="logstashqueue" # set file name, also enables >>>> disk mode >>>> queue.size="10000000" # how many messages (messages, >>>> not >>>> bytes!) to hold in queue >>>> queue.dequeuebatchsize="1000" # number of messages in each batch >>>> removed from queues >>>> queue.maxdiskspace="25g" # maximum disk space used for >>>> the disk assist queue >>>> queue.maxfilesize="100m" # maximum file size of disk >>>> assist files >>>> queue.saveonshutdown="on" # save the queue contents when >>>> stopping rsyslog >>>> queue.type="LinkedList" # use asynchronous processing >>>> queue.workerthreads="5" # threads to process the output >>>> action, queues >>>> action.resumeRetryCount="-1" # rsyslog retry indef if >>>> haproxy >>>> down, (builds queue) When haproxy back, it will start inserting from the >>>> queue >>>> RebindInterval="5" # restablish connections with >>>> haproxy after N messages (?) >>>> name="logstashforwarder" # name of the action >>>> ) >>>> >>>> Tks! >>>> >>>> Doug >>>> >>>> >>>> On Sat, Jun 28, 2014 at 8:37 PM, David Lang <[email protected]> wrote: >>>> >>>> On Sat, 28 Jun 2014, Doug McClure wrote: >>>>> >>>>> I just found the magical setting it appears - rebindinterval! I set >>>>> this >>>>> >>>>>> to 10 and have balanced traffic across each logstash instances behind >>>>>> haproxy. >>>>>> >>>>>> Any guidance on setting rebindinterval? >>>>> It depends on the system that's receiving the data. >>>>> >>>>> rebinding is an expensive operation (especially if you end up using >>>>> encryption), so you want to do it as little as possible, but you want to >>>>> do >>>>> it frequently enough to keep all the receivers busy. >>>>> >>>>> I like to set it to a level that has you rebinding a few times a second >>>>> on >>>>> the theory that the receiving system should be able to accept a second or >>>>> so worth of traffic ahead of what it can output, so rebinding >>>>> infrequtently >>>>> will keep both of the receivers busy all the time. >>>>> >>>>> If you have a lot of receivers, or they have very little queueing, then >>>>> you will need to set it to a smaller value. >>>>> >>>>> there will always be _some_ queuing available, even if the application >>>>> doesn't intend for there to be, because the TCP stack on the receiving >>>>> machine will ack the data from the network even if the software isn't >>>>> ready >>>>> for it. you just have to make sure that when the connection is closed, >>>>> the >>>>> software processes all the data the machine has received or you will >>>>> loose >>>>> some data >>>>> >>>>> I don't know how much queueing logstash will do, you will have to >>>>> experiment. >>>>> >>>>> what is logstash doing with the data? Depending on where the bottleneck >>>>> is >>>>> there, that may give us hints as to how much it can do. >>>>> >>>>> David Lang >>>>> >>>>> >>>>> >>>>> On Sat, Jun 28, 2014 at 11:00 AM, Doug McClure <[email protected]> >>>>>> wrote: >>>>>> >>>>>> Thanks for reply David - I've enjoyed reading your notes/prezos on >>>>>> >>>>>>> rsyslog >>>>>>> recently! >>>>>>> >>>>>>> I feel it's somewhere downstream as well. I've made these changes to >>>>>>> try >>>>>>> and improve throughput. >>>>>>> >>>>>>> I added haproxy between rsyslog and a logstash farm. I have two >>>>>>> logstash >>>>>>> instances accepting connections from the haproxy frontend. I'm still >>>>>>> trying to figure out how to get balanced traffic across each, but I see >>>>>>> traffic hitting both at some point. >>>>>>> >>>>>>> A spike came in again this AM and there were a few thousand DA files >>>>>>> created so I'm not sure what else to try here. Is there a way to >>>>>>> increase >>>>>>> the rsyslog output processing? Can multiple tcp output connections be >>>>>>> established to the haproxy input to increase output? Can the processing >>>>>>> of >>>>>>> DA files be increased/accelerated somehow? >>>>>>> >>>>>>> >>>>>>> Thanks for any pointers! >>>>>>> >>>>>>> Doug >>>>>>> >>>>>>> >>>>>>> On Fri, Jun 27, 2014 at 3:07 PM, David Lang <[email protected]> wrote: >>>>>>> >>>>>>> On Fri, 27 Jun 2014, Doug McClure wrote: >>>>>>> >>>>>>>> >>>>>>>> I'm in the tuning & optimization phase for a large rsyslog - logstash >>>>>>>> >>>>>>>> deployment and working through very close monitoring of system >>>>>>>>> performance, >>>>>>>>> logging load and imstats. >>>>>>>>> >>>>>>>>> This morning, a very large incoming log spike occurred on the rsyslog >>>>>>>>> imptcp input and was processed. That spike was routed to my omfwd >>>>>>>>> action >>>>>>>>> to >>>>>>>>> be shipped to a logstash tcp input. >>>>>>>>> >>>>>>>>> What it appears is happening is that very large input spike went to >>>>>>>>> disk >>>>>>>>> assisted queues and has created over 37K files on disk. They are >>>>>>>>> being >>>>>>>>> processed and shipped and I have a steady 60Kb/s output headed >>>>>>>>> towards >>>>>>>>> logstash tcp input. >>>>>>>>> >>>>>>>>> My question - where should I begin to look to optimize the processing >>>>>>>>> of >>>>>>>>> that output action queue or changing configuration to avoid queuing >>>>>>>>> up >>>>>>>>> so >>>>>>>>> much in the first place? How would I determine if this is due to >>>>>>>>> rsyslog >>>>>>>>> setup or on the logstash input side? >>>>>>>>> >>>>>>>>> >>>>>>>>> This is probably logstash not being able to keep up with the flood >>>>>>>> of >>>>>>>> traffic from rsyslog. Then once rsyslog spills to disk, it gets >>>>>>>> significantly slower (one of the things that can use a >>>>>>>> revision/upgrade >>>>>>>> is >>>>>>>> the disk queue code) >>>>>>>> >>>>>>>> you can specify a larger queue to hold more im memory (also, >>>>>>>> FixedArray >>>>>>>> seems to be a little more efficient thatn LinkedList, but that's not >>>>>>>> your >>>>>>>> problem now) >>>>>>>> >>>>>>>> If logstash can handle more inputs effectively, you can try adding >>>>>>>> more >>>>>>>> workers, but that can acutally slow things down (the locking between >>>>>>>> threads can sometimes slow things down more than the added cores >>>>>>>> working >>>>>>>> speed it up) >>>>>>>> >>>>>>>> But start by looking on the logstash side. >>>>>>>> >>>>>>>> David Lang >>>>>>>> >>>>>>>> Looking at imstats in the analyzer for logstash output (omfwd) I see >>>>>>>> >>>>>>>> significant processed/failed numbers and then in the logstash ouput >>>>>>>>> -DA- >>>>>>>>> it >>>>>>>>> looks like things are as described above - large queues, with no >>>>>>>>> failures >>>>>>>>> being processed gradually. >>>>>>>>> >>>>>>>>> >>>>>>>>> rsyslog 7.4.4 output action: >>>>>>>>> >>>>>>>>> action(type="omfwd" >>>>>>>>> Target="10.x.x.x" >>>>>>>>> Port="10515" >>>>>>>>> Protocol="tcp" >>>>>>>>> Template="LogFormatDSV" >>>>>>>>> queue.filename="logstashqueue" # set file name, also >>>>>>>>> enables >>>>>>>>> disk mode >>>>>>>>> queue.size="1000000" >>>>>>>>> queue.type="LinkedList" # use asynchronous >>>>>>>>> processing >>>>>>>>> queue.workerthreads="5" >>>>>>>>> name="logstashforwarder" >>>>>>>>> ) >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks for any pointers! >>>>>>>>> >>>>>>>>> Doug >>>>>>>>> _______________________________________________ >>>>>>>>> rsyslog mailing list >>>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>>> myriad >>>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >>>>>>>>> you >>>>>>>>> DON'T LIKE THAT. >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> >>>>>>>>> rsyslog mailing list >>>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>>>> http://www.rsyslog.com/professional-services/ >>>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>>>>> myriad >>>>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>>>> DON'T LIKE THAT. >>>>>>> _______________________________________________ >>>>>> rsyslog mailing list >>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>>> http://www.rsyslog.com/professional-services/ >>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>>> DON'T LIKE THAT. >>>>>> >>>>>> _______________________________________________ >>>>> rsyslog mailing list >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>> http://www.rsyslog.com/professional-services/ >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>>> DON'T LIKE THAT. >>>>> >>>>> _______________________________________________ >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST >>>> if you DON'T LIKE THAT. >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of >> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >> LIKE THAT. > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

