2015-11-04 17:24 GMT+01:00 Joe Blow <[email protected]>: > Thanks for the input Rainer! It definitely helps, and I love hearing some > of this from the horse's mouth. Let me start this post by saying i'm > extremely grateful for all the help that rsyslog has provided me throughout > my career. The support on this mailing list is arguably better than any of > the paid vendors i've used for logging/SIEM. > > As much as I'd love to just give up on this, I'm far too confident in the > rsyslog tool to admit defeat. Rsyslog is a beast, but a beast with many > knobs :). I'm interested in potentially using the failover option, but the > DA queue configuration ease might keep me using that for the time being. > > What about getting creative and moving the files to another rsyslog > instance (on the same box) that doesn't have any input modules? Here's my > thoughts: > > Stop rsyslog. > Move rsyslog DA files and .qi file to another directory which a secondary > instance of rsyslog knows about (but has no input modules running). > Start rsyslog with input modules to get the realtime data flowing back in, > with an empty DA queue. > Turn on the second rsyslog instance which only knows about the backlog > files, and has no input modules. > > My thought is that this would give at least 1 dedicated worker (per queue) > 1 full core of resources to chug through the backlog, and only the > backlog. Is my logic sound?
not sure. Let's find the bottleneck. Is it i/o or CPU? What hard facts tell you which one it is (you already commented partly on i/o, this the more solid questions). IMO, the disk queue should primarily be i/o intense, and not put a lot of stress on the cpu. If so, the logic wouldn't work. Rainer > > I've run multiple rsyslog instances on the same box for some other > 'creative' logging projects i've done previously, without too much issue. > > Thoughts? > > Cheers, > > JB > > > On Wed, Nov 4, 2015 at 11:14 AM, Rainer Gerhards <[email protected]> > wrote: > >> 2015-11-04 17:12 GMT+01:00 Rainer Gerhards <[email protected]>: >> > 2015-11-04 17:08 GMT+01:00 Joe Blow <[email protected]>: >> >> I think i've spoken too soon. The in memory queues are clearing >> extremely >> >> well with these settings, but the DA stuff is still pretty sluggish >> (slowed >> >> down to 50-100 EPS again). I've looked at the box and the IO is around >> 10% >> >> (12 disk array, which performs quite snappily), so i'm sincerely >> doubting >> >> this is an IO issue. >> >> >> >> The huge feed in question uses around 4 workers before it has enough >> >> workers to clear the queue as fast as it comes in (400k avg in the >> queue). >> >> From my understanding that means that i've got 4 workers@50k bulk size >> >> each, and at 200k EPS out (4 workers x 50k EPS) the in memory queue >> gets no >> >> bigger. Now this is where my knowledge ends. I've set the low >> watermark >> >> to 750k and high watermark to 1 million, with the thoughts that the low >> >> watermark is below having all 8 workers full bore (8x100k) and the high >> >> watermark is a 250k higher than that (slightly above all workers going >> full >> >> bore). If i'm staying below the low watermark, and still have "free" >> >> workers, would those workers not try and empty the DA queue? What would >> >> help allocate more resources to clearing the DA queue? >> > >> > The DA queue always run on one worker, because you can't use more than >> > one worker with purely sequential files. >> > >> > TBH I think your needs simply go above what the current system can >> > provide. As David said, the queue subsystem could well deserve an >> > overhaul, but this is a too-big task right now given what else is >> > going on and there also has been no sponsor for any of that disk queue >> > work in the past years, so it doesn't seem to have a too high priority >> > either. >> >> mhhh, I should mention a potential work-around: forget the DA queue. >> Use failover actions. If the action fails, write log lines in native >> format to a file. Then, use imfile to monitor that file. Together with >> a smart design of the rulesets, you can probably get all you need out >> of such a system. Unfortunately, I am even more swamped than usual, so >> I cannot provide detail advised except than here pointing you to the >> idea. >> >> HTH >> Rainer >> > >> > Rainer >> >> >> >> Thanks for the prompt responses. >> >> >> >> Cheers, >> >> >> >> JB >> >> >> >> On Wed, Nov 4, 2015 at 10:53 AM, Rainer Gerhards < >> [email protected]> >> >> wrote: >> >> >> >>> 2015-11-04 16:44 GMT+01:00 Joe Blow <[email protected]>: >> >>> > Ok i've played with some numbers.... this is what one of the massive >> >>> queues >> >>> > looks like now, and it *IS* dequeuing much faster (500 EPS from DA, >> 25k >> >>> EPS >> >>> > from in memory queue. >> >>> > >> >>> This may sound a bit strange, and I never tried it, but .. I wouldn't >> >>> be surprised if it is actually faster if you put the queue files on a >> >>> compressed directory. The idea behind that is that while this >> >>> obviously eats CPU, it will probably save you a lot of real i/o >> >>> because the data written to the disk queue can be greatly compressed. >> >>> >> >>> If you give it a try, please let us know the outcome. >> >>> >> >>> Rainer >> >>> >> >>> > Hopefully this helps some other people who have very massive, disk >> backed >> >>> > queues... Please feel free to comment on these values. >> >>> > >> >>> > action(type="omelasticsearch" >> >>> > name="rsys_HugeQ" >> >>> > server="10.10.10.10" >> >>> > serverport="9200" >> >>> > template="HugeQTemplate" >> >>> > asyncrepl="on" >> >>> > searchType="HugeType" >> >>> > searchIndex="HugeQindex" >> >>> > timeout="3m" >> >>> > dynSearchIndex="on" >> >>> > bulkmode="on" >> >>> > errorfile="HugeQ_err.log" >> >>> > queue.type="linkedlist" >> >>> > queue.filename="HugeQ.rsysq" >> >>> > queue.maxfilesize="2048m" >> >>> > queue.highwatermark="1000000" >> >>> > queue.lowwatermark="750000" >> >>> > queue.discardmark="499999999" >> >>> > queue.dequeueslowdown="100" >> >>> > queue.size="500000000" >> >>> > queue.saveonshutdown="on" >> >>> > queue.maxdiskspace="1000g" >> >>> > queue.dequeuebatchsize="50000" >> >>> > queue.workerthreads="8" >> >>> > queue.workerthreadminimummessages="100000" >> >>> > action.resumeretrycount="-1")stop} >> >>> > >> >>> > I'd love some feedback, but these numbers are working pretty well for >> >>> these >> >>> > massive feeds. >> >>> > >> >>> > Cheers, >> >>> > >> >>> > JB >> >>> > >> >>> > On Wed, Nov 4, 2015 at 10:26 AM, Radu Gheorghe < >> >>> [email protected]> >> >>> > wrote: >> >>> > >> >>> >> On Wed, Nov 4, 2015 at 5:19 PM, Joe Blow <[email protected]> >> >>> wrote: >> >>> >> > Radu - My checkpoint interval is set at 100k. Are you suggesting >> >>> this be >> >>> >> > lowered? raised? >> >>> >> >> >>> >> It sounds like the higher the better, but if your problem is on how >> >>> >> fast it can read... I think there's not much you can do - that seems >> >>> >> to be a setting for writes. Also note David's comment on how it >> might >> >>> >> only apply if syncing is enabled. >> >>> >> >> >>> >> On the read side I don't know what optimization you can do in the >> >>> >> conf. Maybe you can test with various file sizes? >> (queue.maxfilesize - >> >>> >> the default is 1MB so that might be too small) Though I wouldn't >> have >> >>> >> high hopes, it sounds like recovery is much too slow even for >> reading >> >>> >> 1MB files. >> >>> >> >> >>> >> Best regards, >> >>> >> Radu >> >>> >> -- >> >>> >> Performance Monitoring * Log Analytics * Search Analytics >> >>> >> Solr & Elasticsearch Support * http://sematext.com/ >> >>> >> _______________________________________________ >> >>> >> rsyslog mailing list >> >>> >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>> >> http://www.rsyslog.com/professional-services/ >> >>> >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> myriad >> >>> >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >> you >> >>> >> DON'T LIKE THAT. >> >>> >> >> >>> > _______________________________________________ >> >>> > rsyslog mailing list >> >>> > http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>> > http://www.rsyslog.com/professional-services/ >> >>> > What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> myriad >> >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> >>> DON'T LIKE THAT. >> >>> _______________________________________________ >> >>> rsyslog mailing list >> >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >>> http://www.rsyslog.com/professional-services/ >> >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> myriad >> >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> >>> DON'T LIKE THAT. >> >>> >> >> _______________________________________________ >> >> rsyslog mailing list >> >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> >> http://www.rsyslog.com/professional-services/ >> >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if >> you DON'T LIKE THAT. >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >> DON'T LIKE THAT. >> > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

