2015-01-16 23:39 GMT+01:00 Dave Caplinger <[email protected]>:
> On Jan 16, 2015, at 2:51 PM, David Lang <[email protected]> wrote: > > > > On Fri, 16 Jan 2015, Dave Caplinger wrote: > > > >> ... filesystem buffer can help speed reading data previously written to > >> disk if your outage was short enough to not get "too far" behind, > because the > >> data is still actually in RAM so you don't actually have to pay > physical IOPS > >> to touch the disk to retrieve it. > > > > the filesystem actions are the super expensive parts, even if things are > cached > > to ram. There are also fsyncs that take place to make the data safe, and > they > > force disk IOPS > > I agree the write path is certainly expensive (and more so by frequent > fsyncs), but when you come back 'n' minutes later to read it (and it's > still in the filesystem buffer), I only meant that it's much quicker than > having to actually seek and read from disk again. So you're not paying the > penalty twice in this case. > > >> ... time windows ... > > > > something to think about here, what do you use as a time reference (both > for > > 'now' and for the log message you are processing), do you use the > current time > > on the system doing the processing, or the timestamps in the messages. > > A combination of receive time at the collector closest to the source > (which we can control the clocks on) along with current time at the system > doing the processing. Lies the source device told about it's time are kept > as-is but not believed... > A couple of things: in some ancient version of rsyslog (v2? v3? - don't really remember), we had DA queues work the way it was proposed here: when DA mode was entered, the in-memory queue was shut down, and queuing was over the disk, only, until the disk queue was empty again. Then, the disk queue was shut down and the memory queue again being used. The typical experience in practice was: as soon as the disk queue was used, the system rarely returned to in-memory mode, because the disk queue is so much slower. Actually, the system more or less became unusably slow (at least too slow for the intended purpose). This especially happened when going to disk was caused by traffic spikes. For low-volume traffic, though, this was not a real issue. But then, you could use a disk queue in the first place. The system was also very complex and had a number of robustness problems. Especially races when switching between disk and memory mode were a can of worms. Just think that you need to handle the case where a message arrives just in that instant when you originally thought you can shutdown the disk mode -- many subtle issues along this and similar cases. With the first performance enhancement project, we realized that strict message order wasn't achievable in any case, and so it made only very limited sense to try very hard to "preserve" ordering. See section 7 of [1] for more details. With that, we changed DA queue operations to what it is today. The end result was extremely better performance (even when just using the disk, due to fewer mutex locks and checks), greatly reduced complexity (IIRC roughly 40% of the queue code, the most complex one, was used for handling mode transitions), greater robustness and greatly enhanced practical usability of DA queues. So it doesn't make any sense at all to change the current system back to the pre-2009 state of affairs. Does that mean it is impossible to change the way queues work? Of course, not. In fact, I'd like to do that for quite some times. But it is a *very* big project. At least, it requires a full redesign and rewrite of the queue subsytem. It would need to work much more like the OS virtual memory, which would also remove mode transitions because those would simply be cache misses. IMO that would also dramatically speed up disk queues. But please don't let's discuss how exactly such a system would need to be designed, because I know I can't implement it in the foreseeable future and so this would just be a waste of time. The changes to the queue system would also require changes to the way batching works. All in all, I would expect that roughly 50% of the core engine would need to be redesigned and rewritten. My gut estimate is that this is a three to six month fulltime job (the original queue system took roughly 2+ month of extremely hard work). We every now and then tried to find a sponsor (anyone up for it?), but this didn't work out. Adiscon is not willing to fund that work. We even thought about doing a commercial queue extension system, so that we could spread the cost over multiple customers, but that's at least currently a no-go due to licensing. End result: I don't see it happen any time soon. The core issue is still on my internal todo list, and I try to sneak in parts of the rewrite at every possibility. But that's a very slow process and it still means we need one big time slot to rewrite the core queue system. If one really thinks this out, one may also come to the conclusion that this is not well spent time. Most of it can be achieved with current rsyslog just by configuring the system with gigantic swap space and let rsyslog use "insanely" large amounts of "main" memory for it's in-memory queue. The OS would then do exactly what a new queue system would do otherwise. The only potential problem would be system shutdown, where persisting unprocessed items to disk could take quite (too) long. Hope that clarifies, Rainer [1] http://www.gerhards.net/download/LinuxKongress2010rsyslog.pdf _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

