Re: [rsyslog] message order was: Re: plans for rsyslog 8.8

David Lang Fri, 16 Jan 2015 12:51:51 -0800

On Fri, 16 Jan 2015, Dave Caplinger wrote:

On Jan 16, 2015, at 12:25 PM, David Lang <[email protected]> wrote:
On Fri, 16 Jan 2015, Dave Caplinger wrote:
...  It would be handy if we could optionally turn off [out-of-order delivery]
for an output queue in order to deliver the queued messages in-order
even if there is an additional disk write penalty to pay (for longer).
One issue is that disk queues are very slow compared to memory queues, so it's
possible that if you force all messages to be written to the queue while you are
also pulling messages from the queue that this will slow you down so much that
you will never catch up. I think there is room for improvement here, but that
would be pretty major surgery.
I understand; I would want to test things to really understand the performancepenalty, but there are mitigating factors for some common cases as well. Forexample: filesystem buffer can help speed reading data previously written todisk if your outage was short enough to not get "too far" behind, because thedata is still actually in RAM so you don't actually have to pay physical IOPSto touch the disk to retrieve it.

the filesystem actions are the super expensive parts, even if things are cachedto ram. There are also fsyncs that take place to make the data safe, and theyforce disk IOPS

These factors are why I was thinking maybe the penalty isn't really as largeas I initially thought, for some cases at least. However, the fact that youindicate having this option would be "major surgery" to Rsyslog is dissuadingme from wanting to bother going down this path.

having an option to change the order probably isn't that bad (Rainer will haveto weigh in), but changing the disk queue itself to be more efficient would bepretty large, and it would involve a lot of care to avoid reliability problems.

To clarify, I'm not looking for *guaranteed* delivery order, just "generallyin order." We do perform event correlation, but in some cases it's withintime windows. So as you described: A followed by B followed by C, all withinT time. Having some variation around a moving "now" pointer in time is fine;the events still wind up within the same (T +/- some small variation) -widthwindow. It's when logs arrive *significantly* out of sequence that you windup having to manage state for multiple T-width windows for the same scenario,and it means you can't really be confident that you're done with a certaintime window (you can be perpetually waiting for the last event in the chain).

something to think about here, what do you use as a time reference (both for'now' and for the log message you are processing), do you use the current timeon the system doing the processing, or the timestamps in the messages.

Using the system time can cause some false positive alerts when logs arecatching up (as you have events that happened over a wide timeframe deliveredover a short timeframe), but you don't have to deal (much) with time goingbackwards

Using the timestamp in the log message gets interesting as you deal withmachines local times drifting, being in different timezones, or just plain beingwrong. And as you say, how do you know when an event is really 'too old' and youcan stop tracking it. (what if a redundant box goes down over a long weekend, doyou really want to keep the correlations open for days in case it has'interesting' combinations of events that it will finish delivering when it'sfixed??)

I tend to favor using the log processing system time. It's much easier to watchthat box and make sure it's times are correct then it is to make sure everythingis correct.


David Lang

It's certainly an edge case; normally connectivity interruptions are either"very brief" (absorbed by in-memory queue), or "short" (absorbed by DAQ for afew minutes/hours depending on log volume). But if they are very long, thetime difference between the oldest and newest logs (which are being deliveredin roughly alternating batches during the DAQ burn-down) can be quite large,like "yesterday, now, yesterday, now, yesterday..."

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] message order was: Re: plans for rsyslog 8.8

Reply via email to