On Tue, 9 Sep 2014, Gareth Bult wrote:
Hi Rainer,
Many thanks for looking, I appreciate you're busy.
If it looked trivial I might've tried to patch it, but it "looks" like
it's pulling from the queue and then running the send plugins, so my initial
impression is that various bits of code need reordering - which is too much
for me. I would guess it needs to be peeking the queue and only de-queueing
once all the output modules have been satisfied ..
It's interesting how things develop, back in "the good-old-days" central logging
was useful to spotting problems without sshing to lots of boxes, and some data
loss / the use of udp was quite acceptable. Today however, people seem to be
using it for collecting 'important' information where 100% accuracy and log
signing are critical .. a paradigm shift in "use-case" really ...
been there, done that, and found that people didn't really want what they
claimed they wanted :-)
Whatever you do with your logging system, at some point it is going to break
(disk fills up, fails, etc)
A question that you have to ask your users/management is "what do you want to
happen when a log cannot be written?" If the anwer is that they would rather
have the application fail and present the user with an error than to take an
action that's not logged, then they are potentially a candidate for what I call
"Audit grade logging". Keep in mind that the application includes login and ssh
if you do this to all logs.
When you shift to using Audit Grade logging, things slow down a LOT, something
on the order of 1000x. I was doing benchmarking of this a few years ago, and
with a high end PCI SSD drive, I was able to get between 2-8K logs/sec
(depending on filesystem, ext3 being 2k) compared to 400K logs/sec on the same
system with a simple SATA driver for normal logging.
Also think about failure modes of the application. If it logs before it takes
the action, then something may happen before the action is taken and the log is
telling you that something happened that didn't.
If the application takes an action and the logs it, it may take the action and
then die before sending out the log.
As a result, the application really needs to double log.
First, log "I intend to take action X", then take the action and log "I
succeeded/failed to take action X". You then need to watch for the first message
without the second and investigate if the action did or did not take place in
those cases.
If you are still wanting to pursue this, we can talk more and get into more
details about what this requires.
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.