in our project, we have our own logging system analogous to syslog.
it is udp based and the writer (after a broadcast) waits for at least
3 acks from other nodes
before going on. so between all the nodes, we have a very reliable
logging scheme. the question is: how does any one node's log
differ from the 'correct' log?
although in general they are the same, there are enough missing that
we built a
subsystem to aggregate and search the whole cluster's log files. in
quantitative terms,
this means for a cluster of 20 nodes, we might lose a few tens of
entries per year
(out of 150-200M entries).
On Dec 24, 2009, at 8:51 AM, Leon Towns-von Stauber wrote:
On Dec 23, 2009, at 7:27 PM, Mark McCullough wrote:
I manage the central log server framework for a large set of
servers. We use UDP. There is no evidence of significant packet
loss anywhere. Yes, older networks will have packet loss, be it TCP
or UDP. But my experience managing a hefty volume of log data is we
just don't see evidence of loss on the network.
[...]
Yes, we periodically review the logs that would be very obvious if
any events are missing, and have yet to find a missing log event.
My experience matches yours. We've had network congestion issues
affect
other things, but rarely, if ever, have we lost UDP syslog messages
due
to network issues. I haven't conducted a thorough audit to say that
it's
never happened, just occasional spot checks, but if it were a
problem at
all, the missing messages would occasionally have blown one of
numerous
multi-message correlations and been noticed, and that has never
happened.
--------------------------------------------------------------------
Leon Towns-von Stauber http://www.occam.com/leonvs/
"We have not come to save you, but you will not die in vain!"
_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/
------------------
Andrew Hume (best -> Telework) +1 732-886-1886
[email protected] (Work) +1 973-360-8651
AT&T Labs - Research; member of USENIX and LOPSA
_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/