I've been setting up mon to watch a group of servers.  As part of this
setup, I've got a perl log watching program watching the system logs and
sending traps to mon when relevant lines appear.  Everything seems to work
beautifully until I try to send a lot of traps to mon quickly, at which
point mon starts simply dropping them en masse.  To verify this behavior, I
used Mon::Client to write a simple perl script to send 100 traps to a mon
service with a test.alert.  In repeated runs, the resulting test.alert.log
ended up with anywhere from 52 to 96 entries, meaning that mon dropped up to
almost half of the traps.  Single runs with 1,000 and 10,000 traps resulted
in 120 and 495 entries in the mon log, respectively.  I also tried running
these tests using localhost as the logging host to rule out network problems
and got similar results.

I realize that UDP is an unreliable protocol and that logging services
generally use UDP to avoid hanging and missing something important
happening, but the above behavior totally rules out using the mon trap stuff
in any situation where missing an alert would be Bad Thing (which I would
think would be almost any situation in which mon traps are used).

Is there something I'm missing here ?  Is there some way to verify that
traps are really getting through the server ? 

-hal

Reply via email to