Hi David:

In message <alpine.deb.2.02.1601041043310.9...@nftneq.ynat.uz>,
David Lang writes:
>has anyone put together the code that would be needed to detect if sec or log 
>delivery is falling behind? something along the order of 'if the timestamp in 
>the logs is > X min behind current, alert'?

To detect slowdown in sec process, I didn't look at the timestamps,
instead I compared the last lines in the input file to the last
processed lines in SEC's buffer.

Basic idea was discussed on

   http://sourceforge.net/p/simple-evcorr/mailman/message/30277509/

This is split for readability:

compare=50
sudo sh -c "tail -n $compare /data/log/system/messages > /tmp/a;
            /etc/init.d/sec_rsyslog_master_mon dump | 
                sed -ne '/Content of input buffer/,+101p' | 
                tail -n $compare | 
            comm -13 /tmp/a - | wc -l"

where /data/log/system/messages is the input file that sec follows and
syslog writes. Grab the last 50 lines, then dump sec's internal buffer
state. Use sec to extract the lines in the buffer. Use tail to cut the
100 lines in the buffer to 50. Then use comm to see how many lines are
only in the sec input file. The number of lines only in sec is the
number of lines sec is behind the current input.

Then to detect a delay in the transport, I have a job on each host
that sends a heartbeat every 10 minutes and I include the time_t value
in the heartbeat.

Then process it with:.

#****f* 10timestamp.sr/detect_process_delay
# SYNOPSIS
# Compare the time_t timestamp in the messages to current time to detect delays
# DESCRIPTION
# The current heartbeat messages include a timestamp in time_t
# (seconds since Jan 1 1970). Use that time and compare to %u the
# current process time for this rule. If > 10 seconds add it to a
# context that is reported every 10 minutes. This rule passes the
# event through to any additional rules in this ruleset for
# consumption.
# INPUTS
# Sample input:
#    May  8 16:50:01 example02 heartbeat: 2010/05/08 16:50:01 \
#        (1273337401) -- HEARTBEAT --
# NOTES
# None.
#******
type= single
desc= Detect delays in syslog event transport/process
continue= takenext
ptype= regexp
rem = $1 is hostname $2 is time_t on host where heartbeat was generated
pattern= ([A-z0-9._-]+) heartbeat:.*\((\d+)\) -- HEARTBEAT --$
action= add delayed_heartbeat_events $0 %u
rem = use perl function here to get time. %u doesn't work.
context = !seeding_timestamps && !ignore_delay_$1 && =( $2 + 10 < time() )

to detect when transport is delayed. This just accumulates issues and
reports them every 10 minutes.

--
                                -- rouilj
John Rouillard
===========================================================================
My employers don't acknowledge my existence much less my opinions.

------------------------------------------------------------------------------
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Reply via email to