I haven't played with this but wouldn't setting the delay in the "Notifiers Window" to something greater then twice the polling time do this for you?
On Wed, September 2, 2009 6:07 am, Andrey Gordon wrote: > Thanks for this info Bill. > Not exactly what I was looking for. For example, in case of HTTP > probes, according to the logic below, I'll get alerted the second IM > detects that the content does not match. I was interested in IM double > checking and triple checking (or w/ever the number is) that that's in > fact true before alerting and doing so before the next poll cycle. In > other words, immediately on detecting a condition that is different > from the current one. > > A good example is getting a 500 error instead of 200 OK response on a > web server late at night, when a DB maintenance job runs and > occasionally the DB can't respond to the web server on time. This > error 99% of the time clears out on next run of the probe, so alerting > on it at 3am is not very desirable. > But if in fact it stays like that for 3-4 runs of the probe I'd like > to know even at 3am. Note that 3-4 runs of the probe does not equal 2 > min (30 sec poll). If my page is out for 2 min I'm in a bigger > trouble, much bigger then if it throws a 500 error for a fraction of a > second.... > > Hopefully this makes a bit of sense :) > __________________________________________________________ > Andrey Gordon | Integrity Interactive | Network Engineer | > +1.781.398.3518 > > On Sep 1, 2009, at 3:51 PM, Bob Merrill wrote: > >> Hi Andrey, >> >> I don't know if these answer your question specifically, but I found >> them in the User Guide. >> >> Packet-based Test Procedure >> Whenever InterMapper tests a packet-based device, it uses the >> following procedure: >> >> >> 1.. InterMapper sends the appropriate probe packet (ping, SNMP get- >> request, DNS query, etc.) >> 2.. InterMapper waits the timeout interval specified for the >> particular device. >> 3.. If a response arrives, InterMapper examines its contents and >> sets the device status based on that response >> 4.. However, if no response arrives, InterMapper sends another probe >> packet >> 5.. The above procedure is repeated until a response arrives or the >> specified number of probes has been sent >> 6.. If no response has arrived after the final timeout, InterMapper >> sets the device status to Down. >> 7.. In any event, the device is scheduled to be tested again at a >> time set by the map's (or the device's) poll interval. >> The default timeout is three seconds, with a default probe count of >> three seconds. Consequently, InterMapper will take nine seconds to >> declare a device is down (three probes, waiting three seconds each). >> Both the timeout and the number of probes can be set for each device. >> >> This often gives rise to 21 second or 51 second outages. What's >> happening here is that the device fails to respond to one set of >> probes (for example, after nine seconds), but responds immediately >> at the next poll 30 or 60 seconds later. This gives an outage >> duration to be (30-9=21) seconds or (60-9=51) seconds. >> >> >> >> TCP-based probes >> Probes like "HTTP", "SMTP", and "LDAP" and others test the ability >> of a server to accept a TCP connection on a specific listening port, >> and to respond to a scripted interchange. >> >> a.. For more information on TCP-based probes see Server Probes - >> Standard . >> What Happens in a TCP-based interchange >> 1.. InterMapper first attempts to connect to the specified port at >> the device's address. >> 2.. If this connection attempt fails, InterMapper shows the device >> in the DOWN state. >> >> If InterMapper successfully connects to the listening port, >> InterMapper sends protocol-specific commands through the TCP >> connection to test the server's responses and compare them to >> expected values. >> 3.. InterMapper changes the status of the device (e.g. ALARM, >> WARNING, OKAY, DOWN) if an error condition is detected, or if the >> execution of InterMapper's probe is interrupted for any reason. >> 4.. If InterMapper doesn't receive a proper response for 60 seconds, >> or if the TCP connection is lost while waiting for a response, the >> InterMapper probe will set status of the device to the proper >> condition. >> Bob Merrill (Documentation Guy) >> >> ----- Original Message ----- From: "Andrey Gordon" <[email protected]> >> To: "InterMapper Discussion" <[email protected]> >> Sent: Tuesday, September 01, 2009 3:16 PM >> Subject: [IM-Talk] False positives protection >> >> >>> You can probably tell I got back to tweaking IM again..... >>> >>> Anyway, I was curious if there is a mechanism that is built in >>> into IM server to double check the failures. For example, if a >>> probe detects a condition that normally triggers an alert (let's >>> say Alarm). Does InterMapper double check that the condition in >>> fact exists or it was just a blip, glitch, etc before alerting? >>> >>> I'd be interested to see a feature that allows control over that >>> kind of behavior. Let's say, just like under Notifiers one can >>> control the number of repeating alerts, it would be nice to >>> control there number of times the condition has to be triggered >>> before alerts goes out. >>> >>> The reason this is important is so we could make sure that the >>> condition is there before sending an alert to a pager at 3am. I'd >>> like to be able to setup alerts going to email after one >>> occurrence and to pager after, let's say 3. Many times, but the >>> time I crawl out of bed to see what it is, it already cleared. >>> >>> Setting it up with delayed notification kind of does the job, but >>> not really. All it does is delays the notification for a minimum of >>> a minute, which with default probe timers only gives it an extra >>> one time to probe it. >>> >>> I'd like to see Intermapper after detecting a condition being able >>> to verify it a controllable number of times before sending an alert >>> to a give notifier. >>> >>> Consider this a feature request, but I'd still want to understand >>> what the logic of this behavior is today as implemented. I'm >>> suspecting many on this list would be interested to hear how IM >>> behaves on condition detection.... >>> >>> Thank you, >>> Andrey >>> __________________________________________________________ >>> Andrey Gordon | Integrity Interactive | Network Engineer | >>> +1.781.398.3518 >>> >>> ____________________________________________________________________ >>> List archives: >>> http://www.mail-archive.com/intermapper-talk%40list.dartware.com/ >>> To unsubscribe: send email to: [email protected] >>> >> >> >> ____________________________________________________________________ >> List archives: >> http://www.mail-archive.com/intermapper-talk%40list.dartware.com/ >> To unsubscribe: send email to: [email protected] >> > > ____________________________________________________________________ > List archives: > http://www.mail-archive.com/intermapper-talk%40list.dartware.com/ > To unsubscribe: send email to: [email protected] > > -- Paul Carlson VP Network Administration Guam Cablevision, LLC. Phone: 671-969-5012 ____________________________________________________________________ List archives: http://www.mail-archive.com/intermapper-talk%40list.dartware.com/ To unsubscribe: send email to: [email protected]
