Putting it back on-list. [you emailed me directly - no offense taken - but it's 
better to be on-list...]

Ok...

Heavy load on a pipe, at least in my experience, doesn't cause much packet 
loss. It will however increase latency. So, I think a test like 
>10%,*5*,>10%,*5*,>10%,*5* [hopefully there no syntax errors there...]
[meaning any three losses of >10% over 3-30m would trigger things.]

I use: 
>15%, *3* ,>15%, *3* ,>15%
>30%, *10* ,>30%, *10*, >30%, *10*, >30%
>50%, *10* ,>50%, *10*, >50%, *10*, >50%
...for a first, second and third level trigger.

But - I only use the triggers to generate an MTR - the MTR comes in very handy 
in arguments with providers [Hello Comcast] when they claim the problem must be 
somewhere else other than their network. (Though to be fair, the tendency to 
blame someone else is a *very* strong one in most help-desk/support situations. 
And it so pisses me off!) The MTR script runs an MTR trace of that path, and 
emails me the result.

I do all my *alerts* with Nagios - using the smokeping plugin.
In those cases, I use something like a warning for >10% loss or more for 5m, 
and critical with >40$ for 5m. [Nagios doesn't use an elaborate trigger system 
like Smokeping. But I don't get many false-positives with either setup. YMMV.]

Using Nagios allows me to more carefully manage alerts and when/how/where 
they're delivered - which makes life a lot easier. For example - no need to 
buzz me about a non critical backup link at 2a. But when the smelly stuff does 
hit the fan, I'll get the alerts I need. So, I abandoned using Smokeping for 
alerts quite a number of years ago.

I could dig up specifics if you really need them, but that's what I recall off 
the top of my head.

Cheers,
Greg


Hi Greg 

thanks for your email, agree on make it simpeler.

The goal i want is to monitor a serious problem asap, but prevent from false 
positives generated by people congesting the line with huge downloads.

How would you solve this, do you monitor such and would you be able to share 
the configuration?

thanks Rob


Op 3 jul. 2015, om 18:08 heeft Gregory Sloop <[email protected]> het volgende 
geschreven:

Re: [smokeping-users] pocketless alarm question, sending to soon the alarm 
emails. 
A few thoughts - though not exactly an answer to your question:

1) I'm often too dense to grok out why more elaborate triggers don't work the 
way I want, and your falls into that category. But this seems, IMO, to be a 
very common problem. So, my solution: Make them simpler. Way simpler. [It's 
kind of like fancy regexs - I think I'm *so* clever and pat myself on the back. 
But then I actually run that regex against more real-world data, and well, it 
doesn't end pretty... So, I usually go back to - simpler is better, unless it's 
impossible to solve otherwise.] 

2) In your case - do you really want to wait 75 minutes to find out a 
connection is completely down? [Or perhaps you have another trigger that does 
that.] But even if you do - this is just my opinion - but loss greater than 10% 
over more than 10-15 minutes is a sign of a _serious_ problem. So, I have 
simple triggers that let me know if I have even modest loss over fairly short 
periods of time. Yes, that can increase the number of alerts you see - but if 
you've got a connection with that many problems, you need to address the 
underlying problem, not just chirp at you about it less often.

3) Yes, at first glance, your pattern appears right. However, I think the *25* 
means *up to* 25 samples. [0-25]
So, a 10% loss, followed by another 10% loss the very next sample will match a 
pattern of: >10%,*25*,>10% or it will also match 
A 10% loss, followed by another 10% loss with 1-25 samples between them with 
<10% loss.

So your example will also trigger with 4 subsequent samples of >10% loss each 
[i.e. over 4 sample periods]. It would also trigger in the conditions you 
envision - (1) >10% loss sample, and then another > 10% loss, 25 samples later, 
and another 25 samples later etc... 

Again, I think less fancy is more likely to produce a result that's still 
useful and a lot less trouble to test and verify it works in the conditions you 
envision.

HTH

-Greg


RdH> Hi team

RdH> I hope you are doing great today?

RdH> what a great tool! Love to run it on my RBp2 and monitor the
RdH> internet connection! I have a small question but i can’t solve it
RdH> myself would you try to help me?

RdH> The goal is to have an alert when the internetline is
RdH> experiencing packetloss, but for a longer time not on every
RdH> glitch. Im using FPingnormal with a step of 180 (means 3 minutus) 

RdH> The loss pattern i defined is like :
RdH> 10%,*25*,>10%,*25*,>10%,*25*,>10% which means based on the how
RdH> to’s provided on the website: Take 25 samples (which is 25*3
RdH> minutes) so if the packetloss exists from start -> 75 minutes
RdH> later still>10% -> 75 minutes later still >10% and another 75
RdH> minutes later still >10% send out an email.

RdH> But it send out an email almost directly, please see the
RdH> screenshots where the packets loss starts and when i received the
RdH> email, that timeframe is not even close to 75 minutes but more like a few 
minutes.

RdH> Could you advise how to make the packetloss alarm more reliable
RdH> where it last for at least 1 hour before sending out the email?

RdH> Many thanks! Cheers Rob

RdH> how the system works now (screenshots where to big)

RdH> -packetloss started about 07:40 stable around 07:55 and started again 08:03
RdH> -email alarm received around  08:07 after the second block of packets loss
RdH> -email cleared received  after 3 minutes around 08:10

RdH> _______________________________________________
RdH> smokeping-users mailing list
RdH> [email protected]
RdH> https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users

-- 
Gregory Sloop, Principal: Sloop Network & Computer Consulting
Voice: 503.251.0452 x82
EMail: [email protected]
http://www.sloop.net
---
_______________________________________________
smokeping-users mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users

-- 
Gregory Sloop, Principal: Sloop Network & Computer Consulting
Voice: 503.251.0452 x82
EMail: [email protected]
http://www.sloop.net
---
_______________________________________________
smokeping-users mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users

Reply via email to