Hi Greg

Thanks for answering. I configured your tree level steps and will see what's 
happening :)

Running nagios also on the RB-pi2 is to heavy i guess. 

Let's monitor and see what the alarms are providing. 

Thanks for your help


Groeten Rob de Hoog
Verstuurd vanaf mijn iPhone

> Op 3 jul. 2015 om 22:04 heeft Gregory Sloop <[email protected]> het volgende 
> geschreven:
> 
> Putting it back on-list. [you emailed me directly - no offense taken - but 
> it's better to be on-list...]
> 
> Ok...
> 
> Heavy load on a pipe, at least in my experience, doesn't cause much packet 
> loss. It will however increase latency. So, I think a test like 
> >10%,*5*,>10%,*5*,>10%,*5* [hopefully there no syntax errors there...]
> [meaning any three losses of >10% over 3-30m would trigger things.]
> 
> I use: 
> >15%, *3* ,>15%, *3* ,>15%
> >30%, *10* ,>30%, *10*, >30%, *10*, >30%
> >50%, *10* ,>50%, *10*, >50%, *10*, >50%
> ...for a first, second and third level trigger.
> 
> But - I only use the triggers to generate an MTR - the MTR comes in very 
> handy in arguments with providers [Hello Comcast] when they claim the problem 
> must be somewhere else other than their network. (Though to be fair, the 
> tendency to blame someone else is a *very* strong one in most 
> help-desk/support situations. And it so pisses me off!) The MTR script runs 
> an MTR trace of that path, and emails me the result.
> 
> I do all my *alerts* with Nagios - using the smokeping plugin.
> In those cases, I use something like a warning for >10% loss or more for 5m, 
> and critical with >40$ for 5m. [Nagios doesn't use an elaborate trigger 
> system like Smokeping. But I don't get many false-positives with either 
> setup. YMMV.]
> 
> Using Nagios allows me to more carefully manage alerts and when/how/where 
> they're delivered - which makes life a lot easier. For example - no need to 
> buzz me about a non critical backup link at 2a. But when the smelly stuff 
> does hit the fan, I'll get the alerts I need. So, I abandoned using Smokeping 
> for alerts quite a number of years ago.
> 
> I could dig up specifics if you really need them, but that's what I recall 
> off the top of my head.
> 
> Cheers,
> Greg
> 
> 
> Hi Greg 
> 
> thanks for your email, agree on make it simpeler.
> 
> The goal i want is to monitor a serious problem asap, but prevent from false 
> positives generated by people congesting the line with huge downloads.
> 
> How would you solve this, do you monitor such and would you be able to share 
> the configuration?
> 
> thanks Rob
> 
> 
> Op 3 jul. 2015, om 18:08 heeft Gregory Sloop <[email protected]> het volgende 
> geschreven:
> 
> Re: [smokeping-users] pocketless alarm question, sending to soon the alarm 
> emails. 
> A few thoughts - though not exactly an answer to your question:
> 
> 1) I'm often too dense to grok out why more elaborate triggers don't work the 
> way I want, and your falls into that category. But this seems, IMO, to be a 
> very common problem. So, my solution: Make them simpler. Way simpler. [It's 
> kind of like fancy regexs - I think I'm *so* clever and pat myself on the 
> back. But then I actually run that regex against more real-world data, and 
> well, it doesn't end pretty... So, I usually go back to - simpler is better, 
> unless it's impossible to solve otherwise.] 
> 
> 2) In your case - do you really want to wait 75 minutes to find out a 
> connection is completely down? [Or perhaps you have another trigger that does 
> that.] But even if you do - this is just my opinion - but loss greater than 
> 10% over more than 10-15 minutes is a sign of a _serious_ problem. So, I have 
> simple triggers that let me know if I have even modest loss over fairly short 
> periods of time. Yes, that can increase the number of alerts you see - but if 
> you've got a connection with that many problems, you need to address the 
> underlying problem, not just chirp at you about it less often.
> 
> 3) Yes, at first glance, your pattern appears right. However, I think the 
> *25* means *up to* 25 samples. [0-25]
> So, a 10% loss, followed by another 10% loss the very next sample will match 
> a pattern of: >10%,*25*,>10% or it will also match 
> A 10% loss, followed by another 10% loss with 1-25 samples between them with 
> <10% loss.
> 
> So your example will also trigger with 4 subsequent samples of >10% loss each 
> [i.e. over 4 sample periods]. It would also trigger in the conditions you 
> envision - (1) >10% loss sample, and then another > 10% loss, 25 samples 
> later, and another 25 samples later etc... 
> 
> Again, I think less fancy is more likely to produce a result that's still 
> useful and a lot less trouble to test and verify it works in the conditions 
> you envision.
> 
> HTH
> 
> -Greg
> 
> 
> RdH> Hi team
> 
> RdH> I hope you are doing great today?
> 
> RdH> what a great tool! Love to run it on my RBp2 and monitor the
> RdH> internet connection! I have a small question but i can’t solve it
> RdH> myself would you try to help me?
> 
> RdH> The goal is to have an alert when the internetline is
> RdH> experiencing packetloss, but for a longer time not on every
> RdH> glitch. Im using FPingnormal with a step of 180 (means 3 minutus) 
> 
> RdH> The loss pattern i defined is like :
> RdH> 10%,*25*,>10%,*25*,>10%,*25*,>10% which means based on the how
> RdH> to’s provided on the website: Take 25 samples (which is 25*3
> RdH> minutes) so if the packetloss exists from start -> 75 minutes
> RdH> later still>10% -> 75 minutes later still >10% and another 75
> RdH> minutes later still >10% send out an email.
> 
> RdH> But it send out an email almost directly, please see the
> RdH> screenshots where the packets loss starts and when i received the
> RdH> email, that timeframe is not even close to 75 minutes but more like a 
> few minutes.
> 
> RdH> Could you advise how to make the packetloss alarm more reliable
> RdH> where it last for at least 1 hour before sending out the email?
> 
> RdH> Many thanks! Cheers Rob
> 
> RdH> how the system works now (screenshots where to big)
> 
> RdH> -packetloss started about 07:40 stable around 07:55 and started again 
> 08:03
> RdH> -email alarm received around  08:07 after the second block of packets 
> loss
> RdH> -email cleared received  after 3 minutes around 08:10
> 
> RdH> _______________________________________________
> RdH> smokeping-users mailing list
> RdH> [email protected]
> RdH> https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
> 
> -- 
> Gregory Sloop, Principal: Sloop Network & Computer Consulting
> Voice: 503.251.0452 x82
> EMail: [email protected]
> http://www.sloop.net
> ---
> _______________________________________________
> smokeping-users mailing list
> [email protected]
> https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
> 
> 
> -- 
> Gregory Sloop, Principal: Sloop Network & Computer Consulting
> Voice: 503.251.0452 x82
> EMail: [email protected]
> http://www.sloop.net
> ---
> _______________________________________________
> smokeping-users mailing list
> [email protected]
> https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
_______________________________________________
smokeping-users mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users

Reply via email to