Hi Greg Thanks for answering. I configured your tree level steps and will see what's happening :)
Running nagios also on the RB-pi2 is to heavy i guess. Let's monitor and see what the alarms are providing. Thanks for your help Groeten Rob de Hoog Verstuurd vanaf mijn iPhone > Op 3 jul. 2015 om 22:04 heeft Gregory Sloop <[email protected]> het volgende > geschreven: > > Putting it back on-list. [you emailed me directly - no offense taken - but > it's better to be on-list...] > > Ok... > > Heavy load on a pipe, at least in my experience, doesn't cause much packet > loss. It will however increase latency. So, I think a test like > >10%,*5*,>10%,*5*,>10%,*5* [hopefully there no syntax errors there...] > [meaning any three losses of >10% over 3-30m would trigger things.] > > I use: > >15%, *3* ,>15%, *3* ,>15% > >30%, *10* ,>30%, *10*, >30%, *10*, >30% > >50%, *10* ,>50%, *10*, >50%, *10*, >50% > ...for a first, second and third level trigger. > > But - I only use the triggers to generate an MTR - the MTR comes in very > handy in arguments with providers [Hello Comcast] when they claim the problem > must be somewhere else other than their network. (Though to be fair, the > tendency to blame someone else is a *very* strong one in most > help-desk/support situations. And it so pisses me off!) The MTR script runs > an MTR trace of that path, and emails me the result. > > I do all my *alerts* with Nagios - using the smokeping plugin. > In those cases, I use something like a warning for >10% loss or more for 5m, > and critical with >40$ for 5m. [Nagios doesn't use an elaborate trigger > system like Smokeping. But I don't get many false-positives with either > setup. YMMV.] > > Using Nagios allows me to more carefully manage alerts and when/how/where > they're delivered - which makes life a lot easier. For example - no need to > buzz me about a non critical backup link at 2a. But when the smelly stuff > does hit the fan, I'll get the alerts I need. So, I abandoned using Smokeping > for alerts quite a number of years ago. > > I could dig up specifics if you really need them, but that's what I recall > off the top of my head. > > Cheers, > Greg > > > Hi Greg > > thanks for your email, agree on make it simpeler. > > The goal i want is to monitor a serious problem asap, but prevent from false > positives generated by people congesting the line with huge downloads. > > How would you solve this, do you monitor such and would you be able to share > the configuration? > > thanks Rob > > > Op 3 jul. 2015, om 18:08 heeft Gregory Sloop <[email protected]> het volgende > geschreven: > > Re: [smokeping-users] pocketless alarm question, sending to soon the alarm > emails. > A few thoughts - though not exactly an answer to your question: > > 1) I'm often too dense to grok out why more elaborate triggers don't work the > way I want, and your falls into that category. But this seems, IMO, to be a > very common problem. So, my solution: Make them simpler. Way simpler. [It's > kind of like fancy regexs - I think I'm *so* clever and pat myself on the > back. But then I actually run that regex against more real-world data, and > well, it doesn't end pretty... So, I usually go back to - simpler is better, > unless it's impossible to solve otherwise.] > > 2) In your case - do you really want to wait 75 minutes to find out a > connection is completely down? [Or perhaps you have another trigger that does > that.] But even if you do - this is just my opinion - but loss greater than > 10% over more than 10-15 minutes is a sign of a _serious_ problem. So, I have > simple triggers that let me know if I have even modest loss over fairly short > periods of time. Yes, that can increase the number of alerts you see - but if > you've got a connection with that many problems, you need to address the > underlying problem, not just chirp at you about it less often. > > 3) Yes, at first glance, your pattern appears right. However, I think the > *25* means *up to* 25 samples. [0-25] > So, a 10% loss, followed by another 10% loss the very next sample will match > a pattern of: >10%,*25*,>10% or it will also match > A 10% loss, followed by another 10% loss with 1-25 samples between them with > <10% loss. > > So your example will also trigger with 4 subsequent samples of >10% loss each > [i.e. over 4 sample periods]. It would also trigger in the conditions you > envision - (1) >10% loss sample, and then another > 10% loss, 25 samples > later, and another 25 samples later etc... > > Again, I think less fancy is more likely to produce a result that's still > useful and a lot less trouble to test and verify it works in the conditions > you envision. > > HTH > > -Greg > > > RdH> Hi team > > RdH> I hope you are doing great today? > > RdH> what a great tool! Love to run it on my RBp2 and monitor the > RdH> internet connection! I have a small question but i can’t solve it > RdH> myself would you try to help me? > > RdH> The goal is to have an alert when the internetline is > RdH> experiencing packetloss, but for a longer time not on every > RdH> glitch. Im using FPingnormal with a step of 180 (means 3 minutus) > > RdH> The loss pattern i defined is like : > RdH> 10%,*25*,>10%,*25*,>10%,*25*,>10% which means based on the how > RdH> to’s provided on the website: Take 25 samples (which is 25*3 > RdH> minutes) so if the packetloss exists from start -> 75 minutes > RdH> later still>10% -> 75 minutes later still >10% and another 75 > RdH> minutes later still >10% send out an email. > > RdH> But it send out an email almost directly, please see the > RdH> screenshots where the packets loss starts and when i received the > RdH> email, that timeframe is not even close to 75 minutes but more like a > few minutes. > > RdH> Could you advise how to make the packetloss alarm more reliable > RdH> where it last for at least 1 hour before sending out the email? > > RdH> Many thanks! Cheers Rob > > RdH> how the system works now (screenshots where to big) > > RdH> -packetloss started about 07:40 stable around 07:55 and started again > 08:03 > RdH> -email alarm received around 08:07 after the second block of packets > loss > RdH> -email cleared received after 3 minutes around 08:10 > > RdH> _______________________________________________ > RdH> smokeping-users mailing list > RdH> [email protected] > RdH> https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users > > -- > Gregory Sloop, Principal: Sloop Network & Computer Consulting > Voice: 503.251.0452 x82 > EMail: [email protected] > http://www.sloop.net > --- > _______________________________________________ > smokeping-users mailing list > [email protected] > https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users > > > -- > Gregory Sloop, Principal: Sloop Network & Computer Consulting > Voice: 503.251.0452 x82 > EMail: [email protected] > http://www.sloop.net > --- > _______________________________________________ > smokeping-users mailing list > [email protected] > https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
_______________________________________________ smokeping-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
