As someone who recently had to implement a monitor of not-smokeping processes, might I suggest “monit”? It is a fairly mainstream package that is readily available in yum and apt-get repos.
Monit is a locally-installed (ie per slave) daemon process that can monitor files (by timestamp or checksum), processes (by PID), programs (by exit code), and system (by resource consumption). It has a flexible config language that can alert/start/stop/exec based on those monitor conditions. I could see monit being used to watch each slave and alert and/or auto-restart the data collection. —bill > On Apr 22, 2018, at 11:29 AM, Gregory Sloop <[email protected]> wrote: > > This is an awesome idea - and one I've wished for in the past - but never got > around to working on. > Checking the slave data files modification times seems plausible as a way to > check updates - but you'd have to test to be sure. [IIRC that will work > though.] > > Personally, I'd probably try to write it in bash - or something completely > external to smokeping. [Bash because of few dependicies - though you'll > probably want/need something like sendemail for email notifications... > > If slaves are behind NAT or something similar, you'll have to have a way to > get to the slave for handling a restart, but that's really outside the scope > of what you're doing. > > Honestly, simply getting notification that a slave is not pushing updates > would be more than enough - even without the restart. > > Sounds fab to me. And I can't think of a better way, off hand. > > -Greg > > > Hello, > > I have a Debian Jessie box with Smokeping 2.6 installed on it. > > It receives data from Slaves over the Internet (10 slaves or so). > Each Slave roughly monitors xDSL or fiber links. > > Every monday, I can see that data from one or two slaves is missing. > Then I remotely restart smokeping service on slave where data is missing. > > I would like to implement something like: > > - if no data at all from Slave for a given period of time, then restart > Slave's smokeping service and send a Notice email > > - if no data at all from Slave for a longer period of time and Slave's > restart already attempted, then send a Warning email > > As Slaves data is stored on a known directory ins Master's filesystem, I > think I can detect when data from a slave has not been lately modified, > reading directories of files modification times. > > Is there a better way to do so ? Alert's settings seem more appropriate when > WAN links in my case, are slower. > > Best regards > > > -- > Gregory Sloop, Principal: Sloop Network & Computer Consulting > Voice: 503.251.0452 x82 > EMail: [email protected] > http://www.sloop.net > --- > _______________________________________________ > smokeping-users mailing list > [email protected] > https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
_______________________________________________ smokeping-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
