Hi, We use smokeping to monitor a number of hosts on various networks. We have a master with a handful of slaves which monitor various sites.
This morning we had an outage which affected one of those sites, but the slaves which were monitoring the site that went down, failed to report any data at all for any networks - even if they were reachable from that network. Communications between the master/slaves were not affected. The affected slaves were reporting this message: WARNING Master said 500 read timeout While the master had messages like: RRDs::update ERROR: /var/lib/smokeping/rrd/slave/slave~site1.rrd: illegal attempt to update using time 1253201797 when last update time is 1253201797 (minimum one second step) All machines are running smokeping 2.4.2. Any ideas? The only thing I can think of is that DNS for the site that went down was also down so the master timed out trying to look it up the site's IP address? Thanks Dave _______________________________________________ smokeping-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
