Rather than jumping on the "Me too" bandwagon (mine was a little shy of 200ms it looks)...
> If this gets bad enough such that all servers get kicked out of the > pool and there's nobody left, then there's a problem to be fixed. This got me thinking... As I understand the way the monitor works, if whatever happened this morning had lasted longer, the scenario you mentioned could happen. Even the best providers (like Internap) experience random problems from time to time, which, if they lasted long enough, could bring everyone's score down below the threshold, leaving the monitor to merrily purge the pool of all its members. Do safeguards exist right now to prevent that? Multiple monitoring pools, taking the lowest offset from each (an average would still allow this problem if one host was having problems putting the offset higher than threshold * number_of_hosts), could solve this problem, but a less-involved way might be to simply have a little code watching total trends. You could calculate the average offset of all hosts per monitoring cycle (which would be an interesting graph anyway, actually), which, with the number of hosts involved ought to be fairly low and, more importantly, fairly consistent. An even lazier solution might be to monitor the number of hosts in the pool, and watch for drops in that. In either case (a spike in average offsets, or a precipitous drop in pool members), the system should suspend purging of pool members until someone can intervene and straighten out whatever's happened. In theory the average offset could be subtracted from individual offsets when computing score, but that might cause problems of its own... I just have in mind a quick little sanity check to prevent freak network failures from emptying the pool. _______________________________________________ timekeepers mailing list [email protected] https://fortytwo.ch/mailman/cgi-bin/listinfo/timekeepers
