Yo, Something I've been wondering about for a while is a class of bug that we have with clock skew.
Memcached handles the expiration timer via: - "process_started" timestamp in seconds, which gets initialized at startup - "current_time" which, once per second, gets set to the delta between the current time and "process_started" If your clock swings around wildly there're a few situations where you could potentially end up with items expiring immediately or never, such as current_time ending up underflowing. A couple easy ideas off the top of my head that would drop some accuracy for avoiding timers (and any cross-platform timer idiocy): - Ditch "proess_started" and kick a counter at 0. Every second the current_time would be incremented by 1. A relative timeout of "60 seconds from now" would be set to "current_time + 60" as it presently is. We'd have to do something special for date formatted expirations. Potentially by noting the exact time once on startup and using that to delta against a provided date to provide the delta-in-seconds. The latter can still be influenced by bad clock, but maybe not as noticable and the feature is less used. - Add some sanity checks in the clock update function, which will fall back to incrementing by 1 if it detects a significant clock correction forward, or if it's gone back in time. Still uses gettimeofday() unless something goes wrong, keeps plodding forward less accurately when something does go wrong. - Use some anti-clock-skew magic that maybe libevent uses. Need to research more options :P Anyone care? The increase in the number of these types of reports is getting obnoxious, and cloud computing's god-awful-ness can only make it worse. -Dormando
