Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 389 by [email protected]: internal clock drift from system time
https://code.google.com/p/memcached/issues/detail?id=389

Memcached's internal monotonic clock keeps drifting away from system time.

Worse, different memcached machines in the same cluster will also drift away from each other. This causes variable key validity across the cluster depending on the machine the key will be stored on.

What steps will reproduce the problem?
Run memcached until its internal clock has significantly drifted into the future (say +60s). Store a key with 40s validity (using system time as base of course).

What is the expected output? What do you see instead?
Expected: Key is stored and will be valid for 40s
Instead: Key is not stored and silently discarded. Caching fails.

My suggestion: Implement
A: smooth drift correction (like that's used with ntp) to not cause time jumps B: a memcached parameter that forces memcached to use system time and warn users of the consequences if this is switched on

I could live with B, as we use ntp with smooth correction.

What version of the product are you using? On what operating system?
SLES11 + memcached 1.4.15

One of our memcached machines in the cluster has a rather big time drift compared to all others (which is corrected by ntp): +5s into the future per 24h. This leads to a steady decrease in the hit rate on that memcache. Eventually the caching completely fails after just one week for a certain key-value that needs 30s validity which is stored from other hosts (that generate expiration dates using their ntp-synchronized system time)

As a result we currently could
A. restart all memcached instances across the cluster every 4 days (not feasible)
B. increase the validity period for that particular key (not possible)
C. Set up a single alternate memcached just for that key type (ugly and not failsafe) that's safe to be restarted
D. not use that server

We'd probably need to implement C, because A will insta-kill our app at particular times (cold cache = bad hit-rate). However, none of these solutions fixes the problem at its core.

# memcached-tool localhost:11211 stats| grep \ time | awk '{print $2}'; date +%s
1418899825
1418899819
#memcached-tool localhost:11211 stats | grep uptime
    uptime      106134

--
You received this message because this project is configured to send all issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

--

--- You received this message because you are subscribed to the Google Groups "memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to