We've been battling an issue for a while now (in fact not sure quite how
long it's been going on). It appears that keys are being evicted before
their expiration time, however when we check stats we see "evicted: 0" on
all slabs. To track this down we've added logging (see below) for a
particular key to track activity.
Server Date Time Request Key Message 10.16.18.35 2010-Jun-23
1:00:46 15463
popularProducts:2010-06-21:collegiate:en:7:image_wall:0 Get failed
10.16.18.35 2010-Jun-23 1:00:47 15463
popularProducts:2010-06-21:collegiate:en:7:image_wall:0 Set to expire in
86400 seconds 10.16.18.24 2010-Jun-23 9:58:29 55378
popularProducts:2010-06-21:collegiate:en:7:image_wall:0 Get failed
10.16.18.24 2010-Jun-23 9:58:43 55378
popularProducts:2010-06-21:collegiate:en:7:image_wall:0 Set to expire in
86400 seconds 10.16.18.19 2010-Jun-23 10:10:54 87555
popularProducts:2010-06-21:collegiate:en:7:image_wall:0 Get failed
10.16.18.19 2010-Jun-23 10:11:08 87555
popularProducts:2010-06-21:collegiate:en:7:image_wall:0 Set to expire in
86400 seconds 10.16.18.14 2010-Jun-23 10:40:57 55731
popularProducts:2010-06-21:collegiate:en:7:image_wall:0 Get failed
10.16.18.14 2010-Jun-23 10:41:15 55731
popularProducts:2010-06-21:collegiate:en:7:image_wall:0 Set to expire in
86400 seconds
This is a small snippet on just one key, but it is representative of what
we're seeing. As you can see we're setting the key to expire in one day
however subsequent GETs are failing well before the 1 day mark... We thought
that perhaps we were reaching memory limits and hitting LRU, however that
doesn't appear to be the case either; we're using ~18% of memory allocated
and a "stats items" call shows evicted and evicted_time are 0 for all slabs.
We've tried switching pools around to eliminate bad ram as a cause, we've
even run a pool on a local machine to eliminate network related issues and
we see the same symptoms.
Any thoughts on what might be going on here?
As for vitals/system config:
Here's a recent stats dump:
STAT pid 19986
STAT uptime 8526687
STAT time 1277416425
STAT version 1.2.8
STAT pointer_size 64
STAT rusage_user 1510.430396
STAT rusage_system 3328.800037
STAT curr_items 313957
STAT total_items 0
STAT bytes 449831564
STAT curr_connections 1025
STAT total_connections 1
STAT connection_structures 1276
STAT cmd_flush 0
STAT cmd_get 14
STAT cmd_set 0
STAT get_hits 14
STAT get_misses 0
STAT evictions 0
STAT bytes_read 491
STAT bytes_written 18449
STAT limit_maxbytes 2147483648
STAT threads 17
STAT accepting_conns 1
STAT listen_disabled_num 0
STAT delete 16755
STAT replace 6515615
We're connecting to memcached through PHP 4.4.4 using the PECL memcache
extension configured thusly:
memcache support => enabled
memcache.allow_failover => 1 => 1
memcache.chunk_size => 8192 => 8192
memcache.default_port => 11211 => 11211
memcache.hash_function => crc32 => crc32
memcache.hash_strategy => consistent => consistent
memcache.max_failover_attempts => 20 => 20