Can you explain what you mean by "expire" ? You've gone into great detail about needing it, but I don't see how explicitly expiring items is doing you get good over letting them be evicted? Or by setting more sane expiration times?
There is expire in the protocol already? Just not.... mass expire? Automatic (and possibly manual) slab reassignment should happen in the coming months as we get the storage engine interface pushed out and some SE's in use. For now you should use 1.4.0, monitor each slab for higher eviction rates and/or low hit rates, and if it becomes a problem, slow roll a restart of your memcached's occasionally. Most folks almost never have to do this. Are you perhaps talking about something like needing namespace-level expiration? http://code.google.com/p/memcached/wiki/FAQ#Namespaces -Dormando On Mon, 20 Jul 2009, dl4ner wrote: > > Hi there, > > background: we are using some clusters and running memcached on > about 30..40 servers with about lots of ram dedicated to the memcache. > > short: > 1) I know there is no such thing as an expire in memcached. > 2) I know memcached is not a database and never will be. > > but it would be good if there were an expire, because that would > make memcaches behaviour more predictable. let me explain. > > As the cache grows, memcached allways allocates new 1 MB pages to its > slabs. > these pages are dedicated to that slab, forever until power fails or > memcached is > restarted (at least if not ENABLE_SLABS_REASSIGN is used). > with time, the memory print of memcached should be a good footprint of > the needed sizes > (the more objects come with size X the more RAM will be used for the > corresponding slab). > > Now here's the problem: > if the environment changes and the distribution of numbers/sizes of > objects change, > then the memcache will not be able to handle all store-requests > correctly. As there > are slabs with low numbers of pages, and there might be no additional > memory left > in the configuration for allocating a new 1 MB page to a slab. > > but reassigning slabs is not really usable when talking aboud some GB > of RAM and the > possible need of reassigning some 100+ or 1000+ pages, as you only can > reassing > one page after another, and only when it's full... > > I'm not speaking of small memcaches, I'm talking abount e.g. 2-4G RAM > for one memcache-instance. > let me give an example: 3 G RAM for memcache. > memcached-tool gives this output, after e.g. some 20 days uptime. > # Item_Size Max_age 1MB_pages Count Full? evicted outofmem > [...] > 13 1.7 kB 721372 s 1 12 no 0 0 > 14 2.1 kB 822425 s 6 2367 no 0 0 > 15 2.6 kB 746706 s 19 7364 yes 0 0 > 16 3.3 kB 750372 s 29 8985 yes 0 0 > 17 4.1 kB 832983 s 313 77622 yes 0 0 > 18 5.2 kB 635906 s 1941 384318 yes 0 0 > 19 6.4 kB 656661 s 663 104753 yes 0 0 > 20 8.1 kB 607873 s 70 8887 yes 0 0 > 21 10.1 kB 654811 s 9 907 yes 0 0 > 22 12.6 kB 833110 s 5 377 no 0 0 > 23 15.8 kB 798629 s 4 254 yes 0 0 > 24 19.7 kB 684625 s 7 356 yes 0 0 > 25 24.6 kB 732865 s 1 23 no 0 0 > 26 30.8 kB 821786 s 1 5 no 0 0 > 27 38.5 kB 602869 s 1 9 no 0 0 > 32 117.5 kB 8119 s 1 1 no 0 0 > > You see that nearly all RAM is thrown into slabs 17,18 and 19, but > there are > lots of slabs with only 1 page (=1M) or some slabs without any page. > > if now would come some more items for those rarely used slabs, they > cannot be stored (or stored long enough), they get evicted. > > using of "-M" for returning errors ist not an ideal solution either. > changing the chunk sizes also would only move the problem to > another slab/slab border. also we have different cache statistics > on different clusters, so one size fits all does not work :) > > To circumvent the evitions, I wrote a memcache-expire-script. > To achieve this, I use a client-side-program, doing a "stats cachedump > <slab> 20000", > and then comparing each timestamp and sending an explicit "delete > <key>" to the cache. > > As I only can fetch the first 20.000 items (due to limit of 2MB for > the > statsdump return buffer), I mostly get the active items within, so I > can only > delete "some" old items. > > the script uses usleeps and runs about 40 seconds +/- per slab > containing more > than 20000 items and deleting e.g. 12000 items. > > after some runs, the cache gets really cleared, so now I know how much > ram it really > would need - only about some percent of all. > > Same cache, still up, but expired: > # Item_Size Max_age 1MB_pages Count Full? evicted outofmem > [...] > 13 1.7 kB 296634 s 1 7 no 0 0 > 14 2.1 kB 603126 s 6 2041 no 0 0 > 15 2.6 kB 603129 s 19 4483 yes 0 0 > 16 3.3 kB 603127 s 29 6195 yes 0 0 > 17 4.1 kB 603129 s 313 7191 yes 0 0 > 18 5.2 kB 602231 s 1941 9129 yes 0 0 > 19 6.4 kB 603124 s 663 2550 yes 0 0 > 20 8.1 kB 515769 s 70 586 yes 5 0 > 21 10.1 kB 304993 s 9 184 yes 2 0 > 22 12.6 kB 515764 s 5 59 yes 0 0 > 23 15.8 kB 66440 s 4 29 yes 0 0 > 24 19.7 kB 327439 s 7 57 yes 0 0 > 25 24.6 kB 63584 s 1 6 no 0 0 > 26 30.8 kB 1564 s 1 2 no 0 0 > 27 38.5 kB 303377 s 1 6 no 0 0 > 32 117.5 kB 37077 s 1 1 no 0 0 > > You see the same memory footprint, but you can imagine the waste > of memory: e.g. slot 19: less than 10.000 objects x 5.2kB, so some 50 > MB > lost within nearly 2 GB of RAM. > > at the same time it shows some evicted items on slot 20/21. > (ok, we have also other memcaches, where evicted is in numbers of 2% > (evicted/total) and this would be quite too much, for e.g. use > memcache > as a store for sessions. > (again: I know memcache is no database and never will be). > > by now I started some tests to send my expire regularly (every 30 min) > to a fresh started cache. > It does not even grow large - it's allowed to handle 3 G RAM and stays > ways > below 300MB. that way it has lots of spare pages to allocate for > rarely used slots. > > But it's quite a waste of cpu cycles to expire such a tremendous cache > client-side. > this should be done server-side. It's just wrong to let the cpus > extract the keys, prepare > the cachedump, send it over the net, parse the cachedump client-side, > compare the > timestamps (what could be done serverside instead of preparing the > cachedump), then > send thousands of delete requests back to the server and waste > bandwidth a second time. > > If expiration is not triggered automatically (seems to me like part of > the philosophy), > then it should be at least possible to trigger it client side. > I imagine another memcache command - in ascii protocol that would be > like this: > > expire <slab> <limit> > > where <slab> would be the slab number, and limit would be an optional > limit for > deletion, e.g. expire maximum 1000 entries. (ok, it could be very fast > in the memcache > to delete some 100.000 entries or even millions, as it is only RAM, > but I feel saver > like that. > > the answer could be like that: > > EXPIRED SLAB #19 > 1843 ITEMS DELETED > 2242 ITEMS ACTIVE > END > > So, now the question to the maintainer folks: > a) any comments on this? > b) does anyone besides me need this? > c) I could imagine to try and implement on my own, (my knowledge in C > is a bit > dusty but at least I could read the code till now). > but this clearly only makes sense if that patch would make its way > into > the sources and stay there. > d) I don't know, but if implemented, it could be implemented into the > binary protocol as well, > but this would go beyond my scope. > > best regards > > Werner Maier >
