On 21. juli. 2009, at 13.47, dl4ner wrote:


Hello Trond,

The biggest problem with expiration is that in order to locate the
item to expire we would have to look at _all_ items in the cache

Yes, I know. That's the reason why
- I suggested an additional command to have it externally triggered
only when needed
 (most users should not need it anyway)
- I suggested a deletion limit, for beeing able to have it stopped
after deletion of
 the given number of expired keys. (let's say 2.000). the Script can
run more often
 anyway.

that way it could be reasonable fair not to block a thread too long.
btw: that is exactly what I'm doing right now: get a list of 20.000
items
via stats cachedump, and foreach item do delete if expired.


I actually thought about creating a slabs maintenance thread a while back. It could inspect the items in the different slabs without locking them, and if it finds an expired item it would then lock the cache and re-evaluate and possibly free the item. This code could also move items between different slab pages, and do slab reassignment. The problem is that this isn't free: 1) CPU It will of course consume CPU to traverse the cache pool and inspect all the items 2) Destroy different caches on the CPU. This sweep over the cache will basically flush other caches on the CPU

Guess I could write a patch to do this to see how works.. What else should I use my vacation for ;-)

Cheers,

Trond


(we don't link the items into a expiry list).

I understand, and this would not be a problem for me. I also see
that my proposal is only a 90%-solution, but with only 10% of work
needed compared to a perfect solution.
But as 99% of all users would not need that feature anyway,
there should be no problem. But I don't want to patch the source
code with every new version of memcached coming out :)

Currently we work out of the assumption that the future memory
allocation pattern will match the current pattern, so that your items
sizes wouldn't change that much (and hence the slab distribution
wouldn't be that off..)

exactly that is my problem. and we discovered it only after using
memcached for about 18 months...
ok, workaround would be to regularly restart the memcached, so
it can adapt to the a new pattern, but this would also mean a loss
in sessions what I'd like to avoid :)

That being said, freeing the item from the slab when it expires would
not solve your problems completely, it would just postpone the
problem...

Yes, completely agreed. But If I have 4 G of RAM, and I normally
would only need about 300 MB, I guess there would easily be enough
additional pages that CAN be allocated if needed.
Then, we can easily check in our network management, whether the
total allocated memory of the memcached process is near the limit
(for example over 3.5 GB) and then step in before it is too late.
I guess it would be some months instead of some days until the
allocated size (pages assigned to slabs) is that big.
I don't have a problem assigning 4 G of RAM even when 500 MB would
do,
if I can avoid that problem this way. 4 G RAM is a lot cheaper than
having
to invent completely new ways inside our application :)

You don't know when items expire so a slab page could
contain items with multiple expiration times, so moving the page to
another slabclass wouldn't be that easy to do (well, you could move
items between slab pages etc, but this is not a small patch ;-) )

Yes. having that all in mind, I guess a simple hardcoded non-automatic
expire (triggered externally) that goes through the linked item list
and
deletetes every object
+ would be a relatively simple patch
+ would not affect the normal default behaviour at all
+ would not be noticed by any other user at all
+ could help to solve our problem for some users (=> at lest for me)

even if it had some drawbacks (ugly, relatively slow, maybe a small
blocking during the expire run) - but those drawbacks would only be
visible
to those who needed that feature. I for my share would not mind it at
all.

my client-side expire is running on one cluster (since last friday);
it is started every 30 minutes by now, and helps a lot. the cache size
dropped from about 3 GB to 130 MB, so there is still enough room
Runtime is about 2 minutes expiring two servers, doing an usleep(5)
every 10
deletes and between every slab.
another cluster did have two servers each 1 G RAM, is now using about
30-70 M of RAM (memcached-statistics) with about 350 pages allocated.
So it seems to me that it does work (as long as you have enough memory
left to be allocated when needed).

The performance-impact of my externally triggered expire (php script)
is not reallly noticeable at the memcache-server (load); I guess an
internal
c-function would do a lot faster but could impact the server more, so
it
should have the deletion limit (to be able to limit that impact).
on the other hand, going through a list of some 100.000 items should
be quite fast as everything is in ram.
Even if it would block the memcached for lets say 200 ms...
who can see the difference if the delivered webpage is twice an
hour for 200ms late...?

regards,

Werner.

--
Trond Norbye

Web Scale Infrastructure                 E-mail: [email protected]
SUN Microsystems                         Phone:  +47 73842100
Haakon VII's gt. 7B                      Fax:    +47 73842101
7485 Trondheim, Norway

Reply via email to