Issue 202 in memcached: TOP_KEYS feature fixes

memcached Fri, 15 Apr 2011 00:23:46 -0700

Status: Accepted
Owner: [email protected]
Labels: Type-Defect Priority-High


New issue 202 by [email protected]: TOP_KEYS feature fixes
http://code.google.com/p/memcached/issues/detail?id=202

I did some real basic bench testing against the TOP_KEYS feature while ...working on the benchmark stuff.


The basic results:

Normal:   400,000 gets/sec
TOP_KEYS: 200,000 gets/sec

8 threads were used. increasing thread count didn't increase performance.Appears to be stacking on locks and spending a lot of time in sprintf.


Arguments I've heard about this being okay:

- memcached is so fast that it doesn't matter if you cut the capacity inhalf

- users desperately need this so it's worth cutting capacity in half

Arguments I have against it:

- there will be things we do in the future that will slow it down, andideally we won't be putting ourselves in a position where you enable all ofthese features at once and end up halving the capacity several times.Performance reduction via in-line features should be strictly limited tofeatures that cannot be implemented outside of the daemon (ie; via tcpdumpand a script/program).

That said, I'm all for shipping a script or C app with memcached fordoing "topkeys-like" quick analysis.


- I dunno, that's basically it.

Approaches for fixing the issue:

- I'd prefer to tear it out and develop it in a branch, then add it back induring 1.6.1 if it's fixable, or relegate it to a module or engineextension and leave it out.

- The feature samples all keys, and is enabled at start time via anenvironment variable. The only user friendly bits about the whole thing isthe information it eventually gives you. Math and usability rules don'tback up why this feature is the way it is; It should be a sampling set(defaulting to 0, changeable at runtime). One sample every 1,000+ commandson a busy server is probably 10x as frequent as it needs to be to findthe "top keys".

- I *hate* this thing. This is maybe 1/3rd of the useful information youcan get out of a key stream, and it's not in any form that could beextensible. If we distribute another fast C based app (possibly start fromperl or whatever), you can find "top keys" via snapshots (the data onlymatters when you're looking at it), and you can discern patterns from toprelated keys. The tools I use to track down keys are often customized tolook for common sections of keys to see if particular features are goingoff-kilter.

So in short, either way you need a key-stream analysis method to actuallyget useful information out of a running instance. Providing anything elsewithout a method of getting the full picture is just flatly missleading.

Issue 202 in memcached: TOP_KEYS feature fixes

Reply via email to