On 11/2/07, Dustin Sallings <[EMAIL PROTECTED]> wrote:
> I'll admit I don't know a lot of PHP, but I'd imagine a function that
> looked something like this (I typed this python in my mail client, so
> I don't know that it actually works):
>
> def get_cached(keys, cache_miss_func, timeout=300):
> found=memcache.get(keys)
> missing=[k for k in keys if k not in found]
> if missing:
> found_in_db=cache_miss_func(missing)
> for k,v in found_in_db.iteritems():
> memcache.set(k, v, timeout)
> found.update(found_in_db)
> return found
This is pretty much what I want to do. Part of what makes it complex
though is the key prefixes I prepend. How do you handle it if you have
user IDs as the keys, and then some other IDs? That would work if
there were namespaces or prefix-aware functions (see below), otherwise
it looks great on paper and pseudocode but I think there's that one
big detail that is getting missed (that I have recognized actually
trying to work a solution out right now)
> 1) One multi-GET call.
1b) if(count($returned_from_cache) ==
count($requested)) { return }
> 2) One SQL query for the misses.
> 3) Two memcached sets for the missing records.
it's funny you wrote this email when you did, I was coming to my
computer to try to scribble down some pseudocode/notes about this very
subject. That is the ideal three step setup I am aiming for above,
with one minor change adding in step 1b - thanks to Brian I believe
for pointing out the obvious, no need to do additional processing if
the cache had every item.
Anyway, I was thinking about this a few minutes ago in the shower...
the place where all great (or crazy?) ideas come from.
I think what I am looking for could actually be accomplished by a
couple minor tweaks to the memcache client itself.
*** PSEUDO CODE ALERT ***
#1) First, add in a "key prefix" parameter. this string (or whatever)
will be prepended to the keys requested prior to being fed to
memcached; on the way back, it will be stripped (leaving no need for
the interpreted PHP level to assemble and de-assemble the key names,
which I believe is only workable by rebuilding a new array item by
item)
#2 and #3 are actually different methods of accomplishing the same
thing. my favorite (I think) is #2....
#2) return two arrays - hits and misses (with $prefix stripped from
the above idea)
list($hits, $misses) = memcache_get($keys, $prefix) ...
This will allow you to easily do a SELECT * FROM foo WHERE ID
IN(implode(',', $misses)) for numeric keys, or for string keys (or
whatever you want quoted)
SELECT * FROM foo WHERE ID IN("'".implode("','", $misses)."'")
(could also use array_walk() and have some callback that checks to see
if it needs mysql_escape_string or not quick... a quick strpos("'",
$string) and then escape it - again only for string keys)
Then just array_combine($hits, $dbhits)
#3) or add a parameter in the get function of what to fill in on a
cache miss. could be anything - i say a parameter since that allows it
to be decided by the end user. a generic "false" might actually be a
legitimate cache hit, so we can't just blindly do that.
I don't like this as much because it requires one more array iteration
on the interpreted level:
foreach($hits as $k => $v) {
if($hit === $v) { $needed[] = $k; }
}
... do the db call for $needed here, combine the two arrays again ...
I'm thinking moving some of the logic into the module tier would speed
up things quite a bit. I could be over-engineering this though. Does
anyone have any feedback? If it sounds like a stupid idea, I won't
even bother trying to hack the module source. Otherwise, I might try,
but I probably would not be writing the most efficient code. If it
sounds like a quite sane idea, I'd be willing to pay someone who could
produce something properly efficient and reusable and push it to the
actual PECL module itself...
Thanks!