On Oct 31, 2007, at 5:40 PM, mike wrote:

Yes, but with key prefixes not matching the database keys, I have to
change the key indexes in PHP back and forth to suit the needs. Is
there a better way? Not to mention I have to assign the key name of
the to be the whole key name, or the array_diff_key/related functions
can't do a diff properly.

This seems horribly redundant:
$cache_keys[$prefix.$key] = $prefix.$key;

the memcache_get wants the parameters in the array VALUES, where
array_diff_key() wants array KEY and memcache_get returns the key name
in the KEY as well. Aligning them requires reiteration - a total of 3
iterations in my current code. I am quite sure I can cut it down to 2
somehow. Possibly also do more optimal looping...


I'll admit I don't know a lot of PHP, but I'd imagine a function that looked something like this (I typed this python in my mail client, so I don't know that it actually works):

def get_cached(keys, cache_miss_func, timeout=300):
        found=memcache.get(keys)
        missing=[k for k in keys if k not in found]
        if missing:
                found_in_db=cache_miss_func(missing)
                for k,v in found_in_db.iteritems():
                        memcache.set(k, v, timeout)
                found.update(found_in_db)
        return found

Using something like the above, you can just pass in the function that gets the data from the DB itself. For example (even more pseudo pseudo-code):

def get_from_db(keys):
        query="select * from something where id in (" +
                ', '.join(['?' for x in keys]) + ")"
        # Assuming a DB cursor is coming from somewhere.
        cursor.execute(query, keys)
        cached_objects=[make_object(row) for row in cursor.fetchall()]
        return dict([(o.id, o) for o in cached_objects])


With something like that, you could have an efficient caching interface that you can use like this:

        objs=get_cached([1, 2, 3, 4, 5], get_from_db)

On a given call, say 1, 3, and 5 are cached. Those will be returned from the cache, and then get_from_db([2, 4]) will be called, and the results of that will be populated into the cache and the result of all five will be returned. In the worst case of this scenario (there are no missing records), that'd be:

        1) One multi-GET call.
        2) One SQL query for the misses.
        3) Two memcached sets for the missing records.

--
Dustin Sallings



Reply via email to