On Thu, Apr 16, 2009 at 6:36 AM, Henrik Schröder <[email protected]> wrote: > Whatever is in $sql will always be unique with no risk of collisions. But if > you hash $sql and use that as a key, you risk getting collisions, which > means that when you look up the cached value for one query, you will get the > results of some other query. This is potentially fatal for your application. >
The chances of an md5 collision are pretty low. Even lower if you go with a sha1 hash instead of md5. > Memcached keys have limitations, they can't contain space, and they can't be > longer than 250 characters, so if you want to store the results of queries > and key by query, you have to do some transformation of the key, but using > md5 on everything is just lazy and not well-thought out. A smarter > transformation would be to replace all " " with "-", and for keys that are > still longer than 250 characters, change it to the first 240 chars + md5 of > the remaining chars. You still have a small possibility of collisions, but > you avoid a lot of unnecessary hashing, and since you probably have very few > queries that are that long, you've reduced the collisions possibility. Oh, > and your keys are still readable when you debug which makes it a lot easier > to see what exactly gets stored and fetched. > I disagree on the query length here. If you're using queries as keys the chances of them being greater than 250 characters is pretty good. Using hashes definitely makes for difficult to debug keys though. Logging the query and the resulting hash can help with that a bit though.
