I would think MD5/SHA1 would be perfectly fine. Brian are we worried about someone generating enough random variations of content to force collisions and get someone elses content from the cache? A brute force attack would require generating so many requests to the server & cache that it seems unfeasible. This is a closed system so the hashes themselves are never exposed publicly. Am I missing something? It seems like we would only care about the functional requirement of a hash that has a very low probability of collision and a collision detection mechanism.
On Tue, Sep 9, 2008 at 4:38 PM, Brian Eaton <[EMAIL PROTECTED]> wrote: > On Tue, Sep 9, 2008 at 4:25 PM, John Hjelmstad <[EMAIL PROTECTED]> wrote: > > I briefly considered String hashCode, but quickly recognized that was a > bad > > idea. MD5 of contents sounds reasonable. Brian, thoughts? > > I suspect using the entire input body contents is out of the question, > though that was my initial thought. > > Don't use MD5. Nobody knows how to attack it for this kind of > application, yet, but a lot of progress has been made. SHA1 is > probably OK. SHA-256 would be great, HMAC-SHA1 would be great, except > then you have to worry about keying, which is a pain. This cache is > potentially shared across multiple servers, right? > > If it's a single server cache, HMAC-SHA1 with a random key. > > The cache key generated by the HTTP content fetchers might be useful > for this as well, assuming you can get ahold of it somehow. >

