Re: Where to cache rewritten content

Louis Ryan Tue, 09 Sep 2008 16:59:59 -0700

I would think MD5/SHA1 would be perfectly fine.

Brian are we worried about someone generating enough random variations of
content to force collisions and get someone elses content from the cache? A
brute force attack would require generating so many requests to the server &
cache that it seems unfeasible. This is a closed system so the hashes
themselves are never exposed publicly. Am I missing something? It seems like
we would only care about the functional requirement of a hash that has a
very low probability of collision and a collision detection mechanism.


On Tue, Sep 9, 2008 at 4:38 PM, Brian Eaton <[EMAIL PROTECTED]> wrote:

> On Tue, Sep 9, 2008 at 4:25 PM, John Hjelmstad <[EMAIL PROTECTED]> wrote:
> > I briefly considered String hashCode, but quickly recognized that was a
> bad
> > idea. MD5 of contents sounds reasonable. Brian, thoughts?
>
> I suspect using the entire input body contents is out of the question,
> though that was my initial thought.
>
> Don't use MD5.  Nobody knows how to attack it for this kind of
> application, yet, but a lot of progress has been made.  SHA1 is
> probably OK.  SHA-256 would be great, HMAC-SHA1 would be great, except
> then you have to worry about keying, which is a pain.  This cache is
> potentially shared across multiple servers, right?
>
> If it's a single server cache, HMAC-SHA1 with a random key.
>
> The cache key generated by the HTTP content fetchers might be useful
> for this as well, assuming you can get ahold of it somehow.
>

Re: Where to cache rewritten content

Reply via email to