Using full body contents is probably unavoidable. This may be - depending on
implementation - a distributed cache, so we should code for that.
I'm inclined to agree with Louis that MD5/SHA1 seems fine... HashUtils (in
common) provides the former, while DigestUtils in apache.commons.codec
proffers sha (and another md5 impl, it seems).

--John

On Tue, Sep 9, 2008 at 4:45 PM, Kevin Brown <[EMAIL PROTECTED]> wrote:

> On Tue, Sep 9, 2008 at 4:38 PM, Brian Eaton <[EMAIL PROTECTED]> wrote:
>
> > On Tue, Sep 9, 2008 at 4:25 PM, John Hjelmstad <[EMAIL PROTECTED]> wrote:
> > > I briefly considered String hashCode, but quickly recognized that was a
> > bad
> > > idea. MD5 of contents sounds reasonable. Brian, thoughts?
> >
> > I suspect using the entire input body contents is out of the question,
> > though that was my initial thought.
> >
> > Don't use MD5.  Nobody knows how to attack it for this kind of
> > application, yet, but a lot of progress has been made.  SHA1 is
> > probably OK.  SHA-256 would be great, HMAC-SHA1 would be great, except
> > then you have to worry about keying, which is a pain.  This cache is
> > potentially shared across multiple servers, right?
> >
> > If it's a single server cache, HMAC-SHA1 with a random key.
> >
> > The cache key generated by the HTTP content fetchers might be useful
> > for this as well, assuming you can get ahold of it somehow.
>
>
> There's a utility checked in to produce a base32 encoded SHA1 checked into
> common that can be used for this.
>

Reply via email to