A little reading would indicate the MD5 is about 40% faster to compute than
SHA1 so I would suggest we use it. It would also make sense to stuff the MD5
of the cached content into an HTTP header on the cached version of the
original content so we can avoid re-computing it.

On Tue, Sep 9, 2008 at 6:28 PM, Louis Ryan <[EMAIL PROTECTED]> wrote:

> I would be a very wary of relying on VM implementation specifics like
> XXXRweriter.class.hashCode(). You certainly don't want the same class
> generating different coes on different instances in a cluster where the
> cache is shared. Also dont use serialVersionUID as that is only intended for
> serialization compatability. Reading the class as an InputStream from the
> ClassLoader and then computing a hash might work but seems more trouble than
> its worth.
>
> In general I think a a manually maintained version no. might actually be
> your best bet.
>
>
> On Tue, Sep 9, 2008 at 5:29 PM, John Hjelmstad <[EMAIL PROTECTED]> wrote:
>
>> Back to this sub-thread...
>> Adding a getVersion() parameter seems error prone and a little burdensome.
>> In theory the version needs to change when any modification in potential
>> output for a given input is made, ie. not just optimization. That's not
>> always the easiest thing to analyze and get right. Meanwhile, invalidating
>> a
>> bunch of cache keys (by registering a new rewriter impl) isn't especially
>> costly: if the cache is useful at all, it will be hit often, ie. refreshed
>> quickly.
>>
>> --John
>>
>> On Tue, Sep 9, 2008 at 1:50 PM, Louis Ryan <[EMAIL PROTECTED]> wrote:
>>
>> > The versioning should be more explicit than that I think. Maybe add a
>> > getVersion function to the rewriter interface so they can manage their
>> own
>> > changes
>> >
>> > On Tue, Sep 9, 2008 at 12:39 PM, John Hjelmstad <[EMAIL PROTECTED]>
>> wrote:
>> >
>> > > Interesting suggestion. I can include the rewriter class names and
>> their
>> > > class hash codes or some other such versioning construct.
>> > >
>> > > On Tue, Sep 9, 2008 at 12:32 PM, Ben Laurie <[EMAIL PROTECTED]> wrote:
>> > >
>> > > > On Tue, Sep 9, 2008 at 8:17 PM, John Hjelmstad <[EMAIL PROTECTED]>
>> > wrote:
>> > > > > Excellent, agreed. CLs forthcoming.
>> > > >
>> > > > Can you include the version numbers of the rewriters in the cache
>> key?
>> > > >
>> > > > >
>> > > > > On Tue, Sep 9, 2008 at 12:14 PM, Kevin Brown <[EMAIL PROTECTED]>
>> > wrote:
>> > > > >
>> > > > >> #2 is the only really viable option. If we have to put caching
>> logic
>> > > in
>> > > > 10
>> > > > >> different places we'll screw it up 9 different times :).
>> > > > >>
>> > > > >> On Tue, Sep 9, 2008 at 12:11 PM, John Hjelmstad <
>> [EMAIL PROTECTED]>
>> > > > wrote:
>> > > > >>
>> > > > >> > As discussed on a few threads and tracked in JIRA issue (
>> > > > >> > http://issues.apache.org/jira/browse/SHINDIG-579), we need to
>> > move
>> > > > >> > rewriting
>> > > > >> > logic out of AbstractHttpCache. Yet we should maintain
>> rewritten
>> > > > content
>> > > > >> > caching capability. The question is where to put it.
>> > > > >> > I see two options, at a high level:
>> > > > >> > 1. In code that calls
>> > > > >> ContentRewriterRegistry.rewrite(HttpResponse|Gadget).
>> > > > >> > Eg. MakeRequestHandler, ProxyHandler, ViewContentFetcher,
>> > > > GadgetServer,
>> > > > >> and
>> > > > >> > the near-future Renderer and Preloader. This allows
>> finer-grained
>> > > > control
>> > > > >> > over caching behavior in context, at the cost of distributing
>> > > caching
>> > > > >> logic
>> > > > >> > in various places.
>> > > > >> > 2. In ContentRewriterRegistry.rewrite(HttpResponse|Gadget)
>> itself,
>> > > if
>> > > > so
>> > > > >> > chosen. Caching logic can be consolidated in
>> > > > >> > CachingContentRewriterRegistry,
>> > > > >> > for instance (which will no longer subclass
>> > > > CachingWebRetrievalFactory),
>> > > > >> > and
>> > > > >> > be considered an optimization to rewriting.
>> > > > >> >
>> > > > >> > I'm inclined to go #2. Rewriters themselves can be augmented
>> with
>> > > > caching
>> > > > >> > hints if necessary, and be assumed deterministic for a given
>> cache
>> > > key
>> > > > in
>> > > > >> > the meantime. Consolidating rewriting logic makes it easier to
>> > share
>> > > > the
>> > > > >> > cache itself.
>> > > > >> >
>> > > > >> > Still, I might be missing situations in which additional
>> context
>> > > > inherent
>> > > > >> > to
>> > > > >> > the calling context is needed to make a caching decision.
>> > > > >> >
>> > > > >> > --John
>> > > > >> >
>> > > > >>
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to