Re: Locationmap Caching Help (FOR-732/711)

Tim Williams Thu, 30 Mar 2006 10:21:03 -0800

On 3/30/06, Ross Gardler <[EMAIL PROTECTED]> wrote:
> Tim Williams wrote:
> > On 3/30/06, Ross Gardler <[EMAIL PROTECTED]> wrote:
> >
> >>Tim Williams wrote:
> >>
> >>>I've been looking into FOR-711 [1]/FOR-732 [2] and I'm in need of some
> >>>new thoughts on the subject.  My "fix" for FOR-732 (changes to
> >>>locationmap having no effect) has effectively reversed much of the
> >>>benefit of caching the responses to begin with.  I'll explain...
> >>
> >>Sorry for not responding sooner, I've been trying to find the time to do
> >>so properly. I'm not finding any time at the moment, so I'll just give
> >>an "off the top of my head" response, in the hope that it helps in some
> >>small way.
> >>
> >>
> >>>The highlights of what I've done:
> >>>o) Created isValid() on AbstractNode.
> >>>o) Implemented isValid() on each node as appropriate.
> >>>o) LocationMapModule now tests whether caching is turned on &&
> >>>isValid() before returning a cached result.
> >>>o) If the traversed isValid() returns an invalid result back to the
> >>>Locationmapmodule, the cache is flushed.
> >>>
> >>>The results I'm seeing: Slow, but the correct behavior... did I
> >>>mention slow [3]?
> >>
> >>Hmmm... that's quite a performance hit. I don't really understand why it
> >>is so bad with caching turned on. It's probably a really stupid question
> >>but are we using the right kind of data structure for the cache data.
> >
> >
> > Because it's making ~315 validity tests per request.
>
> Wow. Why is that the case? There are only a handful of locationmap files
> in most projects.
>
> (you answer this further down)
>
>
> >>One other thought, are we checking all isValid() methods or just the one
> >>  for the file in which the cached data was found? There is no need to
> >>traverse the cache files *below* the one we currently use for giving a
> >>result.
> >
> >
> > We currently stop only when an invalid file is found.  Otherwise all
> > nodes are traversed.  If we had insight into what file a given hint
> > was located in, we could just test that directly without needing to do
> > the whole AbstractNode.isValid()-recursive thing.  Inside
> > getAttribute() we don't have insight into which locationmap returned
> > the value to us.
>
> Therein lies the problem. We should be testing on a per file basis, not
> a per node entry.


I think I'm not explaining it well enough.  "n" (e.g. 315) is not the
number of nodes (SelectNode, MountNode, etc.); "n" is the number of
references to "lm:some-resource" involved in a given request.  In
other words, traversing all of the nodes is actually really fast,
traversing all of the nodes for each of the 315 references to an
lm:some-resource is, in total, slow.

> >>>So what to do and what am I asking?
> >>
> >> > ...  even if we were to
> >>
> >>>implement our own store ... it seems to me it's the sheer number
> >>>of accesses that's the issue rather than storage location.
> >>
> >>...
> >>
> >>
> >>> Having a
> >>>configurable timeout (so that validity tests are only done every X
> >>>number of minutes) would help but seems ultra-hacky.
> >>
> >>Why hacky? Isn't it standard practice for a cache to have a configurable
> >>TTL, then you use that it to fine tune performance, adjusting the cache
> >>on different requrests.
> >
> >
> > Yeah, our cache isn't that sophisticated, when I'm talking timeout,
> > I'm talking about the *whole* cache (HashMap).  I think it's standard
> > practice to evict individual items in the store based on TTL but not
> > potentially invalidate the whole thing.
>
> By caching at the file level, rather than the node level we solve this
> problem. We can define a TTL in the LM file itself. For example, there
> is little point in caching the Forrest core locationmaps since they will
> never change in production.
>
> In my (aborted) experiments with the Cocoon MutiFileValidity I had this
> working, to an extent. The problem was with how the locationmaps are
> mounted, it was not possible to create such a validity object with the
> current way the locationmap works (see archives). I've not had time to
> address this yet.

Yeah, with the node traversal I get the same effective of
MultiFileValidity by way of traversal rather than globally exposing
some agreggated validity object.

--tim

Re: Locationmap Caching Help (FOR-732/711)

Reply via email to