Tim Williams wrote:
On 3/30/06, Ross Gardler <[EMAIL PROTECTED]> wrote:

Tim Williams wrote:

I've been looking into FOR-711 [1]/FOR-732 [2] and I'm in need of some
new thoughts on the subject.  My "fix" for FOR-732 (changes to
locationmap having no effect) has effectively reversed much of the
benefit of caching the responses to begin with.  I'll explain...

Sorry for not responding sooner, I've been trying to find the time to do
so properly. I'm not finding any time at the moment, so I'll just give
an "off the top of my head" response, in the hope that it helps in some
small way.


The highlights of what I've done:
o) Created isValid() on AbstractNode.
o) Implemented isValid() on each node as appropriate.
o) LocationMapModule now tests whether caching is turned on &&
isValid() before returning a cached result.
o) If the traversed isValid() returns an invalid result back to the
Locationmapmodule, the cache is flushed.

The results I'm seeing: Slow, but the correct behavior... did I
mention slow [3]?

Hmmm... that's quite a performance hit. I don't really understand why it
is so bad with caching turned on. It's probably a really stupid question
but are we using the right kind of data structure for the cache data.


Because it's making ~315 validity tests per request.

Wow. Why is that the case? There are only a handful of locationmap files in most projects.

(you answer this further down)


One other thought, are we checking all isValid() methods or just the one
 for the file in which the cached data was found? There is no need to
traverse the cache files *below* the one we currently use for giving a
result.


We currently stop only when an invalid file is found.  Otherwise all
nodes are traversed.  If we had insight into what file a given hint
was located in, we could just test that directly without needing to do
the whole AbstractNode.isValid()-recursive thing.  Inside
getAttribute() we don't have insight into which locationmap returned
the value to us.

Therein lies the problem. We should be testing on a per file basis, not a per node entry.

So what to do and what am I asking?

> ...  even if we were to

implement our own store ... it seems to me it's the sheer number
of accesses that's the issue rather than storage location.

...


Having a
configurable timeout (so that validity tests are only done every X
number of minutes) would help but seems ultra-hacky.

Why hacky? Isn't it standard practice for a cache to have a configurable
TTL, then you use that it to fine tune performance, adjusting the cache
on different requrests.


Yeah, our cache isn't that sophisticated, when I'm talking timeout,
I'm talking about the *whole* cache (HashMap).  I think it's standard
practice to evict individual items in the store based on TTL but not
potentially invalidate the whole thing.

By caching at the file level, rather than the node level we solve this problem. We can define a TTL in the LM file itself. For example, there is little point in caching the Forrest core locationmaps since they will never change in production.

In my (aborted) experiments with the Cocoon MutiFileValidity I had this working, to an extent. The problem was with how the locationmaps are mounted, it was not possible to create such a validity object with the current way the locationmap works (see archives). I've not had time to address this yet.

Ross

Ross