On 3/30/06, Ross Gardler <[EMAIL PROTECTED]> wrote: > Tim Williams wrote: > > On 3/30/06, Ross Gardler <[EMAIL PROTECTED]> wrote: > > > >>Tim Williams wrote: > >> > >>>I've been looking into FOR-711 [1]/FOR-732 [2] and I'm in need of some > >>>new thoughts on the subject. My "fix" for FOR-732 (changes to > >>>locationmap having no effect) has effectively reversed much of the > >>>benefit of caching the responses to begin with. I'll explain... > >> > >>Sorry for not responding sooner, I've been trying to find the time to do > >>so properly. I'm not finding any time at the moment, so I'll just give > >>an "off the top of my head" response, in the hope that it helps in some > >>small way. > >> > >> > >>>The highlights of what I've done: > >>>o) Created isValid() on AbstractNode. > >>>o) Implemented isValid() on each node as appropriate. > >>>o) LocationMapModule now tests whether caching is turned on && > >>>isValid() before returning a cached result. > >>>o) If the traversed isValid() returns an invalid result back to the > >>>Locationmapmodule, the cache is flushed. > >>> > >>>The results I'm seeing: Slow, but the correct behavior... did I > >>>mention slow [3]? > >> > >>Hmmm... that's quite a performance hit. I don't really understand why it > >>is so bad with caching turned on. It's probably a really stupid question > >>but are we using the right kind of data structure for the cache data. > > > > > > Because it's making ~315 validity tests per request. > > Wow. Why is that the case? There are only a handful of locationmap files > in most projects. > > (you answer this further down) > > > >>One other thought, are we checking all isValid() methods or just the one > >> for the file in which the cached data was found? There is no need to > >>traverse the cache files *below* the one we currently use for giving a > >>result. > > > > > > We currently stop only when an invalid file is found. Otherwise all > > nodes are traversed. If we had insight into what file a given hint > > was located in, we could just test that directly without needing to do > > the whole AbstractNode.isValid()-recursive thing. Inside > > getAttribute() we don't have insight into which locationmap returned > > the value to us. > > Therein lies the problem. We should be testing on a per file basis, not > a per node entry.
I think I'm not explaining it well enough. "n" (e.g. 315) is not the number of nodes (SelectNode, MountNode, etc.); "n" is the number of references to "lm:some-resource" involved in a given request. In other words, traversing all of the nodes is actually really fast, traversing all of the nodes for each of the 315 references to an lm:some-resource is, in total, slow. > >>>So what to do and what am I asking? > >> > >> > ... even if we were to > >> > >>>implement our own store ... it seems to me it's the sheer number > >>>of accesses that's the issue rather than storage location. > >> > >>... > >> > >> > >>> Having a > >>>configurable timeout (so that validity tests are only done every X > >>>number of minutes) would help but seems ultra-hacky. > >> > >>Why hacky? Isn't it standard practice for a cache to have a configurable > >>TTL, then you use that it to fine tune performance, adjusting the cache > >>on different requrests. > > > > > > Yeah, our cache isn't that sophisticated, when I'm talking timeout, > > I'm talking about the *whole* cache (HashMap). I think it's standard > > practice to evict individual items in the store based on TTL but not > > potentially invalidate the whole thing. > > By caching at the file level, rather than the node level we solve this > problem. We can define a TTL in the LM file itself. For example, there > is little point in caching the Forrest core locationmaps since they will > never change in production. > > In my (aborted) experiments with the Cocoon MutiFileValidity I had this > working, to an extent. The problem was with how the locationmaps are > mounted, it was not possible to create such a validity object with the > current way the locationmap works (see archives). I've not had time to > address this yet. Yeah, with the node traversal I get the same effective of MultiFileValidity by way of traversal rather than globally exposing some agreggated validity object. --tim
