Conal Tuohy wrote:
>
> I've just been looking at "inclusion" recently and noticed that
> the cinclude
> transformer had caching but the xinclude transformer didn't, and
> I wondered
> if there was some arcane reason or was it just a historial
> accident ;-) And
> BTW I think Carsten is right - the inclusion code should be united.
>
Thanks :)

> But actually my question is about caching of the DirectoryGenerator and
> sub-classes. It seems to me that these transfomers should also be
> cacheable;
> maybe this is just an oversight too? Or is there something tricky
> I haven't
> forseen? ;-)
AFAIK, the directory generator can be made cacheable as well, the
only reason why it isn't is, that noone has done it yet, I think.

>
> I have a pipeline based on a directory listing of xml docs; with
> 9 documents
> it takes 2 or 3 seconds to complete - and with 100 docs it will be
> impossible. But the files aren't often changed.
>
> Anyway, I thought I might make an attempt at adding caching to the
> DirectoryGenerator. It seems to me that the generator could perform the
> search and traverse the file system every time a request is made, but hash
> the resulting xml for the CacheValidity. Is that right? This would be my
> first attempt.
Ok, first question: for what version are you planning to add the caching?
If for 2.0.2/2.0.3 than the Cacheable interface is the right one, if you
are planning it for 2.1 than CacheableProcesingComponent is the correct
one.

Both interfaces are similar regarding their functions but not their return
values. A cacheable component returns a unique key. For the directory
generator this could be the full path of the directory together with the
settings
(include pattern, exclude pattern etc).
Second, a CacheValidity (Cacheable) or a SourceValidity
(CacheableProcesingComponent)
object is returned which is used to detect if the source (= directory)
of the generator changed. So if you use, e.g. the last modificiation
date of the directory here, this should work. (If the last modification date
of the directory changes, if a file changes inside the directory)

>
> I could enhance it to only traverse PART of the file system again to check
> the validity, by keeping a record of the file and directory
> objects involved
> in the last search. In the case of a directory search like "images/*.jpg",
> or "docs/*.xml", the generator should only need to check the timestamp of
> the "images" or "docs" directories, is that right? I've also used searches
> like "images/{1}.jpg" with the ImageDirectoryGenerator, to get the image
> width and height into a pipeline that converts the jpg to svg,
> for scaling,
> etc. In this case the validity could depend on the timestamp of
> that single
> file.

Yes, collecting the files involved together with their last modification
date is another possibility.

HTH
Carsten

Carsten Ziegeler     Chief Architect     Open Source Group, S&N AG
------------------------------------------------------------------
             Cocoon Consulting, Training and Projects
------------------------------------------------------------------
mailto:[EMAIL PROTECTED]                  http://www.s-und-n.de
                    http://ziegeler.bei.t-online.de


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to