I've changed the caching architecture in the current development version,
in a big way.
Previously, the caching code was mingled in amongst other stuff in the
Sitescooper modules. Now, it's been cleanly refactored out into its own
modules as follows. (skip to </boringdiscussionofsitescooperinternals>
if you're not interested...)
<boringdiscussionofsitescooperinternals>
There are now several modules:
Sitescooper::CacheFactory used to handle "global" cache stuff,
and to get hold of:
Sitescooper::PerSiteCache per-site cache objects, which in turn
allow access to:
Sitescooper::CacheObject individual cached pages
Those are the abstract base classes for implementations of the caching
code. Currently there's only one impl, with 2 modules:
Sitescooper::DirCacheFactory and Sitescooper::PerSiteDirCache. This is
a reworked version of the old caching mechanism.
</boringdiscussionofsitescooperinternals>
Here's the details of the DirCache mechanism, and how it differs from the
old one. The big wins are as follows:
- each site maintains a separate cache; so e.g. the Salon front page and
the Salon archives sites will both be able to scoop the same articles
without conflicting.
- the modification dates of cached files, and their URLs, are stored
in a Berkeley DB (or whatever database is available on the platform)
instead of a text file, so it should be faster when starting up
and generally more efficient.
- the new modules allow new caching implementations to be plugged in
easily, if desired.
Downsides:
- any cached pages become invalid and will be expired, so the first time
you use it, you'll get lots of articles you've already read. Sorry
about that.
- the use of db/dbm databases could cause trouble portability-wise.
- I wasn't able to get rid of the current need to cache the entirety of
a page's text; too many neat features (like resilience against
start/end-pattern changes, and diffing) rely on it. Ah well.
- version number has been bumped up to 3.1.0, as it's a big change.
(Not really a downside ;)
Anyway -- it works fine on RedHat 6.2, and I'd imagine, any other UNIX
with perl built with Berkeley DB support. But I'd like a volunteer from
the Windows and Mac communities if possible to try it out (the dev version
on sitescooper.org), and tell me if it installs and works OK...
--j.
_______________________________________________
Sitescooper-talk mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/sitescooper-talk