I've changed the caching architecture in the current development version,
in a big way.

Previously, the caching code was mingled in amongst other stuff in the
Sitescooper modules.  Now, it's been cleanly refactored out into its own
modules as follows.  (skip to </boringdiscussionofsitescooperinternals>
if you're not interested...)


<boringdiscussionofsitescooperinternals>

  There are now several modules:

  Sitescooper::CacheFactory     used to handle "global" cache stuff,
                                and to get hold of:

  Sitescooper::PerSiteCache     per-site cache objects, which in turn
                                allow access to:

  Sitescooper::CacheObject      individual cached pages

  Those are the abstract base classes for implementations of the caching
  code.  Currently there's only one impl, with 2 modules:
  Sitescooper::DirCacheFactory and Sitescooper::PerSiteDirCache.  This is
  a reworked version of the old caching mechanism.

</boringdiscussionofsitescooperinternals>


Here's the details of the DirCache mechanism, and how it differs from the
old one. The big wins are as follows:

  - each site maintains a separate cache; so e.g. the Salon front page and
    the Salon archives sites will both be able to scoop the same articles
    without conflicting.

  - the modification dates of cached files, and their URLs, are stored
    in a Berkeley DB (or whatever database is available on the platform)
    instead of a text file, so it should be faster when starting up
    and generally more efficient.

  - the new modules allow new caching implementations to be plugged in
    easily, if desired.

Downsides:

  - any cached pages become invalid and will be expired, so the first time
    you use it, you'll get lots of articles you've already read. Sorry
    about that.

  - the use of db/dbm databases could cause trouble portability-wise.

  - I wasn't able to get rid of the current need to cache the entirety of
    a page's text; too many neat features (like resilience against
    start/end-pattern changes, and diffing) rely on it.  Ah well.

  - version number has been bumped up to 3.1.0, as it's a big change.
    (Not really a downside ;)

Anyway -- it works fine on RedHat 6.2, and I'd imagine, any other UNIX
with perl built with Berkeley DB support.  But I'd like a volunteer from
the Windows and Mac communities if possible to try it out (the dev version
on sitescooper.org), and tell me if it installs and works OK...

--j.

_______________________________________________
Sitescooper-talk mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/sitescooper-talk

Reply via email to