Thanks Garth,

- I put it aside until I could find enough clear time to spend reading it through. It 
makes for good reading -
and your comment about caching being 'funny'/suiting different situations in different 
ways is definitely
correct. I think you've helped me understand/clarify some mistaken assumptions!

I think that was where the previous part of the conversation was going (and will be 
interested to hear how
others respond - and why I've previously had a tendency to avoid adding such an extra 
'level' (or indeed,
levels) to any solution.

In a 'newspaper' situation, the number of db entries/pages is small, in relation to 
the number of inbound (HTTP)
requests, and that has an important bearing on the likelihood of success. Too many of 
the applications where I
have considered cache had an expected ratio/hit rate too low to justify the extra 
costs and complexity, so it
always seemed easier to look at the twin anathemas of bolstering the hardware and/or 
slimming down the ambitions
behind the content...

Thanks for taking the time,
=dn


----- Original Message -----
From: "Garth Dahlstrom" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: 23 January 2002 05:16
Subject: Re: [PHP] Re: How should I cache database data for php?


> DL,
>
> Since I have a willing audience I'll ramble on a bit... :)
>
> > =were the page subcomponents formatted as HTML or simple text?
>
> The templates were an XML "scripting" language that got
> rendered through a Java App server into cached HTML.
>
> In place of the XML "scripting", one could also use custom tags
> that hooked Java classes, but no one on our team knew Java well enough
> and I was able to demonstrate that the XML scripting stuff
> was good-enough(tm).
>
> The system for publishing is called Content Server by OpenMarket.
> The company I worked for is Toronto's biggest newspaper -
> http://www.thestar.com
>
> Some neat concepts, geared towards publishers, but they let you
> get inside an muck about with most of the code that drove
> the interface, and content controls - publish events (all done XML
> script)...
>
> > =how was the fact that a particular page used certain subcomponents
> > tracked? (eg to know what to update, when)
>
> This was done in a database, there were objects for
> pages (instances of templates), templates (page layout
> slots + base render code), articles, links, pictures,
> collections (lists of articles, links, pictures), and you
> could define your own object types and their relationships
> to the predefined ones.
>
> When a page was changed it was flagged for publish in another
> table and then the all of the subcompoents associated with that
> page were purged from the cache and they were re-rendered
> as pieces (by a calling them through a certain URL) and
> the the page-level caches were purged. (we hacked this
> part in ourselves)
>
> It's a combo of disk and memory caching handled by
> OpenMarket,  I told my coworkers to stay away from
> publishing static content (3 years of that left us with
> over 200,000 pages of outdated stuff that no one knew what
> was important and what wasn't, course they weren't using
> a DB when I started there either but that's another story ;) ).
>
> > =why did/do you think the newspaper made the decision to do the
> > former, than to try this ADODB-type solution?
>
> Well, they had a custom-built in house solution that created
> those 200k pages I mentioned and a perl traller that
> used go through each one of them to update parts of it by
> seeking out special comments (I didn't write it, but that
> was FUNNY stuff).   So they wanted a vendor product...
> quite understandable.
>
> Also, I didn't hear about ADODB till this year.
>
> > =they both achieve the stated objective (reduce load on the
> > db-server). Is one somewhat lesser/better than the
> > other?
>
> The DB server takes only for handling session state (but there
> isn't much to that), all of the content is cached to disk
> or to memory (this was needed because NAS2.1 was *SO* slow
> to render the content in the first place).
>
> Where the system is "not so good, Al" is the dynamic content
> on each page view or per visitor, so for example personalsation
> was not really a possiblity because of the ridge cache setup.
>
> This is where an ADODB-cached data set solution would be
> my prefered choice because you could mirror some data out to the
> webservers (weather, scores, etc)... you still have to hit the
> DB for session stuff (unless you are fond of sticky bits holding
> your state to one web server in a multi web server env).
>
> Caching is a funny thing, the behaviour you want to use really
> depends on the nature of the content you deliver...  You
> won't find a silver bullet method to do it perfectly for all
> content.
>
> We used to force the caches to purge when we published new stuff,
> but that worked because publishing was done in batches at certain
> times (Infinite TTL).  But if you were publishing info
> every minute (i.e. stock quotes), then you might want to
> cache solely on a 1 minute TTL for data...  and perhaps not
> even cache at a page level if it changes so much.
>
> Anyway... I've rambled long enough... :)
>
> -Garth
>
> Northern.CA ===--
> http://www.northern.ca
> Canada's Search Engine
>
>
>


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]

Reply via email to