Thanks Garth, - I put it aside until I could find enough clear time to spend reading it through. It makes for good reading - and your comment about caching being 'funny'/suiting different situations in different ways is definitely correct. I think you've helped me understand/clarify some mistaken assumptions!
I think that was where the previous part of the conversation was going (and will be interested to hear how others respond - and why I've previously had a tendency to avoid adding such an extra 'level' (or indeed, levels) to any solution. In a 'newspaper' situation, the number of db entries/pages is small, in relation to the number of inbound (HTTP) requests, and that has an important bearing on the likelihood of success. Too many of the applications where I have considered cache had an expected ratio/hit rate too low to justify the extra costs and complexity, so it always seemed easier to look at the twin anathemas of bolstering the hardware and/or slimming down the ambitions behind the content... Thanks for taking the time, =dn ----- Original Message ----- From: "Garth Dahlstrom" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: 23 January 2002 05:16 Subject: Re: [PHP] Re: How should I cache database data for php? > DL, > > Since I have a willing audience I'll ramble on a bit... :) > > > =were the page subcomponents formatted as HTML or simple text? > > The templates were an XML "scripting" language that got > rendered through a Java App server into cached HTML. > > In place of the XML "scripting", one could also use custom tags > that hooked Java classes, but no one on our team knew Java well enough > and I was able to demonstrate that the XML scripting stuff > was good-enough(tm). > > The system for publishing is called Content Server by OpenMarket. > The company I worked for is Toronto's biggest newspaper - > http://www.thestar.com > > Some neat concepts, geared towards publishers, but they let you > get inside an muck about with most of the code that drove > the interface, and content controls - publish events (all done XML > script)... > > > =how was the fact that a particular page used certain subcomponents > > tracked? (eg to know what to update, when) > > This was done in a database, there were objects for > pages (instances of templates), templates (page layout > slots + base render code), articles, links, pictures, > collections (lists of articles, links, pictures), and you > could define your own object types and their relationships > to the predefined ones. > > When a page was changed it was flagged for publish in another > table and then the all of the subcompoents associated with that > page were purged from the cache and they were re-rendered > as pieces (by a calling them through a certain URL) and > the the page-level caches were purged. (we hacked this > part in ourselves) > > It's a combo of disk and memory caching handled by > OpenMarket, I told my coworkers to stay away from > publishing static content (3 years of that left us with > over 200,000 pages of outdated stuff that no one knew what > was important and what wasn't, course they weren't using > a DB when I started there either but that's another story ;) ). > > > =why did/do you think the newspaper made the decision to do the > > former, than to try this ADODB-type solution? > > Well, they had a custom-built in house solution that created > those 200k pages I mentioned and a perl traller that > used go through each one of them to update parts of it by > seeking out special comments (I didn't write it, but that > was FUNNY stuff). So they wanted a vendor product... > quite understandable. > > Also, I didn't hear about ADODB till this year. > > > =they both achieve the stated objective (reduce load on the > > db-server). Is one somewhat lesser/better than the > > other? > > The DB server takes only for handling session state (but there > isn't much to that), all of the content is cached to disk > or to memory (this was needed because NAS2.1 was *SO* slow > to render the content in the first place). > > Where the system is "not so good, Al" is the dynamic content > on each page view or per visitor, so for example personalsation > was not really a possiblity because of the ridge cache setup. > > This is where an ADODB-cached data set solution would be > my prefered choice because you could mirror some data out to the > webservers (weather, scores, etc)... you still have to hit the > DB for session stuff (unless you are fond of sticky bits holding > your state to one web server in a multi web server env). > > Caching is a funny thing, the behaviour you want to use really > depends on the nature of the content you deliver... You > won't find a silver bullet method to do it perfectly for all > content. > > We used to force the caches to purge when we published new stuff, > but that worked because publishing was done in batches at certain > times (Infinite TTL). But if you were publishing info > every minute (i.e. stock quotes), then you might want to > cache solely on a 1 minute TTL for data... and perhaps not > even cache at a page level if it changes so much. > > Anyway... I've rambled long enough... :) > > -Garth > > Northern.CA ===-- > http://www.northern.ca > Canada's Search Engine > > > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]