Perrin Harkins wrote:
Christopher L. Everett wrote:
But I haven't
been able to wrap my skull around knowing when the data in Mysql is
fresher than what is in the cache without doing a major portion of the
work needed to generate that web page to begin with.
There are three ways to handle cache synchronization:
1) Time to Live (TTL). This approach just keeps the data cached for a
certain amount of time and ignores possible updates. This is the most
popular because it is easy to implement and gives good performance.
Cache::Cache and friends work this way.
I'm cursed by my installed base. Our users go into our site to "make
sure" their changes are up correctly. I don't think a 15 second TTL
would do us any good :)
2) Polling. This involves checking the freshness of the data before
serving it from cache. This is only feasible if you have a way to check
freshness that is faster than re-generating the data. This is difficult
in most situations.
3) Invalidation. This approach involves removing cache entried whenever
you update something that would make them out of date. This is only
feasible if you have total control over the update mechanism and can
calculate all the dependencies quickly.
I see where one could combine polling and invalidation, for instance
by having empty files representing a page that get touched when the
data for them go out of date.
But again, there is the issue of mapping changed data onto dependent
pages. I guess one way to do that is to track which database rows
appear in which pages in the database. Since typically I do several
database operations to generate a page, adding one more delete or
insert operation whanever a new page is generated won't kill me.
Could get nasty in a big hurry if I'm not careful though. Perhaps
a cache manager object/class that handles cache mappings & invalidation
would be handy. Or maybe do that as part the PageKit base Model class.
One more thing. Perrin Harkins' eToys case study casually mentions a
a means of removing files from the mod_proxy cache directory so that
mod_proxy had to go back to the application servers to get an up to
date copy. I haven't seen anything in the mod_proxy docs that says
this is possible. Does something like that exist outside of eToys?
Not in mod_proxy. We added it ourselves. I don't have the code for
that anymore, but it's not hard to do if you have a competent C hacker
handy. Maybe mod_accel has this feature.
Well, I like to think I'm language independent, heh. But reinventing
the wheel isn't cheap. I'll root around some more.
--
Christopher L. Everett
Chief Technology Officer
The Medical Banner Exchange
Physicians Employment on the Internet