The answer is probably the same as it is for most situations: it depends. My suggestion is this: any data which is a.) guaranteed to be consistent until expiration (i.e. will not need to be invalidated), b.) really small, and c.) accessed very very frequently, you can store locally (but I'd still potentially have that local cache pull from memcached, rather than from the database, directly).
Anything else, put in memcached. Our own testing has shown memcached to be really fast, the network access times to be really low, and honestly, I suspect memcached alone will be all you need. If you're finding those network access delays are causing the problem, then data that fits the model above can be put into a local cache. If it doesn't fit that model, though, I'd avoid caching it locally for all of the problems other people have mentioned. You'll be caching the same data multiple times. You'll have to manage multiple caches. You'll have to invalidate bad data across multiple caches. Blah blah blah. It's a lot of work and it's fairly inefficient. I really like memcached because it's easy, it's fast, and it Just Works in so many cases. Nicholas On Tue, May 12, 2009 at 10:07 AM, JonB <[email protected]> wrote: > > > > On May 12, 12:56 pm, Henrik Schröder <[email protected]> wrote: > > If you went with the application cache in your case, how would you handle > > cache invalidations? Would you have to develop a mechanism to tell all of > > your front end machines that a certain key has expired, or isn't your > > application affected if it gets stale data from the cache? Yes, memcached > is > > slower than a local application cache, but it's synchronized which means > > that if one of your front end machines update or delete a key, that > change > > is visible immediately to your entire application. To compare a local > > application cache with memcahed, you would have to factor in both the > > network overhead of fetching something from memcached, and the network > > overhead of synchronizing your local application caches. The latter case > > scales really, really badly if you have a lot of front end machines. > > > > /Henrik > > If we went with a local cache, it wouldn't be shared between machines > - in reality, a lot of the front end machines would end up with almost > 'identical' caches (as they often run the same queries) - but it > wouldn't be shared. > > Expiry would be the same as memcached - when data's added to the local > cache, it gets an 'expire by' time stored with it. > > By the sound of it, memcached will work - it'll cost me network calls > (but not to the MySQL server) - the only remaining doubt I have is > over the data being stored in it. > > Not a lot of our data can be agregated before being cached - so we're > left again, looking in the cache for very small bits of data (a few > bytes to a few hundred bytes). > > I'll have to sit down and work out where in the application it's > likely to sit, and whether the cost of the network calls etc. to get a > few bytes, is actually worth it. > > I guess having a 'localhost' cache would avoid some of the overhead, > as would UDP. > > I have a feeling this is all going to end in a 'write it and see' kind > of solution :) > > At least in my head it seems like it might play out - afterall, even > if the SQL queries usually return very little data - some of them are > quite expensive for the server to run, and very static in nature - so > we're trading cycles on the front end (making it less efficient) to > push that load to the front end, rather than the MySQL servers.
