I would actually say that it's not the size of the data that's relevant
here, but you're saying that you don't need a synchronized cache, and you
also said that generating the cached data is fairly cheap, so in your case
local caches with short expiration times would be much better for you.

Memcached will work fine for you, it will be much faster than going to the
db every time, but from your description your application can't really
benefit from the main points of memcached, i.e. that it is a distributed
synchronized cache. It works best when you have objects that are expensive
to create, but can be cached for a long time and properly invalidated or
updated in the cache, and where you need the cache to be synchronized.

I would definitely try a local cache as well in your case and measure that
against your memcached implementation.


/Henrik

On Tue, May 12, 2009 at 20:58, JonB <[email protected]> wrote:

>
>
>
> On May 12, 6:31 pm, Henrik Schröder <[email protected]> wrote:
> > Yes, but how would you do cache invalidations?
> >
> > Right now in your existing application, you don't have to worry about
> when
> > the underlying data changes, since all reads go to the one and only
> storage
> > of this data. But if you add a cache layer, you have to start worrying
> about
> > cache invalidation. If you have a local cache, and one machine changes
> the
> > underlying data, you somehow have to tell all your other machines to
> refresh
> > parts of their caches as well. How would you do that? How much would that
> > mechanism cost?
>
> If the data was cached locally by the application, it would use the
> same system as memcached, i.e. the data is given an 'expire by time'
> when it should be discarded - on that machine. This will obviously be
> different on each machine, so each machine would discard it at
> different times, but so long as the data is invalidated, say 5 minutes
> after it was last read - that isn't an issue.
>
> The application doesn't care if 'Machine A' changes data that's in
> 'Machine B's cache - so long as it knows 'Machine B's cache will only
> persist for 5 minutes.
>
> > On the other hand, if you use memcached, the cache is shared, so if one
> > machine changes what's in memcached, all other machines will pick up the
> new
> > values immediately. This way, updates are cheaper, but at a higher read
> > cost. What's the ratio of updates/reads in your application?
> >
> > Or is cache invalidation irrelevant for your application? Will it run
> fine
> > with stale data?
>
> It's not quite irrelevant, it's just 'very tolerant' of stale data.
>
> At the end of the day it comes down to:
>
>  - Do I make a single function call, to return a few bytes of data
> from a 'global slab' of RAM on the local machine.
>
> or,
>
>  - Do I make a memcached call to retrieve what could be a few bytes of
> data, from the memcache 'system'.
>
> Both will do what I want - I guess what I'm looking for is for someone
> to say either:
>
> "It sounds like, for the small data sets your working with there won't
> be much benefit to using memcache - when you take into account the
> processing/network overheads etc."
>
> or,
>
> "The overheads aren't that great - they'll still be a huge amount less
> that hitting the MySQL server, they'll still save the server for more
> write bandwidth, and at least it'll scale and give a coherent cache
> view across 'n' number of machines".
>
> I think I'll just install it - see how easy it is to integrate, and
> then run some test loads through the system and see what it does... I
> have no doubt it'll make the MySQL server's life easier (shielding it
> from the bulk of several hundred SELECT's a second) - and if the front
> end is still spending more time 'doing useful work' than the overhead
> of using the cache, we'll go with it.
>
> Phew - I think I just answered the question :)

Reply via email to