Re: Algorithm for automatic cache invalidation

Jakub Łopuszański Tue, 22 May 2012 15:08:09 -0700

Well this is already deployed to a production system and works just great. 
The reason this is faster is that in some situations you can not easily 
shard a database as some queries would cross shards boundaries. Of course 
this means that a regular DB will not scale unless you change your queries. 
On the other hand a cache is easier to distribute/shard/scale.
It is also easier (and in fact I do that on production) to have a memcache 
runing on each frontend machine (while this would be a strange idea to put 
shards of database on frontends which are usually diskless).
Of course one could argue, that you should just avoid complicated queries 
and shard everything, but even then, the cost of communicating over network 
between a frontend and mysql backend can be larger than a cost of a single 
multiget to a memcached (even if the network latency is similar, I believe 
that network stack of memcahed is one of the fastest).


In my particular example I have 6 frontend diskless machines, each running 
apache2 and memached, and single database and single global memcached. Most 
queries are resolved without any network communication at all, as they 
result in local cache hit.

I think that developers of mysql have no way (or incentive) to build as 
powerfull cache into a database, as no single database has not as much RAM, 
not as many network cards and not as many CPUs as a cloud of 55 memcaches 
used in nk.pl


W dniu piątek, 11 maja 2012 19:57:58 UTC+2 użytkownik Perrin Harkins 
napisał:
>
> On Fri, Apr 27, 2012 at 6:14 PM, Jakub Łopuszański <[email protected]> 
> wrote: 
> > Hi, I’d like to share with you an algorithm for cache invalidation that 
> I’ve 
> > came up with, and successfully implemented in a real world application. 
>
> This may be a silly question, but have you benchmarked your cached 
> application against just going straight to the database?  I've always 
> had the impression that keeping a perfect cache of a database that 
> beats it in performance was not possible because the overhead of cache 
> invalidation (both in the cache and in the application) would ruin the 
> performance gains on reads.  Caches usually beat database performance 
> by sacrificing accuracy (e.g. allowing race conditions and non-ACID 
> behavior) and freshness of data. 
>
> To put it another way, if it was possible to have an up-to-date cache 
> that outperforms an RDBMS with the same data, wouldn't the makers of 
> that RDBMS simply build that cache into their product? 
>
> - Perrin 
>

Re: Algorithm for automatic cache invalidation

Reply via email to