> > Is there any reason why Memcached couldn't be baked into a database > interface module? > I have seen a memcached based 2nd level cache for Hibernate : http://code.google.com/p/hibernate-memcached/
Has anyone worked out a system using memcached that guarantee's > memcached has the most recent data? Essentially, tying memcached to the > database so memcached is never allowed to contain any stale data? Yes, I have tried to achieve this at Lifeblob by building an application level caching layer above memcache API ( Dustin's lib) which would take care of concurrent updates to cache by using locking ( lock variable with a small expiry ) and/or CAS operations. So essentially it works like this: -> Trying to add something to cache -> CAS operation enabled -> No -> Do an add -> If add fails, add a lock to cache ( with a retry count/timeout for acquiring the lock ) -> Once lock is achieved, get the latest data either from DB / application -> Add to cache and remove lock -> Yes -> Perform CAS -> If cas response finds the CAS value out of sync ( something else replaced the cache value), then resolve the conflict between the current cached value and value from DB/Application. -> Add to cache ( by taking and later releasing lock ) The above steps can more or less guarantee cache to have recent data. But note that above steps obviously includes an overhead of conditional lock acquiring/releasing. In our case, places were we need high level of cache consitency( and data which we cache for a very very long time), we follow the above approach. For data that has a very small expiry, the above method would be an overkill and we avoid it. -XP Lifeblob On Thu, Jun 19, 2008 at 9:32 PM, Daniel <[EMAIL PROTECTED]> wrote: > Hi > > Thanks Dustin... > > > Adding further metadata to the cache and changing the protocol > to > > return it isn't really an option at this point, but it's easy to add > > to the data in your application. > > > Yeah, unless someone really needs it. It seems like this feature could > be added to the current protocol easily enough just by adding an > "expired" flag to the return data, and perhaps a command line switch to > enable memcached to return expired data with a timestamp. > > > > Why would you disable caching just because something's > writing? > > There's always a last write. > > Ok, first I better mention what I was referring to with "guaranteed" > current data. > > Database programmers are fanatical about making sure the data is > consistent to any reader during a whole transaction. They get into all > sorts permutations with things like read committed, serialized, mvcc > processing etc to handle the issue. The problem is a truly thorny one, > which explains to me why databases can run so darn slowly! > > >From my research into concurrent transactions, things just get too > hairy > to try to keep track of all the possible concurrent issues when caching > anything that's being updated in transactions. > > If it has been done, is there anyone interested in sharing how they did > it, since the locking/transaction/caching issues can be hard to solve. > > Speaking of this, it seems that a better solution would have the > database interface module include caching. It could create a huge > performance boost by integrating memcached functionality into the a > database interface module. Rather than having all database functionality > run on a core set of machines, you could have a database pre-processor > with memcached functionality running on the client machines. The > pre-processor could handle all of the sql parsing, memcached server, and > communication to the core database machines. > > If done right, the database interface wouldn't change a bit, but reads > from the database could occur so much faster. > > To do this, the database programmers would have to be convinced that > they could provide "guaranteed" accurate cached data. > > I've tried talking with the postgresql programmers, and they are stone > cold to the idea... Perhaps Sun could get a few guys working on it for > Mysql since they seem to now have an interest in both memcached and > mysql, and the technique could do so much better than the current > caching implemented in mysql!!! > > > Most of this is just tired ramblings of someone guessing at > > requirements, though. Once there are particular constraints for an > > application, the mechanism to ensure correctness becomes more clear. > > So, just in case my ramblings were not especially clear, has anyone > dealt with the constraints of caching data in the middle of multiple > on-going transactions in a way that is demonstrateably effective? > > Is there any reason why Memcached couldn't be baked into a database > interface module? > > Thanks > > Daniel > > >