Hi Thanks Dormando...
> Sorry, I'll weigh in a little here. I've only been skimming so forgive > if this is a repeat: Thank you... I appreciate everyone's considered opinions. > - Overcomplicated. See dustin's rails examples for easy abstractions on > getting to the 90% mark. You are absolutely correct... But wouldn't it be worth making a memcached/database combo that's overcomplicated if we could get a performance boost from every app that uses just like they use a database? > - Crazy in-memory databases probably aren't that much faster. If you rip > out most of MySQL's parser and optimize your schema so InnoDB clustered > indexes and adaptive hash indexes morph into life, that's probably close > to as far as you're going to go. First, In memory databases are, or at least can be far faster than databases requiring a spinning disk access. Flash disk accesses and remote in-memory accesses are somewhere in between. Even with a full fledged database cluster doing the work, the cluster ends up running more slowly because it has to handle all of the extra database needs, replication, journaling, failover, version control, blocking, etc etc. In this system, I'm describing a system where the app, and the CDD are on the same system to avoid extra remote accesses, since the CDD is running on the app machine. More important, however... If we can find a way to do more database processing and data reads "outside of the core", more performance is available than when the core database is handling reads and writes. This idea helps to separate the the reading, in the CDD from writes. FYI, Oracle's product, TimesTen seems to have some performance metrics that I believe apply... They reported on page 44 of one financial trading system having an order of magnitude performance increase: http://www.oracle.com/technology/products/timesten/pdf/oow2007/oow07_s291347_timesten_caching_use_cases.pdf I just thought of another way of describing it. The CDD's are like Reader Databases, while the core database handles all writing. > - Limits your ability to cache the results of processing on multiple > queries... If you issue three queries, parse the results a little, then > use that elsewhere, you want to cache the cumulation if it's possible. > - Shouldn't limit your memcaching to the database :\ > If you're displaying a blog, the amount of processing time you blow on > the template will likely outweigh any DB activity. Cache the results of > the entirety or chunks of template renderings. I believe you can KISS > all sections of this without wasting too much time. Yes, that is a better memcached specific application design. What I'm suggesting will be significantly slower than an integrated designed-for memcached app. That's way I believe it seems worthwhile to include the current memcached functionality in the CDD. ---- I don't get it... Here in "memcached land" we're dealing with situations where if we DON'T warm up the cache before going live can make sites blow up, meanwhile people are saying/thinking that a generic memcached/database combination isn't worth the trouble. I think that combining the two will provide a huge performance boost, as long as all the design decisions made support non-blocked processing for the core database. Memcached is great as it is, but it sometimes returns old data. The database knows when this data changes. Why can't we develop a system that will make it so the database alerts/updates the cache when data is changed. Additionally, for example, the CDD's could have a special protocol that let's them access data directly, which should be faster than even a pre-compiled sql query. As another way of thinking about it, first implement the things that memcached can do easily, and let the more complex tasks fall through to the database. Surely that can be done without slowing the database down. For the fun of it, let me revisit one of the more complex tasks, and see if you can't see how this could result in an incredible performance boost. Live Queries We're all familiar with those knarly queries that need to be run repeatedly. I believe it's possible to create a system where the query is kept up to date by the Caching Database Daemon's (CDD's) adding a very small overhead to the core database if there are enough CDD's to keep the underlying data in cache memory. Each CDD would be aware of the live query. When the core database updates the CDD with data involving tables in the query, the CDD would then proceed to rerun the query on the new data. In doing so, it will have to do many requests to other CDD's for info on any joined records, and to alert other CDD's that have joined records that changed, but who cares as long as no additional requests go back to the core database. When the application wants the results, it's CDD simply sends a request to all of the other CDD's and sorts/limits/distinct the results. What a cool performance boost! Thanks Daniel On Fri, 2008-06-20 at 23:18 -0700, dormando wrote: > Sorry, I'll weigh in a little here. I've only been skimming so forgive > if this is a repeat: > > - Overcomplicated. See dustin's rails examples for easy abstractions on > getting to the 90% mark. > - Crazy in-memory databases probably aren't that much faster. If you rip > out most of MySQL's parser and optimize your schema so InnoDB clustered > indexes and adaptive hash indexes morph into life, that's probably close > to as far as you're going to go. > - Limits your ability to cache the results of processing on multiple > queries... If you issue three queries, parse the results a little, then > use that elsewhere, you want to cache the cumulation if it's possible. > - Shouldn't limit your memcaching to the database :\ > > If you're displaying a blog, the amount of processing time you blow on > the template will likely outweigh any DB activity. Cache the results of > the entirety or chunks of template renderings. I believe you can KISS > all sections of this without wasting too much time. > > -Dormando > > > So, in conclusion, the end goal of this is to provide memcached type > > caching to the database in such a way that the data it returns is always > > accurate. I'm not saying this would be easy, but it does seem to be well > > worth the effort. > > > > Thanks > > > > Daniel > > > > > > >
