Hi Thanks Dean and Josef for your responses...
Josef, what's the name of the microsoft caching software, as I'm not familiar with it. I know what you are saying is true to a degree, but I think it is worth doing so every application that uses a database could gain the benefits of memcached without requiring changes to the application code. Wouldn't it be nice to get the speed boost of caching in all parts of your application without needing to complicate your code with memcached requests AND database requests? I'm not aware of any open source database that is setup with memory caching system that can be as large, fast, or as distributed as memcached... It truly is a brilliant solution as is. > Integrating memcached into a database server API wouldn't be hard but > I'm not sure it wouldn't cause a lot more problems than writing a > database caching system from scratch. What you're talking about would > require a great deal of work on the server's part to track which > memcached server the data is stored on, to make sure that data is > replaced when an update to the database is made, etc. > Why would it take so much work on the server's part to track which memcached server the data is stored on? Could not the core database just use a hash? In fact, couldn't the hashing space could be dynamically controlled by the core database to handle moving hashing space from caching database daemon (CDD) to CDD. Of course, this solution should include all of the current memcached api, to support the current users, and to allow fast caching/sharing of application data that doesn't need to be backed by a database. >From what I understand of what you're asking, you basically want the > database to be able to cache the result of a given query with a given > set of parameters, so that if they query is made a second time with the > exact same parameters it can just "look it up" in it's cache and return > the results directly. No, that's the dumb way of caching. Surprisingly, even that way of caching can provide incredible performance gains at times, but I'm going to describe what I believe to be a much smarter way. In essence, every row from any table is cached as an element. The CDD has enough database smarts to process the sql, joins, etc from tables and indexes. It's just rather than having to go through the core database for every read, far more data that is known to be good will be available in the distributed cache. > Then, the database would have some mechanism so that it would "know" > when those cached queries are made invalid by *other* queries and > automatically evict the invalidated results from the cache. There's the beauty and the challenge... By having the core database communicate back to the cache when an update occurs, the cached data stays current. > If it were really that simple , believe me, they'd all be doing it. > That'd kill your TPC-C scores! > By kill, I assume you mean show a factor of 10 improvement or more??? I don't know, and I'll detail my ideas on an implementation. For discussion, let's only talk about the database type of queries. For the record, I believe the caching database daemon (CDD) should essentially implement the current memcached api if for no other reason, than that is how I see the different CDD's communicating amongst themselves without bothering the core database. The amount of memcached support implemented in the CDD is completely variable. Although I see it evolving to the point where the cached data includes a timestamp in order to work smoothly with transactions, it could provide a significant database performance boost by just pre-processing the sql requests and caching non-transaction related data. Let's look at how I see simple interactions working: On the database side, imagine a key/value pair is hashed and distributed for every database row. The sql query is requesting the most recent data from a single row not in a transaction. The CDD decodes the sql, determines the "hash" for that row, and returns with data from a memcached get. If the get fails, then the CDD controlling the hash will request the row from the database core so there isn't any issues of duplicate requests or race conditions. In the case of a sql query involving multiple rows from one or more tables, the CDD gets a list of all rows from the table, or an index if appropriate, and processes each row as above. Simple row updates not involved in a transaction are again passed through the appropriate CDD, to avoid race conditions. Now for some magic During a transaction things can get much more complicated, however we can start by handling simple cases. In the simplest transaction, when it is committed, the core database sends each CDD a list of its rows that were modified on a priority channel. The priority channel is used so the CDD will expire the affected, and optionally add the new row before the CDD handles any more normal requests. Within a transaction, row data requests can include a timestamp or revision index in an attempt to get data that was current at an earlier point in time. I believe this will then allow the caching system to duplicate the functionality of an mvcc database. Transactional updates will be passed to the core database. The core database will be modified, so it too can take advantage of the cache. Instead of going to the disk drive to request a row, it may request the data from the appropriate CDD. And now, when things go wrong... In my understanding of this, things become the most complicated when communications between the nodes fails. My current best idea involves a heartbeat sharing of disabled nodes. The goal is when any node cannot talk to another, it disables that node, and tells every other node about the problem on a priority channel. The calling node then falls back on the core database to handle all requests for that node. When the connection is restored, that node gets an updated or reset by the core database before restarting. So, in conclusion, the end goal of this is to provide memcached type caching to the database in such a way that the data it returns is always accurate. I'm not saying this would be easy, but it does seem to be well worth the effort. Thanks Daniel