> -----Original Message----- > From: Ben Kloosterman [mailto:[EMAIL PROTECTED] > > In our case we must implement Object Query - long story, business > > requerement. However, I see these two ideas complementing each other; > > there is no conflict. > [Ben Kloosterman] > You can use them as you can get some of the data from the > cache . Often O/R layers become much simpler.
no, it gets more complicated. To be blunt: caching data is a tedious task which often goes wrong. Also the purpose of caching is misunderstood. My dreaded caching-is-hard-example: say you have 50 customer objects in the cache. The application wants to load all customers who bought a product X. This will always cause a database query as you can't rely on the in-memory cache if all customer objects available in the database are actually IN the in-memory cache. So this will first load the customer data from the db, then you have to update the cache in-memory with the loaded data and return updated objects from the cache and new ones not in the cache. This has to be done with every query on the data except PK fetches, which can first check the cache. However, even then, it can become a nightmare, as the logic doesn't know if the data is from teh cache or from the db, so you can run into application state issues which will cause wrong decisions being made in your logic, because it works with stale data. (example: desktop app runs on machine A and B. Entity E is loaded in both instances. E is updated by A. B doesn't see that update. B's logic again wants E. E is in the cache, so the logic gets teh cached E. Not good, it misses A's updated data. A cache can therefore only be used to get uniqueing for entity instances (that is: the data of an entity instance in the persistent storage loaded into memory). However this goes wrong as well in many occasions. Say I create a webapp. I store an entity in the viewstate, which is read from the cache. Now when the page gets a postback, the appDomain is recycled and the page gets again the same instance from the viewstate but a different instance from the datalayer. Same data, different instance. Moving on to the hard part: where to put the cache? And what do you want to cache? User state or application state? User state is not that hard, but hardly effective. Application state is more effective but impossible to do: in a webfarm, where to put the application state cache? On a separate box? Similar problem occurs when a desktop app is querying the same database on a lot of desktops. A separate box sounds appealing, but it requires access security to get a save cache, it requires connections to get the cached data... hmmm... sounds familiar, the database also has that. Better yet: the database system also caches data, in memory, caches query plans (so querying the cache if an object is there matching some predicate is optimized in the database) and other nice things making performance as high as possible. Creating your own cache will likely result in rewriting an engine similar to SqlServer but then for objects. > > What strategy do you use to keep the cache in sync with the database, > > especially in the context of a Winforms App? > > [Ben Kloosterman] > This is the most interesting bit :-) It really depends on > how much you want to scale - with caching your Mid Tiers are > very fast and can often handle hundreds or thousands of users > - so often 1 server and a standby is enough . On a Single > Server what I do is this : > On a successful DB update , delete or insert, I update the > cache. In fact I even do this with multiple servers when > there is 1 per geo graphic location ( Sydney doesn't need to > know Melbourne's updates) . > > For multiple servers per location you have at least 3 strategies . > 1. Put time stamps on your records and poll the tables > looking for changes . Data is out of synch by the time of the poll. > 2. Send Cache updates to all other servers. This required > the mid tier to know about the other servers. > 3. DB table Triggers But why all this overhead? What do you win? You need a lot of overhead to get everything in sync and for what? To save some connections to a database? Isn't that rather moot when you need database polling to get that efficiency gain? I also doubt the 'with caching your Mid Tiers are very fast and can often handle hundreds or thousands of users - so often 1 server and a standby is enough' claim. Based on which facts is this claim made? I gave a simple example which makes caching for performance a farce. (You also say later on that caching for performance is not the goal, which is correct as such a goal can't be reached). > Remember though the goal is not performance ( as you are > replacing functions SQL server already does) , the cache > insures that the business layers has immediate access to > frequent / key information , this allows a good middle tier > design which is simple and allows simple DB interactions. This is not true. THe BL always has to consult the only real repository in the system: the persistent storage to make sure the data it HAS TO work with is correct. Like the example I gave. You can NEVER rely on a cache in memory if the data is correct, because right before the database query is executed another thread could have added a customer who bought product X. Your query consulting stale data will miss that customer, which could cause false decisions being made. 'Immediate access' is also not that easy. True, storing customer objects in a hashtable based on their single field PK value is not that hard and finding them back isn't either. It gets tougher when you want to get a set of data based on ANY given predicate. Even 'all customers from 'France'' is faster read from the DB than from a cache, because it will require an index IN MEMORY on the country field, otherwise you'll get a linear search in memory through objects, which I have the feeling is slower than the average RDBMS is able to put on the table. The only caching which DOES work are: caching of processed results and caching of never-changing data. For example a rendered webcontrol, cached for 1 minute. Not only do you save the database roundtrips, you also save the processing time. Often, people should think why they want caching in the first place. To save webserver power because the website gets 500,000 hits per day? Perhaps a page caching with 1 minute per page will help. Often that will give much more performance boosts than lowlevel caching with a lot of overhead. Frans. =================================== This list is hosted by DevelopMentorŪ http://www.develop.com Some .NET courses you may be interested in: Essential .NET: building applications and components with CSharp August 30 - September 3, in Los Angeles http://www.develop.com/courses/edotnet View archives and manage your subscription(s) at http://discuss.develop.com