Hello again , more inline -----Original Message----- From: Moderated discussion of advanced .NET topics. [mailto:[EMAIL PROTECTED]]On Behalf Of Thomas Tomiczek Sent: Wednesday, 9 October 2002 7:03 PM To: [EMAIL PROTECTED] Subject: Re: [ADVANCED-DOTNET] Strongly-Typed DataSets vs. Strongly-Typed Collections
Hello. Inline... -----Original Message----- From: Ben Kloosterman [mailto:[EMAIL PROTECTED]] Sent: Mittwoch, 9. Oktober 2002 12:26 To: [EMAIL PROTECTED] Subject: Re: [ADVANCED-DOTNET] Strongly-Typed DataSets vs. Strongly-Typed Collections 1. With DataSet you can do ordered updates as you can get allrecords that were inserted , deleted etc . And pass these to the DB. In fact you can do very complicated parent child updates easily . eg as an example do all deletes , do child insert than updates and then do parent inserts/ updates. *** YES, thats exactly - where you run into problems having REALLY complicated updates. Know it - been there :-) ALSO you have to code the ORDER yourself. See the difference? YOU have to cal the table insert/updates etc. YOURSELF. Our EntityBroker has a set set of objects and mappings and generates this order HIMSELF. There is no need to control the order of updates in your code at all. Now, how is this inferior to a crude manual appraoch? Yes but it took me 2 days to build an automated update for datasets which does the same automatically. Just run through the tree and follow the parent child relationships. 2. Delayed updates are also done with DataSets , except you set an initial id value of a negative. When updating a key ( to the correct value for the insert ) the Dataset will update all references to the -ve key. This also avoids nulls. *** WRONG. Thats pretty SHITTY, because this means you get a changing primary key. You can not take the PK out of the "object" and store it until you have commited to the database. It also means that for every update you have to ask for the value and have this return back. Thats VASTLY inferior to have a number generator on the software (which, btw., you could also do with DataSets). Hate to tell you this, but this is NOT cutting it. Nothing wrong with a changing primary key for in memory references like datasets - it is just bad for a DB . You only change it for inserts never for updates - which allow other records to access it - this way you still have a single insert update and avoid multiple round trips Yes you can use generated numbers (GUID) I prefer this method as they are easy to use by people eg part number xxxx and the fact number generators waste too much space and index searches get very slow - Compared to the DB your memory Back to the GUID verses int debate.... *** Also, this is does NOT solve our problem with delayed updates and backlinks that I was referring to. *** See this, as a sample what I was meaning. Table a -> b Table b -> a (backlink) Sort of: a is customer, b are his bank accounts, every bank account (or credit card or shipping address) has a link to it's customer, the customer has a backlink back to the "default or dcrrent" value. Problem: You can NOT insert this into the database in one pass - SQL Server at least does NOT support delayed integrity checkingm, like some other databases. You can not insert b, as the link to a needs to resolve, and you can not insert a, as the link to b needs to resolve. Solution: Insert b, with NULL in place of the link, insert A, then update B for the value. EntityBroker: Automatic (link is marked with "Backlink=true") DataSet: By hand. Now, this might sound like a not too common problem - it is pretty common when you orient your DB design around your dataaset. This is automatic in datasets. if only one record is new. If both records are new. Then we have the following. Tablea - new record -1 , bindex -1 Tableb , new record -1 , aindex -1 When you send it to the update it first does tablea - table a gets put in the db as new record id1012 , bindex -1; // have to allow b indexes to be negative When it returns the dataset automatically updates eg Tablea - new record -1012 , bindex -1 Tableb , new record -1 , aindex -1012 Now table b is send to the db and the new record is given an id of 620 when it returns the dataset looks like this Tablea - new record -1012 , bindex 620 Tableb , new record -620 , aindex -1012 This is out of synch with the DB , But the Dataset has the changes marked . Solution in the update routine put 2 lines If dataset.HasChanges Updatedataset It should be said that datasets are not good for DB's that have a high update ratio but these are pretty rare. <--- I think it is more a case that people are not very familiar with the DataSets and how to use it. Datasets give good performance and are flexible , and rich in features. They do take to much memory though which means intermachine performance can suffer unless they are compressed. *** Datasets give a mediocre performance and are terrible to use in larger, distributed applications. Memory consumption is really bad - which can be a problem to build a cache of objects in memory, which is the KEY to BOOST your performance in some scenarios by sometimes more than 1000%. Again, here: EntityBroker (O/R mapper approach): semiautomatic (possible controlled by attributes). DataSet: By hand. You losse. Yes I agree memory consumption is an issue but CPU perfomance of datasets is excellent. I like your approach but have done a similar thing based around datasets. I use a client and server cache but sometimes keep the datasets in compressed formats. (Updates are chained up and down with one or more DataRow at a time) I use 2 forms of compression. 4 MByte / second - 20:1 compression 30-40 MByte / second - 7:1 compression - this is on an AMD Duron 750. Now the 7:1 compression can compress 1 Meg to 130K in about 30ms. This way i massively increase my cache capability and when the ckient requests a table it is send compressed. For tables where row access is frequent obviously the Dataset is not stored as compressed - nor is it cost effective to compress a single row when sending it to the client. Both ways have their advantages but compared to a custom collection . Datasets many features which require no code - XML dumps and imports , Change tracking , Binding , sorting ( via DataView) , ISerializable and best of all Select so on a keyed search you only pass 1 record to the client (Without going to the DB). All mostly Bugfree and with the lilely hood of advancements in the next release . Prety good for a first release and very useable . They do need a bit of work particulaly in terms of caching and memory consumed and I have been working on a framework for this but the same applies to custom solutions -The framework I am writing also involves automatic creation of datasets and data access. eg to get a Company record this is your code. MiddleServerCache.GetData("COMPANY" , recordID); // or for the whole table MiddleServerCache.GetData("COUNTRIES" ); This requires NO code when using my component provided you use SQL server. Anyway this is similar to what you have done it is just based around Datasets to retain future compatability. Both ways have their merits - my main focus has been on automatic dataaccess and caching ( compressed and uncompressed) .I am stuck with any changes Microsoft make. Ben Kloosterman Thomas Tomiczek THONA Consulting Ltd. (Microsoft MVP C#/.NET) You can read messages from the Advanced DOTNET archive, unsubscribe from Advanced DOTNET, or subscribe to other DevelopMentor lists at http://discuss.develop.com. You can read messages from the Advanced DOTNET archive, unsubscribe from Advanced DOTNET, or subscribe to other DevelopMentor lists at http://discuss.develop.com.