Re: [ADVANCED-DOTNET] Strongly-Typed DataSets vs. Strongly-Typed Collections

Ben Kloosterman Thu, 10 Oct 2002 02:23:07 -0700

Hello again , more inline

-----Original Message-----
From: Moderated discussion of advanced .NET topics.
[mailto:[EMAIL PROTECTED]]On Behalf Of Thomas Tomiczek
Sent: Wednesday, 9 October 2002 7:03 PM
To: [EMAIL PROTECTED]
Subject: Re: [ADVANCED-DOTNET] Strongly-Typed DataSets vs. Strongly-Typed
Collections

Hello. Inline...

-----Original Message-----
From: Ben Kloosterman [mailto:[EMAIL PROTECTED]]
Sent: Mittwoch, 9. Oktober 2002 12:26
To: [EMAIL PROTECTED]
Subject: Re: [ADVANCED-DOTNET] Strongly-Typed DataSets vs.
Strongly-Typed Collections

1. With DataSet you can do ordered updates as you can get allrecords
that were inserted , deleted etc . And pass these to the DB. In fact you
can do very complicated parent child updates easily . eg as an example
do all deletes , do child insert than updates and then do parent
inserts/ updates.

*** YES, thats exactly - where you run into problems having REALLY
complicated updates. Know it - been there :-) ALSO you have to code the
ORDER yourself. See the difference? YOU have to cal the table
insert/updates etc. YOURSELF. Our EntityBroker has a set set of objects
and mappings and generates this order HIMSELF. There is no need to
control the order of updates in your code at all. Now, how is this
inferior to a crude manual appraoch?

Yes but it took me 2 days to build an automated update for datasets which
does the same automatically. Just run through the tree and follow the parent
child relationships.

2. Delayed updates are also done with DataSets , except you set an
initial id value of a negative.  When updating a key ( to the correct
value for the insert ) the Dataset will update all references to the -ve
key. This also avoids nulls.

*** WRONG. Thats pretty SHITTY, because this means you get a changing
primary key. You can not take the PK out of the "object" and store it
until you have commited to the database. It also means that for every
update you have to ask for the value and have this return back. Thats
VASTLY inferior to have a number generator on the software (which, btw.,
you could also do with DataSets). Hate to tell you this, but this is NOT
cutting it.

Nothing wrong with a changing primary key for in memory references  like
datasets - it is just bad for a DB . You only change it for inserts never
for updates - which allow other records to access it - this way you still
have a single insert update and avoid multiple round trips   Yes you can use
generated numbers (GUID)  I prefer this method as they are easy to use by
people eg part number xxxx  and the fact number generators waste too much
space  and index searches get very  slow -  Compared to the DB your memory
Back to the GUID verses int debate....

*** Also, this is does NOT solve our problem with delayed updates and
backlinks that I was referring to.

*** See this, as a sample what I was meaning.
Table a -> b
Table b -> a (backlink)

Sort of: a is customer, b are his bank accounts, every bank account (or
credit card or shipping address) has a link to it's customer, the
customer has a backlink back to the "default or dcrrent" value.

Problem: You can NOT insert this into the database in one pass - SQL
Server at least does NOT support delayed integrity checkingm, like some
other databases. You can not insert b, as the link to a needs to
resolve, and you can not insert a, as the link to b needs to resolve.

Solution: Insert b, with NULL in place of the link, insert A, then
update B for the value.
EntityBroker: Automatic (link is marked with "Backlink=true")
DataSet: By hand.
Now, this might sound like a not too common problem - it is pretty
common when you orient your DB design around your dataaset.

This is automatic in datasets. if only one record is new.
If both records are new.  Then we have the following.
Tablea - new record -1  , bindex -1
Tableb , new record -1 , aindex -1

When you send it to the update it first does tablea - table a gets put in
the db as
new record id1012 , bindex -1;   // have to allow b indexes to be negative

When it returns the dataset automatically updates eg
Tablea - new record -1012  , bindex -1
Tableb , new record -1 , aindex -1012

Now table b is send to the db  and the new record is given an id of 620 when
it returns the dataset looks like  this
Tablea - new record -1012  , bindex 620
Tableb , new record -620 , aindex -1012

This is out of synch with the DB , But the Dataset has the changes marked .
Solution in the update routine put 2 lines

If dataset.HasChanges Updatedataset

It should be said that datasets are not good for DB's that have a high
update ratio but these are pretty rare.

<---

I think it is more a case  that people are not very familiar with the
DataSets and how to use it. Datasets give good performance and are
flexible , and rich in features. They do take to much memory though
which means intermachine performance can suffer unless they are
compressed.

*** Datasets give a mediocre performance and are terrible to use in
larger, distributed applications. Memory consumption is really bad -
which can be a problem to build a cache of objects in memory, which is
the KEY to BOOST your performance in some scenarios by sometimes more
than 1000%. Again, here: EntityBroker (O/R mapper approach):
semiautomatic (possible controlled by attributes). DataSet: By hand. You
losse.

Yes I agree memory consumption is an issue but CPU perfomance of datasets is
excellent.  I like your approach but have done a similar thing based around
datasets. I use a client and server cache but sometimes keep the datasets
in compressed formats. (Updates are chained up and down with one or more
DataRow at a time) I use 2 forms of compression.

4 MByte / second - 20:1 compression
30-40 MByte / second - 7:1 compression - this is on an AMD Duron 750.

Now the 7:1 compression can compress 1 Meg  to 130K in about 30ms.  This way
i massively increase my cache capability and when the ckient requests a
table it is send compressed. For tables where row access is frequent
obviously the Dataset is not stored as compressed - nor is it cost effective
to compress a single row when sending it to the client.

Both ways have their advantages but compared to a custom collection .
Datasets many features which require no code - XML dumps and imports ,
Change tracking , Binding , sorting ( via DataView) , ISerializable  and
best of all Select so on a keyed search you only pass 1 record to the client
(Without going to the DB).  All mostly Bugfree and with the lilely hood of
advancements in the next release . Prety good for a first release and very
useable .

They do need a bit of work particulaly  in terms of caching and memory
consumed and I have been working on a framework for this but the same
applies to custom solutions -The framework I am writing also involves
automatic creation of datasets and data access. eg to get a Company record
this is your code.

MiddleServerCache.GetData("COMPANY" ,  recordID);  // or for the whole table
MiddleServerCache.GetData("COUNTRIES" );

This requires  NO code when using my component provided you use SQL server.

Anyway this is similar to what you have done it is just based around
Datasets to retain future compatability. Both ways have their merits - my
main focus has been on automatic dataaccess and caching ( compressed and
uncompressed) .I am stuck with any changes Microsoft make.

Ben Kloosterman

Thomas Tomiczek
THONA Consulting Ltd.
(Microsoft MVP C#/.NET)

You can read messages from the Advanced DOTNET archive, unsubscribe from
Advanced DOTNET, or
subscribe to other DevelopMentor lists at http://discuss.develop.com.

You can read messages from the Advanced DOTNET archive, unsubscribe from Advanced 
DOTNET, or
subscribe to other DevelopMentor lists at http://discuss.develop.com.

Re: [ADVANCED-DOTNET] Strongly-Typed DataSets vs. Strongly-Typed Collections

Reply via email to