Re: [castor-dev] suggestion

ben . galehouse Thu, 01 Nov 2001 09:46:35 -0800


Lemme give an example. Suppose that you have a system which displays two
account balances to the user, and they are both equal to $20. The user was
told at a meeting this morning to make certain that the balances add to $50,
so the user modifies one of the accounts to contain $30 instead.  Now, at
approximatly the same time, somebody else (who just got back from the same
meeting) loads the two accounts, and decides to take care of this change
himself.  So he adds $10 to the other account.


The end result is that the accounts contain $60, much to the chagrin of
their boss.  I use users, because this type of algorithm is very popular
when the user is in the loop, potentially making the transaction long
running.  However, the danger is there whenever you expect this type of
algoritm to keep things clean.

This type of bug tends to be very occasional, unless you are running at high
loads.  This is the things that causes 'inexplicable' data corruption.

Oh, and I was slightly incorrect ealier.  The 'check timestamps on write'
algorithm is slightly stronger than 'read committed' (lvl2) transaction
isolation level, but not as strong as 'repeatable read' (lvl3) transaction
isolation level.  'read committed' is what you get if you don't do anything
at all.

If you want to see the weakness another way, consider that this algoritm is
equivalent to having a write lock which is held from the last read before
the write, to the final write.  This is much more limited in what it
protects from than read/write locks.

For an example of what Serializable transaction levels are needed for,
suppose that you have a record which represents the sum of a column of a
table.  Whenever somebody adds a row to the table, he also adds to the total
record.  A batch process starts to recompute this sum, first by reading the
table, then by adding it up and then doing a write to the total. While it is
doing it's addition, some other transaction adds a record, and adds to the
total that is about to get overwritten. This system has a bug when running
at anything less than the serializable ansi isolation level (level 4),
because the new record can be missed by the batch process.

-----Original Message-----
From: Ilia Iourovitski [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, October 31, 2001 5:59 PM
To: [EMAIL PROTECTED]
Subject: Re: [castor-dev] suggestion


what kind of 
" hard to replicate race condition bugs."

Ilia

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, October 31, 2001 5:47 PM
To: [EMAIL PROTECTED]
Subject: Re: [castor-dev] suggestion





>Another solution is to use trigger to update "Number of Time Modified
>(NOTM)" column in the table as ODD recommended and make 
>castor to check before update.
>As for clustering environment it want help a lot in case of frequent
>updates by different cluster nodes.

If you only check the NOTM on records which you are writing, then again
you
are only implementing the second ansi transaction isolation level (read
committed isolation level). Often this is good enough, but users of the
resulting library need to be well educated about the implications.
Otherwise
there is a real risk of developing ugly, hard to replicate race
condition
bugs.

----------------------------------------------------------- 
If you wish to unsubscribe from this mailing, send mail to
[EMAIL PROTECTED] with a subject of:
        unsubscribe castor-dev

----------------------------------------------------------- 
If you wish to unsubscribe from this mailing, send mail to
[EMAIL PROTECTED] with a subject of:
        unsubscribe castor-dev

----------------------------------------------------------- 
If you wish to unsubscribe from this mailing, send mail to
[EMAIL PROTECTED] with a subject of:
        unsubscribe castor-dev

Re: [castor-dev] suggestion

Reply via email to