On Jul 23, 2009, at 6:24 PM, Ian Hickson wrote:
On Thu, 16 Jul 2009, Nikunj R. Mehta wrote:
The spec should not restrict implementations to any one level of
concurrency unless there are specific undesirable effects.
Restricting the database to a single writer means that if there are
separate
workers or background threads working to update non-overlapping
portions, then
they have to wait for the lone current writer. Implementations can
certainly
compete to produce the level of concurrency that developers need.
Specifically, I propose that the following text
[[
If the mode is read/write, the transaction must have an exclusive
write lock
over the entire database. If the mode is read-only, the transaction
must have
a shared read lock over the entire database. The user agent should
wait for an
appropriate lock to be available.
]]
be replaced with the following text
[[
Multiple read-only transactions may share the same data as long as
there is no
transaction attempting to write the data being read. The user
agent must wait
for transactions that are reading some data before allowing a read/
write
transaction on the same data to continue.
]]
Since there's no way for the author to say ahead of time which rows or
cells the transactions are going to use, how can you do the above
without
ending up with some transactions failing unpredictably?
Database developers (whether experienced DBAs or newcomer WebApp
programmers) identify the data set they are using through statements
they execute (within or outside transactions). It is the database's
job to find out which records are being used.
The concepts of transaction processing apply no matter the granularity
of a data item, whether it is a record or a disk block, or a whole
file. There are many kinds of failures (and yes, failures are always
unpredictable) [1]. Let's focus on failures arising from concurrency
control enforcement, which is probably the one most people worry about
from a programming perspective. In the following discussion, I use the
term locking , even though other protocols have been developed and are
in use, to guarantee serializability, i.e., correct interleaving of
concurrent transactions.
A knowledgeable database programmer would read the smallest set of
data in a transaction so as to avoid locking the entire database for
concurrent operations. Moreover, this approach also minimizes
starvation, i.e., the amount of time a program would need to wait to
obtain permission to exclusively access data.
Transactions can fail even if locking occurs at the whole database
level. As example, consider the situation:
1. A read-only transaction is timed out because some read-write
transaction went on for too long.
2. A read-write transaction is timed out because some read-only
transaction went on for too long.
3. A read-only transaction includes inside it a read-write
transaction. However, it was running concurrently with other read-only
transactions. Other read-only transactions do not complete before the
time out of the read-write transaction and the read-write transaction
fails.
Experience has shown that there is no easy way out when dealing with
transactions, and locking at the whole database level is no solution
to failures.
I hope that answers your question.
Nikunj
http://o-micron.blogspot.com
[1] Fundamentals of Database Systems, 4th ed, Elmasari and Navathe,
Section 17.1.4, 2003