I welcome SQL standard experts opinion, but I believe either implementation is correct. Especially with read committed I don't
think the standard says anything one way or another.

My experience with users is that most would prefer the Derby
implementation, as they tend to have the system retain the least
amount of locks possible - especially in read committed mode.  I
agree that it is confusing.  I also think some DB's add to the
SQL to allow the behavior that you were expecting, ie. lock every
row in the select exclusively and hold the lock until end of
transaction (or maybe in UPDATE vs. exclusive mode).  But I don't
think the standard addresses this, but again I am not an expert
in the SQL standards.


The above comment assumes a lock based isolation control system. Versioning would actually be better for many db clients. Does anyone
know of public domain information about implementing a version
isolation system - sort of like what Oracle does.

Kristian Waagan wrote:
Thanks for the input Mike. Great explanation!
Should a link to "Locking, concurrency, and isolation" (or even "Types and scope of locks in Derby systems" - "Update locks") in the "Tuning Derby" manual be added to the reference manual for the description of the SELECT statement?

Regarding my test, you are right about the Derby behavior - my mistake; the update is not allowed when running with repeatable read or serializable. My test sets both isolation level and autocommit explicitly. It also does next() on all resultsets.

The behavior of Derby compared with MySQL (started with --default-storage-engine=innodb) and PostgreSQL is still different though. PostgreSQL and MySQL does not allow the update no matter what transaction isolation level is used. Can you say anything about which behavior is correct/expected?



--
Kristian

Mike Matrigali wrote:

In order to understand and compare across multiple db's there are 2
important things we need to know:  transaction isolation and autocommit
behavior.

The default in derby is read committed isolation level, and
autocommit=true.  I don't know what the defaults are for the other
databases.  To be safe when comparing I suggest any test explicitly
set both as part of the test.  From your description it sounds like
you have autocommit=false, but it is good to be explicit.

Does your test do any next() calls on the result set?  Derby almost
always streams rows to the application, so locking for all rows
affected by a query is only guaranteed if the application actually
does a next() through all rows.  Because your table does not have
any indexes as described, derby will use row locking for repeatable
read, and read committed, but will use table locking for serializable.

Derby in read committed will release locks at end of statement for
any row which is actually not updated as part of select for update,
in read committed isolation level.
During processing of the query it actually gets update locks which
are either upgraded to exclusive and held to end of transaction or
released if the row is not updated.   If your application needs
to hold locks until end transaction on selects it must use either
repeatable read or serializable isolation.

Did you see the described behavior using repeatable read and doing
next() calls in both the first and second select?

Kristian Waagan wrote:

Hello,

I stumbled across a possible bug with the SELECT ... FOR UPDATE clause.
I have found several sources of information regarding this statement,
with partly conflicting contents.

First of all, I wrote a simple JDBC test application making use of the
statement. I ran this on Derby and two other database systems. Derby did
not behave as the two others (more on this later).

Second, the reference manual states that the statement is supported, and
that it must be used to obtain updateable resultsets.

Third, the JIRA issue 231
(http://issues.apache.org/jira/browse/DERBY-231) is regarding support
for SELECT ... FOR UPDATE. It is unresolved and unassigned. Is this a
stale issue?

The JDBC application I wrote consisted of two threads accessing a single
table: locktesttbl (ID INT, VALUE INT). I inserted 5 rows
(1,1),(2,2),...,(5,5). The first thread executes 'SELECT * FROM
locktesttbl WHERE ID=1 FOR UPDATE', waits 10 seconds, closes the
resultset, executes 'SELECT * FROM locktesttbl WHERE ID=1', closes
resultset and commits. The second thread, which is started 2 seconds
after the first one, executes 'UPDATE locktesttbl SET VALUE=100 WHERE
ID=1' then commits.

The only time the selected VALUE field in the first thread was equal at
the beginning and the end of the transcation, was when the transcation
isolation level was set to SERIALIZABLE. At all other levels, VALUE was
100 at the end of the transaction (before commit). When I did this with
the two other systems (MySQL and PostgreSQL), VALUE was always 1 within
the transaction. This suggests SELECT .. FOR UPDATE is broken in Derby,
and that the single instance of correct behavior seen is due to the
transaction isolation level alone. I have not looked into the source
code on this.

Does anyone have any comments on this?

I will add a JIRA bug issue under category SQL for this one in a few
days (awaiting comments).


--
Kristian







Reply via email to