Hi all,

    this is actually a follow-up to some post of mine on derby-users, as it's 
started getting a bit technical & it was recommended that I ask here.


I've been having problems with locks on syscolumns when insert rows into tables 
with an auto-incremented column.
The Derby docs state:

"Derby keeps track of the last increment value for a column in a cache. It also 
stores the value of what the next increment value will be for the column on 
disk in the AUTOINCREMENTVALUE column of the SYS.SYSCOLUMNS system table. 
Rolling back a transaction does not undo this value, and thus rolled-back 
transactions can leave "gaps" in the values automatically inserted into an 
identity column. Derby behaves this way to avoid locking a row in 
SYS.SYSCOLUMNS for the duration of a transaction and keeping concurrency high."

When I've got lots of threads inserting at once, what I'm seeing is that every 
so often this lock on SYSCOLUMNS gets held for a very long time, and other 
inserts are failing to insert with lock-timeouts on this SYSCOLUMNS entry.

Now, I've done some experimentation & digging through the source-code, and I 
discovered the following:

- when you try & do an insert with an auto-increment, derby creates a nested 
transaction for the required updated of SYSCOLUMNS
- however, if this update using the nested transaction fails, it tries again 
using the parent transaction - which will not be committed at the end of the 
update.

Normally, I'm seeing the SYSCOLUMNS update typically take 1ms or so.  However, 
under heavy load, it's occasionally taking longer (I've seen up to about 1/3 
second).

Now, when this updates takes longer, any other update (and there may be several 
in 1/3 second) will fail immediately, as the the subtransaction deliberately 
does not wait if it comes across any locks.

This means that the failed transactions will then get re-executed under the 
parent transaction, which may not commit for several seconds (under heavy 
load).  Because these transactions are now locking the SYSCOLUMNS row 
themselves, every other update starts failing & being re-executed under the 
parent transaction as well.

The end result is that as soon as one update of the SYSCOLUMNS row fails due to 
a lock, I get a massive cascade of locks that cause everything to come to a 
grinding halt - although once most of the inserts have failed, it starts 
sorting itself out again.

My question is basically - why does the nested-transaction need to fail 
immediately if it encounters a lock?
If it's to try & avoid blocking, for concurrency, it seems to have the opposite 
effect, as executing under the parent transaction appears worse.  Or is there 
some subtle issue I'm not aware of?  (I'm not a Derby Internals Guru!)

I've tried re-building derby, with the nested-transaction taking the default 
lock timeout, and even with 100 concurrent clients, I see no lock timeouts in 
the clients.  I've run it through the derbylang unit tests, and the 
autoincrement test still passes.

So - is this a viable change, or is it flawed in some way?

Many thanks for any advice,

        Andrew Lawrenson

Reply via email to