Re: deadlock question

Mike Matrigali Wed, 31 Jan 2007 10:46:24 -0800

Posting complete information from derby.log may help.  Just post all
of it from the most recent boot message to the end of the error.


My best guess with the limited information so far is that you are
encountering an index split deadlock.  So in the case below transaction
6154 is the writer, 6159 is the reader, and 6160 is the internal
transaction trying to do an index split on behalf of 6154.  (10, 1)
row lock is a special internal lock for derby, which allows better
performance for scanning, but reduced concurrency for splitting.  It
basically protects the scan "position" on a page while the scan does
not have a latch on the page.  Not sure exactly but from your program
description 6154 has row locks on every row in the table and is likely
in the insert phase and needs to split the btree index, forks a child
transaction 6160 to do the split which is waiting on 6159 the reader.
The reader is blocked on row (2,110) which is one of the many row
locks held by the writer.

As a workaround, since your application knows it is going to delete
all the rows in the table it may work to just set table level locking
for the update transaction statement doing the delete of all rows.
This may actually perform slightly better than getting individual
row locks on each row and in the future may also allow for better
space management optimizations for the delete.
http://db.apache.org/derby/docs/10.2/ref/rrefsqlj40506.html


The best way to track down what is going on with the thread that you
don't think should be active is to reproduce the problem with some
extra information being logged.  This way it should be obvious if
the extra transaction has committed or not, and if the extra transaction
is an internal transaction.  The derby.log may also have stack traces
which would give me a better idea what is going on.:

set derby.locks.monitor=true and derby.locks.deadlockTrace=true inderby.properties. This will print the lock table information to thederby.log if timeout happens. If you also setderby.language.logStatementText=true, then all the statements executedon the system will be written out to the derby.log. Statementinformation also includes the transaction id on which the statement isbeing executed. You can use the transaction id holding the lock in thelock table dump to do a reverse search for the transaction id inderby.log, making it possible to narrow down which statements may beholding the locks.




Randy Letness wrote:

Mike Matrigali wrote:
also just posting the deadlock message with the lock information aboutthe locks in the deadlock cycle will answer a lot of questions.
It may depend if there are indexes, it may depend on how many rows
are in the table and if escalation happened.  Also make sure to
include what version of the software you are running.
I'm running Derby 10.2.1.6.

The message is:
2007-01-31 16:57:53.828 GMT Thread[reader0,5,main] (XID = 6159),(SESSIONID = 231), (DATABASE = target/test-db), (DRDAID = null), FailedStatement is: select distinct i.* from item i, cal_timerange_index ctri1where i.parentid=? and i.isactive=1 and ctri1.itemid=i.id and(ctri1.componenttype = ? and ((ctri1.startdate < case whenctri1.isfloating=1 then '20080218T115000' else ? end and ctri1.enddate >case when ctri1.isfloating=1 then '20080117T115000' else ? end) or(ctri1.startdate >= case when ctri1.isfloating=1 then '20080117T115000'else ? end and ctri1.startdate < case when ctri1.isfloating=1 then'20080218T115000' else ? end)))
So the select statement is a join of two tables. The statement returnsabout 30 rows. There are roughly 300 rows in cal_timerange_index. Theupdate transaction removes all these rows and adds the same amount,while the read transaction selects 30 of the rows that are being deletedby the update transaction.
The lock dump shows:

*** The following row is the victim ***
6159 |ROW |S |0 |(2,110) |WAIT |T |NULL|CAL_TIMERANGE_INDEX |
Here are the locks involved:
6154 |ROW |X |1 |(2,110)|GRANT|T|NULL|CAL_TIMERANGE_INDEX |6159 |ROW |S |1|(10,1)|GRANT|T|SQL070131105108180|CAL_TIMERANGE_INDEX |6160 |ROW |X |0 |(10,1)|WAIT|T|SQL070131105108180 |CAL_TIMERANGE_INDEX |
There are also these table locks:
6154 |TABLE |IX |315 |Tablelock |GRANT|T|NULL|CAL_TIMERANGE_INDEX |6159 |TABLE |IS |2 |Tablelock |GRANT|T |NULL|CAL_TIMERANGE_INDEX |6160 |TABLE |IX |1 |Tablelock |GRANT|T |NULL|CAL_TIMERANGE_INDEX
A couple things to note. There are 3 transactions, yet only 2 should beactive at a time (1 writer thread, 1 reader thread). Also none of thesetransactions is in a traditional deadlock (where there are waiting oneach other). Instead, the deadlock occurs because tx6154 doesn'trelease its locks. 6154 isn't waiting on anything. Its almost as ifthe locks aren't getting released. I think i'm going to try to work ona simpler example to reproduce this.
-Randy
Kristian Waagan wrote:
Randy Letness skrev:
A little more info:
I have a test program with a "writer thread", a thread thatcontinuously performs update transactions, and a "reader thread", athread that does selects on the data that is being updated. So withtwo threads, there should be at most 2 transactions in progress atany instant. When the deadlock occurs, the lock dump table showsotherwise. There are 3 distinct transaction ids (1,2,3 forsimplicity). Looking at the locks, I can tell that tx1 and tx3 areupdate transactions (they have exclusive locks) and tx2 is the readtx (has shared locks). This doesn't make sense because there isonly a single thread doing updates so there should only be a singleupdate transaction active. Tx2 is waiting on a lock held by tx1 andtx3 is waiting on a lock held by tx2. For some reason the locksheld by tx1 aren't released, and that is why the deadlock occurs.Its almost as if the locks from the update transaction aren't beingfully released after a commit. Tx1 isn't waiting on anything, butthe locks are still there. So something screwy is going on. Isthere such thing as a phantom lock?
Also, I can't reproduce this on a MySQL database. I'm using thesame code, different JDBC driver.
Hello Randy,
I recommend you log a Jira issue with a simple reproductionscript/program. Then the community can/may have a look at it anddetermine if it is a bug or not. Even if it turns out not to be abug, it will be nice to have it documented in the issue trackingsystem with an explanation of why Derby behaves this way.
Jira (Derby's issue tracking system) can be accessed here:https://issues.apache.org/jira/browse/DERBY
thanks,

Re: deadlock question

Reply via email to