[
https://issues.apache.org/jira/browse/DERBY-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Knut Anders Hatlen updated DERBY-5073:
--------------------------------------
Attachment: derby-5073-1b.diff
It looks to me as if the problem is that the deadlock detection gets confused
by the threads that wait for shared locks on the same rows as the transactions
involved in the deadlock are waiting for shared locks on.
Those transactions are not really involved in the deadlock. But because they
started waiting for the locks earlier than the threads actually involved in the
deadlock, they are first in the queue of waiters. The deadlock detection
algorithm incorrectly assumes that these transactions are blocking for the
transactions involved in the deadlock.
The assumption is incorrect because they are all waiting for shared locks, so
once the first waiter in the queue gets it, the second waiter will also get it.
This means that the second waiter isn't waiting for the first waiter here, but
they're both waiting for some other lock that is blocking them both.
The attached modified patch (1b) changes the algorithm so that it skips past
chains of waiters that have compatible lock requests. That made the repro fail
with exactly one deadlock exception.
> Derby deadlocks without recourse on simultaneous correlated subqueries
> ----------------------------------------------------------------------
>
> Key: DERBY-5073
> URL: https://issues.apache.org/jira/browse/DERBY-5073
> Project: Derby
> Issue Type: Bug
> Components: Services
> Affects Versions: 10.0.2.1, 10.1.2.1, 10.2.2.0, 10.3.3.0, 10.4.2.0,
> 10.5.3.0, 10.6.2.1, 10.7.1.1, 10.8.0.0
> Reporter: Karl Wright
> Attachments: Derby5073.java, derby-5073-1a.diff, derby-5073-1b.diff
>
>
> When the following two queries are run against tables that contain the
> necessary fields, using multiple threads, Derby deadlocks and none of the
> queries ever returns. Derby apparently detects no deadlock condition, either.
> SELECT t0.* FROM jobqueue t0 WHERE EXISTS(SELECT 'x' FROM carrydown t1 WHERE
> t1.parentidhash IN (?) AND t1.childidhash=t0.dochash AND t0.jobid=t1.jobid)
> AND t0.jobid=?
> SELECT t0.* FROM jobqueue t0 WHERE EXISTS(SELECT 'x' FROM carrydown t1 WHERE
> t1.parentidhash IN (?) AND t1.childidhash=t0.dochash AND t0.jobid=t1.jobid
> AND t1.newField=?) AND t0.jobid=?
> This code comes from Apache ManifoldCF, and has occurred when there are five
> or more threads trying to execute these two queries at the same time.
> Originally we found this on 10.5.3.0. It was hoped that 10.7.1.1 would fix
> the problem, but it hasn't.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira