[jira] Commented: (DERBY-5073) Derby deadlocks without recourse on simultaneous correlated subqueries

Knut Anders Hatlen (JIRA) Thu, 10 Mar 2011 08:08:24 -0800

    [ 
https://issues.apache.org/jira/browse/DERBY-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005128#comment-13005128
 ]


Knut Anders Hatlen commented on DERBY-5073:
-------------------------------------------

The code is in Deadlock.handle():

                        // See if the checker is in the deadlock and we
                        // already picked as a victim
                        if ((checker.equals(space)) && (deadlockWake == 
Constants.WAITING_LOCK_DEADLOCK)) {
                                victim = checker;
                                break;
                        }

It never kicks in, and instead it goes further down in the method and wakes 
another victim:

                ActiveLock victimLock = (ActiveLock) waiters.get(victim);

                victimLock.wakeUp(Constants.WAITING_LOCK_DEADLOCK);

The new victim wakes up from it's waiting state in 
ActiveLock.waitForGrant()/ConcurrentLockSet.lockObject(), calls checkDeadlock() 
and ends up Deadlock.handle() again.

I think the problem may be caused by the following piece of code in 
Deadlock.look():

                                } else {
                                        // simply waiting on another waiter
                                        space = 
waitingLock.getCompatabilitySpace();
                                }

As far as I can see, this code doesn't make any sense. space will already have 
the same value as waitingLock.getCompatabilitySpace(), so the operation is 
actually a no-op. (waitingLock is obtained by calling waiters.get(space), and 
the waiters Map is built up by (waitingLock.getCompatabilitySpace(), 
waitingLock) value pairs, see LockControl.addWaiters().) Furthermore, this 
leads to "space" being considered twice in a row by the deadlock detection, so 
that it thinks that the transaction owning that compatibility space is waiting 
for one of its own locks. It therefore detects the deadlock prematurely, and 
before it has seen all transactions involved in it, and incorrectly concludes 
that the original victim wasn't involved.

By changing that last piece of code from a no-op to actually moving one step 
ahead in the wait graph, the repro does fail with a deadlock error. That is, 
change the assignment to:

    space = ((ActiveLock) waitOn).getCompatabilitySpace();

I tried running the regression tests with that change, and they all passed. I 
do find the deadlock detection code a bit hard to follow, so I'm not totally 
convinced this is the right change.

> Derby deadlocks without recourse on simultaneous correlated subqueries
> ----------------------------------------------------------------------
>
>                 Key: DERBY-5073
>                 URL: https://issues.apache.org/jira/browse/DERBY-5073
>             Project: Derby
>          Issue Type: Bug
>          Components: Services
>    Affects Versions: 10.0.2.1, 10.1.2.1, 10.2.2.0, 10.3.3.0, 10.4.2.0, 
> 10.5.3.0, 10.6.2.1, 10.7.1.1, 10.8.0.0
>            Reporter: Karl Wright
>         Attachments: Derby5073.java
>
>
> When the following two queries are run against tables that contain the 
> necessary fields, using multiple threads, Derby deadlocks and none of the 
> queries ever returns.  Derby apparently detects no deadlock condition, either.
> SELECT t0.* FROM jobqueue t0 WHERE EXISTS(SELECT 'x' FROM carrydown t1 WHERE 
> t1.parentidhash IN (?) AND t1.childidhash=t0.dochash AND t0.jobid=t1.jobid) 
> AND t0.jobid=?
> SELECT t0.* FROM jobqueue t0 WHERE EXISTS(SELECT 'x' FROM carrydown t1 WHERE 
> t1.parentidhash IN (?) AND t1.childidhash=t0.dochash AND t0.jobid=t1.jobid 
> AND t1.newField=?) AND t0.jobid=?
> This code comes from Apache ManifoldCF, and has occurred when there are five 
> or more threads trying to execute these two queries at the same time.  
> Originally we found this on 10.5.3.0.  It was hoped that 10.7.1.1 would fix 
> the problem, but it hasn't.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-5073) Derby deadlocks without recourse on simultaneous correlated subqueries

Reply via email to