[jira] Updated: (DERBY-3493) stress.multi times out waiting on testers with blocked testers waiting on the same statement

Knut Anders Hatlen (JIRA) Fri, 07 Mar 2008 07:23:05 -0800

     [ 
https://issues.apache.org/jira/browse/DERBY-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Knut Anders Hatlen updated DERBY-3493:
--------------------------------------

    Attachment: d3493-1a.diff

Attaching a patch which I believe solves the hang.

The patch basically makes ConcurrentCache.create() use ConcurrentHashMap.get() 
directly instead of going through ConcurrentCache.getEntry(), which will block 
until the identity has been set. Then create() fails immediately if the object 
already exists in the cache. Since this introduced yet another difference 
between find() and create() in findOrCreateObject(), I also followed Øystein's 
suggestion from his review of DERBY-2911 and split findOrCreateObject() into a 
number of smaller methods, which I think makes the code easier to follow.

I have started the full regression suite (which seems to run fine) and will 
also have stress.multi running in a loop for some time to verify that the hang 
really has been fixed.

The hang seems to have been caused by the two table descriptor caches in 
DataDictionaryImpl (nameTdCache and OIDTdCache) trying to keep each other 
consistent. So when you insert an object into one of these caches, their 
setIdentity() methods try to automatically insert it into the other one as 
well. So what happened was that one thread inserted an object into one of the 
caches, and at the same time another thread inserted an object with the same 
identity into the other cache. Both of the caches tried to update the same 
object in the other cache at the same time and thereby they ended up waiting 
for each other to finish. Since creating an object that already exists should 
fail, there's no reason to wait for a not fully initialized object to become 
fully initialized before failing. By failing as soon as such a situation is 
detected, the two threads don't wait for each other to finish, and the deadlock 
is avoided.

> stress.multi times out waiting on testers with blocked testers waiting on the 
> same statement
> --------------------------------------------------------------------------------------------
>
>                 Key: DERBY-3493
>                 URL: https://issues.apache.org/jira/browse/DERBY-3493
>             Project: Derby
>          Issue Type: Bug
>          Components: Regression Test Failure, SQL, Test
>    Affects Versions: 10.4.0.0
>         Environment: IBM 1.5 Linux
>            Reporter: Kathey Marsden
>            Assignee: Knut Anders Hatlen
>         Attachments: d3493-1a.diff, threaddump-1204806990660.tdump
>
>
> The diff is:
> 7 del
> < ...running last checks via final.sql
> 7 add
>  > ...timed out trying to kill all testers,
>  >    skipping last scripts (if any).  NOTE: the
>  >    likely cause of the problem killing testers is
>  >    probably not enough VM memory OR test cases that
>  >    run for very long periods of time (so testers do not
>  >    have a chance to notice stop() requests
> Test Failed.
> The testers that are stuck are stuck on the same statement e.g.
> -- 
> update main2 set y = 'zzz' where x = 5;
> ERROR 08000: Connection closed by unknown interrupt.
> ERROR XJ001: Java exception: ': java.lang.InterruptedException'.
> The interupt exception shows:
> java.lang.InterruptedException
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:199)
>         at
> org.apache.derby.impl.sql.GenericStatement.prepMinion(GenericStatement.java:195)
>         at
> org.apache.derby.impl.sql.GenericStatement.prepare(GenericStatement.java:88)
>         at
> org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(GenericLanguageConn
> ctionContext.java:768)
>         at
> org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:606)
>         at
> org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:555)
>         at org.apache.derby.impl.tools.ij.ij.executeImmediate(ij.java:329)
>         at
> org.apache.derby.impl.tools.ij.utilMain.doCatch(utilMain.java:508)
>         at
> org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(utilMain.java:350)
> The code at line 195 of GenericStatement shows:
>           ....
>                 try {
>                     preparedStmt.wait();
>                 } catch (InterruptedException ie) {
>                     throw StandardException.interrupt(ie);
>                 }
> My first guess is that this is perhaps some type of Statement cache
> concurrency bug, but perhaps
> I am reading it wrong.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3493) stress.multi times out waiting on testers with blocked testers waiting on the same statement

Reply via email to