[ 
https://issues.apache.org/jira/browse/DERBY-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Knut Anders Hatlen updated DERBY-3693:
--------------------------------------

    Attachment: GetTables.java

Thanks David, that was a very useful piece of information! I think I (almost) 
see what's happening in your case.

Derby periodically checks whether a compiled plan should be recompiled 
(configurable with the derby.language.stalePlanCheckInterval property). When it 
detects that the size of SYS.SYSTABLES has changed considerably since 
getTables() was compiled the last time (because you have added more tables), 
it'll recompile the plan to see if it can create a plan that is more efficient.

The recompilation happens in a nested transaction, which runs into a lock 
conflict with the parent transaction when it tries to store the compiled plan 
in SYS.SYSSTATEMENTS, and therefore it hangs for the duration specified by 
derby.locks.waitTimeout. This hang is DERBY-177. This is the part I don't quite 
understand. The parent transaction shouldn't have any locks in this case, since 
it hasn't performed any operations before the call to getTables(). My guess is 
that it still holds some locks after querying the system tables to see if a 
recompilation is needed.

Normally, after the nested transaction times out, the timeout exception is 
intercepted and the recompilation is retried in the parent transaction. 
However, because derby.locks.deadlockTrace is set, the timeout exception has a 
different SQLState than what the retry logic expects (40XL2 instead of 40XL1), 
and the exception is not intercepted.

I have attached a java class (GetTables.java) that simulates your application. 
It has a loop which continuously executes CREATE TABLE and getTables(). If I 
run the class without derby.locks.deadlockTrace, I see a hang each time 
getTables() is recompiled. It continues with no error after the hang. If I run 
with derby.locks.deadlockTrace, it fails with a timeout exception the first 
time getTables() is recompiled.

Now, I think the patch I attached earlier (deadlock.diff) will help so that the 
timeout exception isn't thrown, but as Svata noted, you'll still see the hang.

I see the following possible workarounds:

  1) Don't run with derby.locks.deadlockTrace. You'll still see the hangs, but 
it won't fail. You may also want to lower derby.locks.waitTimeout to reduce the 
problem with the hangs.

  2) Increase the value of derby.language.stalePlanCheckInterval (default 100) 
so that you manage to create all your tables without recompiling getTables().

To fix the issue permanently, we need to find out why the parent transaction 
holds locks in the system tables before it recompiles the getTables() query, 
and find out how we can ensure that these locks don't conflict with the nested 
transaction.

> Deadlocks accessing DB metadata
> -------------------------------
>
>                 Key: DERBY-3693
>                 URL: https://issues.apache.org/jira/browse/DERBY-3693
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.3.2.1, 10.3.3.0, 10.4.1.3
>         Environment: Linux amd64 or Windows 32bit
>            Reporter: Svata Dedic
>         Attachments: d3693.java, deadlock.diff, derby.log, GetTables.java
>
>
> My code changes DB structure (create a column), then immediately after 
> setting autocommit back to true, the code rescans the DB metadata
> DatabaseMetaData.getColumns(catalog, schemaName, tableName, null);
> I am sometimes getting a deadlock with these operations:
> 2008-04-16 19:50:47.833 GMT Thread[Default RequestProcessor,1,system] (XID = 
> 569844), (SESSIONID = 2), (DATABASE = /..../a3/.config/localdb/db), (DRDAID = 
> null), Cleanup action starting 2008-04-16 19:50:47.833 GMT Thread[Default 
> RequestProcessor,1,system] (XID = 569844), (SESSIONID = 2),  (DATABASE = 
> /..../IJCProjects/a3/.config/localdb/db), (DRDAID = null), Failed Statement 
> is: EXECUTE STATEMENT SYS."getColumns" ERROR 40XL2: A lock could not be 
> obtained within the time requested.
> The lockTable dump is:
> 2008-04-16 19:50:47.796 GMT
> XID       |TYPE         |MODE|LOCKCOUNT|LOCKNAME                              
>                    |STATE|TABLETYPE /LOCKOBJ                   |INDEXNAME / 
> CONTAINER_ID / MODE for LATCH only)  |TABLENAME / CONGLOM_ID                |
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> *** The following row is the victim ***
> 569852    |ROW          |X   |0        |(44,7)                                
>                 |WAIT |S                      |NULL |SYSSTATEMENTS            
>              | 
> *** The above row is the victim ***
> The stacktrace of the operation causing the deadlock is 
>         at org.apache.derby.iapi.error.StandardException.newException(Unknown 
> Source)
>         at 
> org.apache.derby.impl.services.locks.Timeout.createException(Unknown Source)
>         at 
> org.apache.derby.impl.services.locks.Timeout.buildException(Unknown Source)
>         at 
> org.apache.derby.impl.services.locks.ConcurrentLockSet.lockObject(Unknown 
> Source)
>         at 
> org.apache.derby.impl.services.locks.AbstractPool.lockObject(Unknown Source)
>         at 
> org.apache.derby.impl.services.locks.ConcurrentPool.lockObject(Unknown Source)
>         at 
> org.apache.derby.impl.store.raw.xact.RowLocking3.lockRecordForWrite(Unknown 
> Source)
>         at 
> org.apache.derby.impl.store.access.heap.HeapController.lockRow(Unknown Source)
>         at 
> org.apache.derby.impl.store.access.heap.HeapController.lockRow(Unknown Source)
>         at 
> org.apache.derby.impl.store.access.btree.index.B2IRowLocking3.lockRowOnPage(Unknown
>  Source)
>         at 
> org.apache.derby.impl.store.access.btree.index.B2IRowLocking3._lockScanRow(Unknown
>  Source)
>         at 
> org.apache.derby.impl.store.access.btree.index.B2IRowLockingRR.lockScanRow(Unknown
>  Source)
>         at 
> org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(Unknown 
> Source)
>         at 
> org.apache.derby.impl.store.access.btree.BTreeScan.fetchNext(Unknown Source) 
>         at org.apache.derby.impl.sql.catalog.TabInfoImpl.updateRow(Unknown 
> Source) 
>         at org.apache.derby.impl.sql.catalog.TabInfoImpl.updateRow(Unknown 
> Source)
>         at 
> org.apache.derby.impl.sql.catalog.DataDictionaryImpl.updateSPS(Unknown Source 
> )
>         at 
> org.apache.derby.iapi.sql.dictionary.SPSDescriptor.updateSYSSTATEMENTS(Unknown
>  Source)
>         at 
> org.apache.derby.iapi.sql.dictionary.SPSDescriptor.getPreparedStatement(Unknown
>  Source)
>         at 
> org.apache.derby.iapi.sql.dictionary.SPSDescriptor.getPreparedStatement(Unknown
>  Source)
>         at org.apache.derby.impl.sql.compile.ExecSPSNode.generate(Unknown 
> Source)
>         at org.apache.derby.impl.sql.GenericStatement.prepMinion(Unknown 
> Source)
>         at org.apache.derby.impl.sql.GenericStatement.prepare(Unknown Source)
>         at 
> org.apache.derby.impl.sql.GenericPreparedStatement.rePrepare(Unknown Source)
>         at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown 
> Source)
>         at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
> Source)
>         at 
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
> Source)
>         at 
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeQuery(Unknown 
> Source) 
>         at org.apache.derby.impl.jdbc.EmbedDatabaseMetaData.doGetCols(Unknown 
> Source)
>         at 
> org.apache.derby.impl.jdbc.EmbedDatabaseMetaData.getColumns(Unknown Source)
> I've seen the deadlock occur during several get-metadata operations 
> (getImportedKeys, ...), each time waiting on the SYSSTATEMENTS apparently 
> because of internally constructed PreparedStatement. The lock eventually 
> times out and the locked out operation completes without error.
> When the deadlock occurs, the "real" SQL into systables is being compiled (I 
> traced this sql to be the value of "getColumns" key in 
> org/apache/derby/impl/jdbc/metadata.properties file):
> ---------%<-----------------------%<--------------
> Begin compiling prepared statement: SELECT CAST ('' AS VARCHAR(128)) AS 
> PKTABLE_CAT, S.SCHEMANAME AS PKTABLE_SCHEM, TABLENAME AS
> PKTABLE_NAME, COLS.COLUMNNAME AS PKCOLUMN_NAME, CAST ('' AS VARCHAR(128)) AS 
> FKTABLE_CAT, FKTABLE_SCHEM, FKTABLE_NAME, FKCOLUMN_NAME, CAST ...
> ---------%<-----------------------%<--------------

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to