[ 
https://issues.apache.org/jira/browse/DERBY-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Knut Anders Hatlen updated DERBY-3693:
--------------------------------------

    Attachment: dontWait.stat
                dontWait.diff

Here's a patch (dontWait.diff) which attacks the problem from a different 
angle. Instead of preventing the invalidation from obtaining a lock on the 
system tables in the user transaction, this patch makes the nested transaction 
used for recompiling the meta-data query fail immediately in case of a lock 
conflict. The original code also attempts this, but only for table-level locks. 
This patch makes all lock requests in the nested transaction time out 
immediately if there's a lock conflict, so that there's no hang before we retry 
in the user transaction.

The approach taken is this:

The CompatibilitySpace class used in the lock manager has an owner field, which 
normally is the transaction object. This field is however of type 
java.lang.Object and can theoretically be anything. In the patch, I changed 
this field to be of the type LockOwner, which is a new interface under 
iapi.services.locks. The interface has a method noWait() that returns a boolean 
value. The lock manager will check the return value of this method if 
lockObject() or zeroDurationlockObject() can't lock the object immediately, and 
if it is true, a timeout exception is thrown instead of waiting.

In SPSDescriptor, I set a flag in the nested transaction so that it returns 
true from its noWait() method. This means we'll not see the lock conflicts 
between the nested transaction and its parent as hangs anymore.

I have successfully run the repro, the regression test that was added to 
DatabaseMetaDataTest, and a set of tests that had problems with the previous 
patches. Didn't see any problems in those tests. Will start the full regression 
test suite and report back.

> Deadlocks accessing DB metadata
> -------------------------------
>
>                 Key: DERBY-3693
>                 URL: https://issues.apache.org/jira/browse/DERBY-3693
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.3.2.1, 10.3.3.0, 10.4.1.3
>         Environment: Linux amd64 or Windows 32bit
>            Reporter: Svata Dedic
>            Assignee: Knut Anders Hatlen
>         Attachments: d3693.java, deadlock.diff, derby.log, dontWait.diff, 
> dontWait.stat, GetTables.java, nested_transaction.diff, 
> nested_transaction_v2.diff
>
>
> My code changes DB structure (create a column), then immediately after 
> setting autocommit back to true, the code rescans the DB metadata
> DatabaseMetaData.getColumns(catalog, schemaName, tableName, null);
> I am sometimes getting a deadlock with these operations:
> 2008-04-16 19:50:47.833 GMT Thread[Default RequestProcessor,1,system] (XID = 
> 569844), (SESSIONID = 2), (DATABASE = /..../a3/.config/localdb/db), (DRDAID = 
> null), Cleanup action starting 2008-04-16 19:50:47.833 GMT Thread[Default 
> RequestProcessor,1,system] (XID = 569844), (SESSIONID = 2),  (DATABASE = 
> /..../IJCProjects/a3/.config/localdb/db), (DRDAID = null), Failed Statement 
> is: EXECUTE STATEMENT SYS."getColumns" ERROR 40XL2: A lock could not be 
> obtained within the time requested.
> The lockTable dump is:
> 2008-04-16 19:50:47.796 GMT
> XID       |TYPE         |MODE|LOCKCOUNT|LOCKNAME                              
>                    |STATE|TABLETYPE /LOCKOBJ                   |INDEXNAME / 
> CONTAINER_ID / MODE for LATCH only)  |TABLENAME / CONGLOM_ID                |
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> *** The following row is the victim ***
> 569852    |ROW          |X   |0        |(44,7)                                
>                 |WAIT |S                      |NULL |SYSSTATEMENTS            
>              | 
> *** The above row is the victim ***
> The stacktrace of the operation causing the deadlock is 
>         at org.apache.derby.iapi.error.StandardException.newException(Unknown 
> Source)
>         at 
> org.apache.derby.impl.services.locks.Timeout.createException(Unknown Source)
>         at 
> org.apache.derby.impl.services.locks.Timeout.buildException(Unknown Source)
>         at 
> org.apache.derby.impl.services.locks.ConcurrentLockSet.lockObject(Unknown 
> Source)
>         at 
> org.apache.derby.impl.services.locks.AbstractPool.lockObject(Unknown Source)
>         at 
> org.apache.derby.impl.services.locks.ConcurrentPool.lockObject(Unknown Source)
>         at 
> org.apache.derby.impl.store.raw.xact.RowLocking3.lockRecordForWrite(Unknown 
> Source)
>         at 
> org.apache.derby.impl.store.access.heap.HeapController.lockRow(Unknown Source)
>         at 
> org.apache.derby.impl.store.access.heap.HeapController.lockRow(Unknown Source)
>         at 
> org.apache.derby.impl.store.access.btree.index.B2IRowLocking3.lockRowOnPage(Unknown
>  Source)
>         at 
> org.apache.derby.impl.store.access.btree.index.B2IRowLocking3._lockScanRow(Unknown
>  Source)
>         at 
> org.apache.derby.impl.store.access.btree.index.B2IRowLockingRR.lockScanRow(Unknown
>  Source)
>         at 
> org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(Unknown 
> Source)
>         at 
> org.apache.derby.impl.store.access.btree.BTreeScan.fetchNext(Unknown Source) 
>         at org.apache.derby.impl.sql.catalog.TabInfoImpl.updateRow(Unknown 
> Source) 
>         at org.apache.derby.impl.sql.catalog.TabInfoImpl.updateRow(Unknown 
> Source)
>         at 
> org.apache.derby.impl.sql.catalog.DataDictionaryImpl.updateSPS(Unknown Source 
> )
>         at 
> org.apache.derby.iapi.sql.dictionary.SPSDescriptor.updateSYSSTATEMENTS(Unknown
>  Source)
>         at 
> org.apache.derby.iapi.sql.dictionary.SPSDescriptor.getPreparedStatement(Unknown
>  Source)
>         at 
> org.apache.derby.iapi.sql.dictionary.SPSDescriptor.getPreparedStatement(Unknown
>  Source)
>         at org.apache.derby.impl.sql.compile.ExecSPSNode.generate(Unknown 
> Source)
>         at org.apache.derby.impl.sql.GenericStatement.prepMinion(Unknown 
> Source)
>         at org.apache.derby.impl.sql.GenericStatement.prepare(Unknown Source)
>         at 
> org.apache.derby.impl.sql.GenericPreparedStatement.rePrepare(Unknown Source)
>         at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown 
> Source)
>         at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
> Source)
>         at 
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
> Source)
>         at 
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeQuery(Unknown 
> Source) 
>         at org.apache.derby.impl.jdbc.EmbedDatabaseMetaData.doGetCols(Unknown 
> Source)
>         at 
> org.apache.derby.impl.jdbc.EmbedDatabaseMetaData.getColumns(Unknown Source)
> I've seen the deadlock occur during several get-metadata operations 
> (getImportedKeys, ...), each time waiting on the SYSSTATEMENTS apparently 
> because of internally constructed PreparedStatement. The lock eventually 
> times out and the locked out operation completes without error.
> When the deadlock occurs, the "real" SQL into systables is being compiled (I 
> traced this sql to be the value of "getColumns" key in 
> org/apache/derby/impl/jdbc/metadata.properties file):
> ---------%<-----------------------%<--------------
> Begin compiling prepared statement: SELECT CAST ('' AS VARCHAR(128)) AS 
> PKTABLE_CAT, S.SCHEMANAME AS PKTABLE_SCHEM, TABLENAME AS
> PKTABLE_NAME, COLS.COLUMNNAME AS PKCOLUMN_NAME, CAST ('' AS VARCHAR(128)) AS 
> FKTABLE_CAT, FKTABLE_SCHEM, FKTABLE_NAME, FKCOLUMN_NAME, CAST ...
> ---------%<-----------------------%<--------------

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to