[ 
https://issues.apache.org/jira/browse/DERBY-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristian Waagan updated DERBY-5447:
-----------------------------------

    Issue & fix info:   (was: Patch Available)
       Fix Version/s: 10.9.0.0

Committed patch 2a to trunk with revision 1180790. Doing a final run on 10.8 
before backporting.
Patch 2a ran AutomaticIndexStatisticsTest nearly 1700 times without errors.

Dag, thanks for looking at the patch.
My concern with patch 1a is that it places a burden on the caller of 
releaseExclusive: the caller must ensure that the monitor on 'this' isn't 
already acquired. This can easily be broken, for instance by making 
BasePage.unlatch synchronized, or adding a new synchronized method calling 
releaseExclusive. This may have already been broken, as it's hard to verify. 
Since BasePage is a very mature piece of code, this would probably not be a 
problem in practice. On the other hand, I'm sure there are better ways to 
address the root cause. 

Now that this issue is fixed, I don't know how to provoke the problem. I think 
it must involve two threads using the Observable API, and an 
Observer.update-method that acquires the monitor of both the object being 
observed and the observer and calls back into the object being observed (i.e. 
typically removeObserver). If my IDE isn't fooling me, there are five other 
update-methods. A quick check didn't reveal the synchronization issue in those 
five implementations.
                
> Deadlock in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete 
> (BasePage.releaseExclusive and Observable.deleteObserver 
> (BaseContainerHandle))
> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5447
>                 URL: https://issues.apache.org/jira/browse/DERBY-5447
>             Project: Derby
>          Issue Type: Bug
>          Components: Services
>    Affects Versions: 10.9.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Critical
>             Fix For: 10.9.0.0
>
>         Attachments: derby-5447-1a-obtain_monitors_in_order.diff, 
> derby-5447-2a-change_istat_shutdown.diff
>
>
> Java deadlock involving BasePage.releaseExclusive and 
> Observable.deleteObserver, the observable being a BaseContainerHandle 
> instance, seen when running  
> AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete.
> The activities involved are a scan of a conglomerate and the index statistics 
> daemon being stopped as part of the database shutdown.
> Here are the relevant parts of the stack trace:
> "index-stat-thread" daemon prio=10 tid=0x00007f4e34244000 nid=0x380b waiting 
> for monitor entry [0x00007f4e30aef000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at java.util.Observable.deleteObserver(Observable.java:95)
>         - waiting to lock <0x00000000f51132d0> (a 
> org.apache.derby.impl.store.raw.data.BaseContainerHandle)
>         at 
> org.apache.derby.impl.store.raw.data.BasePage.releaseExclusive(BasePage.java:1819)
>         - locked <0x00000000f6d5a280> (a 
> org.apache.derby.impl.store.raw.data.StoredPage)
>         at 
> org.apache.derby.impl.store.raw.data.CachedPage.releaseExclusive(CachedPage.java:531)
>         at 
> org.apache.derby.impl.store.raw.data.StoredPage.releaseExclusive(StoredPage.java:1066)
>         at 
> org.apache.derby.impl.store.raw.data.BasePage.unlatch(BasePage.java:1371)
>         at 
> org.apache.derby.impl.store.access.btree.ControlRow.release(ControlRow.java:926)
>         at 
> org.apache.derby.impl.store.access.btree.BTreeScan.savePositionAndReleasePage(BTreeScan.java:2146)
>         at 
> org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(BTreeForwardScan.java:442)
>         at 
> org.apache.derby.impl.store.access.btree.BTreeScan.fetchNextGroup(BTreeScan.java:1681)
>         at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl$KeyComparator.fetchRows(IndexStatisticsDaemonImpl.java:1221)
>         at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.updateIndexStatsMinion(IndexStatisticsDaemonImpl.java:463)
>         at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.generateStatistics(IndexStatisticsDaemonImpl.java:323)
>         at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.processingLoop(IndexStatisticsDaemonImpl.java:794)
>         at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.run(IndexStatisticsDaemonImpl.java:710)
>         at java.lang.Thread.run(Thread.java:679)
> "main":
>         at 
> org.apache.derby.impl.store.raw.data.BasePage.isLatched(BasePage.java:1383)
>         - waiting to lock <0x00000000f6d5a280> (a 
> org.apache.derby.impl.store.raw.data.StoredPage)
>         at 
> org.apache.derby.impl.store.raw.data.BasePage.update(BasePage.java:1621)
>         at java.util.Observable.notifyObservers(Observable.java:159)
>         at java.util.Observable.notifyObservers(Observable.java:115)
>         at 
> org.apache.derby.impl.store.raw.data.BaseContainerHandle.informObservers(BaseContainerHandle.java:1008)
>         at 
> org.apache.derby.impl.store.raw.data.BaseContainerHandle.close(BaseContainerHandle.java:414)
>         - locked <0x00000000f51132d0> (a 
> org.apache.derby.impl.store.raw.data.BaseContainerHandle)
>         at 
> org.apache.derby.impl.store.access.btree.OpenBTree.close(OpenBTree.java:490)
>         at 
> org.apache.derby.impl.store.access.btree.BTreeScan.closeForEndTransaction(BTreeScan.java:2021)
>         at 
> org.apache.derby.impl.store.access.btree.index.B2IForwardScan.closeForEndTransaction(B2IForwardScan.java:107)
>         at 
> org.apache.derby.impl.store.access.RAMTransaction.closeControllers(RAMTransaction.java:245)
>         at 
> org.apache.derby.impl.store.access.RAMTransactionContext.cleanupOnError(RAMTransactionContext.java:97)
>         at 
> org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(ContextManager.java:343)
>         at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.stop(IndexStatisticsDaemonImpl.java:919)
>         - locked <0x00000000f4a5a070> (a java.util.ArrayList)
>         at 
> org.apache.derby.impl.sql.catalog.DataDictionaryImpl.disableIndexStatsRefresher(DataDictionaryImpl.java:13891)
>         at 
> org.apache.derby.impl.db.DatabaseContextImpl.cleanupOnError(DatabaseContextImpl.java:69)
>         at 
> org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(ContextManager.java:343)
>         at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.cleanupOnError(TransactionResourceImpl.java:437)
>         at 
> org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:633)
>         at 
> org.apache.derby.impl.jdbc.EmbedConnection30.<init>(EmbedConnection30.java:73)
>         at 
> org.apache.derby.impl.jdbc.EmbedConnection40.<init>(EmbedConnection40.java:51)
>         at 
> org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:70)
>         at 
> org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:248)
>         at 
> org.apache.derby.jdbc.EmbeddedDataSource.getConnection(EmbeddedDataSource.java:480)
>         at 
> org.apache.derby.jdbc.EmbeddedDataSource.getConnection(EmbeddedDataSource.java:424)
>         at 
> org.apache.derbyTesting.junit.JDBCDataSource.shutdownDatabase(JDBCDataSource.java:266)
>         at 
> org.apache.derbyTesting.functionTests.tests.store.AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete(AutomaticIndexStatisticsTest.java:187)
> I have access to a machine on which this can be reproduced pretty simple. 
> It's an Intel(R) Core(TM)2 Duo CPU E8500 @ 3.16GHz running Fedora 15.
> Java is:
> OpenJDK Runtime Environment (IcedTea6 1.10.2) (fedora-58.1.10.2.fc15-x86_64)
> OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
> In a way this bug is critical, since it causes two threads to hang forever. 
> However, for it to happen the database must be shut down while the index 
> statistics daemon is doing work and the timing must be right. There may be 
> other ways to trigger the bug where Observable.update and 
> BasePage.releaseExclusive are involved.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to