[ 
https://issues.apache.org/jira/browse/DERBY-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602303#comment-13602303
 ] 

Brett Bergquist commented on DERBY-5632:
----------------------------------------

Knut, originally this was all done using a script and IJ.   The script invoked 
IJ to freeze the database, then did a file system backup (using bash and ZFS 
commands) and then invoked IJ to unfreeze the database.  Mike Matrigali 
indicated that the expectation was that the freeze/unfreeze would be done using 
the same connection, so I wrote a utility in Java that would create a 
connection, invoke the freeze, issue a system level call to perform the file 
system backup, and then issue the unfreeze all in one connection.   The problem 
still exists even doing it this way but the stack traces tare probably from the 
original.

I really do have an issue with the expectation that this all be done in one 
connection.  Even with the utility that there is a chance that the utility 
could fail between the time the freeze is done and the unfreeze is done.   At 
that point Derby is done and will require to be forcefully killed.

It would seem to me that it should always be possible to connect to the 
database engine.  It may be that specific requests on the connection will block 
when the database is frozen, but the one that should not block is the unfreeze.

I think not allowing the freeze/unfreeze from different connections is a bug.
                
> Logical deadlock happened when freezing/unfreezing the database
> ---------------------------------------------------------------
>
>                 Key: DERBY-5632
>                 URL: https://issues.apache.org/jira/browse/DERBY-5632
>             Project: Derby
>          Issue Type: Bug
>          Components: Documentation, Services
>    Affects Versions: 10.8.2.2
>         Environment: Oracle M3000/Solaris 10
>            Reporter: Brett Bergquist
>            Assignee: Knut Anders Hatlen
>              Labels: derby_triage10_10
>             Fix For: 10.10.0.0
>
>         Attachments: experimental-v1.diff, experimental-v2.diff, stack.txt
>
>
> Tried to make a quick database backup by freezing the database, performing a 
> ZFS snapshot, and then unfreezing the database.   The database was frozen but 
> then a connection to the database could not be established to unfreeze the 
> database.
> Looking at the stack trace of the network server, , I see 3 threads that are 
> trying to process a connection request.   Each of these is waiting on:
>                 at 
> org.apache.derby.impl.store.access.RAMAccessManager.conglomCacheFind(Unknown 
> Source)
>                 - waiting to lock <0xfffffffd3a7fcc68> (a 
> org.apache.derby.impl.services.cache.ConcurrentCache)
> That object is owned by:
>                 - locked <0xfffffffd3a7fcc68> (a 
> org.apache.derby.impl.services.cache.ConcurrentCache)
>                 at 
> org.apache.derby.impl.store.access.RAMTransaction.findExistingConglomerate(Unknown
>  Source)
>                 at 
> org.apache.derby.impl.store.access.RAMTransaction.openGroupFetchScan(Unknown 
> Source)
>                 at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.updateIndexStatsMinion(Unknown
>  Source)
>                 at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.runExplicitly(Unknown
>  Source)
>                 at 
> org.apache.derby.impl.sql.execute.AlterTableConstantAction.updateStatistics(Unknown
>  Source)
> which itself is waiting for the object:
>                 at java.lang.Object.wait(Native Method)
>                 - waiting on <0xfffffffd3ac1d608> (a 
> org.apache.derby.impl.store.raw.log.LogToFile)
>                 at java.lang.Object.wait(Object.java:485)
>                 at 
> org.apache.derby.impl.store.raw.log.LogToFile.flush(Unknown Source)
>                 - locked <0xfffffffd3ac1d608> (a 
> org.apache.derby.impl.store.raw.log.LogToFile)
>                 at 
> org.apache.derby.impl.store.raw.log.LogToFile.flush(Unknown Source)
>                 at 
> org.apache.derby.impl.store.raw.data.BaseDataFileFactory.flush(Unknown Source)
> So basically what I think is happening is that the database is frozen, the 
> statistics are being updated on another thread which has the 
> "org.apache.derby.impl.services.cache.ConcurrentCache" locked and then waits 
> for the LogToFile lock and the connecting threads are waiting to lock 
> "org.apache.derby.impl.services.cache.ConcurrentCache" to connect and these 
> are where the database is going to be unfrozen.    Not a deadlock as far as 
> the JVM is concerned but it will never leave this state either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to