[
https://issues.apache.org/jira/browse/DERBY-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dag H. Wanvik updated DERBY-5325:
---------------------------------
Attachment: derby-5325a.stat
derby-5325a.diff
Uploading a patch for this issue, derby-5325a.
With NIO, writeRAFHeader has two methods leading to interruptible IO:
- getEmbryonicPage
- writeHeader
Currently, getEmbryonicPage may throw InterruptDetectedException and hence, so
may writeRAFHeader.
writeHeader may throw ClosedByInterruptException, AsynchronousCloseException
and ClosedChannelException because writeHeader does not use
RAFContainer4#writePage, but rather uses RAFContainer4#writeAtOffset, which
does not currently attempt to recover after interrupt.
So currently, clients of writeRAFHeader need to be prepared for all of
InterruptDetectedException, ClosedByInterruptException,
AsynchronousCloseException and ClosedChannelException.
writeRAFHeader is used in three locations:
- RAFContainer#clean
- RAFContainer#run(CREATE_CONTAINER_ACTION)
- RAFContainer#run(STUBBIFY_ACTION)
RAFContainer#clean is prepared for InterruptDetectedException only. The issue
shows that ClosedChannelException may also occur, and it is not prepared for
that (this bug).
RAFContainer#run(CREATE_CONTAINER_ACTION) is prepared for
ClosedByInterruptException and AsynchronousCloseException. Since IO during
container creation is single-threaded, this is sufficient: it should never need
to handle ClosedChannelException/InterruptDetectedException, both of which
signal that another thread saw interrupt on the container channel.
RAFContainer#run(STUBBIFY_ACTION) is part of the removeContainer operation
which should happen after the container is closed, so it should be
single-threaded on the container as well(?). It should handle
ClosedByInterruptException and AsynchronousCloseException and do retry, but
doesn't, currently.
If we let writeAtOffset clean up just like writePage,
RAFContainer4#writeAtOffset (i.e.also writeHeader) would only only throw
InterruptDetectedException, i.e. another thread saw interrupt, so retry. This
would simplify logic in RAFContainer: we could remove the retry logic from
RAFContainer#run(CREATE_CONTAINER_ACTION). This could also cover retry logic
for RAFContainer#run(STUBBIFY_ACTION) wrt its use of writeRAFHeader.
Next, RAFContainer#clean is already handling InterruptDetectedException and
would with this change no longer see ClosedByInterruptException,
AsynchronousCloseException or ClosedChannelException. This should solve
DERBY-5325.
I did not add a new test for this issue yet since I don't know how to force
this scenario. We have only seen it once, I believe. I'll be running
InterruptResilienceTest continuously with this patch along with the patch for
DERBY-5312 on several platforms to gain more confidence.
> Checkpoint fails with ClosedChannelException in InterruptResilienceTest
> -----------------------------------------------------------------------
>
> Key: DERBY-5325
> URL: https://issues.apache.org/jira/browse/DERBY-5325
> Project: Derby
> Issue Type: Bug
> Components: Store
> Affects Versions: 10.9.0.0
> Environment: Solaris 10 5/08 s10x_u5wos_10 X86
> Java(TM) SE Runtime Environment (build 1.7.0-b147)
> Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17 mixed mode)
> Reporter: Knut Anders Hatlen
> Assignee: Dag H. Wanvik
> Attachments: derby-5325a.diff, derby-5325a.stat, derby.log,
> error-stacktrace.out
>
>
> Seen here:
> http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.7/testing/testlog/sol/1144688-suitesAll_diff.txt
> There was 1 error:
> 1)
> testRAFWriteInterrupted(org.apache.derbyTesting.functionTests.tests.store.InterruptResilienceTest)java.sql.SQLException:
> The exception 'java.sql.SQLException: Log Record has been sent to the
> stream, but it cannot be applied to the store (Object null). This may cause
> recovery problems also.' was thrown while evaluating an expression.
> at
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
> at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown
> Source)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown
> Source)
> at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
> at org.apache.derby.impl.jdbc.EmbedStatement.executeUpdate(Unknown
> Source)
> at
> org.apache.derbyTesting.functionTests.tests.store.InterruptResilienceTest.testRAFWriteInterrupted(InterruptResilienceTest.java:217)
> (...)
> Caused by: java.nio.channels.ClosedChannelException
> at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:94)
> at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:691)
> at org.apache.derby.impl.store.raw.data.RAFContainer4.writeFull(Unknown
> Source)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer4.writeAtOffset(Unknown
> Source)
> at
> org.apache.derby.impl.store.raw.data.FileContainer.writeHeader(Unknown Source)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(Unknown
> Source)
> at org.apache.derby.impl.store.raw.data.RAFContainer.clean(Unknown
> Source)
> at
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(Unknown
> Source)
> at
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanCache(Unknown
> Source)
> at
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanAll(Unknown Source)
> at
> org.apache.derby.impl.store.raw.data.BaseDataFileFactory.checkpoint(Unknown
> Source)
> at
> org.apache.derby.impl.store.raw.log.LogToFile.checkpointWithTran(Unknown
> Source)
> at org.apache.derby.impl.store.raw.log.LogToFile.checkpoint(Unknown
> Source)
> at org.apache.derby.impl.store.raw.RawStore.checkpoint(Unknown Source)
> at org.apache.derby.impl.store.raw.log.LogToFile.performWork(Unknown
> Source)
> at
> org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(Unknown
> Source)
> at org.apache.derby.impl.services.daemon.BasicDaemon.work(Unknown
> Source)
> at org.apache.derby.impl.services.daemon.BasicDaemon.run(Unknown Source)
> at java.lang.Thread.run(Thread.java:722)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira