[
https://issues.apache.org/jira/browse/DERBY-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029417#comment-13029417
]
Rick Hillegas commented on DERBY-4252:
--------------------------------------
I have run the log-file-inspecting tools against the files in the log directory
of the bad database attached to this issue. The log-file-inspecting tools can
be found on DERBY-5195. I can report the following facts about this data
corruption:
1) The control files log.ctrl and logmirror.ctrl appear to be intact and
identical.
2) There are 20 log files in the directory. The first 19 seem to be valid. At
least, the LogFileReader is able to successfully deserialize all of them.
3) The last log file is corrupt. It is only 17 bytes long so it doesn't even
have a valid header record. I get an EOF reading this file. The initial
formatableID, obsoleteVersion, and logFileNumber fields are correct but the
previousRecordLogInstant is truncated. This is what I see:
java LogFileReader wombat/log/log20.dat -n 0
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<logFile>
<formatableID>128</formatableID>
<obsoleteVersion>9</obsoleteVersion>
<logFileNumber>20</logFileNumber>
<message>java.io.EOFException: null</message>
<stackTrace>java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at LogFileReader$LogFile.readFileHeader(LogFileReader.java:250)
at LogFileReader$LogFile.readFile(LogFileReader.java:216)
at LogFileReader$LogFile.printMe(LogFileReader.java:199)
at LogFileReader.main(LogFileReader.java:88)
</stackTrace>
</logFile>
4) I have noticed the following oddity, which may be a red herring: The last
good log file (log19.dat) ends with a flurry of CheckpointOperations. Each
CheckpointOperation is separated from the preceding CheckpointOperation by only
a couple hundred bytes. In the other log files, the spacing of
CheckpointOperations is generally several tens of thousands of bytes.
I would be happy to continue exploring these files if someone would like to
suggest what I should look for. Alternatively, the LogFileReader is very easy
to use and would be even more powerful in the hands of a real Store expert.
> intermittent corruption./java.io.EOFException in RandomAccessFile.readInt()
> on boot after compress with background checkpoint
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: DERBY-4252
> URL: https://issues.apache.org/jira/browse/DERBY-4252
> Project: Derby
> Issue Type: Bug
> Components: Store
> Affects Versions: 10.6.1.0
> Environment: Windows XP dualcore.
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pwi3260sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 Windows XP x86-32
> jvmwi3260-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_lHdSMr
> JIT - r9_20090213_2028
> GC - 20090213_AA)
> JCL - 20090218_01
> Also tried with the latest development SR5 version and it reproduces.
> Reporter: Kathey Marsden
> Priority: Critical
> Labels: derby_triage10_5_2
> Attachments: corrupt_database_with_logs.zip,
> reproBackgroundCheckpoint.zip
>
>
> The attached reproduction reproBackgroundCheckpoint.zip occasionally results
> in an EOF exception on boot with IBM 1.6 (about 1 out of 200 runs) even
> after the fix for DERBY-4239. The program inserts into /deletes some data
> from a table and runs compress, and as a daemon thread that loops
> checkpoints. I have not seen it with the Sun JVM.
> Before the fix for DERBY-4239, this issue was much more frequent. Sometimes
> you would get the DERBY-4239 trace and sometimes this one. After the fix,
> this issue remains. Mike thought it looked like a different bug and asked
> that we file a separate issue.
> This is the trace:
> java.io.EOFException
> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:739)
> at java.io.RandomAccessFile.readLong(RandomAccessFile.java:772)
> at
> org.apache.derby.impl.store.raw.log.Scan.getNextRecordForward(Unknown Source)
> at org.apache.derby.impl.store.raw.log.Scan.getNextRecord(Unknown
> Source)
> at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown
> Source)
> at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown
> Source)
> at
> org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> at
> org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown
> Source)
> at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown
> Source)
> at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown
> Source)
> at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown
> Source)
> at
> org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> at
> org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown
> Source)
> at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown
> Source)
> at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown
> Source)
> at
> org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> at
> org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown
> Source)
> at
> org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown
> Source)
> at
> org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown
> Source)
> at
> org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown
> Source)
> at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown
> Source)
> at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> at java.sql.DriverManager.getConnection(DriverManager.java:316)
> at java.sql.DriverManager.getConnection(DriverManager.java:273)
> at CheckTables.main(CheckTables.java:8)
> To reproduce, compile the java programs and run reprobckchkpt.ksh. You may
> want to increase the number of iterations and let it run overnight to see the
> failure or back out the fix for DERBY-4239 to see it happen quicker.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira