[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Mike Matrigali (JIRA) Thu, 28 May 2009 15:46:12 -0700

     [ 
https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mike Matrigali updated DERBY-4239:
----------------------------------

    Attachment: DERBY-4239_3.diff

Derby-4239_3.diff is the patch I intend to commit.  It passes complete set of 
nightly tests.

After looking at the backup code it seemed like backup really wanted to have 
the same behavior
that compress was looking for.  I also changed the behavior of the system 
procedure checkpoint
to match backup and compress checkpoint.  

I moved the waiting code into the subroutine so that it could differ between a 
checkpoint returning
false because another a checkpoint was in progress and a couple of other 
possible conditions.
Without this change system could get in a state where it looped forever trying 
to get a checkpoint
(one case was trying to force a clean shutdown after we had already closed down 
the logging
system).



> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot 
> redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 
> 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 
> jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build 
> pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 
> jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, 
> DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, 
> goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, 
> reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The 
> failure comes in oc_rec3 trying to connect to the database, but the actual 
> problem seems to have occurred with the prior test oc_rec2.  The problem is 
> somewhat intermittent, happening approximately 1/4 times.  I extracted the 
> case from the harness and will attach the reproduction and run the script 
> repro.ksh.  The script will loop up to 50 times until it gets the failure 
> which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
>       at org.apache.derby.iapi.error.StandardException.newException(Unknown 
> Source)
>       at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
>       at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
>       at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
>       at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
> Source)
>       at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
> Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
>       at 
> org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown 
> Source)
>       at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown 
> Source)
>       at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
> Source)
>       at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
> Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
>       at 
> org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown 
> Source)
>       at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
>       at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
>       at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
> Source)
>       at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
> Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown
>  Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown
>  Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown
>  Source)
>       at 
> org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
>       at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
>       at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
>       at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
>       at java.sql.DriverManager.getConnection(DriverManager.java:311)
>       at java.sql.DriverManager.getConnection(DriverManager.java:268)
>       at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 
> 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or 
> Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I 
> have run tests many times before on this machine using in the 10.5.1.1 
> release and the same jvm and have never seen this failure, so am looking into 
> whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log.

Reply via email to