[ 
https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Matrigali updated DERBY-4239:
----------------------------------

    Description: 
corruption with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo 
operation null in the log when compress occurs during checkpoint, then jvm exits

I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The 
failure comes in oc_rec3 trying to connect to the database, but the actual 
problem seems to have occurred with the prior test oc_rec2.  The problem is 
somewhat intermittent, happening approximately 1/4 times.  I extracted the case 
from the harness and will attach the reproduction and run the script repro.ksh. 
 The script will loop up to 50 times until it gets the failure which looks like.

ERROR XSLA7: Cannot redo operation null in the log.
        at org.apache.derby.iapi.error.StandardException.newException(Unknown 
Source)
        at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
        at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
        at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
        at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
Source)
        at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
        at 
org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
        at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown 
Source)
        at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
Source)
        at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
        at 
org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
        at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
        at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
        at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
Source)
        at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown 
Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown
 Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown
 Source)
        at 
org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown 
Source)
        at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown 
Source)
        at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
        at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
        at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
        at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
        at java.sql.DriverManager.getConnection(DriverManager.java:311)
        at java.sql.DriverManager.getConnection(DriverManager.java:268)
        at CheckTables.main(CheckTables.java:8)
Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 
1073)), page dump follows: Hex dump:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
<snip lots of 000's>

I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby 
issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run 
tests many times before on this machine using in the 10.5.1.1 release and the 
same jvm and have never seen this failure, so am looking into whether maybe 
something changed on the machine or environment.



  was:
I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The 
failure comes in oc_rec3 trying to connect to the database, but the actual 
problem seems to have occurred with the prior test oc_rec2.  The problem is 
somewhat intermittent, happening approximately 1/4 times.  I extracted the case 
from the harness and will attach the reproduction and run the script repro.ksh. 
 The script will loop up to 50 times until it gets the failure which looks like.

ERROR XSLA7: Cannot redo operation null in the log.
        at org.apache.derby.iapi.error.StandardException.newException(Unknown 
Source)
        at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
        at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
        at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
        at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
Source)
        at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
        at 
org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
        at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown 
Source)
        at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
Source)
        at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
        at 
org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
        at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
        at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
        at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
Source)
        at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown 
Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown
 Source)
        at 
org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown
 Source)
        at 
org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown 
Source)
        at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown 
Source)
        at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
        at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
        at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
        at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
        at java.sql.DriverManager.getConnection(DriverManager.java:311)
        at java.sql.DriverManager.getConnection(DriverManager.java:268)
        at CheckTables.main(CheckTables.java:8)
Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 
1073)), page dump follows: Hex dump:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
<snip lots of 000's>

I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby 
issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run 
tests many times before on this machine using in the 10.5.1.1 release and the 
same jvm and have never seen this failure, so am looking into whether maybe 
something changed on the machine or environment.



        Summary: Possible corruption if SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE 
is called during checkpoint   (was: corruption with storerecovery oc_rec? 
tests.  ERROR XSLA7: Cannot redo operation null in the log when compress occurs 
during checkpoint, then jvm exits)

> Possible corruption if SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE is called 
> during checkpoint 
> -------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 
> 10.6.0.0
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 
> jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build 
> pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 
> jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Assignee: Mike Matrigali
>            Priority: Critical
>             Fix For: 10.1.4.0, 10.2.3.0, 10.3.4.0, 10.4.3.0, 10.5.2.0, 
> 10.6.0.0
>
>         Attachments: badlogsizes.txt, derby-4239_1.diff, DERBY-4239_2.diff, 
> DERBY-4239_3.diff, derby.log, derby.log, derby_dumponly.zip, 
> goodlogsizes.txt, identifyBadContainer.ksh, reproBackgroundCheckpoint.zip, 
> reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> corruption with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo 
> operation null in the log when compress occurs during checkpoint, then jvm 
> exits
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The 
> failure comes in oc_rec3 trying to connect to the database, but the actual 
> problem seems to have occurred with the prior test oc_rec2.  The problem is 
> somewhat intermittent, happening approximately 1/4 times.  I extracted the 
> case from the harness and will attach the reproduction and run the script 
> repro.ksh.  The script will loop up to 50 times until it gets the failure 
> which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
>       at org.apache.derby.iapi.error.StandardException.newException(Unknown 
> Source)
>       at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
>       at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
>       at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
>       at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
> Source)
>       at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
> Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
>       at 
> org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown 
> Source)
>       at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown 
> Source)
>       at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
> Source)
>       at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
> Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
>       at 
> org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown 
> Source)
>       at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
>       at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
>       at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown 
> Source)
>       at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown 
> Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown
>  Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown
>  Source)
>       at 
> org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown
>  Source)
>       at 
> org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
>       at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
>       at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
>       at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
>       at java.sql.DriverManager.getConnection(DriverManager.java:311)
>       at java.sql.DriverManager.getConnection(DriverManager.java:268)
>       at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 
> 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or 
> Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I 
> have run tests many times before on this machine using in the 10.5.1.1 
> release and the same jvm and have never seen this failure, so am looking into 
> whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to