[jira] Updated: (DERBY-3607) Invalid checksum error in Derby 10.3.2.1

Mike Matrigali (JIRA) Fri, 13 Jun 2008 10:56:06 -0700

     [ 
https://issues.apache.org/jira/browse/DERBY-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mike Matrigali updated DERBY-3607:
----------------------------------


Thanks for the info, without a repro I have been inspecting the code and 
anything that you can tell me about the app
helps direct that effort.  Of course if you can come up with a repro that could 
be run it is much more likely someone will
be able to find the issue.  If you can't give a repro I am going to continue to 
ask questions about the app in your environment, 
and maybe something will come of that.

How big is a zipped copy of the db?  Getting a copy of it may help as it would 
be interesting to look at the state of page 0 in
the 2 corrupted tables.  jira will allow up to 10mb of a zipped attachment.  
there may be other places at a apache we could 
load a bigger file, just not sure right now.  depending on when you enabled 
online backup the log may include a complete
update history of the db and looking at it or more likely comparing it with a 
few other examples of the bug may lead to what
kinds of things lead to the problem.

My current assumption is that the problem is caused by some I/O interaction 
similar to DERBY-3347.  Could you describe 
the concurrency in the simplest case that you have been able to reproduce this 
problem.  Basically how many threads
are involved and do they run concurrently?  Things like are the 
startup/shutdown of the 2 db's done independently on different
threads?  While running how many threads/connections are done doing work in the 
2 db's.  

Can you describe when and how often you execute the command that "enables" 
online backup?  It is at this point that derby
copies a number of database files from the original db to your backup location 
and it does have some code that insures that
the page 0 is up to date before the copy.  Is it possible to run your test 
without online backup ever being called just to see it
the bug still reproduces?  

You mentioned "jvm hook", does this mean you are also shutting down the jvm 
durring the test run?  If so can you describe
how often, ie. for every shutdown in the derby.log is there also a jvm 
shutdown?  I think this is what you mean by your
services comment.  Is the following what is going on:
o You start a service and what it does is start up a jvm, it opens the 2 
databases, and some set of work is somehow done
in both the db's (is this work different or the same for each try).  Then you 
stop the service sometime later which shuts down
the jvm and as part of jvm hooks your specific shutdown stuff happens.

How are clients shutdown?  Is it possible that in progress clients are shutdown 
by "killing" them somehow?  I am not 
familar with hibernate so this could be a part of "the seesion factories 
(hibernate level)  shutdown" ?

Is any of the work that is done during the tests include deletes, updates to 
fields included in any index/contraint, or inserts which fail due to duplicate 
key errors, ddl after the initial create of the database?  This is interesting 
as it may queue background work to reclaim deleted space and thus add another 
point of concurrency to the work being done.  

Does each startup/shutdown phase always access the same tables, looking to 
verify that when you get an error on a particular
table whether it is guaranteed that the table was working in the previous 
startup/shutdown test phase.  

Another thing I am looking at is a possiblity that exaustion of resources is 
coming into play, which would explain why it takes multiple db's.  Do you set 
any derby properties as part of your application, if so could you post all the 
ones that you change.  Can you estimate about how many different derby tables 
might be accessed during one of the startup/shutdown phases of your test?   I 
know this is hard as it should also include how many indexes may be referenced. 
 The 2 resources I am mostly thinking about are the page cache and the open 
container cache.  The page cache defaults to 1000 and the open container cache 
defaults to 100.  Depending on your application (basically concurrent user 
threads and the background thread used for post commit and checkpoints) we may 
have multiple open "channels" on each container in the open container cache - i 
am not exactly sure what
resource this maps to on windows.  

> Invalid checksum error in Derby 10.3.2.1
> ----------------------------------------
>
>                 Key: DERBY-3607
>                 URL: https://issues.apache.org/jira/browse/DERBY-3607
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.3.2.1, 10.4.1.3
>         Environment: OS-WIN XP SP2, 1.86GHz, 2GB, JVM 1.5, disk caching 
> disabled, Hibernate 3.1.1.RC3,c3p0
>            Reporter: Shahbaz
>            Priority: Critical
>         Attachments: DB_10.4logs.zip, derby.log, derby.log, 
> hibernate.cfg.xml, hibernate.cfg.xml, hibernate.cfg.xml
>
>
> I am getting this execption when ever I try to restart my application
> java.sql.SQLException: Invalid checksum on Page Page(0,Container(0, 2033)), 
> expected=2,731,401,932, on-disk version=2,375,776,513, page dump follows: Hex 
> dump:
> 00000000: 0076 0000 0001 0000 0000 0000 0002 0000  .v..............
> 00000010: 0000 0006 0000 0000 0000 0000 0000 0000  ................
> 00000020: 0000 0000 0001 0000 0000 0000 0000 0000  ................
> 00000030: 0000 0000 0000 0000 0000 0000 ffff ffff  ................
> 00000040: ffff ffff 0000 0000 0000 0000 0000 0000  ................
> 00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000060: 0000 0000 0000 0000 0000 0000 5000 0000  ............P...
> at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
> Source)
>       at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
> Source)
>       at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
> Source)
>       at 
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
> Source)
>       at 
> org.apache.derby.impl.jdbc.EmbedCallableStatement.executeStatement(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.EmbedPreparedStatement.execute(Unknown 
> Source)
>       at 
> com.mchange.v2.c3p0.impl.NewProxyCallableStatement.execute(NewProxyCallableStatement.java:3044)
>       at 
> ae.sphere.arena.database.management.backup.BackupStategy.createBackup(BackupStategy.java:56)
>       at 
> ae.sphere.arena.database.management.backup.BackupStategy.doSchedulerJob(BackupStategy.java:41)
>       at 
> ae.sphere.arena.common.jobscheduler.Scheduler$1.run(Scheduler.java:49)
>       at org.eclipse.core.internal.jobs.Worker.run(Worker.java:58)
> 00000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000000a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000000b0: 0000 0

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3607) Invalid checksum error in Derby 10.3.2.1

Reply via email to