Hi,
I am currently investigating the issue.

So far, I managed to get an initial copy of TDB indexes which is not corrupted (~2.6GB). We then applied ~635 updates to it (and for each transaction I have the data which has been submitted). I then re-applied the changes with a little program which uses TxTDB only (via TDBLoader.load(...)). At the end of this, the nodes.dat file is corrupted.

This is just doing:

                StoreConnection sc = StoreConnection.make(location) ;
                for ( int i = 1; i < 636; i++ ) {
                        System.out.println(i);
                        DatasetGraphTxn dsg = sc.begin(ReadWrite.WRITE) ;
                        TDBLoader.load(dsg, "/tmp/updates/" + i + ".ttl") ;
                        dsg.commit() ;
                        dsg.close() ;
                }

I tried to apply same changes to an initially empty TDB database and there are no problems.

Now, I am double checking the integrity of my initial TDB indexes.
I then proceed applying one change at the time and verify integrity (via dump).

Paolo



Simon Helsen wrote:
thanks Paolo,

this is related to jena-91. In fact, that is how our problems started

Glad someone else was able to reproduce

Simon



From:
Paolo Castagna <[email protected]>
To:
[email protected]
Date:
09/28/2011 06:47 AM
Subject:
Re: TxTDB - com.hp.hpl.jena.tdb.base.file.FileException: Impossibly large object



The object file of the node table (i.e. nodes.dat) is corrupted.

I tried to read it sequentially, I get:
(318670, java.nio.HeapByteBuffer[pos=0 lim=22 cap=22])
But, after that, the length of the next ByteBuffer is: 909129782 (*).

Paolo

(*) Running a simple program to iterate through all the Pair<Long, ByteBuffer>
      in the ObjectFile and debugging it: ObjectFileDiskDirect, line 176.


Paolo Castagna wrote:
Hi,
we are using|testing TxTDB.

In this case, we just perform a series of WRITE transactions
(sequentially
one after the other) and then issue a SPARQL query (as a READ
transaction).
There are no exceptions during the WRITE transactions.

This is the exception we see when we issue the SPARQL query:

com.hp.hpl.jena.tdb.base.file.FileException: ObjectFile.read(9863)[119398665][119079969]: Impossibly large object : 1752462448 bytes at
com.hp.hpl.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:282)
    at com.hp.hpl.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:60)
at
com.hp.hpl.jena.tdb.nodetable.NodeTableNative.readNodeFromTable(NodeTableNative.java:164)
at
com.hp.hpl.jena.tdb.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:88)
at
com.hp.hpl.jena.tdb.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:59)
at
com.hp.hpl.jena.tdb.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:89)
at
com.hp.hpl.jena.tdb.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:60)
at
com.hp.hpl.jena.tdb.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:44)
at
com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:56)
at
com.hp.hpl.jena.tdb.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:44)
    at com.hp.hpl.jena.tdb.solver.BindingTDB.get1(BindingTDB.java:92)
at
com.hp.hpl.jena.sparql.engine.binding.BindingBase.get(BindingBase.java:106)
at com.hp.hpl.jena.sparql.core.ResultBinding._get(ResultBinding.java:44) at
com.hp.hpl.jena.sparql.core.QuerySolutionBase.get(QuerySolutionBase.java:20)
at
com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:35)
at com.hp.hpl.jena.sparql.resultset.JSONOutput.format(JSONOutput.java:23) at
com.hp.hpl.jena.query.ResultSetFormatter.outputAsJSON(ResultSetFormatter.java:584)
    [...]

This was with an Oracle JVM, 1.6.0_25 64-bit on an VM (on EC2) with
Ubuntu 64-bit OS. We are using a TxTDB packaged directly from SVN (r1176416).

This seems to be a similar (or related) issue to:
https://issues.apache.org/jira/browse/JENA-91

Paolo







Reply via email to