Hey Simon

The only code I am running is.

DatasetGraphTDB datasetGraph = TDBFactory.createDatasetGraph(tdbDir);
InputStream inputStream = new FileInputStream(dbpediaData);

BulkLoader bulkLoader = new BulkLoader();
bulkLoader.loadDataset(dataset, instanceStream, true);

No other processes or threads are running and the application has exclusive
access to the tdb directory. Because of this I suspect a timing issue within
TDB's code maybe somewhere in RecordBuffer or in the BPTree itself. I have
noticed I can only reproduce the issue on fast harddrives such as a SSD
harddrive.

Thanks

-jp

On Fri, Jun 17, 2011 at 11:52 AM, Simon Helsen <[email protected]> wrote:

>
> TBD is not thread-safe. You have to protect read and write operations
> yourself (i.e. multiple read, but exclusive write, i.e. no read while write)
>
> Simon
>
>
> *Simon Helsen, Ph.D.*
> Advisory Software Engineer - Jazz Foundation Server
> ------------------------------
>  *Phone:* 1-416-225-5717 | *Mobile:* 1-647-966-8280*
> E-mail:* *[email protected]* <[email protected]>
> [image: IBM]
>
>
>
>
>  From: jp <[email protected]> To: [email protected] Date: 
> 06/17/2011
> 11:39 AM Subject: BulkLoader error with large data and fast harddrive
> ------------------------------
>
>
>
> I recently updated my computer hardware and am receiving exceptions
> while loading a dbpedia dataset of ~19million triples. I have been
> able to produce the error below using the follow code. I believe this
> might be a concurrency issue as the same data loads with the same code
> on a similar machine with a standard harddrive.
>
> DatasetGraphTDB datasetGraph = TDBFactory.createDatasetGraph(tdbDir);
> InputStream inputStream = new FileInputStream(dbpediaData);
>
> BulkLoader bulkLoader = new BulkLoader();
> bulkLoader.loadDataset(dataset, instanceStream, true);
>
>
> My current specs are
> 2.3gh Quad core i5 processor
> 4gb ram
> 128gb ssd harddrive
>
> tested on both
> java version "1.6.0_22"
> OpenJDK Runtime Environment (IcedTea6 1.10.1) (6b22-1.10.1-0ubuntu1)
> OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
>
> java version "1.6.0_24"
> Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
>
> Jena versions are as follows
> arq-2.8.8
> jena-2.6.4
> tdb-0.8.10
>
> Error while loading into an empty directory
> java.lang.IllegalArgumentException
>                 at java.nio.Buffer.position(Buffer.java:235)
>                 at
> com.hp.hpl.jena.tdb.base.record.RecordFactory.buildFrom(RecordFactory.java:94)
>                 at
> com.hp.hpl.jena.tdb.base.buffer.RecordBuffer._get(RecordBuffer.java:95)
>                 at
> com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.get(RecordBuffer.java:41)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeRecords.getSplitKey(BPTreeRecords.java:141)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.split(BPTreeNode.java:435)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:387)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:399)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:399)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.insert(BPTreeNode.java:167)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.addAndReturnOld(BPlusTree.java:297)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.add(BPlusTree.java:289)
>                 at
> com.hp.hpl.jena.tdb.index.TupleIndexRecord.performAdd(TupleIndexRecord.java:48)
>                 at
> com.hp.hpl.jena.tdb.index.TupleIndexBase.add(TupleIndexBase.java:49)
>                 at
> com.hp.hpl.jena.tdb.index.TupleTable.add(TupleTable.java:54)
>                 at
> com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.addRow(NodeTupleTableConcrete.java:77)
>                 at
> com.hp.hpl.jena.tdb.store.bulkloader.LoaderNodeTupleTable.load(LoaderNodeTupleTable.java:112)
>                 at
> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:268)
>                 at
> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:244)
>                 at
> org.openjena.riot.lang.LangNTuple.runParser(LangNTuple.java:60)
>                 at org.openjena.riot.lang.LangBase.parse(LangBase.java:71)
>                 at
> org.openjena.riot.RiotReader.parseQuads(RiotReader.java:122)
>                 at
> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadQuads$(BulkLoader.java:159)
>                 at
> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadDataset(BulkLoader.java:117)
>                 at
> com.nimblegraph.data.bin.SimpleDatasetLoader.main(SimpleDatasetLoader.java:24)
>
> Error when loading into a directory with one triple. The following is
> run before the bulk loader.
>
> datasetGraph.getDefaultGraph().add(new
> Triple(Node.createURI("urn:hello"), RDF.type.asNode(),
> Node.createURI("urn:house")));
> datasetGraph.sync();
>
> java.lang.IllegalArgumentException: Out of bounds: idx=0, size=-866953722
>                 at
> com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.checkBounds(RecordBuffer.java:228)
>                 at
> com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.add(RecordBuffer.java:66)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeRecords.internalInsert(BPTreeRecords.java:112)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:399)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:399)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:399)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.insert(BPTreeNode.java:167)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.addAndReturnOld(BPlusTree.java:297)
>                 at
> com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.add(BPlusTree.java:289)
>                 at
> com.hp.hpl.jena.tdb.index.TupleIndexRecord.performAdd(TupleIndexRecord.java:48)
>                 at
> com.hp.hpl.jena.tdb.index.TupleIndexBase.add(TupleIndexBase.java:49)
>                 at
> com.hp.hpl.jena.tdb.index.TupleTable.add(TupleTable.java:54)
>                 at
> com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.addRow(NodeTupleTableConcrete.java:77)
>                 at
> com.hp.hpl.jena.tdb.store.bulkloader.LoaderNodeTupleTable.load(LoaderNodeTupleTable.java:112)
>                 at
> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:268)
>                 at
> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:244)
>                 at
> org.openjena.riot.lang.LangNTuple.runParser(LangNTuple.java:60)
>                 at org.openjena.riot.lang.LangBase.parse(LangBase.java:71)
>                 at
> org.openjena.riot.RiotReader.parseQuads(RiotReader.java:122)
>                 at
> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadQuads$(BulkLoader.java:159)
>                 at
> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadDataset(BulkLoader.java:117)
>                 at
> com.nimblegraph.data.bin.SimpleDatasetLoader.main(SimpleDatasetLoader.java:24)
>
> Any help tracking down the issue would be greatly appreciated.
> Thanks for the great software
>
> -jp
> [email protected]
>
>
>

Reply via email to