> Aside from shipping you my laptop is there anything I can provide you
> with to help track down the issue?
A complete log, with the exception would help to identify the point
where it fails. Its a possible clue.
Could you also try running tdbloader and tdbloader2 to bulk load the files?
Andy
On 28/06/11 21:19, jp wrote:
Hey Andy,
Saw the twitter message 29% load speed increase is pretty nice. Glad I
could give you the excuse to upgrade :) Though It worries me that you
don't receive the same exception I do. I consistently have loading
issues using the file posted at
http://www.kosmyna.com/mappingbased_properties_en.nt.bz2. I can get
the test program to complete by making the following changes but it's
slow (30 minutes).
SystemTDB.setFileMode(FileMode.direct) ;
if ( true ) {
String dir = "/home/jp/scratch/ssdtest/DB-X" ;
FileOps.clearDirectory(dir) ;
datasetGraph = TDBFactory.createDatasetGraph(dir);
}
Running the program with the sections of code below fails every time.
//SystemTDB.setFileMode(FileMode.direct) ;
if ( true ) {
String dir = "/home/jp/scratch/ssdtest/DB-X" ;
FileOps.clearDirectory(dir) ;
datasetGraph = TDBFactory.createDatasetGraph(dir);
}
The exception:
java.lang.IllegalArgumentException
at java.nio.Buffer.position(Buffer.java:235)
at
com.hp.hpl.jena.tdb.base.record.RecordFactory.buildFrom(RecordFactory.java:94)
at
com.hp.hpl.jena.tdb.base.buffer.RecordBuffer._get(RecordBuffer.java:95)
at
com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.get(RecordBuffer.java:41)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeRecords.getSplitKey(BPTreeRecords.java:141)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.split(BPTreeNode.java:435)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:387)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:399)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.insert(BPTreeNode.java:167)
at
com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.addAndReturnOld(BPlusTree.java:297)
at com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.add(BPlusTree.java:289)
at
com.hp.hpl.jena.tdb.index.TupleIndexRecord.performAdd(TupleIndexRecord.java:48)
at com.hp.hpl.jena.tdb.index.TupleIndexBase.add(TupleIndexBase.java:49)
at com.hp.hpl.jena.tdb.index.TupleTable.add(TupleTable.java:54)
at
com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.addRow(NodeTupleTableConcrete.java:77)
at
com.hp.hpl.jena.tdb.store.bulkloader.LoaderNodeTupleTable.load(LoaderNodeTupleTable.java:112)
at
com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:268)
at
com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:244)
at org.openjena.riot.lang.LangNTuple.runParser(LangNTuple.java:60)
at org.openjena.riot.lang.LangBase.parse(LangBase.java:71)
at org.openjena.riot.RiotReader.parseQuads(RiotReader.java:122)
at
com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadQuads$(BulkLoader.java:159)
at
com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadDataset(BulkLoader.java:117)
at
com.nimblegraph.data.bin.ReportLoadOnSSD.main(ReportLoadOnSSD.java:68)
http://dbpedia.org/resource/Spirea_X
http://dbpedia.org/ontology/associatedBand
http://dbpedia.org/resource/Adventures_in_Stereo
If I continue to let it run I start seeing this error as well
com.hp.hpl.jena.tdb.TDBException: No known block type for 4
at com.hp.hpl.jena.tdb.base.block.BlockType.extract(BlockType.java:64)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr.getType(BPTreeNodeMgr.java:166)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr.access$200(BPTreeNodeMgr.java:22)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr$Block2BPTreeNode.fromByteBuffer(BPTreeNodeMgr.java:136)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr.get(BPTreeNodeMgr.java:84)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.get(BPTreeNode.java:127)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:379)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:399)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.insert(BPTreeNode.java:167)
at
com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.addAndReturnOld(BPlusTree.java:297)
at com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.add(BPlusTree.java:289)
at
com.hp.hpl.jena.tdb.index.TupleIndexRecord.performAdd(TupleIndexRecord.java:48)
at com.hp.hpl.jena.tdb.index.TupleIndexBase.add(TupleIndexBase.java:49)
at com.hp.hpl.jena.tdb.index.TupleTable.add(TupleTable.java:54)
at
com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.addRow(NodeTupleTableConcrete.java:77)
at
com.hp.hpl.jena.tdb.store.bulkloader.LoaderNodeTupleTable.load(LoaderNodeTupleTable.java:112)
at
com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:268)
at
com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:244)
at org.openjena.riot.lang.LangNTuple.runParser(LangNTuple.java:60)
at org.openjena.riot.lang.LangBase.parse(LangBase.java:71)
at org.openjena.riot.RiotReader.parseQuads(RiotReader.java:122)
at
com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadQuads$(BulkLoader.java:159)
at
com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadDataset(BulkLoader.java:117)
at
com.nimblegraph.data.bin.ReportLoadOnSSD.main(ReportLoadOnSSD.java:68)
Aside from shipping you my laptop is there anything I can provide you
with to help track down the issue? I am comfortable building tdb from
source and setting conditional breakpoints while debugging if that can
be of any benefit.
Thanks for your help.
-jp
On Tue, Jun 28, 2011 at 7:17 AM, Andy Seaborne
<[email protected]> wrote:
Hi there,
I now have an SSD (256G from Crucial) :-)
/dev/sdb1 on /mnt/ssd1 type ext4 (rw,noatime)
and I ran the test program on jamendo-rdf and on
mappingbased_properties_en.nt, then on jamendo-rdf with existing data as in
the test case.
Everything works for me - the loads complete without an exception.
Andy
On 21/06/11 09:10, Andy Seaborne wrote:
On 21/06/11 06:01, jp wrote:
Hey Andy
I wasn't able to unzip the file
http://people.apache.org/~andy/jamendo.nt.gz however I ran it on my
dataset and I received an out of memory exception. I then changed line
42 to true and received the original error. You can download the data
file I have been testing with from
http://www.kosmyna.com/mappingbased_properties_en.nt.bz2 unzipped it's
2.6gb. This file has consistently failed to load.
downloads.dbpedia.org is back - I download that file and loaded it with
the test program - no problems.
While trying other datasets and variations of the simple program I had
what seemed to be a successful BulkLoad however when I opened the
dataset and tried to query it there were no results. I don't have the
exact details of this run but can try to reproduce it if you think it
would be useful.
Yes please. At this point, any details a help
Also, a complete log of the failed load of
mappingbased_properties_en.nt.bz2 would be useful.
Having looked at the stacktraces, and aligned them to the source code,
it appears the code passes an internal consistency check, then fails on
something that the test tests for.
Andy
-jp
On Mon, Jun 20, 2011 at 4:57 PM, Andy Seaborne
<[email protected]> wrote:
Fixed - sorry about that.
Andy
On 20/06/11 21:50, jp wrote:
Hey andy,
I assume the file you want me to run is
http://people.apache.org/~andy/ReportLoadOnSSD.java
When I try to download it I get a permissions error. Let me know when
I should try again.
-jp
On Mon, Jun 20, 2011 at 3:30 PM, Andy Seaborne
<[email protected]> wrote:
Hi there,
I tried to recreate this but couldn't, but I don't have an SSD to
hand at
the moment (being fixed :-)
I've put my test program and the data from the jamendo-rdf you sent me
in:
http://people.apache.org/~andy/
so we can agree on exactly a test case. This code is single threaded.
The conversion from .rdf to .nt wasn't pure.
I tried running using the in-memory store as well.
downloads.dbpedia.org was down atthe weekend - I'll try to get the
same
dbpedia data.
Could you run exactly what I was running? The file name needs
changing.
You can also try uncommenting
SystemTDB.setFileMode(FileMode.direct) ;
and run it using non-mapped files in about 1.2 G of heap.
Looking through the stacktarce, there is a point where the code has
passed
an internal consistence test then fails with something that should be
caught
by that test - and the code is sync'ed or single threaded. This is, to
put
it mildly, worrying.
Andy
On 18/06/11 16:38, jp wrote:
Hey Andy,
My entire program is run on one jvm as follows.
public static void main(String[] args) throws IOException{
DatasetGraphTDB datasetGraph = TDBFactory.createDatasetGraph(tdbDir);
/* I saw the BulkLoader had two ways of loading data based on whether
the dataset existed already. I did two runs one with the following
two
lines commented out to test both ways the BulkLoader runs. Hopefully
this had the desired effect. */
datasetGraph.getDefaultGraph().add(new
Triple(Node.createURI("urn:hello"), RDF.type.asNode(),
Node.createURI("urn:house")));
datasetGraph.sync();
InputStream inputStream = new FileInputStream(dbpediaData);
BulkLoader bulkLoader = new BulkLoader();
bulkLoader.loadDataset(datasetGraph, inputStream, true);
}
The data can be found here
http://downloads.dbpedia.org/3.6/en/mappingbased_properties_en.nt.bz2
I appended the ontology to end of file it can be found here
http://downloads.dbpedia.org/3.6/dbpedia_3.6.owl.bz2
The tdbDir is an empty directory.
On my system the error starts occurring after about 2-3minutes and
8-12 million triples loaded.
Thanks for looking over this and please let me know if I can be of
further assistance.
-jp
[email protected]
On Jun 17, 2011 9:29 am, andy wrote:
jp,
How does this fit with running:
datasetGraph.getDefaultGraph().add(new
Triple(Node.createURI("urn:hello"), RDF.type.asNode(),
Node.createURI("urn:house")));
datasetGraph.sync();
Is the preload of one triple a separate JVM or the same JVM as the
BulkLoader call - could you provide a single complete minimal
example?
In attempting to reconstruct this, I don't want to hide the
problem by
guessing how things are wired together.
Also - exactly which dbpedia file are you loading (URL?) although I
doubt the exact data is the cause here.