Sorry for any confusion tdbloader2 is working find I had a typo in my $PATH variable. I'll post results of the load asap.
-jp On Tue, Jun 28, 2011 at 7:02 PM, jp <[email protected]> wrote: > The complete log file is over 13gb. I have posted the first 5000 lines > here http://www.kosmyna.com/ReportLoadOnSSD.log.5000lines > The run of tdbloader failed as well. first 5000 lines can be found > here http://www.kosmyna.com/tdbloader.log.5000lines > > I could not run tdbloader2 I get the following error > ./tdbloader2: line 14: make_classpath: command not found > > I have TDBROOT environment variable correctly set and am using this > version of tdb > http://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/tags/TDB-0.8.10/bin > > -jp > > > On Tue, Jun 28, 2011 at 4:30 PM, Andy Seaborne > <[email protected]> wrote: >>> Aside from shipping you my laptop is there anything I can provide you >>> with to help track down the issue? >> >> A complete log, with the exception would help to identify the point where it >> fails. Its a possible clue. >> >> Could you also try running tdbloader and tdbloader2 to bulk load the files? >> >> Andy >> >> >> On 28/06/11 21:19, jp wrote: >>> >>> Hey Andy, >>> >>> Saw the twitter message 29% load speed increase is pretty nice. Glad I >>> could give you the excuse to upgrade :) Though It worries me that you >>> don't receive the same exception I do. I consistently have loading >>> issues using the file posted at >>> http://www.kosmyna.com/mappingbased_properties_en.nt.bz2. I can get >>> the test program to complete by making the following changes but it's >>> slow (30 minutes). >>> >>> SystemTDB.setFileMode(FileMode.direct) ; >>> >>> if ( true ) { >>> String dir = "/home/jp/scratch/ssdtest/DB-X" ; >>> FileOps.clearDirectory(dir) ; >>> datasetGraph = TDBFactory.createDatasetGraph(dir); >>> } >>> >>> Running the program with the sections of code below fails every time. >>> >>> //SystemTDB.setFileMode(FileMode.direct) ; >>> >>> if ( true ) { >>> String dir = "/home/jp/scratch/ssdtest/DB-X" ; >>> FileOps.clearDirectory(dir) ; >>> datasetGraph = TDBFactory.createDatasetGraph(dir); >>> } >>> >>> The exception: >>> java.lang.IllegalArgumentException >>> at java.nio.Buffer.position(Buffer.java:235) >>> at >>> com.hp.hpl.jena.tdb.base.record.RecordFactory.buildFrom(RecordFactory.java:94) >>> at >>> com.hp.hpl.jena.tdb.base.buffer.RecordBuffer._get(RecordBuffer.java:95) >>> at >>> com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.get(RecordBuffer.java:41) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeRecords.getSplitKey(BPTreeRecords.java:141) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.split(BPTreeNode.java:435) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:387) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:399) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.insert(BPTreeNode.java:167) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.addAndReturnOld(BPlusTree.java:297) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.add(BPlusTree.java:289) >>> at >>> com.hp.hpl.jena.tdb.index.TupleIndexRecord.performAdd(TupleIndexRecord.java:48) >>> at >>> com.hp.hpl.jena.tdb.index.TupleIndexBase.add(TupleIndexBase.java:49) >>> at com.hp.hpl.jena.tdb.index.TupleTable.add(TupleTable.java:54) >>> at >>> com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.addRow(NodeTupleTableConcrete.java:77) >>> at >>> com.hp.hpl.jena.tdb.store.bulkloader.LoaderNodeTupleTable.load(LoaderNodeTupleTable.java:112) >>> at >>> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:268) >>> at >>> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:244) >>> at org.openjena.riot.lang.LangNTuple.runParser(LangNTuple.java:60) >>> at org.openjena.riot.lang.LangBase.parse(LangBase.java:71) >>> at org.openjena.riot.RiotReader.parseQuads(RiotReader.java:122) >>> at >>> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadQuads$(BulkLoader.java:159) >>> at >>> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadDataset(BulkLoader.java:117) >>> at >>> com.nimblegraph.data.bin.ReportLoadOnSSD.main(ReportLoadOnSSD.java:68) >>> http://dbpedia.org/resource/Spirea_X >>> http://dbpedia.org/ontology/associatedBand >>> http://dbpedia.org/resource/Adventures_in_Stereo >>> >>> If I continue to let it run I start seeing this error as well >>> com.hp.hpl.jena.tdb.TDBException: No known block type for 4 >>> at >>> com.hp.hpl.jena.tdb.base.block.BlockType.extract(BlockType.java:64) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr.getType(BPTreeNodeMgr.java:166) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr.access$200(BPTreeNodeMgr.java:22) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr$Block2BPTreeNode.fromByteBuffer(BPTreeNodeMgr.java:136) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNodeMgr.get(BPTreeNodeMgr.java:84) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.get(BPTreeNode.java:127) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:379) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalInsert(BPTreeNode.java:399) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.insert(BPTreeNode.java:167) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.addAndReturnOld(BPlusTree.java:297) >>> at >>> com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.add(BPlusTree.java:289) >>> at >>> com.hp.hpl.jena.tdb.index.TupleIndexRecord.performAdd(TupleIndexRecord.java:48) >>> at >>> com.hp.hpl.jena.tdb.index.TupleIndexBase.add(TupleIndexBase.java:49) >>> at com.hp.hpl.jena.tdb.index.TupleTable.add(TupleTable.java:54) >>> at >>> com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.addRow(NodeTupleTableConcrete.java:77) >>> at >>> com.hp.hpl.jena.tdb.store.bulkloader.LoaderNodeTupleTable.load(LoaderNodeTupleTable.java:112) >>> at >>> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:268) >>> at >>> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader$2.send(BulkLoader.java:244) >>> at org.openjena.riot.lang.LangNTuple.runParser(LangNTuple.java:60) >>> at org.openjena.riot.lang.LangBase.parse(LangBase.java:71) >>> at org.openjena.riot.RiotReader.parseQuads(RiotReader.java:122) >>> at >>> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadQuads$(BulkLoader.java:159) >>> at >>> com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadDataset(BulkLoader.java:117) >>> at >>> com.nimblegraph.data.bin.ReportLoadOnSSD.main(ReportLoadOnSSD.java:68) >>> >>> Aside from shipping you my laptop is there anything I can provide you >>> with to help track down the issue? I am comfortable building tdb from >>> source and setting conditional breakpoints while debugging if that can >>> be of any benefit. >>> >>> Thanks for your help. >>> -jp >>> >>> On Tue, Jun 28, 2011 at 7:17 AM, Andy Seaborne >>> <[email protected]> wrote: >>>> >>>> Hi there, >>>> >>>> I now have an SSD (256G from Crucial) :-) >>>> >>>> /dev/sdb1 on /mnt/ssd1 type ext4 (rw,noatime) >>>> >>>> and I ran the test program on jamendo-rdf and on >>>> mappingbased_properties_en.nt, then on jamendo-rdf with existing data as >>>> in >>>> the test case. >>>> >>>> Everything works for me - the loads complete without an exception. >>>> >>>> Andy >>>> >>>> On 21/06/11 09:10, Andy Seaborne wrote: >>>>> >>>>> >>>>> On 21/06/11 06:01, jp wrote: >>>>>> >>>>>> Hey Andy >>>>>> >>>>>> I wasn't able to unzip the file >>>>> >>>>>> http://people.apache.org/~andy/jamendo.nt.gz however I ran it on my >>>>>> dataset and I received an out of memory exception. I then changed line >>>>>> 42 to true and received the original error. You can download the data >>>>>> file I have been testing with from >>>>>> http://www.kosmyna.com/mappingbased_properties_en.nt.bz2 unzipped it's >>>>>> 2.6gb. This file has consistently failed to load. >>>>> >>>>> downloads.dbpedia.org is back - I download that file and loaded it with >>>>> the test program - no problems. >>>>> >>>>>> While trying other datasets and variations of the simple program I had >>>>>> what seemed to be a successful BulkLoad however when I opened the >>>>>> dataset and tried to query it there were no results. I don't have the >>>>>> exact details of this run but can try to reproduce it if you think it >>>>>> would be useful. >>>>> >>>>> Yes please. At this point, any details a help >>>>> >>>>> Also, a complete log of the failed load of >>>>> mappingbased_properties_en.nt.bz2 would be useful. >>>>> >>>>> Having looked at the stacktraces, and aligned them to the source code, >>>>> it appears the code passes an internal consistency check, then fails on >>>>> something that the test tests for. >>>>> >>>>> Andy >>>>> >>>>>> >>>>>> -jp >>>>>> >>>>>> >>>>>> On Mon, Jun 20, 2011 at 4:57 PM, Andy Seaborne >>>>>> <[email protected]> wrote: >>>>>>> >>>>>>> Fixed - sorry about that. >>>>>>> >>>>>>> Andy >>>>>>> >>>>>>> On 20/06/11 21:50, jp wrote: >>>>>>>> >>>>>>>> Hey andy, >>>>>>>> >>>>>>>> I assume the file you want me to run is >>>>>>>> http://people.apache.org/~andy/ReportLoadOnSSD.java >>>>>>>> >>>>>>>> When I try to download it I get a permissions error. Let me know when >>>>>>>> I should try again. >>>>>>>> >>>>>>>> -jp >>>>>>>> >>>>>>>> On Mon, Jun 20, 2011 at 3:30 PM, Andy Seaborne >>>>>>>> <[email protected]> wrote: >>>>>>>>> >>>>>>>>> Hi there, >>>>>>>>> >>>>>>>>> I tried to recreate this but couldn't, but I don't have an SSD to >>>>>>>>> hand at >>>>>>>>> the moment (being fixed :-) >>>>>>>>> >>>>>>>>> I've put my test program and the data from the jamendo-rdf you sent >>>>>>>>> me >>>>>>>>> in: >>>>>>>>> >>>>>>>>> http://people.apache.org/~andy/ >>>>>>>>> >>>>>>>>> so we can agree on exactly a test case. This code is single >>>>>>>>> threaded. >>>>>>>>> >>>>>>>>> The conversion from .rdf to .nt wasn't pure. >>>>>>>>> >>>>>>>>> I tried running using the in-memory store as well. >>>>>>>>> downloads.dbpedia.org was down atthe weekend - I'll try to get the >>>>>>>>> same >>>>>>>>> dbpedia data. >>>>>>>>> >>>>>>>>> Could you run exactly what I was running? The file name needs >>>>>>>>> changing. >>>>>>>>> >>>>>>>>> You can also try uncommenting >>>>>>>>> SystemTDB.setFileMode(FileMode.direct) ; >>>>>>>>> and run it using non-mapped files in about 1.2 G of heap. >>>>>>>>> >>>>>>>>> Looking through the stacktarce, there is a point where the code has >>>>>>>>> passed >>>>>>>>> an internal consistence test then fails with something that should >>>>>>>>> be >>>>>>>>> caught >>>>>>>>> by that test - and the code is sync'ed or single threaded. This is, >>>>>>>>> to >>>>>>>>> put >>>>>>>>> it mildly, worrying. >>>>>>>>> >>>>>>>>> Andy >>>>>>>>> >>>>>>>>> On 18/06/11 16:38, jp wrote: >>>>>>>>>> >>>>>>>>>> Hey Andy, >>>>>>>>>> >>>>>>>>>> My entire program is run on one jvm as follows. >>>>>>>>>> >>>>>>>>>> public static void main(String[] args) throws IOException{ >>>>>>>>>> DatasetGraphTDB datasetGraph = >>>>>>>>>> TDBFactory.createDatasetGraph(tdbDir); >>>>>>>>>> >>>>>>>>>> /* I saw the BulkLoader had two ways of loading data based on >>>>>>>>>> whether >>>>>>>>>> the dataset existed already. I did two runs one with the following >>>>>>>>>> two >>>>>>>>>> lines commented out to test both ways the BulkLoader runs. >>>>>>>>>> Hopefully >>>>>>>>>> this had the desired effect. */ >>>>>>>>>> datasetGraph.getDefaultGraph().add(new >>>>>>>>>> Triple(Node.createURI("urn:hello"), RDF.type.asNode(), >>>>>>>>>> Node.createURI("urn:house"))); >>>>>>>>>> datasetGraph.sync(); >>>>>>>>>> >>>>>>>>>> InputStream inputStream = new FileInputStream(dbpediaData); >>>>>>>>>> >>>>>>>>>> BulkLoader bulkLoader = new BulkLoader(); >>>>>>>>>> bulkLoader.loadDataset(datasetGraph, inputStream, true); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> The data can be found here >>>>>>>>>> >>>>>>>>>> http://downloads.dbpedia.org/3.6/en/mappingbased_properties_en.nt.bz2 >>>>>>>>>> I appended the ontology to end of file it can be found here >>>>>>>>>> http://downloads.dbpedia.org/3.6/dbpedia_3.6.owl.bz2 >>>>>>>>>> >>>>>>>>>> The tdbDir is an empty directory. >>>>>>>>>> On my system the error starts occurring after about 2-3minutes and >>>>>>>>>> 8-12 million triples loaded. >>>>>>>>>> >>>>>>>>>> Thanks for looking over this and please let me know if I can be of >>>>>>>>>> further assistance. >>>>>>>>>> >>>>>>>>>> -jp >>>>>>>>>> [email protected] >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Jun 17, 2011 9:29 am, andy wrote: >>>>>>>>>>> >>>>>>>>>>> jp, >>>>>>>>>>> >>>>>>>>>>> How does this fit with running: >>>>>>>>>>> >>>>>>>>>>> datasetGraph.getDefaultGraph().add(new >>>>>>>>>>> Triple(Node.createURI("urn:hello"), RDF.type.asNode(), >>>>>>>>>>> Node.createURI("urn:house"))); >>>>>>>>>>> datasetGraph.sync(); >>>>>>>>>>> >>>>>>>>>>> Is the preload of one triple a separate JVM or the same JVM as the >>>>>>>>>>> BulkLoader call - could you provide a single complete minimal >>>>>>>>>>> example? >>>>>>>>>>> >>>>>>>>>>> In attempting to reconstruct this, I don't want to hide the >>>>>>>>>>> problem by >>>>>>>>>>> guessing how things are wired together. >>>>>>>>>>> >>>>>>>>>>> Also - exactly which dbpedia file are you loading (URL?) although >>>>>>>>>>> I >>>>>>>>>>> doubt the exact data is the cause here. >>>>>>>>> >>>>>>> >>>> >> >
