OutOfMemoryError with tdbquery

Timothy Lebo Fri, 28 Mar 2014 11:00:42 -0700

Jena,

I have a TDB with 4.2 billion triples that I created with tdbloader.
It’s taken from the 2012 Billion Triples Challenge.
I assert three triples for each URL they retrieved (“context”),
e.g. for the URL http://www.hyphen.info/rdf/30.xml:


<http://www.hyphen.info/rdf/30.xml> 
<http://purl.org/twc/vocab/between-the-edges/root> <http://www.hyphen.info> .
<http://www.hyphen.info> <http://purl.org/twc/vocab/between-the-edges/pld> 
<http://hyphen.info> .
<http://hyphen.info> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://purl.org/twc/vocab/between-the-edges/PayLevelDomain> .


When I submit the following query with tdbquery:

select ?url where{?url <http://purl.org/twc/vocab/between-the-edges/root> 
<http://dbpedia.org>.}

The following Exception is thrown.

I’m assuming that Jena is trying to build up all of the results before 
reporting them.
Is there a way to just get “the stream” to avoid the memory issue?

Thanks,
Tim Lebo

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at 
com.hp.hpl.jena.tdb.base.record.RecordFactory.create(RecordFactory.java:87)
        at 
com.hp.hpl.jena.tdb.base.record.RecordFactory.buildFrom(RecordFactory.java:122)
        at 
com.hp.hpl.jena.tdb.base.buffer.RecordBuffer._get(RecordBuffer.java:107)
        at 
com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.get(RecordBuffer.java:53)
        at 
com.hp.hpl.jena.tdb.base.recordbuffer.RecordRangeIterator.hasNext(RecordRangeIterator.java:130)
        at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
        at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:119)
        at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
        at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:181)
        at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
        at 
org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:58)
        at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:59)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at 
com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
        at 
com.hp.hpl.jena.sparql.resultset.ResultSetMem.<init>(ResultSetMem.java:95)
        at 
com.hp.hpl.jena.sparql.resultset.TextOutput.write(TextOutput.java:147)
        at 
com.hp.hpl.jena.sparql.resultset.TextOutput.write(TextOutput.java:130)
        at 
com.hp.hpl.jena.sparql.resultset.TextOutput.write(TextOutput.java:118)
        at 
com.hp.hpl.jena.sparql.resultset.TextOutput.format(TextOutput.java:65)
        at 
com.hp.hpl.jena.query.ResultSetFormatter.out(ResultSetFormatter.java:135)
        at 
com.hp.hpl.jena.sparql.util.QueryExecUtils.outputResultSet(QueryExecUtils.java:157)
        at 
com.hp.hpl.jena.sparql.util.QueryExecUtils.doSelectQuery(QueryExecUtils.java:199)
        at 
com.hp.hpl.jena.sparql.util.QueryExecUtils.executeQuery(QueryExecUtils.java:75)
        at arq.query.queryExec(query.java:186)
        at arq.query.exec(query.java:145)

OutOfMemoryError with tdbquery

Reply via email to