Damien Obrist created JENA-1758:
-----------------------------------

             Summary: Exception when preparing InfModel after TDB loading 
operation
                 Key: JENA-1758
                 URL: https://issues.apache.org/jira/browse/JENA-1758
             Project: Apache Jena
          Issue Type: Bug
          Components: TDB2
    Affects Versions: Jena 3.12.0
            Reporter: Damien Obrist


h2. Exception

I'm loading a few million triples into a TDB2 dataset. I'm using a custom 
loader (extending {{LoaderMain}}), since the triples being loaded are generated 
by a separate application and streamed in over HTTP.

After the loading is done, I try to reset an {{InfModel}} to recompute the 
inference taking into account the new triples, but I encounter the following 
exception:

{noformat}
org.apache.jena.atlas.RuntimeIOException: Out of bounds: (limit 
32834204)32834205
        at org.apache.jena.atlas.io.IO.exception(IO.java:254)
        at 
org.apache.jena.dboe.trans.data.TransBinaryDataFile.checkRead(TransBinaryDataFile.java:190)
        at 
org.apache.jena.dboe.trans.data.TransBinaryDataFile.read(TransBinaryDataFile.java:184)
        at 
org.apache.jena.tdb2.store.nodetable.TReadAppendFileTransport.read(TReadAppendFileTransport.java:71)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
        at 
org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:637)
        at 
org.apache.thrift.protocol.TCompactProtocol.readFieldBegin(TCompactProtocol.java:543)
        at 
org.apache.jena.riot.thrift.wire.RDF_IRI$RDF_IRIStandardScheme.read(RDF_IRI.java:318)
        at 
org.apache.jena.riot.thrift.wire.RDF_IRI$RDF_IRIStandardScheme.read(RDF_IRI.java:311)
        at org.apache.jena.riot.thrift.wire.RDF_IRI.read(RDF_IRI.java:258)
        at 
org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:319)
        at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224)
        at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213)
        at org.apache.thrift.TUnion.read(TUnion.java:138)
        at 
org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:80)
        at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103)
        at 
org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52)
        at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:197)
        at 
org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:108)
        at 
org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52)
        at 
org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:66)
        at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:112)
        at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108)
        at 
org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53)
        at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
        at 
org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36)
        at 
org.apache.jena.tdb2.store.IteratorTxnTracker.next(IteratorTxnTracker.java:43)
        at 
java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1812)
        at 
java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
        at 
java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
        at 
java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:169)
        at 
java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
        at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
        at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265)
        at org.apache.jena.atlas.iterator.Iter.hasNext(Iter.java:903)
        at org.apache.jena.atlas.iterator.Iter$1.hasNext(Iter.java:192)
        at 
org.apache.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:90)
        at 
org.apache.jena.reasoner.rulesys.impl.RETEEngine.fastInit(RETEEngine.java:155)
        at 
org.apache.jena.reasoner.rulesys.FBRuleInfGraph.prepare(FBRuleInfGraph.java:471)
        at 
org.apache.jena.rdf.model.impl.InfModelImpl.prepare(InfModelImpl.java:87)
        ...
{noformat}

h2. Code

I tried to come up with a minimal example but wasn't able to reproduce the 
issue outside of my more involved application environment, where the issue 
occurs consistently. It seems to depend on a specific timing, the number of 
triples being loaded, the inference rules used etc.

The code looks roughly like this:

{code:java}
// initialize
Dataset dataset = TDB2Factory.connectDataset(path);
Model unionModel = dataset.getNamedModel("urn:x-arq:UnionGraph");
InfModel infModel = ModelFactory.createInfModel(reasoner, unionModel);
Txn.executeRead(dataset, infModel::prepare);

// load
InputStream inputStream = // incoming HTTP stream
DataLoader loader = new CustomLoader(LoaderPlans.loaderPlanParallel, 
dataset.asDatasetGraph);
loader.startBulk();
try {
    RDFDataMgr.parse(loader.stream(), inputStream, RDFLanguages.NQUADS);
    loader.finishBulk();
} catch (RuntimeException e) {
    loader.finishException(e);
    throw e;
}

// reset model to recompute inference
Txn.executeRead(dataset, () -> {
    model.reset();
    model.prepare();
});
{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to