[ 
https://issues.apache.org/jira/browse/JENA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928617#comment-16928617
 ] 

Damien Obrist commented on JENA-1758:
-------------------------------------

h2. Investigation

Playing around I have observed the following:
* when catching the exception and trying to prepare the model a second time, 
the call succeeded and no exception was thrown
* when sleep for a few seconds after the loading and before preparing the 
model, the call succeeded and no exception was thrown at all

So I suspected that, even though the loading is complete, something must still 
be going on in the background.

Next I tried to close and reopen the dataset after the loading, in order to 
make sure that potential pending writes are terminated and have been flushed to 
disk:

{code:java}
TDBInternal.expel(dataset.asDatasetGraph());
dataset = TDB2Factory.connectDataset(path);
infModel = ModelFactory.createInfModel(reasoner, unionModel);
Txn.executeRead(dataset, infModel::prepare);
{code}

This lead to the following exception:

{noformat}
org.apache.jena.atlas.lib.InternalErrorException: BlockMgrFileAccess : already 
closed
        at 
org.apache.jena.dboe.base.block.BlockMgrFileAccess.checkNotClosed(BlockMgrFileAccess.java:78)
        at 
org.apache.jena.dboe.base.block.BlockMgrFileAccess.allocLimit(BlockMgrFileAccess.java:150)
        at 
org.apache.jena.dboe.base.block.BlockMgrWrapper.allocLimit(BlockMgrWrapper.java:87)
        at 
org.apache.jena.dboe.base.page.PageBlockMgr.allocLimit(PageBlockMgr.java:45)
        at 
org.apache.jena.dboe.trans.bplustree.BPlusTree._commitPrepare(BPlusTree.java:544)
        at 
org.apache.jena.dboe.trans.bplustree.BPlusTree._commitPrepare(BPlusTree.java:73)
        at 
org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.commitPrepare(TransactionalComponentLifecycle.java:98)
        at 
org.apache.jena.dboe.transaction.txn.TransactionCoordinator.lambda$executePrepare$12(TransactionCoordinator.java:687)
        at java.util.ArrayList.forEach(ArrayList.java:1257)
        at 
org.apache.jena.dboe.transaction.txn.TransactionCoordinator.executePrepare(TransactionCoordinator.java:685)
        at 
org.apache.jena.dboe.transaction.txn.Transaction.prepare(Transaction.java:151)
        at 
org.apache.jena.dboe.transaction.txn.Transaction.commit(Transaction.java:160)
        at 
org.apache.jena.tdb2.loader.main.DataToTuples.action(DataToTuples.java:139)
        at 
org.apache.jena.tdb2.loader.main.DataToTuples.lambda$startBulk$0(DataToTuples.java:101)
        at java.lang.Thread.run(Thread.java:748)
Exception in thread "Thread-67" java.lang.NullPointerException
        at 
org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.getDataState(TransactionalComponentLifecycle.java:251)
        at 
org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.abort(TransactionalComponentLifecycle.java:126)
        at org.apache.jena.dboe.transaction.txn.SysTrans.abort(SysTrans.java:45)
        at 
org.apache.jena.dboe.transaction.txn.Transaction.lambda$null$10(Transaction.java:189)
        at java.util.ArrayList.forEach(ArrayList.java:1257)
        at 
org.apache.jena.dboe.transaction.txn.Transaction.lambda$abort$$11(Transaction.java:189)
        at 
org.apache.jena.dboe.transaction.txn.TransactionCoordinator.executeAbort(TransactionCoordinator.java:726)
        at 
org.apache.jena.dboe.transaction.txn.Transaction.abort$(Transaction.java:189)
        at 
org.apache.jena.dboe.transaction.txn.Transaction.abort(Transaction.java:181)
        at 
org.apache.jena.tdb2.loader.main.DataToTuples.action(DataToTuples.java:142)
        at 
org.apache.jena.tdb2.loader.main.DataToTuples.lambda$startBulk$0(DataToTuples.java:101)
        at java.lang.Thread.run(Thread.java:748)
{noformat}

This shows that {{DataToTuples}} is the culprit still running after the loading 
operation has returned.

> Exception when preparing InfModel after TDB loading operation
> -------------------------------------------------------------
>
>                 Key: JENA-1758
>                 URL: https://issues.apache.org/jira/browse/JENA-1758
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB2
>    Affects Versions: Jena 3.12.0
>            Reporter: Damien Obrist
>            Priority: Major
>
> h2. Exception
> I'm loading a few million triples into a TDB2 dataset. I'm using a custom 
> loader (extending {{LoaderMain}}), since the triples being loaded are 
> generated by a separate application and streamed in over HTTP.
> After the loading is done, I try to reset an {{InfModel}} to recompute the 
> inference taking into account the new triples, but I encounter the following 
> exception:
> {noformat}
> org.apache.jena.atlas.RuntimeIOException: Out of bounds: (limit 
> 32834204)32834205
>       at org.apache.jena.atlas.io.IO.exception(IO.java:254)
>       at 
> org.apache.jena.dboe.trans.data.TransBinaryDataFile.checkRead(TransBinaryDataFile.java:190)
>       at 
> org.apache.jena.dboe.trans.data.TransBinaryDataFile.read(TransBinaryDataFile.java:184)
>       at 
> org.apache.jena.tdb2.store.nodetable.TReadAppendFileTransport.read(TReadAppendFileTransport.java:71)
>       at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>       at 
> org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:637)
>       at 
> org.apache.thrift.protocol.TCompactProtocol.readFieldBegin(TCompactProtocol.java:543)
>       at 
> org.apache.jena.riot.thrift.wire.RDF_IRI$RDF_IRIStandardScheme.read(RDF_IRI.java:318)
>       at 
> org.apache.jena.riot.thrift.wire.RDF_IRI$RDF_IRIStandardScheme.read(RDF_IRI.java:311)
>       at org.apache.jena.riot.thrift.wire.RDF_IRI.read(RDF_IRI.java:258)
>       at 
> org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:319)
>       at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224)
>       at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213)
>       at org.apache.thrift.TUnion.read(TUnion.java:138)
>       at 
> org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:80)
>       at 
> org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103)
>       at 
> org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52)
>       at 
> org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:197)
>       at 
> org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:108)
>       at 
> org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52)
>       at 
> org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:66)
>       at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:112)
>       at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108)
>       at 
> org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53)
>       at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
>       at 
> org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36)
>       at 
> org.apache.jena.tdb2.store.IteratorTxnTracker.next(IteratorTxnTracker.java:43)
>       at 
> java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1812)
>       at 
> java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
>       at 
> java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
>       at 
> java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:169)
>       at 
> java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
>       at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
>       at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265)
>       at org.apache.jena.atlas.iterator.Iter.hasNext(Iter.java:903)
>       at org.apache.jena.atlas.iterator.Iter$1.hasNext(Iter.java:192)
>       at 
> org.apache.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:90)
>       at 
> org.apache.jena.reasoner.rulesys.impl.RETEEngine.fastInit(RETEEngine.java:155)
>       at 
> org.apache.jena.reasoner.rulesys.FBRuleInfGraph.prepare(FBRuleInfGraph.java:471)
>       at 
> org.apache.jena.rdf.model.impl.InfModelImpl.prepare(InfModelImpl.java:87)
>       ...
> {noformat}
> h2. Code
> I tried to come up with a minimal example but wasn't able to reproduce the 
> issue outside of my more involved application environment, where the issue 
> occurs consistently. It seems to depend on a specific timing, the number of 
> triples being loaded, the inference rules used etc.
> The code looks roughly like this:
> {code:java}
> // initialize
> Dataset dataset = TDB2Factory.connectDataset(path);
> Model unionModel = dataset.getNamedModel("urn:x-arq:UnionGraph");
> InfModel infModel = ModelFactory.createInfModel(reasoner, unionModel);
> Txn.executeRead(dataset, infModel::prepare);
> // load
> InputStream inputStream = // incoming HTTP stream
> DataLoader loader = new CustomLoader(LoaderPlans.loaderPlanParallel, 
> dataset.asDatasetGraph);
> loader.startBulk();
> try {
>     RDFDataMgr.parse(loader.stream(), inputStream, RDFLanguages.NQUADS);
>     loader.finishBulk();
> } catch (RuntimeException e) {
>     loader.finishException(e);
>     throw e;
> }
> // reset model to recompute inference
> Txn.executeRead(dataset, () -> {
>     model.reset();
>     model.prepare();
> });
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to