[ https://issues.apache.org/jira/browse/JENA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928617#comment-16928617 ]
Damien Obrist edited comment on JENA-1758 at 9/12/19 3:32 PM: -------------------------------------------------------------- h2. Investigation Playing around I have observed the following: * when catching the exception and trying to prepare the model a second time, the call succeeded and no exception was thrown * when adding sleep for a few seconds after the loading and before preparing the model, the call succeeded and no exception was thrown at all So I suspected that, even though the loading is complete, something must still be going on in the background. Next I tried to close and reopen the dataset after the loading, in order to make sure that potential pending writes are terminated and have been flushed to disk: {code:java} TDBInternal.expel(dataset.asDatasetGraph()); dataset = TDB2Factory.connectDataset(path); infModel = ModelFactory.createInfModel(reasoner, unionModel); Txn.executeRead(dataset, infModel::prepare); {code} This lead to the following exception: {noformat} org.apache.jena.atlas.lib.InternalErrorException: BlockMgrFileAccess : already closed at org.apache.jena.dboe.base.block.BlockMgrFileAccess.checkNotClosed(BlockMgrFileAccess.java:78) at org.apache.jena.dboe.base.block.BlockMgrFileAccess.allocLimit(BlockMgrFileAccess.java:150) at org.apache.jena.dboe.base.block.BlockMgrWrapper.allocLimit(BlockMgrWrapper.java:87) at org.apache.jena.dboe.base.page.PageBlockMgr.allocLimit(PageBlockMgr.java:45) at org.apache.jena.dboe.trans.bplustree.BPlusTree._commitPrepare(BPlusTree.java:544) at org.apache.jena.dboe.trans.bplustree.BPlusTree._commitPrepare(BPlusTree.java:73) at org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.commitPrepare(TransactionalComponentLifecycle.java:98) at org.apache.jena.dboe.transaction.txn.TransactionCoordinator.lambda$executePrepare$12(TransactionCoordinator.java:687) at java.util.ArrayList.forEach(ArrayList.java:1257) at org.apache.jena.dboe.transaction.txn.TransactionCoordinator.executePrepare(TransactionCoordinator.java:685) at org.apache.jena.dboe.transaction.txn.Transaction.prepare(Transaction.java:151) at org.apache.jena.dboe.transaction.txn.Transaction.commit(Transaction.java:160) at org.apache.jena.tdb2.loader.main.DataToTuples.action(DataToTuples.java:139) at org.apache.jena.tdb2.loader.main.DataToTuples.lambda$startBulk$0(DataToTuples.java:101) at java.lang.Thread.run(Thread.java:748) Exception in thread "Thread-67" java.lang.NullPointerException at org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.getDataState(TransactionalComponentLifecycle.java:251) at org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.abort(TransactionalComponentLifecycle.java:126) at org.apache.jena.dboe.transaction.txn.SysTrans.abort(SysTrans.java:45) at org.apache.jena.dboe.transaction.txn.Transaction.lambda$null$10(Transaction.java:189) at java.util.ArrayList.forEach(ArrayList.java:1257) at org.apache.jena.dboe.transaction.txn.Transaction.lambda$abort$$11(Transaction.java:189) at org.apache.jena.dboe.transaction.txn.TransactionCoordinator.executeAbort(TransactionCoordinator.java:726) at org.apache.jena.dboe.transaction.txn.Transaction.abort$(Transaction.java:189) at org.apache.jena.dboe.transaction.txn.Transaction.abort(Transaction.java:181) at org.apache.jena.tdb2.loader.main.DataToTuples.action(DataToTuples.java:142) at org.apache.jena.tdb2.loader.main.DataToTuples.lambda$startBulk$0(DataToTuples.java:101) at java.lang.Thread.run(Thread.java:748) {noformat} This shows that {{DataToTuples}} is the culprit still running after the loading operation has returned. was (Author: dobrist): h2. Investigation Playing around I have observed the following: * when catching the exception and trying to prepare the model a second time, the call succeeded and no exception was thrown * when sleep for a few seconds after the loading and before preparing the model, the call succeeded and no exception was thrown at all So I suspected that, even though the loading is complete, something must still be going on in the background. Next I tried to close and reopen the dataset after the loading, in order to make sure that potential pending writes are terminated and have been flushed to disk: {code:java} TDBInternal.expel(dataset.asDatasetGraph()); dataset = TDB2Factory.connectDataset(path); infModel = ModelFactory.createInfModel(reasoner, unionModel); Txn.executeRead(dataset, infModel::prepare); {code} This lead to the following exception: {noformat} org.apache.jena.atlas.lib.InternalErrorException: BlockMgrFileAccess : already closed at org.apache.jena.dboe.base.block.BlockMgrFileAccess.checkNotClosed(BlockMgrFileAccess.java:78) at org.apache.jena.dboe.base.block.BlockMgrFileAccess.allocLimit(BlockMgrFileAccess.java:150) at org.apache.jena.dboe.base.block.BlockMgrWrapper.allocLimit(BlockMgrWrapper.java:87) at org.apache.jena.dboe.base.page.PageBlockMgr.allocLimit(PageBlockMgr.java:45) at org.apache.jena.dboe.trans.bplustree.BPlusTree._commitPrepare(BPlusTree.java:544) at org.apache.jena.dboe.trans.bplustree.BPlusTree._commitPrepare(BPlusTree.java:73) at org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.commitPrepare(TransactionalComponentLifecycle.java:98) at org.apache.jena.dboe.transaction.txn.TransactionCoordinator.lambda$executePrepare$12(TransactionCoordinator.java:687) at java.util.ArrayList.forEach(ArrayList.java:1257) at org.apache.jena.dboe.transaction.txn.TransactionCoordinator.executePrepare(TransactionCoordinator.java:685) at org.apache.jena.dboe.transaction.txn.Transaction.prepare(Transaction.java:151) at org.apache.jena.dboe.transaction.txn.Transaction.commit(Transaction.java:160) at org.apache.jena.tdb2.loader.main.DataToTuples.action(DataToTuples.java:139) at org.apache.jena.tdb2.loader.main.DataToTuples.lambda$startBulk$0(DataToTuples.java:101) at java.lang.Thread.run(Thread.java:748) Exception in thread "Thread-67" java.lang.NullPointerException at org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.getDataState(TransactionalComponentLifecycle.java:251) at org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.abort(TransactionalComponentLifecycle.java:126) at org.apache.jena.dboe.transaction.txn.SysTrans.abort(SysTrans.java:45) at org.apache.jena.dboe.transaction.txn.Transaction.lambda$null$10(Transaction.java:189) at java.util.ArrayList.forEach(ArrayList.java:1257) at org.apache.jena.dboe.transaction.txn.Transaction.lambda$abort$$11(Transaction.java:189) at org.apache.jena.dboe.transaction.txn.TransactionCoordinator.executeAbort(TransactionCoordinator.java:726) at org.apache.jena.dboe.transaction.txn.Transaction.abort$(Transaction.java:189) at org.apache.jena.dboe.transaction.txn.Transaction.abort(Transaction.java:181) at org.apache.jena.tdb2.loader.main.DataToTuples.action(DataToTuples.java:142) at org.apache.jena.tdb2.loader.main.DataToTuples.lambda$startBulk$0(DataToTuples.java:101) at java.lang.Thread.run(Thread.java:748) {noformat} This shows that {{DataToTuples}} is the culprit still running after the loading operation has returned. > Exception when preparing InfModel after TDB loading operation > ------------------------------------------------------------- > > Key: JENA-1758 > URL: https://issues.apache.org/jira/browse/JENA-1758 > Project: Apache Jena > Issue Type: Bug > Components: TDB2 > Affects Versions: Jena 3.12.0 > Reporter: Damien Obrist > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > h2. Exception > I'm loading a few million triples into a TDB2 dataset. I'm using a custom > loader (extending {{LoaderMain}}), since the triples being loaded are > generated by a separate application and streamed in over HTTP. > After the loading is done, I try to reset an {{InfModel}} to recompute the > inference taking into account the new triples, but I encounter the following > exception: > {noformat} > org.apache.jena.atlas.RuntimeIOException: Out of bounds: (limit > 32834204)32834205 > at org.apache.jena.atlas.io.IO.exception(IO.java:254) > at > org.apache.jena.dboe.trans.data.TransBinaryDataFile.checkRead(TransBinaryDataFile.java:190) > at > org.apache.jena.dboe.trans.data.TransBinaryDataFile.read(TransBinaryDataFile.java:184) > at > org.apache.jena.tdb2.store.nodetable.TReadAppendFileTransport.read(TReadAppendFileTransport.java:71) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) > at > org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:637) > at > org.apache.thrift.protocol.TCompactProtocol.readFieldBegin(TCompactProtocol.java:543) > at > org.apache.jena.riot.thrift.wire.RDF_IRI$RDF_IRIStandardScheme.read(RDF_IRI.java:318) > at > org.apache.jena.riot.thrift.wire.RDF_IRI$RDF_IRIStandardScheme.read(RDF_IRI.java:311) > at org.apache.jena.riot.thrift.wire.RDF_IRI.read(RDF_IRI.java:258) > at > org.apache.jena.riot.thrift.wire.RDF_Term.standardSchemeReadValue(RDF_Term.java:319) > at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224) > at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213) > at org.apache.thrift.TUnion.read(TUnion.java:138) > at > org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:80) > at > org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103) > at > org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52) > at > org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:197) > at > org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:108) > at > org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52) > at > org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:66) > at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:112) > at org.apache.jena.tdb2.lib.TupleLib.quad(TupleLib.java:108) > at > org.apache.jena.tdb2.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:53) > at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270) > at > org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:36) > at > org.apache.jena.tdb2.store.IteratorTxnTracker.next(IteratorTxnTracker.java:43) > at > java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1812) > at > java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294) > at > java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206) > at > java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:169) > at > java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300) > at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681) > at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:265) > at org.apache.jena.atlas.iterator.Iter.hasNext(Iter.java:903) > at org.apache.jena.atlas.iterator.Iter$1.hasNext(Iter.java:192) > at > org.apache.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:90) > at > org.apache.jena.reasoner.rulesys.impl.RETEEngine.fastInit(RETEEngine.java:155) > at > org.apache.jena.reasoner.rulesys.FBRuleInfGraph.prepare(FBRuleInfGraph.java:471) > at > org.apache.jena.rdf.model.impl.InfModelImpl.prepare(InfModelImpl.java:87) > ... > {noformat} > h2. Code > I tried to come up with a minimal example but wasn't able to reproduce the > issue outside of my more involved application environment, where the issue > occurs consistently. It seems to depend on a specific timing, the number of > triples being loaded, the inference rules used etc. > The code looks roughly like this: > {code:java} > // initialize > Dataset dataset = TDB2Factory.connectDataset(path); > Model unionModel = dataset.getNamedModel("urn:x-arq:UnionGraph"); > InfModel infModel = ModelFactory.createInfModel(reasoner, unionModel); > Txn.executeRead(dataset, infModel::prepare); > // load > InputStream inputStream = // incoming HTTP stream > DataLoader loader = new CustomLoader(LoaderPlans.loaderPlanParallel, > dataset.asDatasetGraph); > loader.startBulk(); > try { > RDFDataMgr.parse(loader.stream(), inputStream, RDFLanguages.NQUADS); > loader.finishBulk(); > } catch (RuntimeException e) { > loader.finishException(e); > throw e; > } > // reset model to recompute inference > Txn.executeRead(dataset, () -> { > model.reset(); > model.prepare(); > }); > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)