[
https://issues.apache.org/jira/browse/JENA-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066106#comment-17066106
]
Bernhard Stiftner commented on JENA-1868:
-----------------------------------------
Thanks for pointing out that using {{dataset.commit()}} makes a difference -
I'm getting the same results here, i.e. those failures and exceptions disappear
when using {{dataset.commit()}} to finish read transactions.
After digging some more into that issue I found that TDB2's BPlusTree
(org.apache.jena.dboe.trans.bplustree.BPlusTree) potentially alters its state
on _every_ transaction abort, no matter if read or write:
{code:java}
protected void _abort(TxnId txnId, BptTxnState state) {
rootIdx = state.initialroot;
// Truncate - logically in block manager space.
nodeManager.resetAlloc(state.boundaryBlocksNode);
recordsMgr.resetAlloc(state.boundaryBlocksRecord);
stateManager.setState(state.initialroot, state.boundaryBlocksNode,
state.boundaryBlocksRecord);
stateManager.sync();
}
{code}
... while a commit of a read transaction seems to be a no-op:
{code:java}
protected void _commit(TxnId txnId, BptTxnState state) {
if ( isWriteTxn() ) {
rootIdx = state.getRoot();
stateManager.sync();
}
}
{code}
I'm not sure what's the reason for having different behaviour of {{commit()}}
and {{abort()}} operations when it comes to read transactions. I'm probably
being naive here, but I'm suggesting to have {{abort()}} behave just like
{{commit()}} for read txns, i.e. both being no-ops.
A patch doing just that:
[https://github.com/apache/jena/commit/a72371a09725e4e4e18e6e00771dfb8413b12502]
I'm not sure yet if that's actually a good idea (although the stress test works
fine), I'll do some extensive testing in the next few days to check for side
effects.
> TDB2 Concurrency: NPE in TransactionalComponentLifecycle
> --------------------------------------------------------
>
> Key: JENA-1868
> URL: https://issues.apache.org/jira/browse/JENA-1868
> Project: Apache Jena
> Issue Type: Bug
> Components: TDB2
> Affects Versions: Jena 3.14.0
> Reporter: Bernhard Stiftner
> Priority: Major
> Attachments: TDB2StressTest.java
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> We're evaluating moving from TDB1 to TDB2 and are hitting various
> concurrency/thread-safety issues that apparently didn't exist with TDB1.
> Our setting is as follows: one JVM, ~20 independent TDB1/TDB2 instances,
> highly concurrent workload involving every TDB1/TDB2 instance.
> A common issue we're hitting with TDB2 is this NullPointerException in
> TransactionalComponentLifecycle:
> {noformat}
> java.lang.NullPointerException
> at
> org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.getReadWriteMode(TransactionalComponentLifecycle.java:324)
> at
> org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.complete(TransactionalComponentLifecycle.java:143)
> at
> org.apache.jena.dboe.transaction.txn.SysTrans.complete(SysTrans.java:47)
> at
> org.apache.jena.dboe.transaction.txn.Transaction.lambda$endInternal$16(Transaction.java:220)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
> at
> org.apache.jena.dboe.transaction.txn.Transaction.endInternal(Transaction.java:220)
> at
> org.apache.jena.dboe.transaction.txn.Transaction.end(Transaction.java:209)
> at
> org.apache.jena.dboe.transaction.txn.TransactionalBase._end(TransactionalBase.java:262)
> at
> org.apache.jena.dboe.transaction.txn.TransactionalBase.abort(TransactionalBase.java:159)
> at
> org.apache.jena.dboe.storage.system.DatasetGraphStorage.abort(DatasetGraphStorage.java:63)
> at
> org.apache.jena.sparql.core.DatasetGraphWrapper.abort(DatasetGraphWrapper.java:253)
> at org.apache.jena.sparql.core.DatasetImpl.abort(DatasetImpl.java:158)
> at TDB2StressTest.randomRead(TDB2StressTest.java:87)
> at TDB2StressTest.runStressTestWorker(TDB2StressTest.java:64)
> at TDB2StressTest.lambda$runStressTest$0(TDB2StressTest.java:43)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
> The attached "test case" manages to reproduce this issue most of the time on
> my machine (YMMV of course, since the test is based on quite some concurrency
> voodoo).
> The same test is working flawlessly when run against a TDB1 backend.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)