[jira] [Commented] (JENA-1868) TDB2 Concurrency: NPE in TransactionalComponentLifecycle

Bernhard Stiftner (Jira) Tue, 24 Mar 2020 11:57:28 -0700


    [ 
https://issues.apache.org/jira/browse/JENA-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066106#comment-17066106
 ]


Bernhard Stiftner commented on JENA-1868:
-----------------------------------------

Thanks for pointing out that using {{dataset.commit()}} makes a difference - 
I'm getting the same results here, i.e. those failures and exceptions disappear 
when using {{dataset.commit()}} to finish read transactions.

After digging some more into that issue I found that TDB2's BPlusTree 
(org.apache.jena.dboe.trans.bplustree.BPlusTree) potentially alters its state 
on _every_ transaction abort, no matter if read or write:
{code:java}
protected void _abort(TxnId txnId, BptTxnState state) {
    rootIdx = state.initialroot;
    // Truncate - logically in block manager space.
    nodeManager.resetAlloc(state.boundaryBlocksNode);
    recordsMgr.resetAlloc(state.boundaryBlocksRecord);
    stateManager.setState(state.initialroot, state.boundaryBlocksNode, 
state.boundaryBlocksRecord);
    stateManager.sync();
}
{code}
... while a commit of a read transaction seems to be a no-op:
{code:java}
protected void _commit(TxnId txnId, BptTxnState state) {
    if ( isWriteTxn() ) {
        rootIdx = state.getRoot();
        stateManager.sync();
    }
}
{code}
I'm not sure what's the reason for having different behaviour of {{commit()}} 
and {{abort()}} operations when it comes to read transactions. I'm probably 
being naive here, but I'm suggesting to have {{abort()}} behave just like 
{{commit()}} for read txns, i.e. both being no-ops. 
 A patch doing just that: 
[https://github.com/apache/jena/commit/a72371a09725e4e4e18e6e00771dfb8413b12502]
 
I'm not sure yet if that's actually a good idea (although the stress test works 
fine), I'll do some extensive testing in the next few days to check for side 
effects.

> TDB2 Concurrency: NPE in TransactionalComponentLifecycle
> --------------------------------------------------------
>
>                 Key: JENA-1868
>                 URL: https://issues.apache.org/jira/browse/JENA-1868
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB2
>    Affects Versions: Jena 3.14.0
>            Reporter: Bernhard Stiftner
>            Priority: Major
>         Attachments: TDB2StressTest.java
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We're evaluating moving from TDB1 to TDB2 and are hitting various 
> concurrency/thread-safety issues that apparently didn't exist with TDB1.
> Our setting is as follows: one JVM, ~20 independent TDB1/TDB2 instances, 
> highly concurrent workload involving every TDB1/TDB2 instance.
> A common issue we're hitting with TDB2 is this NullPointerException in 
> TransactionalComponentLifecycle:
> {noformat}
> java.lang.NullPointerException
>     at 
> org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.getReadWriteMode(TransactionalComponentLifecycle.java:324)
>     at 
> org.apache.jena.dboe.transaction.txn.TransactionalComponentLifecycle.complete(TransactionalComponentLifecycle.java:143)
>     at 
> org.apache.jena.dboe.transaction.txn.SysTrans.complete(SysTrans.java:47)
>     at 
> org.apache.jena.dboe.transaction.txn.Transaction.lambda$endInternal$16(Transaction.java:220)
>     at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
>     at 
> org.apache.jena.dboe.transaction.txn.Transaction.endInternal(Transaction.java:220)
>     at 
> org.apache.jena.dboe.transaction.txn.Transaction.end(Transaction.java:209)
>     at 
> org.apache.jena.dboe.transaction.txn.TransactionalBase._end(TransactionalBase.java:262)
>     at 
> org.apache.jena.dboe.transaction.txn.TransactionalBase.abort(TransactionalBase.java:159)
>     at 
> org.apache.jena.dboe.storage.system.DatasetGraphStorage.abort(DatasetGraphStorage.java:63)
>     at 
> org.apache.jena.sparql.core.DatasetGraphWrapper.abort(DatasetGraphWrapper.java:253)
>     at org.apache.jena.sparql.core.DatasetImpl.abort(DatasetImpl.java:158)
>     at TDB2StressTest.randomRead(TDB2StressTest.java:87)
>     at TDB2StressTest.runStressTestWorker(TDB2StressTest.java:64)
>     at TDB2StressTest.lambda$runStressTest$0(TDB2StressTest.java:43)
>     at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
> The attached "test case" manages to reproduce this issue most of the time on 
> my machine (YMMV of course, since the test is based on quite some concurrency 
> voodoo).
> The same test is working flawlessly when run against a TDB1 backend.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (JENA-1868) TDB2 Concurrency: NPE in TransactionalComponentLifecycle

Reply via email to