Hi Aaron,

So it it the trailing edge of the transaction.

That synchronization should be happening anyway because flushing should be inside the transaction commit before the overall transaction finished commiting. The TransactionCoordinator controls this.

The bug is that the flush is being triggered from the "CompleteFinish" step, not the "CommitFinish".

In TransactionCoordinator
  do CommitFinish
  releaseWriterLock
  do Complete steps

so other writer are released before flush and clearup.

In NodeTableCache:

notifyCompleteFinish should be notifyCommitFinish

Around line 330:
    @Override
    //public void notifyCompleteFinish(Transaction transaction) {
    public void notifyCommitFinish(Transaction transaction) {
        if(transaction.isWriteTxn()) {
            updateCommit();
        }
    }

I have got a test case now.

I'll cancel the RC1 VOTE and redo the release.

    Andy


On 16/01/2020 03:39, Aaron Coburn wrote:
Hi Andy,

It appears that there is a subtle race condition in how the
ThreadBufferingCache coordinates write access to the underlying storage. In
debugging this, I made the following changes
https://github.com/apache/jena/compare/feature/debug_concurrent_writes
Effectively,
I added a Semaphore object to the class. With that change, my test suite
passes (this is the Trellis codebase, in case you are wondering). I also
checked for any significant changes to the performance from 3.13.1 to
3.14.0 (with this change in place), and after running the tests repeatedly,
I didn't see any appreciable performance change one way or the other. If
this seems like a reasonable adjustment to that class, I can write up a
JIRA issue and submit this as a PR.

Best, Aaron

On Wed, 15 Jan 2020 at 14:25, Aaron Coburn <aaron.cob...@gmail.com> wrote:

Hi Andy,
I'll dig a little deeper into what's going on and will put together a
reproducible test case for this. I first wanted to find out if it might be
something obvious.

Thanks,
Aaron

On Wed, 15 Jan 2020 at 13:44, Andy Seaborne <a...@apache.org> wrote:

Hi Aaron,

Could you say some more about how the concurrent writes are happening
and what they are doing?  Just from the stacktrace I haven't managed to
write a test case.

My guess is that another transaction is finishing a commit about the
same time. But if the other transaction is mid-processing then its
something else.

If you are able to putting in a JVM-suspend breakpoint at
ThreadBufferingCache:88 and capture a thread dump, that would be very
helpful - I realise it's not always easy to get up.

      Andy



On 15/01/2020 16:55, Aaron Coburn wrote:
This might be good to split off into a separate issue (and it doesn't
necessarily need to block the release), but I'm finding that, when using
TDB2 with this release candidate in a concurrent write context, I start
encountering a lot of errors. And those errors are definitely not
present
with 3.13.1. Specifically, the issue seems to be related to contention
over
the TDB2 ThreadBufferingCache. That buffering cache is present in the
3.13.1 release, and I'm not entirely sure what changed with 3.14.0 that
would trigger these errors, but this is the relevant part of the stack
trace:

Caused by: org.apache.jena.tdb2.TDBException: ThreadBufferingCache:
already
buffering
at

org.apache.jena.tdb2.store.nodetable.ThreadBufferingCache.enableBuffering(ThreadBufferingCache.java:88)
at

org.apache.jena.tdb2.store.nodetable.NodeTableCache.updateStart(NodeTableCache.java:352)
at

org.apache.jena.tdb2.store.nodetable.NodeTableCache.notifyTxnStart(NodeTableCache.java:319)
at

org.apache.jena.dboe.transaction.txn.TransactionCoordinator.lambda$notifyBegin$14(TransactionCoordinator.java:915)
at

org.apache.jena.dboe.transaction.txn.TransactionCoordinator.lambda$listeners$0(TransactionCoordinator.java:207)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
at

org.apache.jena.dboe.transaction.txn.TransactionCoordinator.listeners(TransactionCoordinator.java:207)
at

org.apache.jena.dboe.transaction.txn.TransactionCoordinator.notifyBegin(TransactionCoordinator.java:915)
at

org.apache.jena.dboe.transaction.txn.TransactionCoordinator.begin(TransactionCoordinator.java:553)
at

org.apache.jena.dboe.transaction.txn.TransactionCoordinator.begin(TransactionCoordinator.java:509)
at

org.apache.jena.dboe.transaction.txn.TransactionalBase.begin(TransactionalBase.java:110)
at

org.apache.jena.dboe.storage.system.DatasetGraphStorage.begin(DatasetGraphStorage.java:59)
at

org.apache.jena.sparql.core.DatasetGraphWrapper.begin(DatasetGraphWrapper.java:233)
at org.apache.jena.sparql.core.DatasetImpl.begin(DatasetImpl.java:116)
at org.apache.jena.system.Txn.exec(Txn.java:76)
at org.apache.jena.system.Txn.executeWrite(Txn.java:125)
at

org.apache.jena.rdfconnection.RDFConnectionLocal.update(RDFConnectionLocal.java:80)

Effectively, I get that error at the first time a client attempts to
concurrently write to the TDB2 store. Subsequent attempts just hang.

Cheers,
Aaron





On Mon, 13 Jan 2020 at 11:23, Andy Seaborne <a...@apache.org> wrote:

Hi,

Here is a vote on the release of Apache Jena 3.14.0
This is the first proposed release candidate.

==== Changes:

https://s.apache.org/jena-3.14.0-jira

==== Release Vote

Everyone, not just committers, is invited to test and vote.
Please download and test the proposed release.

Staging repository:

https://repository.apache.org/content/repositories/orgapachejena-1035

Proposed dist/ area:
     https://dist.apache.org/repos/dist/dev/jena/

Keys:
     https://svn.apache.org/repos/asf/jena/dist/KEYS

Git commit (browser URL):
     https://github.com/apache/jena/commit/19d42a5

Git Commit Hash:
     19d42a57a9debc675047b2d1ce9769979c43e7d8

Git Commit Tag:
     jena-3.14.0

Please vote to approve this release:

           [ ] +1 Approve the release
           [ ]  0 Don't care
           [ ] -1 Don't release, because ...

This vote will be open until at least

       Thursday, 16th January 2020 at 187:00 UTC

If you expect to check the release but the time limit does not work
for you, please email within the schedule above with an expected time
and we can extend the vote period.

Thanks,

         Andy

Checking needed:

+ are the GPG signatures fine?
+ are the checksums correct?
+ is there a source archive?

+ can the source archive really be built?
             (NB This requires a "mvn install" first time)
+ is there a correct LICENSE and NOTICE file in each artifact
             (both source and binary artifacts)?
+ does the NOTICE file contain all necessary attributions?
+ have any licenses of dependencies changed due to upgrades?
              if so have LICENSE and NOTICE been upgraded appropriately?
+ does the tag/commit in the SCM contain reproducible sources?





Reply via email to