Hi Andy,

It appears that there is a subtle race condition in how the
ThreadBufferingCache coordinates write access to the underlying storage. In
debugging this, I made the following changes
https://github.com/apache/jena/compare/feature/debug_concurrent_writes
Effectively,
I added a Semaphore object to the class. With that change, my test suite
passes (this is the Trellis codebase, in case you are wondering). I also
checked for any significant changes to the performance from 3.13.1 to
3.14.0 (with this change in place), and after running the tests repeatedly,
I didn't see any appreciable performance change one way or the other. If
this seems like a reasonable adjustment to that class, I can write up a
JIRA issue and submit this as a PR.

Best, Aaron

On Wed, 15 Jan 2020 at 14:25, Aaron Coburn <[email protected]> wrote:

> Hi Andy,
> I'll dig a little deeper into what's going on and will put together a
> reproducible test case for this. I first wanted to find out if it might be
> something obvious.
>
> Thanks,
> Aaron
>
> On Wed, 15 Jan 2020 at 13:44, Andy Seaborne <[email protected]> wrote:
>
>> Hi Aaron,
>>
>> Could you say some more about how the concurrent writes are happening
>> and what they are doing?  Just from the stacktrace I haven't managed to
>> write a test case.
>>
>> My guess is that another transaction is finishing a commit about the
>> same time. But if the other transaction is mid-processing then its
>> something else.
>>
>> If you are able to putting in a JVM-suspend breakpoint at
>> ThreadBufferingCache:88 and capture a thread dump, that would be very
>> helpful - I realise it's not always easy to get up.
>>
>>      Andy
>>
>>
>>
>> On 15/01/2020 16:55, Aaron Coburn wrote:
>> > This might be good to split off into a separate issue (and it doesn't
>> > necessarily need to block the release), but I'm finding that, when using
>> > TDB2 with this release candidate in a concurrent write context, I start
>> > encountering a lot of errors. And those errors are definitely not
>> present
>> > with 3.13.1. Specifically, the issue seems to be related to contention
>> over
>> > the TDB2 ThreadBufferingCache. That buffering cache is present in the
>> > 3.13.1 release, and I'm not entirely sure what changed with 3.14.0 that
>> > would trigger these errors, but this is the relevant part of the stack
>> > trace:
>> >
>> > Caused by: org.apache.jena.tdb2.TDBException: ThreadBufferingCache:
>> already
>> > buffering
>> > at
>> >
>> org.apache.jena.tdb2.store.nodetable.ThreadBufferingCache.enableBuffering(ThreadBufferingCache.java:88)
>> > at
>> >
>> org.apache.jena.tdb2.store.nodetable.NodeTableCache.updateStart(NodeTableCache.java:352)
>> > at
>> >
>> org.apache.jena.tdb2.store.nodetable.NodeTableCache.notifyTxnStart(NodeTableCache.java:319)
>> > at
>> >
>> org.apache.jena.dboe.transaction.txn.TransactionCoordinator.lambda$notifyBegin$14(TransactionCoordinator.java:915)
>> > at
>> >
>> org.apache.jena.dboe.transaction.txn.TransactionCoordinator.lambda$listeners$0(TransactionCoordinator.java:207)
>> > at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
>> > at
>> >
>> org.apache.jena.dboe.transaction.txn.TransactionCoordinator.listeners(TransactionCoordinator.java:207)
>> > at
>> >
>> org.apache.jena.dboe.transaction.txn.TransactionCoordinator.notifyBegin(TransactionCoordinator.java:915)
>> > at
>> >
>> org.apache.jena.dboe.transaction.txn.TransactionCoordinator.begin(TransactionCoordinator.java:553)
>> > at
>> >
>> org.apache.jena.dboe.transaction.txn.TransactionCoordinator.begin(TransactionCoordinator.java:509)
>> > at
>> >
>> org.apache.jena.dboe.transaction.txn.TransactionalBase.begin(TransactionalBase.java:110)
>> > at
>> >
>> org.apache.jena.dboe.storage.system.DatasetGraphStorage.begin(DatasetGraphStorage.java:59)
>> > at
>> >
>> org.apache.jena.sparql.core.DatasetGraphWrapper.begin(DatasetGraphWrapper.java:233)
>> > at org.apache.jena.sparql.core.DatasetImpl.begin(DatasetImpl.java:116)
>> > at org.apache.jena.system.Txn.exec(Txn.java:76)
>> > at org.apache.jena.system.Txn.executeWrite(Txn.java:125)
>> > at
>> >
>> org.apache.jena.rdfconnection.RDFConnectionLocal.update(RDFConnectionLocal.java:80)
>> >
>> > Effectively, I get that error at the first time a client attempts to
>> > concurrently write to the TDB2 store. Subsequent attempts just hang.
>> >
>> > Cheers,
>> > Aaron
>> >
>> >
>> >
>> >
>> >
>> > On Mon, 13 Jan 2020 at 11:23, Andy Seaborne <[email protected]> wrote:
>> >
>> >> Hi,
>> >>
>> >> Here is a vote on the release of Apache Jena 3.14.0
>> >> This is the first proposed release candidate.
>> >>
>> >> ==== Changes:
>> >>
>> >> https://s.apache.org/jena-3.14.0-jira
>> >>
>> >> ==== Release Vote
>> >>
>> >> Everyone, not just committers, is invited to test and vote.
>> >> Please download and test the proposed release.
>> >>
>> >> Staging repository:
>> >>
>> https://repository.apache.org/content/repositories/orgapachejena-1035
>> >>
>> >> Proposed dist/ area:
>> >>     https://dist.apache.org/repos/dist/dev/jena/
>> >>
>> >> Keys:
>> >>     https://svn.apache.org/repos/asf/jena/dist/KEYS
>> >>
>> >> Git commit (browser URL):
>> >>     https://github.com/apache/jena/commit/19d42a5
>> >>
>> >> Git Commit Hash:
>> >>     19d42a57a9debc675047b2d1ce9769979c43e7d8
>> >>
>> >> Git Commit Tag:
>> >>     jena-3.14.0
>> >>
>> >> Please vote to approve this release:
>> >>
>> >>           [ ] +1 Approve the release
>> >>           [ ]  0 Don't care
>> >>           [ ] -1 Don't release, because ...
>> >>
>> >> This vote will be open until at least
>> >>
>> >>       Thursday, 16th January 2020 at 187:00 UTC
>> >>
>> >> If you expect to check the release but the time limit does not work
>> >> for you, please email within the schedule above with an expected time
>> >> and we can extend the vote period.
>> >>
>> >> Thanks,
>> >>
>> >>         Andy
>> >>
>> >> Checking needed:
>> >>
>> >> + are the GPG signatures fine?
>> >> + are the checksums correct?
>> >> + is there a source archive?
>> >>
>> >> + can the source archive really be built?
>> >>             (NB This requires a "mvn install" first time)
>> >> + is there a correct LICENSE and NOTICE file in each artifact
>> >>             (both source and binary artifacts)?
>> >> + does the NOTICE file contain all necessary attributions?
>> >> + have any licenses of dependencies changed due to upgrades?
>> >>              if so have LICENSE and NOTICE been upgraded appropriately?
>> >> + does the tag/commit in the SCM contain reproducible sources?
>> >>
>> >
>>
>

Reply via email to