Ah, okay, I see the problem more clearly now. Thanks!
It seems to me now that the best immediate road forward is to go to true MR+SW
(a write lock for the dataset), since I take from your remarks that you think
that would be valuable in itself. That would be straightforward. I have read a
few papers that discuss doing MW by locking at the granularity of triple
patterns or BGPs, but I have to admit that it will take more study before I am
ready to implement something like that. {grin}
---
A. Soroka
The University of Virginia Library
On Aug 28, 2015, at 7:42 AM, Andy Seaborne <[email protected]> wrote:
> On 28/08/15 12:22, [email protected] wrote:
>> In fact, this is why I tried (for a first try) a design with only one
>> transaction committing at a time, which amounts to SW in terms of
>> serializability, I thought.
>
> No :-(
>
>> But I am allowing multiple writers to
>> assemble changes in multiple transactions at the same time, and I
>> think that is what will prevent the use of swap-into-commit. Maybe
>> this is a bad trade? Since JENA-624 contemplates very high
>> concurrency, is it worth doing a MR+SW design at all? But MRMW seems
>> very hard. {grin}
>>
>> I had some ideas about structuring indexes in such a way as to allow
>> for more fine-grained locking and using merge for actual MW, but as
>> you point out, locking down to particular resources is not able to
>> guarantee against conflicts between conceptual entities. I also had
>> some nightmares trying to think about how to manage bnodes across
>> multiple writers.
>
>
> See my example for a counter example.
>
> It's not 2 commits at once to avoid, it is that W2 is reading a pre-W1 comit
> view of the world.
>
> W1 starts and takes a start-of-transaction pointer to datastructures.
>
> W1 reads the account balance as 10
>
> W2 start ditto.
>
> W2 reads the account balance as 10
>
> W1 updates and commits
> The account balance visible to any new reader is 15
>
> W2 updates and commits
> The account balance visible to any new reader is 17
>
> but it should be 22. The +5 has been lost.
>
> Your scheme keeps the database datastructures safe, but at the data model
> level, can cause inconsistency and loss of change.
>
> Either an application level resolution of changes or something like 2-phase
> locking is needed and even then there are issues of non-repeatable reads and
> phantoms reads.
>
> https://en.wikipedia.org/wiki/Isolation_%28database_systems%29
>
> It gets very nasty when aggregations (COUNT, SUM) happen. You can get
> answers that are not from any state of the data that ever existed.
>
> Andy