Re: JENA-624: "Develop a new in-memory RDF Dataset implementation"

[email protected] Fri, 28 Aug 2015 05:08:03 -0700

Ah, okay, I see the problem more clearly now. Thanks!

It seems to me now that the best immediate road forward is to go to true MR+SW 
(a write lock for the dataset), since I take from your remarks that you think 
that would be valuable in itself. That would be straightforward. I have read a 
few papers that discuss doing MW by locking at the granularity of triple 
patterns or BGPs, but I have to admit that it will take more study before I am 
ready to implement something like that. {grin}


---
A. Soroka
The University of Virginia Library

On Aug 28, 2015, at 7:42 AM, Andy Seaborne <[email protected]> wrote:

> On 28/08/15 12:22, [email protected] wrote:
>> In fact, this is why I tried (for a first try) a design with only one
>> transaction committing at a time, which amounts to SW in terms of
>> serializability, I thought.
> 
> No :-(
> 
>> But I am allowing multiple writers to
>> assemble changes in multiple transactions at the same time, and I
>> think that is what will prevent the use of swap-into-commit. Maybe
>> this is a bad trade? Since JENA-624 contemplates very high
>> concurrency, is it worth doing a MR+SW design at all? But MRMW seems
>> very hard. {grin}
>> 
>> I had some ideas about structuring indexes in such a way as to allow
>> for more fine-grained locking and using merge for actual MW, but as
>> you point out, locking down to particular resources is not able to
>> guarantee against conflicts between conceptual entities. I also had
>> some nightmares trying to think about how to manage bnodes across
>> multiple writers.
> 
> 
> See my example for a counter example.
> 
> It's not 2 commits at once to avoid, it is that W2 is reading a pre-W1 comit 
> view of the world.
> 
> W1 starts and takes a start-of-transaction pointer to datastructures.
> 
> W1 reads the account balance as 10
> 
> W2 start ditto.
> 
> W2 reads the account balance as 10
> 
> W1 updates and commits
> The account balance visible to any new reader is 15
> 
> W2 updates and commits
> The account balance visible to any new reader is 17
> 
> but it should be 22. The +5 has been lost.
> 
> Your scheme keeps the database datastructures safe, but at the data model 
> level, can cause inconsistency and loss of change.
> 
> Either an application level resolution of changes or something like 2-phase 
> locking is needed and even then there are issues of non-repeatable reads and 
> phantoms reads.
> 
> https://en.wikipedia.org/wiki/Isolation_%28database_systems%29
> 
> It gets very nasty when aggregations (COUNT, SUM) happen.  You can get 
> answers that are not from any state of the data that ever existed.
> 
>       Andy

Re: JENA-624: "Develop a new in-memory RDF Dataset implementation"

Reply via email to