Re: SPI DatasetGraph creating Triples/Quads on demand using DatasetGraphInMemory

Andy Seaborne Mon, 14 Mar 2016 02:57:13 -0700

On 14/03/16 07:31, Joint wrote:



????
That doesn't read well...
I tested two types of triple storage both of which use a concurrent map to 
track the graphs. The first used the TripleTable and took write locks so there 
was one write per graph. The second used a concurrent skip list set and no 
write locks so there is no write contention.
Your dev code has a method canAbort set to return false.I was wondering what 
the idea was?


Where is canAbort?
Are you looking at the Jena code or Mantis code?
Do you mean supportsTransactionAbort?

A system can't provide a proper abort unless it can reconstruct the oldstate, either by having two copies (Txn in memory does this) or a log ofsome kind (TDB does this).

For example, plain synchronization MRSW locking can't provide an abortoperation. It needs the cooperation of components to do that.


        Andy


Dick

-------- Original message --------
From: Andy Seaborne <[email protected]>
Date: 13/03/2016  7:54 pm  (GMT+00:00)
To: [email protected]
Subject: Re: SPI DatasetGraph creating Triples/Quads on demand using
   DatasetGraphInMemory

On 10/03/16 20:10, Dick Murray wrote:

Hi. Yes re TriTable and TripleTable. I too like the storage interface which
would work for my needs and make life simpler. A few points from me.
Currently I wrap an existing dsg and cache the additional tuples into what
I call the deferred DSG or DDSG. The finds return a DSG iterator and a DDSG
iterator.

The DDSG is in memory and I have a number of concrete classes which achieve
the same end.

Firstly i use a Jenna core men DSG and the find handles just add tuples as
required into the HexTable because i don't have a default graph, i.e. it's
never referenced because i need a graph uri to find the deferred data.

The second is in common I have a concurrent map which handles recording
what graphs have been deferred then I either use TriTable or a concurrent
set of tuples to store the graph contents. When I'm using the TriTable I
acquire the write lock and add tuples. So writes can occur in parallel to
different graphs. I've experimented with the concurrent set by spoofing the
write and just adding the tuples I.e. no write lock contention per graph. I
notice the datatsetgraphstorage


????

does not support txn abort? This gives an
in memory DSG which doesn't have lock contention because it never locks...
This is applicable in some circumstances and I think that the right
deferred tuples is one of them?

I also coded a DSG which supports a reentrerant RW with upgrade lock which
allowed me to combine the two DSG's because I could promote the read lock.

Andy I notice your code has a txn interface with a read to write promotion
indicator? Is an upgrade method being considered to the txn interface
because that was an issue I hit and why I have two dsg's. Code further up
the stack calls a txn read but a cache miss needs a write to persist the
new tuples.

A dynamic adapter would support a defined set of handles and the find would
be shimmed to check if any tuples need to be added. If we could define a
set of interfaces to achieve this which shouldn't be too difficult.

On the subject of storage is there any thought to providing granular
locking, DSG, per graph, dirty..?

Dick


Per graph indexing only makes sense if the graphs are held separately.
A quad table isn't going to work very well because some quads are in one
graph and some in another yet all in the same index structure.

So a ConcurrentHashMap holding (c.f. what is now called
DatasetGraphMapLink) separate graphs would seem to make sense.
Contributions welcome.

Transaction promotion is an interestingly tricky thing - it can mean a
system has to cause aborts or lower the isolation guarantees. (e.g. Txn1
starts Read, Txn2 starts write-updates-commits, Txn1 continues, can't
see Txn2 changes (note it may be before or after Txn2 ran), Txn attempts
to promote to a W transaction.  Read-committed leads to non-repeatable
reads (things like count() go wrong for example).

When you say "your code has a txn interface" I take you mean non-jena code?

That all said, this sound like a simpler case - just because a read
transaction needs to update internal caches does not mean it's the fully
general case of transaction promotion.  A lock and weaker isolation may do.

        Andy

Re: SPI DatasetGraph creating Triples/Quads on demand using DatasetGraphInMemory

Reply via email to