*Wishlist*
- CLEREZZA-219: Rollback / atomicity
- A committed Transactions is like a patch to a set of graphs
- Transaction can be copied and executed on another instance
- Optimistic concurrency control
- Code that doesn't modifies MGraphs need no transactions
- No more ConurrentModificationException: Iterators returned by MGraphs
point to the version at the moment of invocation
*Problems*
- Consistency with Bnode
- Suppose we have an mGraph with: _:a rdf:type ex:Animal
- The transactions t1 and t2 both read the above unique triple of the
MGraph
- t1 adds _:a rdf:type ex:Cat
- t2 adds _:a rdf:type ex:Dog
- The second thread should fail to commit
*Solution approach*
I'm unfamiliar with the java transaction api, I'm not sure how this could
replace/affect the following
I immagine the api being used like this (in Scala, in Java some additional
interfaces would be needed)
val tm: TcTransasctionManager = ...
val t = tm.createTransaction { tcManager =>
val mGraph = tcManger.getMGraph(new UriRef("http://example.org/test.graph
"))
//within this block it appears as if there's no one else in the world, we
don't have to care
//about locking
}
//retries up to 5 time if applying the result patch fails
//transactions that perform only read operations produce an empty patch and
commit thus never fails
val result = t.commit(5)
//we could also run the transaction without any change being actualy sone to
the tcManager
//val result = t.simulate()
A Transaction is specified by a function that takes a TcManager, the
functionality of the transaction is implemented in this function and all
access to mGraphs occurs via the the TcManager received as argument. The
list of triple collections as well as the triple collection returned by this
TcManager appear not to change unless for the changes done within the
perform method.
- when a transaction is committed or when a read access is done, all
previous write call are transformed into a patch, this patch is associated
to the transaction
- All read access are performed against a base graph and a set of
transaction that have been committed but have not yet been applied to the
base graph or that are associated to the transaction performing the read
operation.
- Changes within a transaction produce a patch, at the time of commit it
is checked that the mgraph is still compatible with the change (removed
triples are still there and the context of affected bnodes is unchanged)
- when a triple is added or removed containing an existing bnode (one
that has not been added with the transaction) a the context of this bnode in
the original graph is marked for removal and a replacement subgraph with
distinct bnode objects is created.
- A scheduled task monitors the readlock on mGraphs and choosed suitable
moments for applying committed patches to the base mgraph, only during this
operation the base mgraph is write-locked. No transaction function is
executed while patches are applied, no iterator should be open the mGraph or
some mechanism not to ConcurrentModificationExceptions must be applied.
*Performance*
On read access apart from the base-graph a set of patches has to be checked,
both for additions and removals of triples. As under normal circumstances
the umber of patches should be relatively small this shouldn't be too bad.
Write operations should get significantly faster as generating and adding a
patch does not require a qrite lock on the base graph.