> One of the things that strikes me is that extending Quad to be a > QuadOperation breaks being a Quad. It adds functionality a quad does not > have. Two quads are equal if they have the same G/S/P/O and that's not true > for QuadOperation. > An operation is a pair - the action and the data - not data. e.g. Putting a > QuadOperation into a DatasetGraph would cause problems.
Andy-- I've thought harder about this and I've realized that whether or not I can make a navel-gazing argument about correctness, the typing is obviously confusing and that's damnation enough. I'll fix this to stop extending Quad. --- A. Soroka The University of Virginia Library On Jul 25, 2015, at 7:43 AM, Andy Seaborne <[email protected]> wrote: > On 23/07/15 14:18, [email protected] wrote: >> After a longish conversation with Andy Seaborne, I've worked up a simple >> journaling DatasetGraph wrapping implementation. The idea is to use >> journaling to support proper aborting behavior (which I believe this code >> does) and to add to that a semantic for DatasetGraph::addGraph that copies >> tuples instead of leaving a reference to the added Graph (which I believe >> this code also does). Between these two behaviors, the idea is to be able to >> support transactionality (MRSW only) reasonably well. >> >> The idea is (if this code looks like a reasonable direction) to move onwards >> to an implementation that uses persistent data structures for covering >> indexes in order to get at least to MR+SW and eventually to attack JENA-624: >> "Develop a new in-memory RDF Dataset implementation". >> >> Feedback / advice / criticism greedily desired and welcome! >> >> https://github.com/ajs6f/jena/tree/JournalingDatasetgraph >> >> https://github.com/apache/jena/compare/master...ajs6f:JournalingDatasetgraph >> >> --- >> A. Soroka >> The University of Virginia Library >> > > Hi there, > > A first look - there's quite a lot to do with the release at the moment. > > Having a separate set of functionality to the underlying DatasetGraph is good > for the MRSW case and with that composition on multiple datasets, text > indexes etc etc. > > For the MR+SW, I think the more connected nature of transactions and > implementation might make it harder to have independent functionality but > we'll see. > > https://github.com/afs/mantis/tree/master/dboe-transaction > is a take on a trasnaction mechanism. I'm using it at the moment so I'm > finding otu what works ... and what does not. > > > Yes - addGraph ought to be a copy. The general dataset where the app can put > together a collection of different graph types is the exception but needed > for the case of some graphs being inference, maybe some not. > > > One of the things that strikes me is that extending Quad to be a > QuadOperation breaks being a Quad. It adds functionality a quad does not > have. Two quads are equal if they have the same G/S/P/O and that's not true > for QuadOperation. > > An operation is a pair - the action and the data - not data. > > e.g. Putting a QuadOperation into a DatasetGraph would cause problems. > > > ListBackedOperationRecord<OpType> extends ReversibleOperationRecord<OpType> > > [[ > public class ListBackedOperationRecord<OpType extends InvertibleOperation<?, > ?, ?, ?>> > implements ReversibleOperationRecord<OpType> { > ]] > > > while, yes, a collection of operations could be an operation, datasets don't > provide such composite operations so the abstraction is not used. And the > reverse of it would be recursive - each operation needs reversing. > > I'd keep log (= list of operations) as a separate concept from the operations > themselves. One key operation of a ListBackedOperationRecord is clear and > Operations are > > Or this is a naming thing, is "record" the log entry or the log itself? > > > Is there some specific reason as to why you override the DatasetGraphWithLock > lock? > > > My take on this is: > > https://github.com/afs/jena-workspace/tree/master/src/main/java/transdsg > > One difference is the notion of reversing an operation is not a feature of > the operation itself, it's the way it is played back. Partially, this is > efficiency (which may not matter) as it reduces the object churn but also it > puts undo-playback in one place (e.g. reading and writing from storage, which > might be non-heap memory, or a compacted form (or even a disk) for where > large+long transactions even on in-memory lead to excessive object use. Just > an idea. > > Andy >
