Thanks for the feedback Rob and Andy. I am on vacation for a bit, but will hopefully have an updated draft with some of your suggestions next week!
-Stephen On Tue, May 20, 2014 at 10:43 AM, Andy Seaborne <[email protected]> wrote: > On 19/05/14 21:22, Stephen Allen wrote: > >> All, >> >> I've been working on a design for remote transactions for SPARQL >> (initially >> for the query and update endpoints, but most likely for GSP as well). My >> initial draft is at [1]. I would appreciate any feedback, particularly in >> the places where I have made notes (the @@ sections). >> >> Although I tried to design it to not preclude distributed transactions, I >> am intentionally limiting the specification to what is necessary for a >> single server currently. >> >> Implementation seems like it should be fairly straightforward, targeting >> Fuseki 2 / TDB. There look to be a few bumps that need to be overcome >> (particularly TDB's usage of ThreadLocal variables >> in DatasetGraphTransaction), but there do not appear to be any >> showstoppers. Although I have not started writing any code yet. >> > > There is DatasetGraphTxn, direct from the StoreConnection, which is the > actual one-time transaction object that goes in the thread local. > > It itself needs MRSW semantics. > > > >> I have started JENA-700 to track this work [2]. >> >> -Stephen >> >> [1] http://people.apache.org/~sallen/sparql11-transaction/ >> [2] https://issues.apache.org/jira/browse/JENA-700 >> >> > > == Protocol > > POST /transaction > GET /transaction/txid > > Nice way to view a transaction - I'd have made the txid a large globally > unique number so there is no risk of guessing it by a third party. > > A container for transactions makes sense but the other mapping to HTTP > verbs seems odd to me. > > The transaction state is part of the the document at /transaction/txid. > > So PUT/DELETE does not to me seem the right way to handle it as the txid > does not exist (probably) after DELETE. A system may wish to see the state > of a complete transaction though after abort. > > REST is state exchange so a POST or PUT of a new state document to the > server to commit or abort the transaction (or use a query string parameter) > woudl be my inclination. > > POST is preferable because I think it is changing part of the state of the > transaction. Information like start of transaction time, who started > it,... is not overwritten and can be guaranteed by the server. PUT is able > the whole document being replaced by whatever the client claims. > > "authenticated sessions" (if you mean two-way certificated https) are very > complicated to manage. > > == Promotable transactions > > Transaction that start READ and become WRITE aren't ruled up in TDB -- > they would have the effect of potentially causing an abort at the point > when they get promoted but any system has that possibility at this point or > commit point with conflicting updates. In RDF conflict is messier because > RDF triples do not correspond to application conceptual entities, leading > to unexpected conflict, or so fine grain the system is less that acceptable > performance. > > With SPARQL operations being coarse grained read/query or write/update, > promotability makes more sense. > > == Timeouts and deadlocks. > > The SPARQL protocol operations within a transaction still need to be > atomic with respect to the transaction. The client may now have multiple > threads invoking operations - or pass the transaction id to another machine. > > Multiple writers: For TDB, it's internally single writer with a lock for > multiple writers so there is deadlock potential with two writers. We could > add a "begin(WRITE)-or-bounce" operation. > > (If a system is multiple writer, the pain point moves to system generated > aborts which don't happen in TDB). > > == ETags and optimistic transactions > > Have you any thoughts how how this might play with etags? > > Andy >
