Re: SPARQL Transaction

Stephen Allen Fri, 23 May 2014 10:53:00 -0700

Thanks for the feedback Rob and Andy.  I am on vacation for a bit, but will
hopefully have an updated draft with some of your suggestions next week!


-Stephen


On Tue, May 20, 2014 at 10:43 AM, Andy Seaborne <[email protected]> wrote:

> On 19/05/14 21:22, Stephen Allen wrote:
>
>> All,
>>
>> I've been working on a design for remote transactions for SPARQL
>> (initially
>> for the query and update endpoints, but most likely for GSP as well).  My
>> initial draft is at [1].  I would appreciate any feedback, particularly in
>> the places where I have made notes (the @@ sections).
>>
>> Although I tried to design it to not preclude distributed transactions, I
>> am intentionally limiting the specification to what is necessary for a
>> single server currently.
>>
>> Implementation seems like it should be fairly straightforward, targeting
>> Fuseki 2 / TDB.  There look to be a few bumps that need to be overcome
>> (particularly TDB's usage of ThreadLocal variables
>> in DatasetGraphTransaction), but there do not appear to be any
>> showstoppers.  Although I have not started writing any code yet.
>>
>
> There is DatasetGraphTxn, direct from the StoreConnection, which is the
> actual one-time transaction object that goes in the thread local.
>
> It itself needs MRSW semantics.
>
>
>
>> I have started JENA-700 to track this work [2].
>>
>> -Stephen
>>
>> [1] http://people.apache.org/~sallen/sparql11-transaction/
>> [2] https://issues.apache.org/jira/browse/JENA-700
>>
>>
>
> == Protocol
>
> POST  /transaction
> GET   /transaction/txid
>
> Nice way to view a transaction - I'd have made the txid a large globally
> unique number so there is no risk of guessing it by a third party.
>
> A container for transactions makes sense but the other mapping to HTTP
> verbs seems odd to me.
>
> The transaction state is part of the the document at /transaction/txid.
>
> So PUT/DELETE does not to me seem the right way to handle it as the txid
> does not exist (probably) after DELETE. A system may wish to see the state
> of a complete transaction though after abort.
>
> REST is state exchange so a POST or PUT of a new state document to the
> server to commit or abort the transaction (or use a query string parameter)
> woudl be my inclination.
>
> POST is preferable because I think it is changing part of the state of the
> transaction.  Information like start of transaction time, who started
> it,... is not overwritten and can be guaranteed by the server. PUT is able
> the whole document being replaced by whatever the client claims.
>
> "authenticated sessions" (if you mean two-way certificated https) are very
> complicated to manage.
>
> == Promotable transactions
>
> Transaction that start READ and become WRITE aren't ruled up in TDB --
> they would have the effect of potentially causing an abort at the point
> when they get promoted but any system has that possibility at this point or
> commit point with conflicting updates.  In RDF conflict is messier because
> RDF triples do not correspond to application conceptual entities, leading
> to unexpected conflict, or so fine grain the system is less that acceptable
> performance.
>
> With SPARQL operations being coarse grained read/query or write/update,
> promotability makes more sense.
>
> == Timeouts and deadlocks.
>
> The SPARQL protocol operations within a transaction still need to be
> atomic with respect to the transaction.  The client may now have multiple
> threads invoking operations - or pass the transaction id to another machine.
>
> Multiple writers: For TDB, it's internally single writer with a lock for
> multiple writers so there is deadlock potential with two writers.  We could
> add a "begin(WRITE)-or-bounce" operation.
>
> (If a system is multiple writer, the pain point moves to system generated
> aborts which don't happen in TDB).
>
> == ETags and optimistic transactions
>
> Have you any thoughts how how this might play with etags?
>
>         Andy
>

Re: SPARQL Transaction

Reply via email to