On 14/10/2016 17:09, "Andy Seaborne" <a...@apache.org> wrote:

    I don't understand what capabilities are enabled by transaction 
    granularity if there are multiple transactions in a single patch. 
    Concrete examples of where it helps?
    
    However, I've normally been working with one transaction per patch anyway.
    
    Allowing multiple transaction per patch is for making a collect of 
    (semantically) related changes into a unit, by consolidating small 
    patches "today's changes " (c.f. git squash).
    
    Leaving the transaction boundaries in gives internal checkpoints, not 
    just one big transaction. It also makes the consolidate patch 
    decomposable (unlike squash).
    
    Internal checkpoints are useful not just for keeping the transaction 
    manageable but also to be able to restart a very large update in case it 
    failed part way through for system reasons (server power cut, user 
    reboots laptop by accident, ...)  Imagine keeping a DBpedia copy up to date.

 I think the thought is that a producer of A patch can decide whether each 
transaction being recorded should be reversible or not. For example if you are 
a very large dataset to an already large database you probably don’t want to 
slow down the import process by having to check whether every triple/quad is 
already in the database as you import it. Therefore you might choose to output 
a non-reversible transaction for performance reasons.

On the other hand if you’re accepting a small change to the data then that cost 
is probably acceptable and you would output a reversible transaction.

 I am not arguing that you shouldn’t have transaction boundaries, in fact I 
think they are essential, but simply that you may want to be to annotate the 
properties of a transaction Beyond just stating the boundaries.




Reply via email to