[jira] Issue Comment Edited: (JENA-41) Different policy for concurrency access in TDB supporting a single writer and multiple readers

Stephen Allen (JIRA) Thu, 17 Mar 2011 10:22:57 -0700

    [ 
https://issues.apache.org/jira/browse/JENA-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008009#comment-13008009
 ]


Stephen Allen edited comment on JENA-41 at 3/17/11 5:20 PM:
------------------------------------------------------------

I have been working on adding transactions to Parliament (using a WAL for 
durability but with MVCC to provide nonblocking readers).  I've been thinking 
about the interface between ARQ and graph stores and have attached some code 
and interfaces I have come up with.

I see two main ways to modify the DatasetGraph interface to provide 
transactions.

1)  Add explicit transaction support to the interface (see 
TransactionalDataGraph.java).  This means breaking interface changes or a 
second codepath to deal with these new DataGraphs.  But it has the benefit of 
cleaner syntax if a single thread is using more than one transaction (some uses 
for this capability would be: a) eliminate memory/disk buffers in the SPARQL 
Update code by obtaining sequential transactions for stores that supported 
snapshot isolation; b) simplify transaction handling if queries were changed 
not to run on a single thread (either parallel execution or NIO-style 
asynchronous execution).  See [1] for sample usage with multiple transactions.

2)  Tie transaction information to the current thread (see 
TransactionHandler.java).  This works well with the current ARQ query execution 
but becomes more unwieldy when working with multiple transactions on a single 
thread.  See [2] for example usage.



[1] 
   public void MultipleTransactions1()
    {
        TransactionalDatasetGraph tdsg = ...
        Quad quadToAdd = ...
        
        // Omitted error handling for clarity

        // A read-only transaction
        Transaction t1 = 
tdsg.getTransactionManager().begin(IsolationLevel.SERIALIZABLE, 
AccessMode.READ_ONLY);
        
        // A read-write transaction
        Transaction t2 = 
tdsg.getTransactionManager().begin(IsolationLevel.SERIALIZABLE, 
AccessMode.READ_WRITE);
        
        System.out.println("t1.size()= " + tdsg.size(t1));
        
        System.out.println("Adding statement in t2");
        tdsg.add(quadToAdd, t2);
        System.out.println("t2.size()= " + tdsg.size(t1));
        t2.commit();
        
        System.out.println("t1.size()= " + tdsg.size(t1));
        t1.commit();
        
        // Output
        // ------
        // t1.size()= 0
        // Adding statement in t2
        // t2.size()= 1
        // t1.size()= 0
    }

[2] 
   public void MultipleTransactions2()
    {
        // Error handling omitted for clarity
        
        DatasetGraph dsg = ...
        Quad quadToAdd = ...
        
        TransactionHandler th = dsg.getTransactionHandler();
        
        // A read-only transaction
        th.begin(IsolationLevel.SERIALIZABLE, AccessMode.READ_ONLY);
        TransactionHandle t1 = th.suspend();
        
        // A read-write transaction
        th.begin(IsolationLevel.SERIALIZABLE, AccessMode.READ_WRITE);
        TransactionHandle t2 = th.suspend();
        
        th.resume(t1);
        System.out.println("t1.size()= " + dsg.size());
        
        t1 = th.suspend();
        th.resume(t2);
        System.out.println("Adding statement in t2");
        dsg.add(quadToAdd);
        System.out.println("t2.size()= " + dsg.size());
        th.commit();
        
        th.resume(t1);
        System.out.println("t1.size()= " + dsg.size());
        th.commit();
        
        // Output
        // ------
        // t1.size()= 0
        // Adding statement in t2
        // t2.size()= 1
        // t1.size()= 0
    }


 

      was (Author: sallen):
    I have been working on adding transactions to Parliament (using a WAL for 
durability but with MVCC to provide nonblocking readers).  I've been thinking 
about the interface between ARQ and graph stores and have attached some code 
and interfaces I have come up with.

I see two main ways to modify the DatasetGraph interface to provide 
transactions.

1)  Add explicit transaction support to the interface (see 
TransactionalDataGraph.java).  This means breaking interface changes or a 
second codepath to deal with these new DataGraphs.  But it has the benefit of 
cleaner syntax if a single thread is using more than one transaction (some uses 
for this capability would be: a) eliminate memory/disk buffers in the SPARQL 
Update code by obtaining sequential transactions for stores that supported 
snapshot isolation; b) simplify transaction handling if queries were changed 
not to run on a single thread (either parallel execution or NIO-style 
asynchronous execution).  See [1] for sample usage with multiple transactions.

2)  Tie transaction information to the current thread (see 
TransactionHandler.java).  This works well with the current ARQ query execution 
but becomes more unwieldy when working with multiple transactions on a single 
thread.  See [2] for example usage.



[1] 
   public void MultipleTransactions1()
    {
        TransactionalDatasetGraph tdsg = ...
        Quad quadToAdd = ...
        
        // Omitted error handling for clarity

        // A read-only transaction
        Transaction t1 = 
tdsg.getTransactionManager().begin(IsolationLevel.READ_UNCOMMITTED, 
AccessMode.READ_ONLY);
        
        // A read-write transaction
        Transaction t2 = 
tdsg.getTransactionManager().begin(IsolationLevel.SERIALIZABLE, 
AccessMode.READ_WRITE);
        
        System.out.println("t1.size()= " + tdsg.size(t1));
        
        System.out.println("Adding statement in t2");
        tdsg.add(quadToAdd, t2);
        System.out.println("t2.size()= " + tdsg.size(t1));
        t2.commit();
        
        System.out.println("t1.size()= " + tdsg.size(t1));
        t1.commit();
        
        // Output
        // ------
        // t1.size()= 0
        // Adding statement in t2
        // t2.size()= 1
        // t1.size()= 0
    }

[2] 
   public void MultipleTransactions2()
    {
        // Error handling omitted for clarity
        
        DatasetGraph dsg = ...
        Quad quadToAdd = ...
        
        TransactionHandler th = dsg.getTransactionHandler();
        
        // A read-only transaction
        th.begin(IsolationLevel.READ_UNCOMMITTED, AccessMode.READ_ONLY);
        TransactionHandle t1 = th.suspend();
        
        // A read-write transaction
        th.begin(IsolationLevel.SERIALIZABLE, AccessMode.READ_WRITE);
        TransactionHandle t2 = th.suspend();
        
        th.resume(t1);
        System.out.println("t1.size()= " + dsg.size());
        
        t1 = th.suspend();
        th.resume(t2);
        System.out.println("Adding statement in t2");
        dsg.add(quadToAdd);
        System.out.println("t2.size()= " + dsg.size());
        th.commit();
        
        th.resume(t1);
        System.out.println("t1.size()= " + dsg.size());
        th.commit();
        
        // Output
        // ------
        // t1.size()= 0
        // Adding statement in t2
        // t2.size()= 1
        // t1.size()= 0
    }


 
  
> Different policy for concurrency access in TDB supporting a single writer and 
> multiple readers
> ----------------------------------------------------------------------------------------------
>
>                 Key: JENA-41
>                 URL: https://issues.apache.org/jira/browse/JENA-41
>             Project: Jena
>          Issue Type: New Feature
>          Components: Fuseki, TDB
>            Reporter: Paolo Castagna
>         Attachments: Transaction.java, TransactionHandle.java, 
> TransactionHandler.java, TransactionManager.java, 
> TransactionManagerBase.java, TransactionalDatasetGraph.java
>
>
> As a follow up to a discussion about "Concurrent updates in TDB" [1] on the 
> jena-users mailing list, I am creating this as a new feature request.
> Currently TDB requires developers to use a Multiple Reader or Single Writer 
> (MRSW) locking policy for concurrency access [2]. Not doing this could cause 
> data corruptions.
> The MRSW is indeed a MR xor SW (i.e. while a writer has a lock, no readers 
> are allowed and, similarly, if a reader has a lock, no writes are possible).
> This works fine in most of the situation, but there might be problems in 
> presence of long writes or long reads.
> It has been suggested that a "journaled file access" could be used to solve 
> the issue regarding a long write blocking reads.
>  [1] http://markmail.org/message/jnqm6pn32df4wgte
>  [2] http://openjena.org/wiki/TDB/JavaAPI#Concurrency

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Issue Comment Edited: (JENA-41) Different policy for concurrency access in TDB supporting a single writer and multiple readers

Reply via email to