Rob Vesse created JENA-648:
------------------------------

             Summary: Make TDB datasets harder to corrupt
                 Key: JENA-648
                 URL: https://issues.apache.org/jira/browse/JENA-648
             Project: Apache Jena
          Issue Type: Improvement
          Components: TDB
            Reporter: Rob Vesse


This RFE comes out of discussions I had in person with Andy earlier this week.  
On the mailing lists and Q&A sites we see a steady stream of questions from 
people who have corrupted TDB databases and it would be nice if we could put in 
place features that make this harder to do.

There are two main things we should do in the long term as I see it:

# Make using TDB non-transactonally more difficult
# Put in place some mechanism to make it difficult for multiple JVMs to access 
the same TDB dataset simultaneously

Me and Andy think the first could be achieved by making TDB datasets operation 
in auto-commit more rather than non-transactional mode by default.  In order to 
allow this we likely need upgradeable read transactions to be supported.  As 
part of this change non-transactional mode would still be supported but users 
would have to explicitly set some "Here be Dragons" style flag in order to do 
this.  Users who aren't using transactions currently would likely merely see 
performance drop since suddenly they are getting auto-commits on every 
operation but when they complain we can tell them they should be using 
transactions properly to ensure their TDB databases remain uncorrupted.

As far as the second point goes we could likely do this the way a lot of other 
applications do by having the code write a lock file to disk when a database is 
opened which contains the owning processes PID.  Whenever you go to open a 
database the presence of the lock file is checked for and if present the PID 
validated with the code refusing to open the database if the PIDs do not match. 
 There would likely need to be some code to cope with the case where the lock 
file gets left around and the owning PID is not alive but that shouldn't be too 
complicated.

Since these may be considered as substantial behavioural changes to TDB these 
may likely go into Jena 3



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to