Many thanks; I'll work with this for a while and see where it goes .... On Wed, Mar 28, 2012 at 1:39 PM, Paolo Castagna < castagna.li...@googlemail.com> wrote:
> Hi Bernie, > I proposed an update for the current documentation, you can see a preview > here: > http://jena.staging.apache.org/jena/documentation/tdb/tdb_transactions.html > > It's just a small tiny change, taking content from Andy's email, but I > hope it > will help users: > > - Dataset dataset = ... > + Location location = ... ; > + Dataset dataset = TDBFactory.create(location) ; > > This is how TDBFactory.create(location) is implemented: > > public static Dataset createDataset(Location location) > { return createDataset(createDatasetGraph(location)) ; } > > ... which calls: > > public static DataSource create(DatasetGraph dataset) > { return (DataSource)DatasetImpl.wrap(dataset) ; } > > ... which at the end, currently, results to this in StoreConnection: > > public static synchronized StoreConnection make(Location location) > { > StoreConnection sConn = cache.get(location) ; > if ( sConn != null ) > return sConn ; > > DatasetGraphTDB dsg = DatasetBuilderStd.build(location) ; > sConn = _makeAndCache(dsg) ; > return sConn ; > } > > You can have a look at it yourself, starting from here: > > http://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/tags/jena-tdb-0.9.0-incubating/src/main/java/com/hp/hpl/jena/tdb/TDBFactory.java > > By the way, the documentation is in SVN and patches for the website are > more > than welcome! ;-) Have a look here in the content/jena/ directory: > http://svn.apache.org/repos/asf/incubator/jena/site/trunk/ > > Hopefully, creating a new Dataset each time for each of your threads is > the solution to your problems (and it you can also measure how fast/slow > it is to just create a new Dataset object :-)). > > In the meantime, thanks for your time, patience and feedback. > > Andy, are you reluctant to have TDBFactory.create(location) in the > documentation and/or is there a plan (I am not aware of) to change the > way to create Dataset objects in TDB? If that is the case, we could still > have TDBFactory.create(location) in the documentation, but add a > NOTE/WARNING > that this can change. > > Thanks, > Paolo > > Bernie Greenberg wrote: > > This is really news. Then what is the right-size object to hold around > for > > an on-disk data store to represent an opened database, such that each > call > > on the server doesn't have to Dataset dataset1 = > > TDBFactory.create(location) anew, or is that so cheap that each call on > the > > server should do it to access the dataset? > > > > On Wed, Mar 28, 2012 at 12:31 PM, Paolo Castagna < > > castagna.li...@googlemail.com> wrote: > > > >> Hi Andy, > >> thanks for this reply, I find it very useful. > >> > >> As you said, our documentation is not that clear and perhaps we should > >> improve it with content from this email. > >> > >> In particular, I think having in the documentation: > >> > >> Dataset dataset = ... ; > >> > >> is not really helping users. > >> > >> Need to go off-line for a bit, but I'll propose changes to the document: > >> > http://incubator.apache.org/jena/documentation/tdb/tdb_transactions.html > >> > >> If the problem is just in the documentation, it's much easier to fix. > ;-) > >> > >> Thank again, > >> Paolo > >> > >> Andy Seaborne wrote: > >>> I can see a way it might go wrong if you are using the same dataset > Java > >>> object in "dataset.begin(READ)", "dataset.begin(WRITE)". It will > >>> switch the first one to the second transactions, and that will trigger > >>> the concurrency check. > >>> > >>> (The documentation does not explain this, and indeed, is almost > >>> misleading on the subject.) > >>> > >>> Instead, a app idiom of one dateset per thread. The "Dataset" concept > >>> incorporates the JDBC connection concept so it's like Connection pools. > >>> > >>> I'll add some checking code as well. > >>> > >>> > >>> A way to use transactions that should work in the released system is to > >>> do this on one thread: > >>> > >>> Dataset dataset1 = TDBFactory.create(location) ; > >>> dataset1.begin(READ) ; > >>> ... > >>> > >>> and on the other thread: > >>> > >>> Dataset dataset2 = TDBFactory.create(location) ; > >>> dataset2.begin(WRITE) ; > >>> ... > >>> > >>> i.e. different dataset objects (they get backed by the same safe > >>> datastorage). > >>> > >>> I may be able to come up with a cleaner solution but I'd (Mildly) > prefer > >>> not to use thread local variables as it's only a partial fix. > >>> > >>> Datasets are quite cheap to create and TDBFactory.create(location) gets > >>> all the caching thing right. > >>> > >>> If, for some reason, you are using in-memory TDB databases, then the > use > >>> of "named memory locations" should work. Location.mem("X"). > >>> > >>> Andy > >> > > > >