Re: Transactions versus threads

Bernie Greenberg Wed, 28 Mar 2012 13:04:29 -0700

Many thanks; I'll work with this for a while and see where it goes ....

On Wed, Mar 28, 2012 at 1:39 PM, Paolo Castagna <
castagna.li...@googlemail.com> wrote:


> Hi Bernie,
> I proposed an update for the current documentation, you can see a preview
> here:
> http://jena.staging.apache.org/jena/documentation/tdb/tdb_transactions.html
>
> It's just a small tiny change, taking content from Andy's email, but I
> hope it
> will help users:
>
> - Dataset dataset = ...
> + Location location = ... ;
> + Dataset dataset =  TDBFactory.create(location) ;
>
> This is how TDBFactory.create(location) is implemented:
>
>    public static Dataset createDataset(Location location)
>    { return createDataset(createDatasetGraph(location)) ; }
>
> ... which calls:
>
>    public static DataSource create(DatasetGraph dataset)
>    { return (DataSource)DatasetImpl.wrap(dataset) ; }
>
> ... which at the end, currently, results to this in StoreConnection:
>
>    public static synchronized StoreConnection make(Location location)
>    {
>        StoreConnection sConn = cache.get(location) ;
>        if ( sConn != null )
>            return sConn ;
>
>        DatasetGraphTDB dsg = DatasetBuilderStd.build(location) ;
>        sConn = _makeAndCache(dsg) ;
>        return sConn ;
>    }
>
> You can have a look at it yourself, starting from here:
>
> http://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/tags/jena-tdb-0.9.0-incubating/src/main/java/com/hp/hpl/jena/tdb/TDBFactory.java
>
> By the way, the documentation is in SVN and patches for the website are
> more
> than welcome! ;-) Have a look here in the content/jena/ directory:
> http://svn.apache.org/repos/asf/incubator/jena/site/trunk/
>
> Hopefully, creating a new Dataset each time for each of your threads is
> the solution to your problems (and it you can also measure how fast/slow
> it is to just create a new Dataset object :-)).
>
> In the meantime, thanks for your time, patience and feedback.
>
> Andy, are you reluctant to have TDBFactory.create(location) in the
> documentation and/or is there a plan (I am not aware of) to change the
> way to create Dataset objects in TDB? If that is the case, we could still
> have TDBFactory.create(location) in the documentation, but add a
> NOTE/WARNING
> that this can change.
>
> Thanks,
> Paolo
>
> Bernie Greenberg wrote:
> > This is really news.  Then what is the right-size object to hold around
> for
> > an on-disk data store to represent an opened database, such that each
> call
> > on the server doesn't have to Dataset dataset1 =
> > TDBFactory.create(location) anew, or is that so cheap that each call on
> the
> > server should do it to access the dataset?
> >
> > On Wed, Mar 28, 2012 at 12:31 PM, Paolo Castagna <
> > castagna.li...@googlemail.com> wrote:
> >
> >> Hi Andy,
> >> thanks for this reply, I find it very useful.
> >>
> >> As you said, our documentation is not that clear and perhaps we should
> >> improve it with content from this email.
> >>
> >> In particular, I think having in the documentation:
> >>
> >>  Dataset dataset =  ... ;
> >>
> >> is not really helping users.
> >>
> >> Need to go off-line for a bit, but I'll propose changes to the document:
> >>
> http://incubator.apache.org/jena/documentation/tdb/tdb_transactions.html
> >>
> >> If the problem is just in the documentation, it's much easier to fix.
> ;-)
> >>
> >> Thank again,
> >> Paolo
> >>
> >> Andy Seaborne wrote:
> >>> I can see a way it might go wrong if you are using the same dataset
> Java
> >>> object in  "dataset.begin(READ)", "dataset.begin(WRITE)".  It will
> >>> switch the first one to the second transactions, and that will trigger
> >>> the concurrency check.
> >>>
> >>> (The documentation does not explain this, and indeed, is almost
> >>> misleading on the subject.)
> >>>
> >>> Instead, a app idiom of one dateset per thread. The "Dataset" concept
> >>> incorporates the JDBC connection concept so it's like Connection pools.
> >>>
> >>> I'll add some checking code as well.
> >>>
> >>>
> >>> A way to use transactions that should work in the released system is to
> >>> do this on one thread:
> >>>
> >>>    Dataset dataset1 = TDBFactory.create(location) ;
> >>>    dataset1.begin(READ) ;
> >>>    ...
> >>>
> >>> and on the other thread:
> >>>
> >>>    Dataset dataset2 = TDBFactory.create(location) ;
> >>>    dataset2.begin(WRITE) ;
> >>>    ...
> >>>
> >>> i.e. different dataset objects (they get backed by the same safe
> >>> datastorage).
> >>>
> >>> I may be able to come up with a cleaner solution but I'd (Mildly)
> prefer
> >>> not to use thread local variables as it's only a partial fix.
> >>>
> >>> Datasets are quite cheap to create and TDBFactory.create(location) gets
> >>> all the caching thing right.
> >>>
> >>> If, for some reason, you are using in-memory TDB databases, then the
> use
> >>> of "named memory locations" should work.  Location.mem("X").
> >>>
> >>>     Andy
> >>
> >
>
>

Re: Transactions versus threads

Reply via email to