RE: SDB to TDB transition

Lebling, David (US SSA) Tue, 15 Apr 2014 08:20:13 -0700

Rob,

Thanks, that is very helpful. I have successfully installed and run Fuseki and 
populated a simple in-memory database via the Fuseki control center page. I 
also got a sketch of the necessary modifications to my existing implementations 
to where it almost compiles.

I have a couple of questions, though:

1. The DatasetAccessor you get from the factory seems to have methods for get, 
put and delete. Is any reason not to use them instead of your recommendation of 
UpdateExecutionFactory.createRemote()? The way my services are used they 
"never" (well, hardly ever) do anything but replace whole named graphs. Would 
it be more efficient to do things piecemeal, which I seem to recall is also 
possible.

2. Given a DatasetAccessor is there any way to list the named graphs? That's a 
method available through Dataset.

3. Is the Model begin/commit/abort transaction pattern needed when dealing with 
these classes? It's necessary with the old SDB version. If it's needed, what 
Model is the one that is controlled? DatasetAccessor.putModel() seems to take 
an in-memory model and stuff it in the TDB store, so there's nothing available 
to do the transaction on. What's the pattern here?

4. What is the best way to "reset" a TDB database? (In SDB there was a 
truncate() method that did that.

5. I haven't really gotten into how Assemblers and such are set up. It looks 
very complicated. I will probably have more questions when I get that far.

Thanks,

Dave

-----Original Message-----
From: Rob Vesse [mailto:[email protected]] 
Sent: Thursday, April 10, 2014 6:56 PM
To: [email protected]
Subject: Re: SDB to TDB transition

Hi David

So the first think to point out which may make things tricky for you is that 
TDB can only be accessed from a single JVM at a single time.  This is due to 
the fact that  it uses memory mapped files, journaling and caching so if you 
try and use it from multiple JVMs you are almost guaranteed to corrupt your 
data.

However the workaround for this is to introduce Fuseki into your architecture 
as your database server.  Fuseki can serve up TDB datasets and clients then 
access them over HTTP.

If you use the standard APIs for remote query 
(QueryExecutionFactory.sparqlService()), remote update
(UpdateExecutionFactory.createRemote()) and remote graph access
(DatasetAccessorFactory.createHTTP()) then your application can actually be 
refactored to be agnostic of the backend and you can always switch out
Fuseki+TDB for another SPARQL compliant server if necessary.

With Fuseki+TDB being able to query the union graph can either be done at the 
server configuration level or by specifying a magic graph URI in your queries - 
<urn:x-arq:UnionGraph> though continuing to use this feature will make it 
harder to migrate off Fuseki+TDB should you ever need to since this feature 
goes beyond standard SPARQL.

I don¹t see why you can¹t keep your interaction roughly the same as you have it 
now you would merely need to change the underlying implementation.
 Of course if your operations are mostly coarse-grained I.e. in terms of graphs 
then the DatasetAccessor API may actually cover much of what you need which 
would make your implementation fairly trivial.

Hope this is enough to get you started, please feel free to ask further 
questions or for clarifications as you explore this,

Cheers,

Rob

On 10/04/2014 12:09, "Lebling, David (US SSA)"
<[email protected]> wrote:

>I am looking at finally biting the bullet and transitioning from SDB to 
>TDB. The first step is to come up with a level-of-effort estimate to 
>see if this fits in our budget.
>
>We are using Jena 2.11.0 and SDB 1.4.0. We have a set of five web 
>services (which can be in separate JVMs) that use SDB as an OWL storage 
>device. The items  stored are named graphs. These are read, modified in 
>memory (sometimes through inference, sometimes though pure Java code 
>adding and removing and modifying statements), and written back out. We 
>also use SPARQL queries on the union graph to find graphs of interest.
>Although currently there are a fairly small number of these named 
>graphs, we want to be able to expand the system to hold a much larger 
>number. One of the stumbling blocks with SDB was a bug in its multi-JVM 
>concurrency code that wasn't fixed due to lack of SDB support.
>
>All interactions with the SDB database are through a single class which 
>implements an interface with read, write, delete, find, etc. on named 
>graphs and open and close on the database itself.
>
>Any advice on how to go about architecting and implementing a TDB 
>version of the above would be appreciated. More details can be supplied 
>if needed, of course.
>
>Thanks,
>
>Dave Lebling
>

RE: SDB to TDB transition

Reply via email to