Re: Jena TDB and Jena Inference cooperation underlying mechanics

Dave Reynolds Wed, 11 Dec 2013 04:30:10 -0800

On 11/12/13 10:23, Daniel Maatari Okouya wrote:

Many thanks for taking the time to provide such a precise answer.


Meanwhile, I’m afraid i would require some clarification.


1- Could you explain the different between 1 and 2. To me the solution are exactly the 
same. With my current knowledge of Jena. I don’t see 2 different code coming out of 
"Construct an Infmodel over a TDB" and  “Load The TDB model into memory”. In 
both case i would go with dataset.getNamedModel or getdefaultmodel, which to me would 
result in an in-memory model, that will be a parameter of a createInfModel…..

No, a call like getNamedModel doesn't load the model into memory, itgives you an interface onto the TDB store. When you query that model thequeries are sent to TDB.


To create an in memory copy you would do something like:

   Model memCopy = ModelFactory.createDefaultModel();
   memCopy.add( tdbModel );

2-In (3) I’m not familiar with the term "inference closure”. My guess would be 
that you mean ontology schema… is there another term for that, could you please 
clarify in easier jargon.


Maybe a trivial example would help:

Suppose you have an ontology, in a TDB model or a file, which states:

     foaf:knows rdfs:domain foaf:Person; rdfs:range foaf:Person .

This is a small fragment of the FOAF vocabulary specification [1].

Now suppose you have some instance data, again a TDB model or file,which only states:


     :dave foaf:knows :bob .

The a query to list the types of :dave and :bob on that instance datawill of itself return empty.

However, if you construct an RDFS inference model which combines theontology and the instance data with a knowledge of RDFS semantics then aquery to that for type statements would yeild, among other things:


     :dave rdf:type foaf:Person .
     :bob rdf:type foaf:Person .

These plus the ontology plus the instance data are the inferenceclosure. I.e. the closure is what you get by adding back in all the newstatements you can derive as a result of inference [2].

The inference engine has had to do work to generate those additionalinferred statements. All the internal reasoner state for doing so is inmemory. So if you kill your application and ask the same query again theinference engine has to do that work over again. For more complexexamples that work can be expensive, particularly over a persistent store.

So in option 3 you take those inferred triples and store them, alongwith the original triples. If you list the rdf:type statements on thatmodel now you see the same answers you would have seen if you queriedvia an inference engine but didn't have to do inference to get them (youdid it ahead of time).


Dave

[1] http://xmlns.com/foaf/spec/#term_knows

[2] Sometimes that complete closure can be infinite, so you can't inferand store everything. But in practice the fragments implemented by therule engine are generally finite.

Moreover, I’m not sure to understand the full logic here. Why would i store the 
result of some query and then query them again. Is this the way to store all 
the inferredTriple back in the store? Do you mean that you take every triple 
that is returned and assert them back in the Infmodel or the base model ? By 
the way does that mean that in general, in jena, there is noway, to generate a 
model that is the combination of the asserted and all the inferred triple (or a 
selected set of inferred triple, e.g. all class axioms).

As you can see i’m quite confuse ;) . I would much appreciate if you could 
clarified and detailed a bit further (3)

Also if u have good pointers for me to read, please do not hesitate ;)


Many thanks,

-M-

--
Daniel Maatari Okouya
Sent with Airmail
From: Dave Reynolds Dave Reynolds
Reply: [email protected] [email protected]
Date: December 11, 2013 at 9:47:51 AM
To: [email protected] [email protected]
Subject:  Re: Jena TDB and Jena Inference cooperation underlying mechanics
There's no specific integration of TDB and inference.

The rule-based inference engines themselves run in memory but can
operate over any model, however it is stored. So there are several options.

1. Construct an InfModel over a TDB based model. When you query the
InfModel you will see both the TDB model and any inferences.

2. Load the TDB model into memory then construct an InfModel over that.
Then query the InfModel.

3. Prepare an inference closure and store that. Load the data, e.g. into
memory, construct the InfModel, query the InfModel for the patterns you
are interested in (which might be every triple) store all those results
in a TDB model. Then at run time open the closure TDB as a plain model
and query it.

4. As 3 but use TDB's high performance RDFS-subset closure.

#1 is easy to do but the inference results are stored in memory so it
doesn't enable you to scale to models that wouldn't fit in memory anyway.

#2 can be faster. Inference involves a lot of queries and query to an in
memory model is naturally faster than querying TDB. Whether the cost of
the initial load outweighs the speed of inference depends on how caching
works out, your data and your queries.

#3 gives you good query performance at the cost of an expensive slow
cycle to prepare the data. It's not suited to mutating data and requires
the preparation phase to be run on a machine with enough memory to
compute the closure, or closure subset, that you want.

#4 can cope with much larger data sets than #3 at the expense of a more
limited range of inference.

Dave



On 10/12/13 21:58, Daniel Maatari Okouya wrote:

Dear All,

If they can work together, I would like to understand a bit better the 
underlying mechanics of making Jena TDB and the Jena Inference infrastructure 
work together.


Hence, I have the following question:


1-Can someone have the Jena TDB act like Stardog, in the sense that if one make 
a query that would include inferred triple as well ?

If that is possible, can someone explain to me the underlying mechanics that it 
would imply, knowing that i understood the following from the documentation.

Inferred triples are situated in an InfGraph/InfModel(warper), which is 
obtained by binding a reasoner to a base Graph/Model (understood as the one 
containing the asserted triples). One can query that infGraph using the Querry 
engine ( However this is happening on an InMemory models).

My guess here is that if a model is in the TDB, then for our query to include 
the inferred triple, this would requires that, the querry to the model is 
actually run against the InfModel/Infgraph. However i don’t know if that is 
possible and how exactly TDB would do that? Indeed, that would mean choosing a 
reasoner, creating the infmodel, and exposing it in lieu of the base model.

Can someone explain a bit that mechanics, that is how it works with Jena TDB, 
if one wants some inference. For instance, if the actual model is RDFS or OWL ? 
how to query it and obtain answer that include the inferred knowledge. Many 
thanks

Best,

-M-



--
Daniel Maatari Okouya
Sent with Airmail

Re: Jena TDB and Jena Inference cooperation underlying mechanics

Reply via email to