On 05/04/13 16:01, David Jordan wrote:
I am testing under several scenarios. For some static cases, I do precompute
the inferences and store them. For this case, I do have one open question. If
one wants to later combine multiple ontologies and data with their own implied
inferencings, is there ever an issue that the original non-inferenced OWL
specification are needed, because of their interactions with the inferencing of
the other ontologies being combined? Will one lose some triples that would have
been inferred if one had started with inferencing done on the original OWL code?
If I follow the question correctly then no, that's safe. All the OWL and
RDFS semantics are defined to be monotonic. Adding new statements can
only ever lead to further statements being deducible, they can't lead to
previously inferred statements becoming invalid. The deductive closure
of an model is always a superset of the original model, no information
is lost.
Given
OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
You are saying that with the following code:
Model memmodel = ModelFactory.createDefaultModel();
memmodel.add( dbmodel);
OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
This will cause my "database model" to be completely pulled into memory and
placed in the memmodel, so that the OntModel can run much more efficiently?
Yes. The cost of that "pulling into memory" step may be quite high but
once it's there all the little queries made by the reasoner will perform
better. In most cases you should come out ahead.
Whereas with the following model omodel it will always go to the database?
Model dbmodel = SDBFactory.connectNamedModel(store, name);
OntModel omodel = ModelFactory.createOntologyModel(spec, model);
Depends what you mean by "always".
The point is that the reasoners have to do a *lot* of queries over the
data to do their job. In the above set up then each of those queries
goes to the database. This can be very very slow. By pulling the data
into memory you take the hit once (with a simple efficient query).
Some of the results of that reasoning is then stored in internal state
in the reasoner so future queries to the omodel may be partially
answered by that internal state and may not trigger further database
queries.
Exactly what "partial" means in this case is complex. The rule reasoners
employ a mix of forward and backward chaining. The forward parts will
all run to completion and store their results in memory. The backward
reasoning is only triggered by the particular query goal and may invoke
further queries to the underlying model (and thus the database).
However, some parts of the backward queries are "tabled" (to stop
infinite loops as much as for performance reasoners) and those tables
are in memory as well. So over time more and more of a given query can
be answered out of the in-memory state but that never reaches 100% of
all queries.
Dave
-----Original Message-----
From: Dave Reynolds [mailto:[email protected]]
Sent: Friday, April 05, 2013 10:39 AM
To: [email protected]
Subject: Re: Persisting OWL in Jena
On 05/04/13 15:09, David Jordan wrote:
Dave,
I have been getting "less than stellar" performance in my benchmarking. I would
just like to be sure that the way I am using Jena IS performing inference over in-memory
models. I have stored Models in the database. When I access them and create an OntModel,
I do it in the following manner:
Store store; // assume this is initialized Model model =
SDBFactory.connectNamedModel(store, name); OntModelSpec spec = new
OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
OntModel omodel = ModelFactory.createOntologyModel(spec, model);
omodel.prepare();
Does this result in an in-memory model as you recommend?
No, that's an inference model running over the database.
If not, could you show the necessary code.
Depends on what you are trying to do. Whether your data is static. What
inferences you want (all or just some interesting ones). Whether the source
data is large. Whether is available as a file or only a database model. Etc.
In the simple case your data is essentially fixed and you can precompute and
store the inferences.
Model memmodel = ModelFactory.createDefaultModel();
// read data into model or use FileUtils.get().loadModel instead
OntModelSpec spec =
new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
dbmodel.add( omodel );
If there are only some inferences you need then you might be more selective in what the
final "add" phase puts into the database model.
Then you access that data in future uses via a non-inference model:
Model dbmodel = SDBFactory.connectNamedModel(store, name);
OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);
OntModel omodel = ModelFactory.createOntologyModel(spec, dbmodel);
If your data is already in the database and you want to dynamically compute the
inferences over its current state then do something more like:
Model memmodel = ModelFactory.createDefaultModel();
memmodel.add( dbmodel );
OntModelSpec spec =
new OntModelSpec(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
OntModel omodel = ModelFactory.createOntologyModel(spec, memmodel);
// use omodel
Any updates to the data will need to be reflected into the omodel. If those
updates are done in the same VM that might be OK, if they are done by other
database clients then that's problematic.
Fundamentally databases and Jena's rule-based inference do not mix well.
Depending on what you need from inference you may be able to achieve the same
effects by query rewriting, or query rewriting plus some simpler pre-computed
closure. In the worst case you need a full deductive database.
For minimal RDFS inference then there is some support in the TDB loader for
computing that more efficiently at load time than the full in-memory rule
systems do.
Dave