fast model copy

Andrew Hunter Wed, 24 Jan 2018 21:28:43 -0800

Hi,

I have an OntModel using simple RDF reasoning, but I have found that
preparing the model and then copying it into a default, non-reasoning model
is much faster for querying. My immediate question: Is there an efficient
way of copying an OntModel into a default, non-reasoning model without
copying all statements?


Some more details on my motivation, with a sketch of the code found below:
My OntModel is held in memory and is originally constructed from a set of
OWL files. I am matching a stream of data items, each with a small set of
RDF triples attached, against a few long standing queries. Conceptually, I
wish to insert a data item into the OntModel and then execute the queries
to see if that one item matches, then remove it from the model and wait for
the next item. Since there is only one item of interest at a time, I also
have the option of extracting a sub-model rooted at a resource identifying
the item, and using this much smaller model to query.

What I have found is that by far the fastest way to complete query
execution is to perform all the reasoning on the ontology and data in the
OntModel, and then to make a copy of all its statements into a default
model, and query over the default model. Since I have several queries, I
actually extract a relevant sub-model using reachableClosure from
ResourceUtils and use this "sub" model. Both the querying and the closure
extraction are dramatically faster with the default model copy over the
OntModel (as I'm sure you all know).

However, making the copy itself now takes 95+% of the total item matching
time. Hence my question: is there a faster way to make a copy, or reference
the statements in an OntModel as if they were part of a default model?
Alternatively, is there a way of "turning off" the inference functions when
querying (I see no API for this)? My limited understanding of the InfModel
and InfGraph indicates the statements are actually held in multiple graphs
(e.g., the base graph and the deductions graph) so it probably can't be as
easy as "get a reference to the list of statements/triples".

I also tried implementing ModelChangedListener and GraphListener to extract
the statements being added by the reasoner and add those to the default
model, but these seem to only callback when new data is added from outside.
The idea was that I could add the large set of ontology-based inference
once for all at startup and just make small deltas based on new inferences
from the incoming data, but these small deltas were not reported to me.

I would be most grateful for any ideas or pointers. In any case, best of
luck and thanks!

Andy


A sketch of my current code:

// Block executed once at startup
OntModel fullModel = ModelFactory.createOntologyModel(
OntModelSpec.OWL_LITE_MEM_RDFS_INF);
// load ontology files into fullModel with something like
for (each owl file) { fullModel.add( ... ); }
// perform inference over onotlogy
fullModel.prepare();
// end startup block

...

// new data arrives
fullModel.add( [[model from data string]] );
// perform inference on new data + ontology
fullModel.prepare();
// make a copy into a default model
Model copyModel = ModelFactory.createDefaultModel();
copyModel.add(fullModel);
// extract relevant sub-model
Model subModel = ResourceUtils.reachableClosure(copyModel
.getResource("some:root:data:id:uri"));

// query uses relevant default model
QueryExection qe = QueryExecutionFactory.create(queryString, subModel);

// execSelect, process results, cleanup, wait for new data, etc.

fast model copy

Reply via email to