Re: Overhead for instantiating small models and reasoners?

Dave Reynolds Wed, 04 May 2016 00:47:54 -0700

Hi David,

On 04/05/16 00:49, Martin, David wrote:

Hi all,


I'm building a system that instantiates small, in-memory rule reasoners, and 
uses each reasoner very briefly to get a small set of inferences. It does this 
repeatedly and frequently. In other words, it instantiates many different 
reasoners over time, and uses each one to handle a distinct problem..

Each reasoner involves a default model of only a few dozen triples, used with a 
GenericRuleReasoner that loads 50-100 forward rules, created in a very normal 
way:

    Model contextData = FileManager.get().loadModel(contextPath, "N3");
    List rules = Rule.rulesFromURL("file:C:/test.rule");
    Reasoner reasoner = new GenericRuleReasoner(rules);
   reasoner.setParameter(ReasonerVocabulary.PROPruleMode, "forward");
    Infmodel inf = ModelFactory.createInfModel(reasoner, contextData);

Typically, the reasoning will add roughly 10 triples and sometimes will delete 
a few triples. I use about 10 calls to inf.listStatements to retrieve the 
inferred triples, and then I'm done with that reasoner.

These reasoning problems are all independent of one another, so no opportunity 
for reuse of the contextData . However, the GenericRuleReasoner is reused.

Naturally, since we want to support many and frequent reasoning processes like 
the above, we are concerned about the overhead (speed and memory, with speed 
the more important of the two). I don't really have any idea how much 
bookkeeping and infrastructure is instantiated with the creation of a model or 
Infmodel.  I would appreciate comments or pointers that may shed light on this.

I've read through the various comments on performance in 
https://jena.apache.org/documentation/inference, but they don't address my 
situation. I'm concerned primarily about the overhead associated with *creating 
new models and reasoners*, since my system does that so often.

The system wasn't really designed with this kind of use case in mind sothere aren't many hooks to control this.

Roughly there's three chunks of work involved - parsing the rules,building the internal rule data structure and running the rules. Youwant to try to save the first two.

Reusing the GenericRuleReasoner instance will mean you don't reparse therules each time, so that's easy.

The internal data structure is actually part of the InfGraph (or ratherthe engine instance associated with InfGraph). However, the reasonerimplementation plays tricks by creating an internal dummy InfGraph whichacts a cache of the built engine. So I *think* just reusing the sameGenericRuleReasonser instance is also enough to reuse the engine datastructure.

The overhead of creating a wrapping InfGraph is small so creating a newInfGraph each time should be fine.

It would be possible to avoid that small overhead by keeping oneinstance of the contextData model, one InfModel built over the top,injecting new data "behind the reasoner's back" by clearing thecontextData model and adding the new data to it then call inf.rebuild()to tell the reasoner what you've done. My guess is that the savingsdoing that would be negligible but the only way to be sure is to measurethem.

If performance is an issue for you then the other thing to consider isthe overhead of working with Models v.s. Graphs. Internally Jena storesdata as Triples in Graphs both of which provide minimal interfaces. Themore convenient interface of Statements and Models is implemented bycreating wrapper objects. So the reasoners work at the Triple level butif you use an InfModel and do listStatements a new Statement object willbe created for each Triple. In practice object creation (or rather, theassociated GC overhead) in modern java is so good that the cost of thisis highly likely to be trivial compared the cost of doing any reasoningin the first place. However, again this is something you could measure.


Dave

Re: Overhead for instantiating small models and reasoners?

Reply via email to