Re: reasoning overhead

Dave Reynolds Thu, 28 Mar 2013 13:42:11 -0700

Hi David,

On 28/03/13 19:47, David Jordan wrote:


I would like to confirm my understanding about reasoning in Jena, as well as 
ask whether the Pellet reasoner does things different.

When I create an OntModel, there is essentially no overhead, this is very fast. 
With the very first use of the OntModel, it takes considerable time to produce 
a response. Once I get that response, is ALL the reasoning been completed for 
the entire OntModel? Subsequent calls are faster, suggesting that there is much 
less work being done. Is ALL reasoning done with the first call, or is there 
additional lookup/reasoning done with subsequent calls?


The latter.

The rule engines use a mix of forward and backward reasoning.

The forward reasoning can be run any time you like by manually callingprepare() but if you issue a query to an inference model which is notyet in a "prepared" state then the prepare() will be triggeredautomatically. This forward phase just depends on the rules and thedata, not any query.

The results of forward reasoning (and indeed all the intermediate state)are stored in memory.

Then for each query a number of backward rules might match to generateadditional inferences. Sometimes queries can reuse partial goal resultsfrom earlier queries (through a sort of caching mechanism calledtabling) but not not always. Typically there will always be somebackward reasoning run for every query.

Does Pellet operate the same way, when it is being used with Jena? Does it do 
all inferencing at once, or in a more lazy eval fashion?

You would have to ask on the Pellet list but I would expect that asubstantial part of the reasoning will be done in the equivalent of aprepare phase, i.e. eager.

I am doing some benchmarking, there is one piece of code that is running orders 
of magnitude slower than everything else. The code is as follows: (This is 
using SDB with the latest version of Postgres)
                OntClass oclass = omodel.getOntClass(GRAPE);
                ExtendedIterator<OntProperty> properties = 
oclass.listDeclaredProperties(true);
                while( properties.hasNext() ){
                        OntProperty property = properties.next();
                        System.out.println(property.getLocalName());
                }

I have run the command to create the indexes on the data. Is this expected to 
be real slow?


Yes.

Reasoning running over a database is always slow and does you no good atall. The reasoners typically need to touch all the data and need to keepintermediate results in memory so there is essentially no benefit tohaving the base model in a DB, much better to load it into memory and*then* do reasoning.

Compounded with this SDB is slower than TDB in general, and postgres isnot necessarily the fastest back-end for SDB.

Compounded with this listDeclaredProperties has to do a lot walking overclasses, properties asking about domains and ranges and does all thatusing separate Jena API calls. Which translates into a lot of smallqueries. SDB is (as the S suggests) has better performance trade-offswhen doing fewer more substantial SPARQL queries rather than lots oflittle API calls.

If you can arrange it the best way to use reasoning in combination witha database is to load all that into memory (maybe on temporary bigmachine), compute all the inferences you are interested, then store to adatabase a merge of the data plus the inferences (probably as twoseparate graphs in a union-default store). Then in your applicationquery this closure using a non-inference model.

Of course of your application updates the data and you need to seeinferred consequences of that update then this strategy doesn't work.


Dave

Re: reasoning overhead

Reply via email to