Re: InfModel performance problem

Dave Reynolds Wed, 26 Sep 2012 01:01:29 -0700

On 25/09/12 19:56, Igor Brussilowski wrote:

Hi Dave,


I modified the rule (removed triples (?a a eg:D) (?b a eg:D)) and the performance 
is increased significantly (approximately 0 in the described test). Thank you for 
reply.>

Interesting. Not sure what to make of that. You still have the problempatterns (?a ?e1 ?o1) (?b ?e1 ?o1) so still have a large cross-productqueue but just aren't doing so many joins to it.

One more question:

A SELECT query for {?s ?p ?o} on the triple store (ca. 800 triples) takes ca. 
20 sec whereby a DESCRIBE query causes an OutOfMemory exception. Is this 
bahaviour to expected?


I wouldn't go as far as to say "expected" but is conceivable.

A DESCRIBE will follow the bNode closure. Depending on the nature ofyour data and the generated inferences then this many involve a lot ofseparate calls to the InfModel. Even though the returned results may besmaller there may be a lot of work done on the way.

The backward chaining rule results are partially cached but only fortabled predicates and the caches depend on the query patterns(subsumption of query patterns is not currently tested). Which meansthat all the results from the first query are not necessarily used infuture queries.

If there is more work going on in queries than in update thenmaterialise the inference model into a plain model:


   model.add( infModel )

then issue your queries such as DESCRIBE to the materialized model.

Dave


----- Ursprüngliche Message -----

Von: Dave Reynolds <[email protected]>
An: [email protected]
CC:
Gesendet: 13:43 Dienstag, 25.September 2012
Betreff: Re: InfModel performance problem

On 25/09/12 11:58, Igor Brussilowski wrote:

  Hi all,

  I create an InfModel based on the Jena's builtin
  OWLMicroReasoner, which is extended by my own rules for evaluation of
  equality of resources based on their owl:hasKey properties like this
  one:

  # assuming eg:K is an owl:hasKey property from domain eg:D

  (?a a eg:D) (?b a eg:D) notEqual(?a ?b)
  (eg:K owl:equivalentProperty ?e1)
  (?a ?e1 ?o1) (?b ?e1 ?o1)
  -> (?a owl:sameAs ?b)

  as well the rules equality1, equality2 and equality3 taken from the ruleset

owl-fb-mini.rules.


  This model is populated repeatedly by a list of triples (ca. 800):

  // print triples count
  for (;;) {
       graph.getBulkUpdateHandler().add(triples);
       // print duration of add in seconds
       // print graph size
  }

  Could you please explain the following output? Why takes the add function

so much time?


  triples: 842
  add: 0 s
  graph size: 1408
  add: 10 s
  graph size: 1408
  add: 11 s
  graph size: 1408
  add: 10 s
  graph size: 1408
  add: 10 s
  graph size: 1408
  ...


Hard to tell but I suspect it is just performance limitations of the
current forward chainer.

Because the rules are working at the triple level and your rule has two
completely ungrounded triple patterns in it then it effectively creates
a cross product of the entire data set with itself (i.e. about 0.7M
Tuples) those then have to get joined with the other clauses.

It would be better if rule chainer treated unground triple patterns
separately and had indexing of the tuples in the join nodes.
Indeed there's lots that could be done to improve the performance of the
rule engine if anyone had time and support to do so.

Dave

Re: InfModel performance problem

Reply via email to