Re: Reasoning profiles and performance

Dave Reynolds Tue, 20 Oct 2020 00:43:13 -0700

tl;dr I'm afraid there's no support for pre-materialized butincrementally updated OWL reasoning over persistent models in Jena.


On 20/10/2020 02:00, Zalan Kemenczy wrote:

Hi there,


I've been experimenting with various reasoning profiles to get better
performance in my project, and I'm looking for guidance on how to push
materialization towards data creation time, rather than query time. I
believe this would suit my use-case: data mutations are infrequent (they do
happen), but queries need to be fast.

Some additional configuration context:

1. Loading some owl ontologies (~80k triples) into an in memory model
2. Bind it to a OWLMicro reasoner using bindSchema
3. Bind the reasoner to a TDB2 backed model

With just the ontologies loaded, even before I loaded any instance data,
this led to fairly slow queries (> 1 min). To confirm the issue was with
the inference layer, I serialized all the ground and inferred triples to a
second tdb2 backed dataset, with no inference layer, and my reference
queries were much faster (~100ms) and returned identical results.

Reading the docs, I gathered I could try to push materialization towards
data mutation time with a combination of forward reasoning and prepare. The
inference doc had this to say about the GenericRuleReasoner:

"When run in forward mode all rules are treated as forward even if they
were written in backward ("<-") syntax. This allows the same rule set to be
used in different modes to explore the performance tradeoffs."

If you just have a plain rule set with simple A -> B or B <- A rulesthen you can indeed run the rules in either direction. However, if youhave a hybrid rule set which mixes directions, and in particular usesforward rules to create instantiated backward rules, that won't work -those intrinsically need the hybrid engine.

However, I've run into a couple of issues:

1. If I setMode on the OWLMicroReasoner too FORWARD, I get the following
exception when I try to bind the reasoner to a graph:

org.apache.jena.reasoner.rulesys.BasicForwardRuleInfGraph cannot be cast to
org.apache.jena.reasoner.rulesys.FBRuleInfGraph

Due to the following line:(
https://github.com/apache/jena/blob/bfce1741cb12f9cf544235d32fba6598bc7341b5/jena-core/src/main/java/org/apache/jena/reasoner/rulesys/OWLMicroReasoner.java#L94
)

Yes the OWLMicroReasoner is intrinsically a hybrid ("FB") rule reasonerand that can't be changed.

2. If I use a GenericRuleReasoner, loaded with OWLMicro rules set to
FORWARD mode, I can bind the reasoner, but then I get the following
execution error at query execution time:

Forward reasoner does not support hybrid rules - [ (?x owl:intersectionOf
?y) -> (?x rdf:type owl:Class) ]

Which I don't understand because that does not seem like a backward rule.

As noted above the OWLMicro rules are hybrid and can't be run via a pureforward engine.

That particular example looks like a plain fact and forward mode shouldsupport that, possible a bug. However, those cases are the least of theyour worries, it's true hybrid rules like:


[inverseOf2: (?P owl:inverseOf ?Q)
    -> table(?P), table(?Q), [inverseOf2b: (?X ?P ?Y) <- (?Y ?Q ?X)] ]

that simply have no pure forward equivalent.

So to sum up, I have two questions:

1. What would be your recommended approach to pushing materialization to
data creation time


I'm afraid there's no good support for this in jena.

If the rate of data updates is very low compared to the rate of queriesthen you could re-run the entire materialization from scratch each timethe data changes. Unsubtle and slow at materialization time but querieswould then be faster.

If the rate of data updates is high but your data fits in memory thenuse the in-memory reasoner and let it's (limited) incremental reasoninghandle the changes.

But I'm afraid Jena has no support for incrementally updating inferenceresults when the data is beyond memory limits and persisted to e.g. TDB.

2. How would you create forward rules reasoner that implements OWLmicro, or
closest to

You would need to write a custom pure-forward ruleset to implement theaxioms you want, perhaps starting from etc/rdfs-noresource.rules andadding the relevant OWL axioms. Depending on which axioms you wantperformance may or may not be problematic and the pure forward enginewill still hold all it's data in memory so that won't scale any better.


Dave

Re: Reasoning profiles and performance

Reply via email to