On 01/07/15 14:33, Paul Tyson wrote:
Hi Andy, further questions below.
On Fri, 2015-06-26 at 18:47 +0100, Andy Seaborne wrote:
On 25/06/15 17:35, Paul Tyson wrote:
Hi Andy, no joy yet.
On Wed, 2015-06-24 at 22:36 +0100, Andy Seaborne wrote:
On 24/06/15 21:37, Paul Tyson wrote:
Before working through the configuration of an ontology model in
fuseki2, I wanted to ask if anyone has experience with large models.
I estimate there will be 250K class definitions, about 40M triples.
That's not that large :-)
(There are installations I've worked with x10 that many triples).
What hardware are you running on?
I just took a small sample (500 class definitions), and am running on a
Windows laptop with 4Gb memory.
My queries will be for instance checking:
select ?class
where {
_:a rdf:type ?class;
ex:p1 "v1";
ex:p2 <R1>;
Well, depending on the frequencies of
? ex:p1 "v1"
and
? ex:p2 <R1>
that query will be quite fast. i.e. if there is a property-object that
selects a few resources (100s), then that 's no much of a stress test
should work fine.
I specified OWL_MEM_MICRO_RULE_INF for the OntModel. Fuseki grinds on
this query for a long time (more than 30 minutes) and consumes all
memory and cpu.
That's because there is inference as well as storage. Ontology model
don't have to have inference.
The rule engine and TDB can interact badly, especially as the database
grows relative to the available RAM.
Is there a configuration to avoid this bad behavior?
Not really. Rules tend to walk what to the database looks like
randomly. It really needs a rules implementation that is DB-aware.
If it's just rdfs-isms, then a query engine can do a quite good job.
Which of the micro rules do you have a requirement for?
At this point I only need someValuesFrom, allValuesFrom, intersectionOf,
unionOf, hasValue, equivalentClass. I see a comment in one of the rules
files that intersectionOf is implemented procedurally. What triggers the
intersectionOf functionality, in case of a customized ruleset?
Sorry - don't know.
I also need datatype restrictions from OWL2. I started to write rules
for it, but the owl:withRestrictions facet list seems to present the
same problem as intersectionOf, as it requires iterating through a list.
I could brute-force it by writing explicit rules to cover all the
possible patterns in my data.
Regards,
--Paul
Andy
# ...all properties of _:a
.
}
Is there hope of fast performance in this scenario?
Yes.
To be clear, my class definitions are like:
<Class1> owl:equivalentClass [owl:intersectionOf (
[owl:unionOf (<A1> <A2> ...)]
[owl:unionOf (<B1> <B2> ...)]
)];
<A1> owl:equivalentClass [owl:Restriction;
owl:onProperty ex:p1;
owl:hasValue "v1"].
<B1> owl:equivalentClass [owl:Restriction;
owl:onProperty ex:p2;
owl:hasValue "v2"].
I expect to find <Class1>,<A1>,<B1> as a rdf:types of [ex:p1 "v1";ex:p2
"v2"].
Eventually I need to use OWL2 datatype restrictions, so it doesn't
appear the current Jena OWL reasoner will get me there, but I wanted to
explore the capabilities.
I need to get some other tools to validate the ontology to make sure
there's nothing pathological, but on inspection it looks OK.
Regards,
--Paul
Any other approaches for better performance?
Thanks,
--Paul