Hello Fabian,
Sorry for the delay, we have been very busy getting ready for OWLIM 5.0
Let me answer your questions inline:
On 15/03/12 08:42, Fabian Cretton wrote:
Hello,
I've been moving from OWLIM Lite 4.3 to OWLIM-SE 4.3 lately, and
started to play with the wordNet ontology.
I am doing those first trials on my desktop computer, Windows 7, 8GB
Ram, 3GB allowed to TomCat, hoping this small configuration is ok for
tests.
About repository configuration:
When doing some pretty heavy SPARQL queries on an ontology as WordNet,
with ?p variable for properties which could be any of the wordnet
property, is the "Use predicate indices" of any help ?
The documentation says "One should consider using this index for
datasets that contain a very large number (~1000) different
predicates." WordNet has less than 100 predicates.
If it does not help, could it make things worse ?
Not really, the only downside will be a slight increase in load time and
an increase in storage space required. I suggest you try with predicate
lists turned on to see if this helps. It will make most difference when
you have queries that have a single triple pattern with an unbound
predicate, e.g. SELECT * { ?s ?p "some_object_value }
About the rule-set:
I first created the repository with no rule-set. I did load WordNet
with, of course, no inference. Than, using SPARQL Update I changed the
repository rule-set to "owl-horst-optimized". As expected, nothing
happened as the inferences are carried out at load time. Then I did
reload all the files but couldn't find any inferred triple. I had to
delete/recreate the repository with the rule-set from start, and then
the inferences where done. Any idea if I might have done something
wrong or if there is something to be careful about ?
Changing the rule-set is always troublesome and I recommend not to if at
all possible. A problem with using tomcat/http repository is that
dropping a repository does not actually delete the storage files. So if
you then change the configuration and reload with the same repository id
then it wil confuse the inferencer, because it will see that each
statement is already loaded and assume that no inference needs to be
done. I suspect this is what is happening in your case, so any easy
check would be to drop the repository, recreate with the new config and
then immediately clear it. After that I would expect things to work
correctly.
About inferred triples:
If I remember well, OWLIM 3.5 had an 'implicit' context where I could
find all inferred triples. Is it correct that this no more exists with
vers 4.3 ? (I saw in the user guide how to query the explicit/implicit
triples, it is just a question for understanding).
This feature is still present in 4.3:
http://owlim.ontotext.com/display/OWLIMv43/OWLIM-SE+Query+Behaviour#OWLIM-SEQueryBehaviour-ManagingExplicitandImplicitStatements
In reality, inferred statements belong to the database's default graph.
However, they can be filtered by using some special graph names. We call
these pseudo-graphs, because they are not really graphs, rather just a
means to switch on/off certain query answering behaviour.
Then I have a question about what happens on disk:
I had my repository loaded with WordNet T-Box + A-Box. Then I changed
the ontology with Protégé, adding a new property as sub-property of
all the wordNet properties. Then, to update my T-Box, I did simply
load the new file in the same context, without first removing the old
one. Is it the correct way to do it ?
Probably best to delete the T box first, otherwise there will be some
overlap and possible consistency problems.
My size on disk did of course increase. My Triples count did rise from
about 5Mio to 6Mio.
Then I removed the T-Box, which took 2 hours on my desktop computer (I
guess it is something to expect), and the triples drop from 6Mio to
2Mio (no more inference). But what is unexpected here, is that the I
would think the hard drive storage folder would also come back to the
original size without inference, but on the contrary, it did still
increase a little. So a repository with no inference was about 150 MB.
With inferences it is around 600MB. And after doing those
manipulations and removing the inferences it was arounnd 900MB (but I
was expecting around 150MB). At first glance, I think "pso" and "pos"
where much bigger than the "clean" ones.
I think in this case you just have a lot of unused pages in your index
files. I'm not sure if there is any advantage for OWLIM to offer a
clean-up function, i.e. compact pages in use to the beginning of the
index files and truncate the rest.
At least it is not something that anyone has asked for before.
Thank you once again for any help
Fabian
I hope that helps. All the best,
barry
_______________________________________________
Owlim-discussion mailing list
[email protected]
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion