Dear Steve,
thanks for your feedback and sorry for not coming back to you earlier
but I was on vacation until just the other day.
I have committed an update to OntoNet that should address your inquiries:
- addOntology() on spaces and sessions now returns the String that you
can use as a key to identify the ontology in the OntologyProvider (or
the graph in the TcManager if you create a UriRef from it).
- you can export scopes, spaces and sessions as Clerezza objects if
needed - does not give you the OWL-oriented view on the graph but can
save much computing power. I will probably employ it on the REST API
- you can supply the TcProvider to the GraphContentInputSource. If it is
the same as the TcManager singleton instance, we skip copying all the
triples to yet another Graph. Should take considerably less time; on the
other hand it prevents from using this method to *update* graphs. Note
that there are protected binding methods in OntologyInputSource
implementations for triple providers, physical IRIs etc.
- other minor optimizations
It would be great to share a benchmarking method to assess network
scalability. So far I have managed to load a 200MB RDF/XML graph using a
256MB VM without out-of-memory errors.
Also thanks for the post on the IKS blog (I am telling you here because
I don't know if you and Martin are following an IKS mailing list)! I am
working on an adopter-oriented one, and it would be great to include an
overview on the Acuity experience with Stanbol-Fedora - what it does and
what benefit it gets from Stanbol. Would you like to share?
Unfortunately, I have been able to tell only my side of the story so
far, as the link at [1] keeps timing out on me :(
Thanks a lot, keep up the good work!
Alessandro
[1]
fedora-stanbol.acuityunlimited.net:18080/orbeon/stanbol-fedora/data-browser
On 12/30/11 6:08 PM, Stephen Bayliss wrote:
Hi Alessandro
Thanks very much for your responses.
Dear Steve,
On 12/19/11 6:22 PM, Stephen Bayliss wrote:
Our use-case is thus:
1) Create OntologyContentInputSource(stream)
Perhaps you're better off with a
GraphContentInputSource(InpuStream), so
it won't have to go through the burden of converting from
OWLOntology to
Graph just in order to store it (everything is stored as
Clerezza graphs
anyhow). Note that OWLOntology exports of scopes, spaces and
ontologies
within is possible regardless of the input source (although it is THE
bottleneck of the current implementation, see my comment to
STANBOL-433).
I'm now adding the possibility to specify the TcProvider in the
GraphContentInputSource constructor. This should also save
the burden of
copying the triples from the in-memory SimpleGraph to the
Graph stored
in the TcManager (IF you pass the TcManager singleton as TcProvider).
Great, we'll take a look at the GraphContentInputSource and the TcProvider
constructor argument.
- as our content is behind authentication, the stream
is provided
by an HTTP client
- the content has an identifier (URI) assigned by the external
system (independent of the contents of the stream/ontology)
2) Load OntologyInputSource into the space with
CustomOntologySpace.addOntology(...)
3) When updated content comes along:
- remove the original (from the store as well as the space)
- add the updated content
As the OntologyInputSource was created from a stream, it
doesn't have
a physical IRI (I think?),
correct
Actually logically it does have a physical IRI - the one that our HTTP
client sourced the input stream from - so if there was an option to specify
the physical IRI somehow, maybe this would in fact do the job?
so at (2) we don't have a "KReS identifier" for it
- so if we want to replace the ontology in the future with
an updated
version I can't see currently an easy way of determining which
ontology to remove from the space and then delete it prior
to adding
the updated content.
if the ontology is named (i.e. it does have logical IRI even
if not a
physical one), you could simply call
OntologyProvider#getKey(logicalIRI), but if you would like something
simpler... see my next comment below.
I can list the graph keys through the OntologyProvider; but I think
what I need is to know (or be able to set?) the key when adding it?
Would it be enough if this key were the return value of
addOntology() ?
If there's no logical way of passing in an identifier that we wish to use
for the graph, then I think this would do the job; we can maintain our own
map/index of the graph keys vs the content provider's URIs for these graphs.
Also I can see that if I get the TcProvider I can do a
.deleteTripleCollection(UriRef ref) - how would this UriRef link in
with the above (when I look at the identifiers of the ontologies
retrieved using the the keys from listGraphs, these are
"Anonymous-xyz" and don't have an IRI).
I'll have to look into this one, fortunately I've still got
some time on it.
Great, thanks!
All the best,
Alessandro
--
M.Sc. Alessandro Adamou
Alma Mater Studiorum - Università di Bologna
Department of Computer Science
Mura Anteo Zamboni 7, 40127 Bologna - Italy
Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council (CNR)
Via Nomentana 56, 00161 Rome - Italy
"As for the charges against me, I am unconcerned. I am beyond
their timid, lying morality, and so I am beyond caring."
(Col. Walter E. Kurtz)
--
M.Sc. Alessandro Adamou
Alma Mater Studiorum - Università di Bologna
Department of Computer Science
Mura Anteo Zamboni 7, 40127 Bologna - Italy
Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council (CNR)
Via Nomentana 56, 00161 Rome - Italy
"As for the charges against me, I am unconcerned. I am beyond their timid, lying
morality, and so I am beyond caring."
(Col. Walter E. Kurtz)