On 06/02/13 20:34, David Jordan wrote:
Since we are on the topic of models and graphs, I have some questions about
this. Reading in the docs online last night, I read that a Model is a graph,
but not the class Graph. I was trying to understand the distinction between
these and the cardinality and containment of the following items: Models, Sub
models, (named ) graphs, datasets.
In Jena then ...
o A Model holds a set of RDF statements, it is the normal programmers
interface. The things in it are Statements which contain RDFNodes such
as Resources. A Resource has a pointer to a model so you can do things
like ask what properties it has which makes the convenience API possible.
o A Graph is a lower level SPI intended to be easier to work with for
people implementing backends like stores or inference engines. They are
not very convenient for application programming and not intended to be.
Graphs hold Triples which are made up of Nodes. The Nodes *don't* have
any pointer to a containing graph. Graphs are a little generalized in
that they do not enforce the RDF syntactic constraints (such as
properties always being URI resources).
o In Jena it's possible to create a Model which appears to be the union
of set of other models but doesn't copy data (you can also copy data if
you want to). That's done with MultiUnion internally which in turn is
used to implement OntModels so that an OntModel act as if it contains
the union of a base model and a set of import models. The term "sub
model" is used there.
o Dataset implements the SPARQL notion of a collection of named graphs
(lower case 'g' here, graph the mathematical concept, not Graph the Jena
data structure). A Dataset has a default model and a set of named models
which are named by URIs. There is *no* requirement that the default
model be a union of the named models. With some stores like TDB it is
possible to make that so but that behaviour is orthogonal to the notion
of a Dataset.
A related question is when one indicates that one model contains another.
That's up to the application. Jena doesn't have a notion of model
containment other than the related concepts above.
It was suggested at one time that if you need to distinguish between the
specified versus inferred triples of an ontology or set of RDF statements, that
you should first store an ontology without any inferencing, presumably in a
model by itself (call it A). Then you could create a new model (call it B) that
does have inferencing, make the original non-inferenced model be a submodel,
then store the new model, which would have the inferences.
Not sure I follow why you would do this. An inference model gives you
access to the raw model and acts as if it contains all the triples in
the raw model. It doesn't make any sense to me to also have the
non-inferred model as a sub-model as well as the base model.
A few questions about this.
1. Would this result in only a single copy of each triple? i.e. would the
non-inferenced data only be stored once for model A?
If you are using OntModels as outlined above then yes only one copy.
2. Is the relationship between model A and B stored somewhere (please answer
relative to SDB)?
Depends on your code. If you create an OntModel programatically over
some models store in SDB then it is only the runtime OntModel data
structure that knows anything about the union.
If you do it by create a OWL model with import statements and store that
in the data base then you have some record of the relationships that you
could access and persist.
3. Is this a common usage pattern?
Not as described.
4. SPARQL lets you query a named graph. Is a model considered a name graph? Do
you just use the name of the model in the SPARQL graph statement?
See above and see lots of online resources explaining how to use named
models and SPARQL.
Dave