Dear All,
I propose this paragraph to be added to the implementation
guidelines for RDFS:
"*About implementing multiple Instantiation*
Knowledge representation models and more generally semantic
networks differ fundamentally in one aspect from data
structures, such as XML, Relational database schemata and data
structures in all programming languages, including the
object-oriented one:
·Knowledge representation starts with an item in the real world
regardless its nature, assigns an identifier to it in order to
be able to make assertions about it, and then accumulates
statements (assertions, propositions) about it.
·Data structures start with a set of templates, a set of
foreseen kinds of statements dedicated to a particular category
each (class, entity), to be filled in by a user.
Consequently, knowledge representation may assign multiple
classes to a given identifier without any problem. The
associated processing software will then allow for asserting for
this identifier all properties applicable to each assigned
class. This process is called “multiple instantiation. For
instance, the “weapon” with all its characteristics may also be
a “ceremonial object”.
A system based on data structures must create a different
instance of the respective templates for each class an item
belongs to. It may later the link the different instances
describing aspects of the same thing, in order to simulate the
mechanism. In particular the very successful “encapsulation
principle” of object-oriented programming languages requires
dedicated data structures and constitutes a fundamental mismatch
with the Open-World modeling of semantic relationships (see, for
instance Schnase 1993). Fundamental to semantic data integration
are also superproperties, which are not provided by data
structures either.
The CRM as ontology relies heavily on multiple instantiation:
Classes that use to co-occur on things simultaneously
“incidentally”, without being associated with properties only
applicable to the combination of such classes, are not modelled
individually as subclasses of multiple parent classes. The
latter would be called “multiple IsA”. To avoid multiple IsA in
such cases is an important normalization principle to keep the
ontology very compact and unambiguous.
Most implementations on top of RDF still use RDF as if it were a
fixed schema and repeat in the UI code all the schema.
Therefore, the promise of RDF and other semantic models to be
able to accommodate dynamically new properties often does not
work. It is still as if they were using Relational systems.
Generic XML editors do adapt already to the schema, but usually
the rendering paradigms they employ, without additional
parameters, are too poor for good UI code. One can however write
code that reads the RDF schema used at run-time and that extends
data entry and display by the actual properties found. This
functionality is foreseen by SPARQL, but most programmers still
do not appreciate the utility of querying the schema. Even if
fixed templates are used, the data entry system should foresee
the same thing to be described by multiple templates, relatively
freely selectable by the user.
In the specification modules of mapping software used to
transform data into a CRM-compatible form, care must be taken to
foresee and allow the user to combine RDF classes
systematically. It may be useful to develop tools for specific
guidance that show users how a valid path from a given domain
class to a certain range class can be created by using multiple
instantiation (and, by the way, also by using subclasses of the
domain class), such as combining /E41 Appellation/ with /E33
Linguistic Object/ in order to reach /E56 Language/ via /P72 has
language./
In a local system, another workaround for multiple instantiation
can be the creation of classes that replace all candidate cases
for multiple instantiation by subclasses using multiple IsA. For
good reasons, the compatibility with the CRM is defined at the
import/export/query level and not at the system internals.
Therefore, such internal workarounds do not affect the
interoperability: Whereas the query compatibility of this
solution with the standard is immediate, the respective
import/export system simply needs to make the trivial
replacements of the respective class combinations with their
multiple IsA counterparts and vice-versa.
So, partially, problems with multiple instantiation are a
question of programming practice. On the other side, it is also
a question of user training and extended good practice. Users
may provide feedback about frequent cases where multiple
instantiation is used, in order to guide users to these
modelling cases. These could systematically be entered into the
CRM RDF implementation, without requiring the CRM standard
itself to repeat them."
John L. Schnase, (1993). "Semantic Data Modelling of Hypermedia
Associations", in: ACM Transactions on Information Systems,
Vol.11,No.1, January 1993, p 45.
Comments welcome!
Best,
Martin
--
------------------------------------
Dr. Martin Doerr
Honorary Head of the
Center for Cultural Informatics
Information Systems Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas (FORTH)
N.Plastira 100, Vassilika Vouton,
GR70013 Heraklion,Crete,Greece
Vox:+30(2810)391625
Email:[email protected] <mailto:[email protected]>
Web-site:http://www.ics.forth.gr/isl
_______________________________________________
Crm-sig mailing list
[email protected] <mailto:[email protected]>
http://lists.ics.forth.gr/mailman/listinfo/crm-sig