[Crm-sig] Using multiple instantiation

Martin Doerr Wed, 5 Dec 2018 18:05:20 +0200

Dear All,

I propose this paragraph to be added to the implementation guidelinesfor RDFS:


"*About implementing multiple Instantiation*

Knowledge representation models and more generally semantic networksdiffer fundamentally in one aspect from data structures, such as XML,Relational database schemata and data structures in all programminglanguages, including the object-oriented one:

·Knowledge representation starts with an item in the real worldregardless its nature, assigns an identifier to it in order to be ableto make assertions about it, and then accumulates statements(assertions, propositions) about it.

·Data structures start with a set of templates, a set of foreseen kindsof statements dedicated to a particular category each (class, entity),to be filled in by a user.

Consequently, knowledge representation may assign multiple classes to agiven identifier without any problem. The associated processing softwarewill then allow for asserting for this identifier all propertiesapplicable to each assigned class. This process is called “multipleinstantiation. For instance, the “weapon” with all its characteristicsmay also be a “ceremonial object”.

A system based on data structures must create a different instance ofthe respective templates for each class an item belongs to. It may laterthe link the different instances describing aspects of the same thing,in order to simulate the mechanism. In particular the very successful“encapsulation principle” of object-oriented programming languagesrequires dedicated data structures and constitutes a fundamentalmismatch with the Open-World modeling of semantic relationships (see,for instance Schnase 1993). Fundamental to semantic data integration arealso superproperties, which are not provided by data structures either.

The CRM as ontology relies heavily on multiple instantiation: Classesthat use to co-occur on things simultaneously “incidentally”, withoutbeing associated with properties only applicable to the combination ofsuch classes, are not modelled individually as subclasses of multipleparent classes. The latter would be called “multiple IsA”. To avoidmultiple IsA in such cases is an important normalization principle tokeep the ontology very compact and unambiguous.

Most implementations on top of RDF still use RDF as if it were a fixedschema and repeat in the UI code all the schema. Therefore, the promiseof RDF and other semantic models to be able to accommodate dynamicallynew properties often does not work. It is still as if they were usingRelational systems. Generic XML editors do adapt already to the schema,but usually the rendering paradigms they employ, without additionalparameters, are too poor for good UI code. One can however write codethat reads the RDF schema used at run-time and that extends data entryand display by the actual properties found. This functionality isforeseen by SPARQL, but most programmers still do not appreciate theutility of querying the schema. Even if fixed templates are used, thedata entry system should foresee the same thing to be described bymultiple templates, relatively freely selectable by the user.

In the specification modules of mapping software used to transform datainto a CRM-compatible form, care must be taken to foresee and allow theuser to combine RDF classes systematically. It may be useful to developtools for specific guidance that show users how a valid path from agiven domain class to a certain range class can be created by usingmultiple instantiation (and, by the way, also by using subclasses of thedomain class), such as combining /E41 Appellation/ with /E33 LinguisticObject/ in order to reach /E56 Language/ via /P72 has language./

In a local system, another workaround for multiple instantiation can bethe creation of classes that replace all candidate cases for multipleinstantiation by subclasses using multiple IsA. For good reasons, thecompatibility with the CRM is defined at the import/export/query leveland not at the system internals. Therefore, such internal workarounds donot affect the interoperability: Whereas the query compatibility of thissolution with the standard is immediate, the respective import/exportsystem simply needs to make the trivial replacements of the respectiveclass combinations with their multiple IsA counterparts and vice-versa.

So, partially, problems with multiple instantiation are a question ofprogramming practice. On the other side, it is also a question of usertraining and extended good practice. Users may provide feedback aboutfrequent cases where multiple instantiation is used, in order to guideusers to these modelling cases. These could systematically be enteredinto the CRM RDF implementation, without requiring the CRM standarditself to repeat them."

John L. Schnase, (1993). "Semantic Data Modelling of HypermediaAssociations", in: ACM Transactions on Information Systems, Vol.11,No.1,January 1993, p 45.


Comments welcome!

Best,


Martin

--
------------------------------------
 Dr. Martin Doerr

 Honorary Head of the
 Center for Cultural Informatics

 Information Systems Laboratory
 Institute of Computer Science
 Foundation for Research and Technology - Hellas (FORTH)

 N.Plastira 100, Vassilika Vouton,
 GR70013 Heraklion,Crete,Greece

 Vox:+30(2810)391625
 Email: [email protected]
 Web-site: http://www.ics.forth.gr/isl

[Crm-sig] Using multiple instantiation

Reply via email to