Re: [Crm-sig] clarification needed

martin Thu, 10 Jul 2008 15:47:07 +0300

Guenther Goerz wrote:

Dear Martin,


Because your last comment is shown somewhere in the text below, let me
answer here.  It's good to hear that my interpretation seems to match
with your intention, but then I don't understand the question in your
second paragraph.

1. Yes, "constant" is the term I used --- as opposed to "variable" ---
to explain the term "nominal", which may not be familiar to people
unfamiliar with description logics, to indicate that I talk about a
linguistic object which is used as an identifier for an individual, or
a "particular", if you prefer that.


In this case, an instance of Type cannot be a "constant". Instances of Type
are the concepts meant by the terms themselves, not the identifiers. Identifiers
in the CRM are instances of Appellation. This conforms with SKOS. Compatibility
of the CRM with SKOS is a desirable feature.

There are neither "Variables" in the CRM.

Yes, we prefer "particular".

For general terms of logic, my
terminological references are works as the (online) Stanford
Encyclopedia of Philosophy or the Routledge Encyclopedia, not some

coding experience you may associate.

I wonder if this Encyclopedia regards "constants" and "variables" as
elements of an ontology. Are you sure? Please cite!

Just a side remark: If you
suggest "particular" as a central CRM term, there should be a separate
entry for it in the glossary.  In 4.2.4, it is just mentioned in the
entry on "universal".

Good point. This is an ISSUE!


2. Perhaps you could be so kind to explain in more detail what you
mean by a mediation system where some instance of a table "agent"
would have type "artist"... etc. It seems to me that you are referring
to a relational data base with a schema which is not clear to me by
your short remark. I'm not sure whether you are asking to introduce
redundancy, which could be done, of course, but I don't see why.


With "mediation" system I mean what Gio Wiederholt described generally,
in particular here the problem to integrate typical Relational systems,
as we have seen dozens or hundreds, under a global model in a Knowledge
Representation system similar to RDFS. The Relational schema has a Table
"agent" with a field "type" and the value in some record for "type" is "artist".
I refer in particular to Local As View integration (LAV).


My scenario was the following:

We associate a (domain ontology) class Artist with the CRM as a
subclass of E21 Person (just as FRBR classes are associated to CRM
classes) and we can then generate instances of this class, e.g. the
Artist "Vincent". We could modify that, of course, by instead make the
class Artist a subclass of E39 Actor.  If this is your argument, I
agree --- it's an improvement.  So, the Artist "Vincent" is an E39
Actor and as such an E21 Person.  Furthermore, we generate an instance
"Artist" (notice the string quotes) of E55 Type which is a term taken
from a thesaurus (let it still be WordNet).  So, the example has to be
modified in that now E39 Actor "Vincent" P2 has type E55 Type
"Artist". I don't see the need for any further changes, in particular
not to express something IN the thesaurus --- did you want to say
BETWEEN the thesaurus and the CRM??   The subsumption relation in CRM
in combination with the property P2 should be sufficient: From
"Vincent" we could get by the inverse relation of P2 is type of to E55
"Artist" and by the CRM subsumption hierarchy to E39 Actor and E21
Person, etc. So, there is no redundancy, but, as I said, to keep
semantic integrity --- i.e. that the meaning of the domain class
Artist and the E55 Type "Artist" coincide --- is the business of the
user who decides to use the E55 Type machinery.


Good. In that case, we have a duplication of all terms in the thesaurus with
corresponding classes specializing the CRM. If you suggest this as an 
implementation,
for UMLS for instance you have to create 5 million classes with some IsA
cycles in it. Why creating the subclass "artist" when you have the thesaurus?


Now you brought in a new term, "agent", as the identifier of some
(relational DBS) table.  Assuming that "agent" is also in the
thesaurus, you could of course generate another instance of E55 Type:
"Agent".

If "Artist" in the thesaurus is a narrower term than "Agent", we can
say that the thesaurus term "Agent" applies also "Vincent" because it
is a more general term than "Artist" to which Vincent is already
related by P2-inverse is type of: Referring to the thesaurus, whatever
we call an artist can also be called an agent. This should at least
cover a part of your question, but maybe I missed something due to the
brevity of your remark.  If the term "Agent" is not in the thesaurus
(it certainly is in WordNet...), we have to find another solution; in
this case we could look whether there is a means to express that it is
a synonym to some term in the thesaurus --- then the machinery works
as well.

I actually meant that the mediator makes a mapping process, and
that "agent" maps to "E39 Actor". Then, "type" is either mapped to E55 Type,
or interpreted as a subclass statement for E39 Actor. There is a bunch of
literature about mediation systems.


3. Decidability: No, sorry, you are wrong. There is of course an
equivalent of undecidability in procedural languages (and Turing
machines --- this was Turing's point in his 1936 paper): endless loops
or infinite recursion, resp.


Of course. Sorry for my short-hand. I have 20 years programming practice.
No reasonable programmer would let a procedure in an endless loop.
This is regarded as a bug in any good practice. My argument was, that
you can control the cases of undecidability in procedural code, and I still
insist on that. In that case, it is a question of the middleware, and not a
problem I have to circumvent compromising a semantically correct representation.

For further details, see David Harel's
nice book "Computers Ltd.: What They Really Can't Do" or some other
textbook in Theoretical Computer Science on the halting problem.  That
is the reason why I have been insisting on description logics (or
OWL-DL), because for any sentence to prove or query to answer the
reasoner stops and comes up with an answer. Provably, this is not the
case for full first-order logic and of course for any more powerful
languages (like those with metaclasses and any other higher order
constructs).  And, although it seems to be just a theoretical issue,
it is of utmost and immediate importance for practice (and
practitioners): What does it help if someone can express whatever he
wants but there are (provably) no means to decide whether it holds or
not?

Of course it helps to know that there is no answer. Since when are questions
forbidden that have no answer? Most historical questions have no answer.
I thought, the argument was that the reasoner
crashes. Still I would like to see a real live example. The source Vladimir
pointed to actually describes narrower necessary conditions for undecidability
than the set of necessary conditions you describe. What I am looking
for is if a specific set of arguments we want to make on a metaclass level
leads into undecidability. Obviously, not all questions on metaclasses do that.

Best,

martin


Best regards,
-- Guenther

On 7/1/08, martin <[email protected]> wrote:

Dear Guenther,


 Guenther Goerz wrote:

Dear Christian-Emil,

it's pretty hard to get to your point because your text is truncated in
the middle of a sentence at the end of the first paragraph.

In your first sentence, you are referring to my "first stroke
paragraph" which is, I think, the one in the <quotation> starting with
"The usual way to attach concepts..."  What I am talking about here is
to generate an "extension" of the CRM --- if at all --- only in the
sense of attaching a domain ontology, i.e., concepts of a domain
ontology to the (reference ontology) CRM as it is common in "ontology
engineering".  It's just in the same fashion as FRBR concepts are
connected to CRM concepts.

For the following, there is no need to refer to OWL-DL at all --- it's
only because the quotation was taken from a paper about the OWL-DL
implementation of CRM.  Just think of First-Order Logic, or
preferrably, a decidable subset of it.

Now, for the "E55 type hierarchy": In my copy of the CRM document,
v.4.2.4, it is mentioned in the introductory paragraph "On Types" ---
the one we are discussing about --- on pp. 17f. and furthermore on
pp. 51 (in the section on E55), 65, 74, 75.  In none of these places I
found that a type hierarchy under E55 Type *SHOULD* have a subclass
for each of the classes of the CRM.  Maybe I missed something, so
please direct me to the proper text location.  It could have, of
course, in particular cases.  Whether there are subclasses (for
each??) of the classes of the CRM will in my opinion depend on the
purpose of the particular modelling activity.  (Just a side remark
w.r.t. infinite recursion: If this is serious, I think it is a
conceptual bug and not a feature and it should be fixed as soon as
possible.  This is another argument in favor of my remark that the
text proposed to vote on is not technically mature.  And, by the way,
referring to your earlier remarks about dissemination of the CRM: How
would you explain to a practitioner that this makes sense, and if it
does, what he could gain from it? I think my proposal would not allow
infinite recursion in the sense that the "has-lexconcept" property
mentioned below would have an inverse property pointing back to the
one and only exemplar of the CRM.)

Representing domain classes as subclasses of CRM-classes was my first
idea as well.  And it will work fine regarding subsumption and
inheritance.  In fact, reasoning can be done intensionally (i.e., on
the formal expressions by which the classes are defined) as well as
extensionally (i.e., on the set of all individuals belonging to a
class).  But we have to be careful with E55 Type because the E55 class
has in my view been conceived to provide a weak form of reification.
If we want to keep a decidable version of the CRM we may not allow
full reification (as in unrestricted RDF), because otherwise we could
generate paradoxes with is due to its ability to express
self-reference.  Now, for my mentioning of "contradiction" in the last
sentence of the resp. paragraph which I think is what you refer to by
"why this causes inconsistency": If we allow "artist" to be a subclass
of E21 Person and at the same time to be a subclass of E55 Type, we
are in trouble: E55 Type is a subclass of E28 Conceptual Object which
is something immaterial, whereas an E21 Person is something material.
So, we would have the artist Vincent, being material and immaterial at
the same time.  This is in fact Martin's argument in the discussion of
the last day of the Heraklion workshop in May (which you missed, if I
remember correctly)

 To my understanding, the solution you describe below is exactly the current
state of
 the CRM. I do not prefer the term "constant", because it does not exist
 in the CRM, and comes from an encoding perspective, but rather use the
 term "particular".

 Note, that in a mediation system, the fact that an instance of a table
"agent"
 would have type "artist" may be used to decide that the instance is not
only an
 instance of Actor but also of E21 Person. In order to do so, we must
express
 a relationship in the respective thesaurus between "artist" and E21 Person.
 How would you do that?

 About the issue of decidability: To my understanding this issue occurs
only,
 if a declarative language is used. Procedural implementations would not
have
 this problem. The CRM however does not make any assumptions about the
implementation
 method.

 .

Therefore, another solution has to be provided, which I did in my
second "stroke paragraph", beginning with "Instead, a constant
"Artist" may be used..."  So we have the E21 Person Vincent which P2
has type E55 Type "Artist" (a nominal, i.e. a constant, not a
subclass).  In this representation, we can reason with and on terms
without problems, using the term hierarchy (which may be called an
"E55 Type hierarchy").

To avoid misunderstanding, let me point out what I would understand by
a "E55 Type hierarchy": Take WordNet as a thesaurus --- cum grano
salis, just for the sake of its easy availability --- , and take
furthermore my "artist" example.  The hyperonymy in WordNet provides
"creator" as a broader term to "artist" and "person" as a broader term
to "creator".  Narrower terms to "artist" are "painter", "sculptor",
etc.  In general, I would not claim that such a term hierarchy is a
priori a class hierarchy in the sense of CRM and therefore I would
hesitate to merge both.  I am aware that some people tried to turn
WordNet itself into a formal ontology, but this is another story.
Here I will hold up the claim that the super-/sub-class relation in
the CRM is not *IDENTICAL* to the broader/narrower term relation in
WordNet.  Now we could "navigate" in the CRM class hierarchy
(i.e. perform subsumption inferences) and we could navigate in the
WordNet hierarchy, but to mix both needs further justification.  First
of all I think that a naive combination of both would make problems.

But there is a more sophisticated way to combine both, namely the one
I described in the third paragraph starting with "both representations
are not mutually exclusive..."  Let me repeat that in this case the
semantic integrity is within the user's responsibility.  In the last
paragraph of my last mail I referred to a technical solution we found
to do such a hybrid navigation by introducing a special property
"has-lexconcept" (and its inverse).  With CRM, we would instead have
E55 Type as the "interface" between CRM classes and terms from some
thesaurus related by the property P2 has type (is type of).  Instances
of CRM classes as E21 Person represent (domain) objects, whereas the
WordNet entries just represent words (terms) and their use.

With this background, let me try to understand your T21 example: Let's
connect E21 Person via P2 has-type E55 Type to the WordNet term
"person".  Making the transisition to WordNet, we could then navigate
to the subconcepts I mentioned, like "artist" (or do you mean
something different by a "person thesaurus"??).  Then we would find
that the term ("lexical concept" as we use to call it, because in
general it represents a synonym set which is an equivalence class)
"artist"/WordNet is a narrower term of T21 "person"/WordNet. We might
call the former T21-1 and we could use it to say by means of P2 that
some E21 Person instance P2 has type E55 Type (T21-1)
"artist"/WordNet.  Did I get you right???

Best regards,
-- Guenther


On 6/22/08, Christian-Emil Ore <[email protected]> wrote:

Dear Günther,
 One does not need to extend the CRM to get your first stroke paragraph.

The

CRM states that the class hierarchy under E55 type should have a sub

class

for each of the classes in the CRM.  This, of cause, opens for an

endless

recursion but that is not the point here. So for the class E21 Person

there

exists a sub...subclass of E55 Type, let us call it T21. The intention,

at

least according to my understanding, is that the terms in a Person

thesaurus

should be mapped to a type-(in the CRM sense) hierarchy under this type,
T21, corresponding to the formal structure of the thesaurus. An instance

of

E21 Person can then be connected via P2 to an instance of a subclass in

this

hierarchy or to an instance of T21 itself. Due to the generalisation
mechanism in the CRM all P2 insta

 The latter case corresponds to your artist case. Could you please

explain

to me without referring to rdf/owl why this causes inconsistency I just
don't understand your line of arguments.

 Regards,
 Christian-Emil



 On 13.06.2008 15:24, Guenther Goerz wrote:

Dear all,

As an attempt to clarify the problem of modelling alternatives with
CRM-types --- this is the term I will use to distinguish it from other
uses of "type", as e.g. in computer science --- let me start with
quoting a section from the paper I submitted to the CIDOC 2008
conference.  In order to avoid higher-order (logic) constructs which
in my view are probably hard to comprehend for practitioners anyway,
without excluding a weak form of reification completely, I suggested
two ways of representation:

<quotation>
``... E55 Type has been implemented as a class which ---
for the purpose of reasoning on the conceptual level --- may serve as
an interface to external concepts of formal domain ontologies (or
thesauri) as subclasses or as constants.  In fact, at least two
different representations are possible:

- The usual way to attach concepts of a domain ontology to the CRM is
 direct subclassing, e.g., the (application domain) class Artist as a
 subclass of E21 Person.  So, ``Vincent van Gogh'' would be an
 instance of Artist and inherit all properties of E21 Person.  In
 that case to represent Artist also as a subclass of E55 Type would
 lead to contradictions.

- Instead, a constant ``Artist'' may be used; in general, it will be a
 term of a domain-specific thesaurus.  Such constants
 (``individuals'') are admitted in T-Boxes by means of the ``one-of''
 OWL-DL language construct, i.e. an enumeration datatype.  They
 correspond to classes with singleton extensions.  So, we could
 represent ``Vincent van Gogh'' as an immediate instance of E21
 Person and relate it by P2 has type to E55 Type with value
 ``Artist''.  In this case, of course, the constants cannot have
 instances in turn.

Both representations are not mutually exclusive; in our example the
name of the class Artist (case 1) could additionally be used as a
constant which is assigned as a value to E55 Type (case 2), but then
it is up to the user to guarantee for semantic integrity.  In the
second case the intention expressed in the CRM document is supported
that is shall be possible to deal with domain concepts --- such as
Artist --- as objects of discourse.  Which of these representations
will be chosen for a particular application will depend on the
intended use of the domain model.''
</quotation>

What I am proposing here is to provide possibilities to argue with and
about terms, i.e. use terms in reasoning in a ``de re'' and a ``de
dicto'' mode.

De re corresponds to the first way of representation: We introduce
domain level classes as subclasses of CRM-classes.  The domain level
classes may have subclasses in turn as, e.g., Painter and Sculptor as
subclasses of Artist; so all of our instances from, e.g., a museum

data

base are instances of the domain ontology which in turn is connected
to CRM as a reference ontology.  In Description Logic (OWL-DL),
reasoning with classes (concepts) is possible
- intensionally, i.e. in terms of their defining expressions (T-Box),
 or
- extensionally, i.e. in terms of their instances (A-Box = extension,
 i.e. the set of instances).

It is important to keep in mind that we have an ``open world''
semantics, i.e. if we have as a necessary condition for a FATHER that
he is a PERSON and that Exists some CHILD, which is a PERSON,
(``existential restriction''), we can represent an instance of a
PERSON who claims to be a father without representing explicitly the
particular CHILD --- in open world semantics ``I don't know'' is a
legitimate answer to the question for a child.  On the other hand,
there may be two PERSONS claiming to be the father of some particular
CHILD (probably a rare case in the real world...), if we do not
combine the existential restriction with a cardinality restriction,
i.e., that there must be exactly one father for any CHILD.

De dicto corresponds to the second way of representation: We introduce
a constant as the value of E55 Type.  In this case, we can reason
about the term itself, and not about its denotation.  If we use a
thesaurus where a broader term for ``Artist'' were ``Person'' and
narrower terms were ``painter'', ``sculptor'', etc., we can reason in
the narrower-term--broader-term thesaurus hierarchy as opposed to the
class hierarchy in the domain.  Formally, in both cases we have a
subsumption hierarchy; in our example on the one hand a class
hierarchy which contains CRM classes with integrated domain classes
which have a denotation in the domain (de re), in the second case in a
thesaurus hierarchy of terms which don't have instances (de dicto).
As mentioned above, there may be combinations where the terms are
related to domain classes by a particular property such as (an
extended version of) P2 has type.

The latter situation reminds me of a technique we applied in our NLP
work where we have a domain ontology representing a rich domain
semantics on the one hand and a hierarchical lexicon --- WordNet in
our case --- on the other hand.  The ``terms'' above would correspond
to WordNet's ``synsets'', i.e. sets of synonymous words.  As synonymy
is an equivalence relation, we have a typical case of an abstraction
from word to (lexical) concepts: Each word in the synset may represent
the equivalence class.  Then we introduced a property
``has-lexconcept'' into the ontology which relates domain concepts to
the lexical concepts, i.e. words by which they are expressed.  But
this relation has to be maintained by the system implementors and must
be handled with care: Of course, it's up to them to care for semantic
integrity.  We can reason within the domain class hierarchy, as well
as within the lexical concept hierarchy (WordNet), but combined
inferences are possible as well.  In the case of subsumption
inferences, e.g., given a certain (domain) class and, by virtue of
lex-concept, the corresponding lexical concept(s), we could ask for
its superclass and the words it is related to.  Furthermore, we can
stay in the WordNet hierarchy, look for the broader lexical concept
(synset), and ask for the domain concept it corresponds --- if there
is one --- to by virtue of the inverse relation to has-lexconcept.

Best,
-- Guenther

On 6/9/08, Vladimir Ivanov <[email protected]> wrote:

---------- Forwarded message ----------
 From: Vladimir Ivanov <[email protected]>
 Date: 2008/6/9
 Subject: Re: [Crm-sig] About Types: ISSUE PLEASE VOTE
 To: Guenther Goerz <[email protected]>


 Dear Guenther,

 > Section 9: I don't understand it at all.  Could you please

explain

---

 > and perhaps also the colleagues who already voted for the text as

 > whole what they understand?  As a side remark, I cannot make any

sense

 > out of the last sentence.

 The only sense of the last sentence I've made, was its

correspondence

 to OWL Full language.
 If one allows to treat "E55.Types" both as classes and as

instances,

 you may face to problems with reasoning.

 Excerpt from OWL spec. (http://www.w3.org/TR/owl-ref/):
 "... However, use of the OWL Full features means that one loses

some

guarantees

 that OWL DL and OWL Lite can provide for reasoning systems."

 these "guarantees" are related to decidability of reasoning.

 "Inference in OWL Full is clearly undecidable as OWL Full does not
 include restrictions
 on the use of transitive properties which are required in order to

maintain

 decidability." from

(http://www.cs.man.ac.uk/~horrocks/Publications/download/2005/Horr05c.pdf

 , p.2)

 As for the sect. 9 as a whole,
 I think the main idea was that
 "you may implement a system of user-defined types (subclasses of

E55

 and properties)
 at necessary (in your application) level of granularity, but it

should

 correspond to the CRM notion of type".

 Best regards,
 Vladimir.

 >
 > Best,
 > -- Guenther
 >
 >
 > On 6/4/08, martin <[email protected]> wrote:
 >> Dear All,
 >>
 >>  Following the decision in the last meeting, we have to decide

via

e-mail

 >> vote on
 >>  the updated  attached text about types in the CRM document. I

have

 >> desparately tried to
 >>  describe as exact as possible what the CRM does, and to avoid

the

metaclass

 >>  question, once this is a philosophical rather than an applied

question in

 >> the
 >>  current form the CRM describes.
 >>
 >>  Please VOTE:
 >>
 >>  ACCEPT [ ]
 >>
 >>  REQUEST MODIFICATION: [....]
 >>
 >>  by June 12.
 >>
 >>  Best,
 >>
 >>  Martin
 >>  --
 >>
 >>

--------------------------------------------------------------

 >>   Dr. Martin Doerr              |  Vox:+30(2810)391625        |
 >>   Principle Researcher          |  Fax:+30(2810)391638        |
 >>                                |  Email: [email protected] |
 >>                                                              |
 >>                Center for Cultural Informatics               |
 >>                Information Systems Laboratory                |
 >>                 Institute of Computer Science                |
 >>    Foundation for Research and Technology - Hellas (FORTH)   |
 >>                                                              |
 >>   Vassilika Vouton,P.O.Box1385,GR71110 Heraklion,Crete,Greece |
 >>                                                              |
 >>          Web-site: http://www.ics.forth.gr/isl               |
 >>

--------------------------------------------------------------

 >>
 >>
 >> _______________________________________________
 >>  Crm-sig mailing list
 >>  [email protected]
 >>

http://lists.ics.forth.gr/mailman/listinfo/crm-sig

 >>
 >>
 >>
 > _______________________________________________
 > Crm-sig mailing list
 > [email protected]
 >

http://lists.ics.forth.gr/mailman/listinfo/crm-sig

 >
 _______________________________________________
 Crm-sig mailing list
 [email protected]
 http://lists.ics.forth.gr/mailman/listinfo/crm-sig

_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig

_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig


 --

--------------------------------------------------------------
  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
  Principle Researcher          |  Fax:+30(2810)391638        |
                               |  Email: [email protected] |
                                                             |
               Center for Cultural Informatics               |
               Information Systems Laboratory                |
                Institute of Computer Science                |
   Foundation for Research and Technology - Hellas (FORTH)   |
                                                             |
  Vassilika Vouton,P.O.Box1385,GR71110 Heraklion,Crete,Greece |
                                                             |
         Web-site: http://www.ics.forth.gr/isl               |
--------------------------------------------------------------



--

--------------------------------------------------------------
 Dr. Martin Doerr              |  Vox:+30(2810)391625        |
 Principle Researcher          |  Fax:+30(2810)391638        |
                               |  Email: [email protected] |
                                                             |
               Center for Cultural Informatics               |
               Information Systems Laboratory                |
                Institute of Computer Science                |
   Foundation for Research and Technology - Hellas (FORTH)   |
                                                             |
 Vassilika Vouton,P.O.Box1385,GR71110 Heraklion,Crete,Greece |
                                                             |
         Web-site: http://www.ics.forth.gr/isl               |
--------------------------------------------------------------

Re: [Crm-sig] clarification needed

Reply via email to