Dear Bernhard,

Thank you very much for your rich input!


Some quick answers:

Bernhard Schiemann wrote:
Dear all,
Regarding the compatibility document (V4), we have the following questions:
-I found no time slot for this issue during the London meeting. Shouldn't
the SIG discuss and/or vote on that issue in London?

See: http://cidoc.ics.forth.gr/agentas/18th_sig_agenda+13th_frbr_crm.htm
Thursday November 6, 11:30-13:00, "Final text for compatibility"

-The term "Compatibility" is used for this document and most of the
contents is about the compatibility between technical systems. (first
sentence: "of their data structures"). Maybe this could be formulated
precisely by changing the headline to e.g.
"CRM compatibility of technical systems and data structures" instead of
"Compatibility"

This is an "Issue". Is there any other compatibility you think of?

A general remark: As the goal seems to be to be to assert
compatibility with the CRM as defined in the CRM document, it is hard
to impossible to achieve this goal as long as we refer to a text which
is open for interpretation.  Compatibility in a rigid sense can only
be proved with a formal definition.  One could propose a weak concept
of compliance if there is a test suite against which a system claiming
that can be tested.  However, what is easier is to check for
incompatibility, i.e. it is far easier to say if something is
incompatible, i.e in contradiction, with the CRM.

The CRM definition is and will be text based. The notions of subclass,
superclass etc., are clear enough to be transferred to a formal language.
This has good reasons we have discussed in the past.
I vote against any change of this.

The details of spelling out in a formal language are a question of good practice
and certification authorities.



-"In other words, it does not aim to provide more structure
than users have previously provided." (Page 1, section 1.1,
3. paragraph) Does this address the workflow to produce compatible data?
If you have a structure you can translate it to CRM structures, and if
you just have unstructured information, you have to structure it first?

Yes, the workflow of transforming data from a legacy format into a
CRM compatible form. Many users had the impression, they should structure
their texts in order to do so. This is not intended.


-"exhaustively in terms of CRM concepts" (Page 2, section
1.2, use case no. 4), All instances of (a queried) CRM concept?

-***Nick*** means that Nick provides an appropriate section about use
cases? Or Examples?
Obsolete. Ignore.


-The next paragraph is the central point of this document (Page 2, 2nd
before 1.3, beginning with: "In the context"): the definition of
"without loss of meaning". Regarding the email Vladimir Ivanov already
sent, we want to add:
+"By virtue of this classification" Do you mean: "Using this
classification"?
What would the word "use" clarify? The fact that there is this classification,
once it has been used, lets the user understand.

+"expert conversant" How do you define an expert conversant? Is a field
expert e.g. an  art historian an expert for the classification of a
painting?

Sure. Obvious to our audience.


-The first paragraph of 1.3: "A CRM compatible form should
not implement the quantifiers ..." If the CRM contains
quantifiers, why shouldn't they be implemented by cardinality
restrictions? What do you mean by "form"? Do you mean a "profile"?
I try this at an example from the scope note:
"Quantification: many to one, necessary, dependent (1,1:1,n)
Scope note: ... A temporal entity can have in reality only one
Time-Span, but there may exist alternative opinions about it,
which we would express by assigning multiple Time-Spans.
Related temporal entities may share a Time-Span..." coded as
e.g. E2.Temporal_Entity P4.has_timespan minimal one
E52.Time-Span. Where is the problem?

I propose to change the 2nd phrase in 1.3 to:

"We call any encoding of such CRM instances in a formal language that preserves the relations between the CRM classes, properties and inheritance rules a “CRM-compatible form”".

The CRM suggests to use a monotonic schema for information integration.
Cardinatlity constarints violate monotonicity. That is the problem and the
solution we have so far agreed on. Do you suggest as an issue to drop 
monotonicity?


-second paragraph of 1.3: A subset of a consistent set is consistent by
definition.

No. Cutting out IsA relations can create inconsistent models.

I propose to add the following 4th condition:
 •      any instance of the reduced CRM-compatible form is also a valid 
instance of a (full) CRM compatible form

Is there an implicit claim that the superset is
inconsistent?
No.

> Or do you mean "proper subset" ???
Our audience are not mathematicians...


Section 1.3
- The definition of single concepts that crm-compatible data structure
(or something else) must contain is problematic.
For example: Think of an address management system. You
could build such a system compliant to CRM concepts like E21.Person,
E53.Place etc. and with regard to the necessary properties and
inheritance rules. There would be no need of a E12.Production Event, so such
a system will never be crm-compliant, no matter how exact the other
concepts, properties and restriction are took into account.
Is this desirable?

The definition of an export comaptible system does not require any minimal 
concepts.
You confuse the sender system with the receiver data structure. The receiver 
data
structure is required to have least elements.


-Section 1.4 first paragraph, second sentence:
+"in in"

Thank you!

+What are "implicit concepts"? If "concepts" are present in elements of
a data structure they are not implicit, but explicit. May be it would be
better to understand, if one can provide an example for such a concept.
The whole story about formal ontologies for automatic processing is
that everything (concepts, properties) has to be defined explicitly.
Of course, you may infer by inheritance that e.g. an instance has some
(previously implicit) properties which are now made explicit by
virtues of the reasoning step.  Nevertheless, the concepts and
properties are defined explicitly.

The source data structures are not instances of a formal ontology. Nothing like 
that
has been said. So, their concept are not all made explicit. I do not see a 
particular
reason HERE to provide an example?

Neither has anyone said that the schema matching process would be automatic!
We require only automatic DATA transformation.


Section 1.4 / 1. Paragraph / 3. Sentence ("As long as these concepts can
 be encoded as instances of E55 Type (i.e. as terminology) ..."):
If a concept is encoded as instances of E55 Type in the sense of a term,
it has the notion of a lexical concept. As such it can not have the same
(suitable) properties as the original data item. Shouldn't we update
this to actually (in London) updated definition of E55?

You misinterpret the English text. The "suitable properties" are used to 
connect the
term to the data item (such as "has type", or "in the role of"). For me, the 
text
is unambiguous. Any other opinions?

Example: The term Artist (i.e. the propositional form Artist(x) in
Cristian-Emil's proposal) in a thesaurus does not have a birth date,
but instances of the subclass ARTIST of E21.Person (or whatever) of
course inherits this property if the superclass has it.

See above.



-Section 1.4 second paragraph: The first sentence could be easily
misinterpreted in a way that you have to consider at least only one
CRM-concept to build an export-compatible data structure. A reference to
the reduced CRM-compatible form defined in 1.3 should be added at this
point.
No backwards references needed to definitions in such a text. The reader should 
not sleep...

I chose the term "represented" for "being mapped to"Of course, what sense would 
it have otherwise?

What about:

"Note that not all CRM concepts may need to be matched with some elements of an 
export-compatible data structure."

I fear, that this is more arcane...


-Section 1.4 third paragraph: First occurrence of the term
"record". Is this a "data record" from a database?
Or just "record" in everyday language, i.e. a written account of
something.
Please let us know what resonable alternative explanation would exist.

This point to a major problem in the whole document: It is
not clear whether the terms are used as in commonsense language or
whether they are used terminologically, i.e. in  a scientific way.

OK, "data record". Any other ambiguities in the "whole document" ?


-Section 1.4 5. paragraph: How does the reduction work, if
we declare that a classification must be implemented "without
loss of meaning"? (relation to Page 2, 2nd before 1.3, beginning with:
"In the context")
Obviously, by a controlled loss as described here.

"Loss of meaning" can be stated only between formal systems --- if we
refer to text, "meaning" and "loss of meaning" can be argued about,
but there may be the situation that no agreement will be found.

This is no AI text. There exist scholars that understand natural language,
and their domain, and can say if a translation captures sufficiently
what they wanted to express.

We do not share your concept of meaning. For us, a formal system has
no meaning on its own. Only a human can associate meaning with formally
stated concepts. Constraints are not meaning.


-Section 1.5 first sentence: "all user data". How does this information
system work if "not all CRM concepts may be represented by elements of
an export-compatible data structure"? (Section 1.4, 2nd paragraph)

I don't understand the association. These two sentences have nothing to
do with each other.


-Section 1.5 2. paragraph: A "partially import-compatible data
structure" is not yet defined by this document.
This IS the definition. It says: "An information system is partially export 
compatible if ..."

-Section 1.5 3. paragraph: What are "generic data"?
The data that the system is designed for. The point is, that someone may device 
a
container, a flat file e.g., to import and export CRM data parallel to the 
generic
data.


-Section 1.5 5. paragraph: What is a "semantic reduction"?

I introduced "semantic" in 1.4, 5th paragraph:
"...been exported into a CRM compatible form by semantic reduction to CRM 
concepts"

to make the reference more clear.

-Section 1.5 6. paragraph, last sentence: Do they choose on CRM basis?
What should that mean?

I propose to change the phrase to:
Note that local information system providers may choose to make their systems 
import-compatible with the CRM

-Section 1.5 figure XXX: The meaning of figure XXX is not really clear
to us. E.g. What does the data export arrow to the left side mean? The
paragraph beneath the figure claims that it shows only *some* of the
data flow patterns. It is not clear which patterns are shown an which
not (and why). If we really like to have figures in this document, there
should be one figure which shows all the data flows mentioned
(export-compatible, import-compatible, access-compatible). This figure
can easily become to complex, so that three figures, one for each
defined pattern, would be necessary.

I regarded the overview far better. The figure is only a help, not a formal
part. Please provide an alternative we can vote on.

Section 1.6, For export-compatible, a.: "other than" Does this mean all
other concepts except E1 and E77?

This is how my understanding of English is.

We additionally found some questions (sent by email) that are still left
open in the V4 document. E.g.:
-"Obviously there is also an implicit research issue: How to define a
mapping that proves incompatibility. Help from the computer science
community appreciated."
-"Should we distinguish notions of intensional/extensional meaning?
Should we introduce relationships that preserve meaning (equivalence,
subsumtion)?"
-"Dear Nick,
I think only you and Patrick can answer this question: What do other
standards do about verification?"
-Some questions raised at earlier stages of the document (V2) by Prof. Görz.

Sure, we deliberately do not want to resolve all the implementation
questions. The document must make clear the effect of the compatibility,
not all the possible means to show it.

Ultimately, the user decides if the data are correctly interpreted and
handled, and not the programmer. The programer has to find out how to
satisfy the user. The ISO text must describe the goal and effect, not the
how.


As a whole there seem to be many points that need to be further
clarified and discussed so that it is not ready to be forwarded
to ISO. Nevertheless I see the benefit to have a standarized and
detailed definition of crm-compatibility inside the iso-standard.
A suggestion: Would it be possible to send the updated crm-definition
with the old compatibility part to ISO this year and send a new
compatibility part as an update of the ISO-standard next year? That
would lower the deadline pressure of the discussion and could lead to a
compatibility-definition we all agree on.

So far I have an overwhelming positive feed-back to the document.
It does not help anybody to delay this text, in favour of details that
could be added equally in the following years, or be part of a good
practice guide. Most of your comments pertain to rephrasing, I hope I
could already resolve.

I do not see anything in your concerns that pertains
to the substance of the thing, except for the idea to define
"loss of meaning" by formal methods rather than by expert opinion,
which I vehemently advocate against, since this is not what the CRM was
made for.

I thank you very much for the scrutiny with which you read the text,
and all your effort, that will lead to improvements.

Best,

Martin




kind regards, on behalf of the authors
Bernhard

_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig


--

--------------------------------------------------------------
 Dr. Martin Doerr              |  Vox:+30(2810)391625        |
 Principle Researcher          |  Fax:+30(2810)391638        |
                               |  Email: [email protected] |
                                                             |
               Center for Cultural Informatics               |
               Information Systems Laboratory                |
                Institute of Computer Science                |
   Foundation for Research and Technology - Hellas (FORTH)   |
                                                             |
 Vassilika Vouton,P.O.Box1385,GR71110 Heraklion,Crete,Greece |
                                                             |
         Web-site: http://www.ics.forth.gr/isl               |
--------------------------------------------------------------

Reply via email to