Re: Merging Databases

Pat Hayes Fri, 24 Jul 2009 09:10:11 -0700


On Jul 23, 2009, at 4:24 AM, Antoine Isaac wrote:

Hello,
Trying to add some explanation wrt. the SKOS vocabulary, hoping notto conflict with Pat's clarifications ;-)
For skos:exactMatch, the SKOS reference says [1]:
The property skos:exactMatch is used to link two concepts,indicating a high degree of confidence that the concepts can beused interchangeably across a wide range of information retrievalapplications
So there can be substitution. But, contrary to what happens withowl:sameAs, this substitution is not automatic for *all* RDF triplesthe concept would be involved in. Actually it is left toimplementers or ontology provider to define in which "context" twoexactly equivalent concepts may be substituted.

To me, that makes this almost completely useless, and actually harmfulto interoperation: for I have (the code I write has) no way to knowwhat "context' is intended when reading such an assertion. Here I amwith some RDF, and I am told that A skos:exactMatch B. Can Isubstitute B for A in my content? Nobody knows. The SKOS documentationdoesn't know. Nothing in my RDF tells me the answer. I have no way toknow where to go to discover the answer. If I decide to go with myhigh degree of confidence and make the inference, and in fact it wasntappropriate, how will I ever discover this? What kind of formallydetectable inconsistency would lead me (my code) to conclude that amistake had been made, and where? Without some answers to some ofthese (obvious) questions, skos:exact Match might as well be calledskos:SpongeBobSquarePants.

As a (fictious!) result, one concept may be substituted by anotherif it is the object of a dc:subject statement, but not if it is thesubject of a dc:creator statement.The idea was really to be able to state some form of semanticequivalence that would be less committing than the RDF/OWL one.

And the problem, which this conspicuously fails to address, is how tomake semantic sense of this idea of "semantic equivalence which isless committed". Until one does, calling it "semantic" is just hotair. Here's the central issue. The actual semantics of owl:sameAs isdefined as co-reference. It does not mean that there are two things, Aand B, which are equivalent in a strong sense (which can be weakened,no doubt, if we only think about it for a bit.) A sameAs B means thatthe two names 'A' and 'B' denote the very same thing. So if A sameaAsB, there is ONE THING with two names. That is what the semanticsspecifies. But **being the very same thing as** is not something thatadmits of weakening or can be given degrees of commitment. Two verysimilar things can be more or less similar, but one thing cannot beslightly-not-the-same as itself. You can't be partially orapproximately the same as yourself, because any relation other thanidentity requires the relationship to hold between TWO similar but non-identical things, and it is the very point of identity - sameAs -that when it holds, there is only ONE thing there. Now, turn thisaround. What other semantically defined circumstances would sanctionsubstituting the name A for the name B in some assertion, **except**that A and B denote the same thing? Remember, to count as 'semantic',the criterion has to be stated in terms of what the names denote, notin terms of the names themselves. So 'A' denotes some thing, A, and'B' denotes B, and we want some conditions on A and B, not on 'A' and'B', that is enough to justify taking something said using the name'A' and treating it as being said using the name 'B' instead. If A andB are different, what would give anyone a licence to transfer truthsasserted about one of them to similar-but-different truths assertedabout the other? What kind of "context" would this be, that permitssuch blatantly invalid inferences?

Hopefully you can see that a genuinely semantic analysis of thissituation is somewhat difficult. You don't get it by inventingplausible-sounding relationship names, providing no axioms for them,and then being deliberately vague about what they mean and callingthis 'reference documentation'.

Now, skos:exactMatch is transitive, which means that if a concept Xin one vocabulary has been mapped to Y in a second vocabulary, and Yis connected to Z in a third vocabulary, then X and Z can besubstituted.

Is it symmetric (A skos:exactMatch B implies B skos:exactMatch A)and reflexive (A skos:exactMatch A) ? Because if it is all three ofsymmetric, reflexive and transitive then it is an equivalencerelation, and if it is also substitutive, then it IS equality(owl:sameAs), whether the "reference documentation" admits this or not.

This can (and will) be useful, but may be also harmful if the twomappings were created with different application concerns in mind,


IF?? Of COURSE this will happen.

and that negligible semantic differences over a one-step mapping addup to a bigger lap over a longer path.

and OF COURSE that will happen. This effect has been noted and writtenabout for many years. It is often called the heap paradox. It is thebéte noir of all attempts to give a precise semantics for conceptswith 'fuzzy' boundaries. There are probably around a dozen precisesuggestions, none of which are universally accepted, for gettingaround it.

closeMatch is meant to deal with a lesser level of commitment. Aswritten in the SKOS Primer,
skos:closeMatch is not defined as transitive, which prevents suchsimilarity assessments to propagate beyond these two schemes.
Imagine that for a specific application you are creating mappingsbetween two vocabularies. You can thus create these mappings withoutbother with the possibility that these links may cause dubioussubstitutions for a different applications.

And also without the bother of the links creating any new entailmentsor inferences for your applications. What do you even mean by a"link", if this has no semantics to support any entailments?

SKOS has also other mapping properties, esp. a skos:relatedMatchcould be the anchor for the more specialized properties that Alanhas in OBO.This skos:relatedMatch has really not much semantics (even informalones) but I feel it would still be better than rdfs:seeAlso.

Why do you feel that? Two properties which are both normativelydeclared to have no semantics are equally meaningless.

According to RDFS spec [3]
The property rdfs:seeAlso specifies a resource that might provideadditional information about the subject resource.
As far as I understand it (and keeping to the distinctions made e.g.in [4]) this means that rdfs:seeAlso can connect a non-documentresource to an information resource

And what is to prevent this being the case for two resources connectedby skos:relatedMatch ? You said it has no semantics....

Pat

, which I feel is a bit too broad for Alan's case...

Cheers,

Antoine

[1] http://www.w3.org/TR/skos-reference/#mapping
[2] http://www.w3.org/TR/skos-primer/#secmapping
[3] http://www.w3.org/TR/2000/CR-rdf-schema-20000327/#s2.3.4
[4] http://www.w3.org/TR/cooluris/#distinguishing
On Jul 21, 2009, at 7:26 PM, Alan Ruttenberg wrote:
On Tue, Jul 21, 2009 at 1:23 PM, Toby Inkster<[email protected]> wrote:
On Tue, 2009-07-21 at 19:52 +0300, Bernhard Schandl wrote:
I would say: Never assert sameAs. It's just too big a hammer.
Instead use a wider palette of relationships to connect entities
to other ones.
which ones would you recommend?
skos:exactMatch = asserts that the two resources represent the same
concept
Say, refer to the same thing.
, but does not assert that all triples containing the first
resource are necessarily true when the second resource issubstituted
in.
I'm having trouble parsing this one. I don't know what concepts are,
but they are an odd sort of thing if they can be the same, but can't
be substituted.
This is exactly what is needed in many cases. Philosophicalterminology is that they have the same referent but not the samesense, and lack of substitutability reflects the unfortunate butinevitable fact that the Web as a whole is not referentiallytransparent (yet). More mundane example, the same person might needto be referred to in one way in one context and differently inanother, just because the two social contexts require differentforms of address. (That example from Lynn Stein.)
In any case, this isn't much better when the issue I point out isthat
there is a specific relation between e.g. the intervention and the
drug - that relation is no where near equivalence in any form.
True, but in cases like this, it is simply a basic conceptualmistake to be using any kind of loose-sameAs property. rdf:seeAlsowould be more like what is needed for linking a drug to anintervention. I agree with you about having a selection of better-thought-out relations rather than just using sameAs as a kind ofall-purpose knee-jerk connecting link. Maybe this "Linked Data"slogan has a rather dumbing-down effect, as it suggests that 'link'is a simple uniform notion that works in all cases.
skos:closeMatch = same as exact match, but slightly woolier.
Seems harmless, assuming one doesn't mind whatever one is dealingwith
typed a concept.
Ditto the broader and narrower relations, which although not to my
taste (i don't how to tell when they hold) are certainly betterthan
using sameAs.
owl:equivalentProperty = if {X equivalentProperty Y} and {A X B}then
{A Y B}. In other words, the properties can be used completely
interchangeably. But perhaps there are other important differences
between X and Y, such as their rdfs:label or rdfs:isDefinedBy.
Still near equivalence.
owl:equivalentClass = if {X equivalentClass Y} then all Xs are Ysand
vice versa. Same dealy with owl:equivalentProperty really.
Ditto.
ovterms:similarTo = a general, all-purpose wimps' predicate. Iuse this
extensively.
Under the principal "first do no harm", this seems to work,although I
note that the intervention (something that happens) isn't similar to
the drug used in it (something that is consumed when theintervention
happens).

seeAlso seems pretty harmless and noncommittal.

But better is probably to look more closely at what the entities are
and then choose a relationship that better expresses how theyrelate.In the case of the intervention, one plausible interpretation isthat
the "intervention" names a class of processes, and that there is a
subclass of such processes in which the drug participates. (theothersubclass are those in which a placebo is the participant) This canbe
modeled in OWL.
(My real advice for clinical trial resource is to collaborate withthe
OBI project and use terminology that is being developed for exactly
that purpose)

In my line of work I start with the OBO Relation ontology,
http://www.obofoundry.org/ro/ which provides a basic set of well
documented relations, such as the has_participant relationship.

OWL also provides some relations of beyond equivalences - subclass
relations are an option, when appropriate, as well as making
statements that classes overlap - by expressing that theintersection
of the two is not empty.
That ontology is undergoing some reform, as it should in time.Some ofthe new candidate relations are documented in links from thatpage. In
addition it is proposed that that there be class level and instance
level versions of the relations - the class level relations might
better a modeling style that would rather avoid using OWL
restrictions, and fits well with OWL 2 which allows a name(URI) tobe
used as both a class and an instance.

Finally, for those cases where there are more than one URI and they
*really* mean the same thing - why not try to get the parties who
minted them to collaborate and retire one of the URIs. If theyreallymean the same thing there should be no harm in either party usingthe
other's URI.
Its not that simple, unfortunately. I'm going to make this issuethe center of my invited talk at ISWC later this year :-)
Pat
-Alan
--
Toby A Inkster
<mailto:[email protected]>
<http://tobyinkster.co.uk>
------------------------------------------------------------
IHMC (850)434 8903 or (650)4943973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes


------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes

Re: Merging Databases

Reply via email to