Dan Brickley wrote:
On 23/2/09 22:24, Mike Bergman wrote:
David Baxter wrote:
We at Cycorp have been publishing owl:sameAs links from our OpenCyc
concepts to WordNet synsets, e.g.
<http://sw.opencyc.org/2008/06/10/concept/en/India> owl:sameAs
<http://www.w3.org/2006/03/wn/wn20/instances/synset-India-noun-1>
We've done so with the idea that the WordNet synset represents the
same concept as the OpenCyc term (i.e. the South Asian country in this
case), and contains further relevant information that complements what
is available in OpenCyc, e.g.
"is a member of OPEC" (OK, this one's of dubious value, but it might
be useful if it were true)
"is a member of the British Commonwealth"
"is a part of Asia"
However, WordNet also contains assertions about the "India" synset
that seem strange to assert about the country, e.g.
"is an instance of NounSynset"
"contains WordSense 'Republic of India 1'"
We'd like to know what the general feeling in the LOD community is
about these links. Is there any precedent or consensus about the best
way to link from ontologies such as OpenCyc's to WordNet? Is anyone
finding these links useful and/or harmful?
Thanks for any input.
I've rolled back to your starting message since intervening comments
have unfortunately snipped out the essence of your question about
owl:sameAs.
Maybe we lack agreement on what the essence was!
Let me also again add this link from over the weekend that I
think is also germane:
http://i9606.blogspot.com/2009/02/semantic-dissonance-in-uniprot.html
As I understand the current OWL, "an owl:sameAs statement indicates that
two URI references actually refer to the same thing: the individuals
have the same 'identity.'" [1]. In logical terms, I understand this to
represent complete and total identity, equivalent to the '='
relationship, or something pretty doggone close to it. I also understand
this property to perhaps have the strongest entailment of any OWL
property.
Yup, owl:sameAs is for when there's only one thing, not two similar or
related things.
The inference from your use case and the similar issue with Ben's
uniprot example are all too typical of sameAs problems once disparate
datasets actually get pulled together.
I appreciate the rdf:seeAlso suggestion; it is the most common fallback.
But the issue with that one, which is why you went to sameAs in the
first place, is that seeAlso is way too weak to convey the nature of the
relationship. Sure, we could do a subPropertyOf but we could at best
capture only the very weak semantics that seeAlso presently provides; we
could not strengthen it.
I think the real issue is that we don't have a readily available (or at
least accepted) predicate. I would suggest, though, that the issue at
hand is very much captured by the concept of "relative identity":
http://plato.stanford.edu/entries/identity-relative/
esp. Section 3 (though there are some wonderful paradoxes throughout).
What I like about 'relative identity' is that we can still infer and
reason over the relationship (but *how* and weak or strong still is up
for grabs).
I think the considerable experience of Cycorp in such matters could be
invaluable in severing this Gordian knot. Care to stroll deeper into the
den?
A hasRelativeIdentity B ??
Interesting, but I think in this case we're talking about modeling some
lightweight linguistics data, and linking it to the classes the natural
language words are words for. Talk of identity is a bit of a distraction
here. This is due to the modeling style chosen for the W3C Wordnet RDF
representation, nothing more. If it were a class-centric projection of
Wordnet into RDF, we'd be having quite a different discussion.
Au contraire! The issue here is the predicate, sameAs, which itself has
an identity assumption in its semantics.
While your earlier point was absolutely true about WordNet and its
purpose as a language and linguistics model (and, thus, clearly not a
knowledge base in the same vein as OpenCyc), the identity issue arises
as soon as the OpenCyc world view attempts to establish a relationship
"identity" with the conceptual linguistic view within WordNet.
The broader issue, still, is what is occurring via the simplest
inference engines out there that are tracing sameAs links as if they
were identity to entail all assertions through the sameAs linkages.
This is the fragile foundation that is creaking mightily as anyone tries
to do any meaningful work with any of this linked data.
Do we just want to browse links for things that might be related (even
there with no consensus), or do we want to do real stuff with this
information?
Mike