Hi Aldo,
thanks for the references - I will definitely look into them. I know about the
systematic polysemy phenomenon - e.g. sentences where you use two meanings of
one polysemous word are semantically valid. But I am not saying that you have
to make a commitment to one classification of a certain "object" that is
described in Wikipedia, e.g. church as a building vs church as a community.
What I am saying is that when you use DBpedia ontology, where the classes are
vaguely defined you are indeed making some commitments (as Peter pointed out)
by interpreting churches as a buildings, while in Cyc there are such
distinctions, but there are also relations that keep these interpretations
together.
What I am mostly concerned with is the process of developing DBpedia ontology.
I have pointed out inconsistencies in Wikipedia which are a by-product of its
community based development. Having 5 or 6 means of expressing one information
(e.g. administrative categories) is not a problem of polysemy, but is a problem
of a community driven process. Assuming DBpedia ontology creators will agree
that a church is a systematically polysemous concept I don't expect that all
systematically polysemous concepts will be treated as such. My belief is based
on my careful examination of Wikipedia (incoherent) structure and the problems
discovered in the DBpedia ontology that were discussed so far.
But you may be right in yet another sense - there is no much semantics in
SemanticWeb. At least - if you look at DBpedia from God's perspective you are
making a mistake. There are only portions of the data that are useful in
certain applications and you have to adjust them to your "ontology" before you
start using it.
Kind regards,
Aleksander
---- Wł. Śr, 16 kwi 2014 15:44:08 +0200 Aldo
Gangemi<aldo.gang...@gmail.com> napisał(a) ----
Hi, the kind of ambiguity we are discussing is well known in lexical semantics:
it is called “systematic polysemy”. In practice, certain terms are used to
express several related meanings. Patterns emerge out of these phenomena, e.g.
Place/Building/Community, which the Cyc concepts noticed by Aleksander are an
occurrence of. Systematic polysemy was originally described by James
Pustejovsky, and studied in detail within WordNet by Paul Buitelaar.
Trying to resolve that ambiguity is a typical formal ontology task, but this
task is sometimes in conflict with common sense, and especially with linguistic
and informal data. Pat Hayes (as far as I remember) used to call the excess of
distinction “ontological overcommitment”. In fact there are even cognitive
experiments that prove the importance of “controlled” ambiguity for fast and
efficient thinking [6].
As an ontology designer that spent quite a time trying to understand how and
when it is useful to make distinctions, I would recommend to follow the
crowdsourced value of Wikipedia: there is structure there, but often it is not
what we would like to find as formal ontologists or logicians. And this is not
a “problem" in my view: we need to understand what is relevant, and to derive
requirements from it. If there are structures in Wikipedia that produce
exceptions in DBpedia, and derivatively in potential formal orderings of
classes and properties, such exceptions should be studied as empirical
phenomena, not as “dirtiness” to be corrected.
>From this perspective, the DBpedia ontology is simply not informed by either
>criteria: it does not try to be formally correct, but it does not try to
>follow empirical science rules either. That’s why I was recommending to look
>at properties and mappings first.
As far as Kingsley’s recommendation to “triangulate” with other ontologies,
that’s a good one, but it should be accompanied by understanding what is the
actual data-oriented ontology of DBpedia-Wikipedia. Let me reference some
examples of this empirical research conducted by my lab: [1][2][3], and there
is more of course in the literature, e.g. [4][5].
Best
Aldo
[1] Nuzzolese A.G., Gangemi A., Presutti V. Encyclopedic Knowledge Patterns
from Wikipedia Page Links. Proceedings of ISWC2011, the Ninth International
Semantic Web Conference, Springer, 2011.
[2] Aldo Gangemi, Andrea Giovanni Nuzzolese, Valentina Presutti, Francesco
Draicchio, Alberto Musetti and Paolo Ciancarini. Automatic Typing of DBpedia
entities. Proceedings of ISWC2012, the Tenth International Semantic Web
Conference, LNCS, Springer, 2012.
[3] Presutti V., Aroyo L., Adamou A., Schopman B., Gangemi A., Schreiber G.
Extracting Core Knowledge from Linked Data. Proceedings of the Second Workshop
on Consuming Linked Data, COLD2011, Workshop in conjunction with the 10th
International Semantic Web Conference 2011 (ISWC 2011),
http://ceur-ws.org/Vol-781/, 2011.
[4] http://www.heikopaulheim.com/docs/iswc2013.pdf
[5] Johanna Völker, Mathias Niepert. Statistical Schema Induction. In
proceeding of: The Semantic Web: Research and Applications - 8th Extended
Semantic Web Conference, ESWC 2011
[6] Steven T. Piantadosi, Harry Tily, Edward Gibson. The communicative function
of ambiguity in language. Cognition, 2012; 122 (3): 280
On Apr 16, 2014, at 2:26:51 PM , apoh...@o2.pl wrote:
I have a somewhat mixed feeling regarding the organization of Wikipedia. It is
true that if you use it as a human, you will find what you need in a reasonable
time (usually using search box and following direct category links). But from
the POV of a KB engineer, its organization is very far from being perfect. Let
me just give you an example: administrative categories in Wikipedia. These are
categories that should not be normally displayed to the end-user. You would
expect that there is one, maybe two means of expressing that given category is
administrative (e.g. a container category and a template). But there are at
least several ways of stating that a given category is administrative -
sometimes this information is only stated in the contents of the category.
Another example - eponymous categories. You have Cat_main template, which is
used to state that there is a corresponding article for a given category. You
also have Main template, which has similar meaning. But in majority of the
situations the link between the cat. and article is only provided as a link at
the beginning of the category contents or as the first item on the list of
category articles.
So I have doubts regarding construction of a well-structured ontology that
would emerge e.g. from the usage of infobox templates in Wikipedia. I am not
saying that Cyc or Umbel are perfect (they are not), there are also
duplications and unnecessary classes there. But Cyc has been constructed for
more than 20 years and I believe that many of the problems we are discussing
here were already discussed during Cyc creation. Let me just say that you have
3 concepts in Cyc that correspond to "church":
* #$ChurchService
* #$Church-Building
* #$Church-LocalCongregation
I am not saying that the mapping we will produce in May or June will provide
the correct classes in all the cases (this is just not feasible), but I am
saying that the necessary concepts are already present in Cyc (and Umbel). And
if they are really missing we will provided them and attach to the rich
structure of Cyc/Umbel.
Kind regards,
Aleksander
---- Wł. Śr, 16 kwi 2014 06:22:44 +0200 Peter F.
Patel-Schneider<peter.patel-schnei...@nuance.com> napisał(a) ----
I agree that it seems harder to crowdsource ontologies. However, Wikipedia
seems to have a half-decent organization, so maybe it is possible. My view is
that Wikipedia is succeeding, and not just in overall organization, because
there are Wikipedia editors that challenge and remove incorrect and incoherent
information and also spend time oncleanup tasks. I think that any
crowdsourced artifact needs to have participants that spend considerable time
on these tasks.
It appears that the DBpedia ontologyhas not been subject to much of this kind
of activity. Just today I went through the DBpedia class taxnomy and marked
classes that were arguably misplaced. I found over 50 out of about 500
non-sport classes(and the sport classes probably all need some attention).
The misplacement of DBpediaclasses is not justa problem with the DBpedia
ontology, of course, as the DBpedia taxonomy is used to generate type
statements for all DBpedia resources. The only aspect of DBpedia that makes
this not quite so severe a problem is that many of the misplaced classes have
few or no instances. However, some of the misplaced classes, e.g.,
FictionalCharacter, ChessPlayer, PokerPlayer, Saint, Religious, Monarch,
Medician, Galaxy, Restaurant, Country, Grape, and Venue, have a significant
number of instances.
peter
On 04/15/2014 07:16 PM, Mike Bergman wrote:
> Hi Peter,
>
> My observation is that crowdsourced knowledge bases (namely, Wikipedia,
> DBpedia, schema.org, Freebase, etc) can be excellent sources for the
> description and characterization of things and entities, but the
structures
> that may be derived from them will by definition be incoherent at the TBox
> level.
>
> Exhortations to many contributors to be more coherent at a structural
level
> are not likely, I believe, to meet with much success. The motivations of
> contributors and editors are most often local within the KB space. Thus,
in
> microcosm, many parts of these KBs can look pretty good, but when the
scope
> extends more broadly across the KB, the coherence breaks down. There
aren't
> many advocates for structure-wide coherence.
>
> As an advocate for structure-wide coherence and one who is not afraid to
> wade into the fray, perhaps you can work some useful magic. I'm dubious,
but
> I truly wish you luck.
>
> Our approach, which we have been working on for some years episodically,
> with another episode due shortly, is to use a coherent structure (UMBEL,
in
> our approach, which is a faithful, simplified subset of Cyc) to provide
the
> TBox, and then to find defensible ways to map the entity and concept
> information in crowdsourced KBs to that structure.
>
> We have been talking about this for so long that it is time for us to
> complete our initial development and put something forward that you and
> others can similarly scrutinize. We hope to have something useful by this
> summer.
>
> Thanks, Mike
>
> On 4/15/2014 6:55 PM, Patel-Schneider, Peter wrote:
>> Hmm.
>>
>> Well, perhaps one could argue that there should be no hierarchy at all.
>>
>> Using the DBpedia ontology does commit you to a lot of things, many of
them
>> quite questionable. For example, in the DBpedia ontology churches are
>> buildings, which is not true for many churches, and not even true for
the
>> physical location associated with many churches. This is one of the
things
>> that I think needs to be changed.
>>
>> peter
>>
>>
>> On Apr 15, 2014, at 4:57 PM, Paul Houle <ontolo...@gmail.com>
>> wrote:
>>
>>> I try not to get hung up on the idea of having one right hierarchy
but
>>> assume most end users will need to interpret the types that exist
in
>>> the way that makes sense for what they are doing.
>>>
>>> The idea of foaf:Agent, which is a superclass of both person and
>>> organization, is a powerful concept because of properties shared by
>>> these two "things"; for instance, either can be a party to a
>>> lawsuit. Even in music you could say a brand like "Michael Jackson"
>>> is a team effort.
>>>
>>> On the other hand some people want :Flutist to be a subclass of
>>> :Person and it makes sense to say one person who plays the flute
is a
>>> :Flutist but you can't say that a trio that all plays the flute is
a
>>> :Flutist. Everybody has some theorem they expect the system to
prove
>>> and they won't accept your axiom set unless you can prove their
>>> theorem with it.
>>>
>>> On Tue, Apr 15, 2014 at 6:51 PM, Patel-Schneider, Peter
>>> <peter.patel-schnei...@nuance.com> wrote:
>>>> schema.org is controlled by the schema.org partners, Google,
Yahoo!,
>>>> Bing, and Yandex. Contributions from the community are
accepted, but are
>>>> vetted before being added to schema.org
>>>>
>>>> See http://schema.org/ for more information.
>>>>
>>>>
>>>> One problem with alignment to schema.org is that the formal
meaning of
>>>> the schema.org ontology is unusual and not fully explained.
>>>>
>>>> peter
>>>>
>>>>
>>>>
>>>> On Apr 14, 2014, at 10:08 PM, 小出 誠二
<seijikoi...@gmail.com> wrote:
>>>>
>>>>> Dear Peter
>>>>>
>>>>> For eyes of ontologists, it is well known that DBpedia
ontology is
>>>>> incorrect, but I have never check about the contents of
schema.org. Thank
>>>>> you for the info.
>>>>>
>>>>> I am planning to correct trustable ontologies like
schema.org, but I do not
>>>>> know how to revise or advice the contents of schema.org.
>>>>>
>>>>> Does anyone know it? Or does anyone have interest the
portal sites of
>>>>> trustable ontologies?
>>>>>
>>>>> Seiji Koide
>>>>>
>>>>> -----Original Message-----
>>>>> From: Patel-Schneider, Peter
[mailto:peter.patel-schnei...@nuance.com]
>>>>> Sent: Tuesday, April 15, 2014 1:50 PM
>>>>> To: dbpedia-discussion@lists.sourceforge.net
>>>>> Subject: [Dbpedia-discussion] probably incorrect mapping
to schema.org from
>>>>> MusicalArtist
>>>>>
>>>>> The ontology says that MusicalArtist is a subclass of
>>>>> schema.org:MusicGroup.
>>>>>
>>>>> This seemed very odd to me, but then I looked at
schema.org and noticed
>>>>> that
>>>>> schema.org:MusicGroup can also be for *solo* artists.
However,
>>>>> MusicalArtist is for any musical artist, not just
soloists. So this mapping
>>>>> still looks incorrect.
>>>>>
>>>>> peter
>>
>>
------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and
their
>> applications. Written by three acclaimed leaders in the field,
>> this first edition is now available. Download your free book today!
>> http://p.sf.net/sfu/NeoTech
>> _______________________________________________
>> Dbpedia-discussion mailing list
>> Dbpedia-discussion@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion