Hi Peter,

I agree with Aleksander about having mixed feelings about the Wikipedia 
category structure (I also agree with the imperfection of Cyc and UMBEL 
:) ).

Besides the category issues Aleksander notes, there is another source of 
issues with the Wikipedia structure: compound categories. By "compound 
categories" we mean categories that combine a main subject with 
additional attributes or characteristics. Some examples for English are 
shown under [1]. Excluding administrative categories in the English 
Wikipedia, about half of categories are of this compound type. They 
often have prepositions, certain articles, or certain heads in the titles.

(I will say that some of the compound categories are being converted 
over time to list categories 'List of XXX', which is another challenge.)

These kinds of category problems exhibit themselves in how Wikipedia is 
used. First, my understanding is that Wikipedia's own structured effort, 
Wikidata, has chosen not to use the Wikipedia categories or the DBpedia 
ontology for their organizing structure [2]. Second, I think it can be 
fairly argued that most users of Wikipedia use it for lookup of specific 
references and related relationships. I know of very few examples where 
Wikipedia is used for casual discovery or navigation, uses which would 
rely on the Wikipedia structure.

That you are willing, Peter, to work to improve the DBpedia ontology is 
fantastic. We ourselves have just spent tens of hours doing a manual 
mapping of the 650 or so classes in the DBpedia ontology to Cyc and 
UMBEL. I hope to post a link to that shortly; it may aid in some of your 
own efforts. Despite my earlier cynicism, I do wish your effort well.

We are convinced to our bones the value of the Wikipedia content, unique 
in human history, and also believe there is much, much latent and 
decipherable structure within it, including its infoboxes as DBpedia has 
shown. But we are also convinced that the Wikipedia structure as is is 
significantly flawed. We'd like to be able to infer and discover across 
the entire Wikipedia structure, as well as to get to valuable content 
for specific things.

Thanks, Mike

[1] http://en.wikipedia.org/wiki/List_of_Toronto_Maple_Leafs_award_winners
http://en.wikipedia.org/wiki/List_of_tallest_buildings_in_Metro_Manila
http://en.wikipedia.org/wiki/Ambassadors_and_envoys_from_Russia_to_Poland_%281763%E2%80%931794%29
http://en.wikipedia.org/wiki/1991_Japanese_Formula_3000_season

[2] 
https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Migrating_away_from_GND_main_type;
 
http://ultimategerardm.blogspot.co.uk/2013/09/some-answers-about-heady-stuff-of.html

On 4/16/2014 6:26 AM, apoh...@o2.pl wrote:
> I have a somewhat mixed feeling regarding the organization of Wikipedia.
> It is true that if you use it as a human, you will find what you need in
> a reasonable time (usually using search box and following direct
> category links). But from the POV of a KB engineer, its organization is
> very far from being perfect. Let me just give you an example:
> administrative categories in Wikipedia. These are categories that should
> not be normally displayed to the end-user. You would expect that there
> is one, maybe two means of expressing that given category is
> administrative (e.g. a container category and a template). But there are
> at least several ways of stating that a given category is administrative
> - sometimes this information is only stated in the contents of the category.
>
> Another example - eponymous categories. You have Cat_main template,
> which is used to state that there is a corresponding article for a given
> category. You also have Main template, which has similar meaning. But in
> majority of the situations the link between the cat. and article is only
> provided as a link at the beginning of the category contents or as the
> first item on the list of category articles.
>
> So I have doubts regarding construction of a well-structured ontology
> that would emerge e.g. from the usage of infobox templates in Wikipedia.
> I am not saying that Cyc or Umbel are perfect (they are not), there are
> also duplications and unnecessary classes there. But Cyc has been
> constructed for more than 20 years and I believe that many of the
> problems we are discussing here were already discussed during Cyc
> creation. Let me just say that you have 3 concepts in Cyc that
> correspond to "church":
> * #$ChurchService
> * #$Church-Building
> * #$Church-LocalCongregation
>
> I am not saying that the mapping we will produce in May or June will
> provide the correct classes in all the cases (this is just not
> feasible), but I am saying that the necessary concepts are already
> present in Cyc (and Umbel). And if they are really missing we will
> provided them and attach to the rich structure of Cyc/Umbel.
>
> Kind regards,
> Aleksander
>
> ---- Wł. Śr, 16 kwi 2014 06:22:44 +0200 *Peter F.
> Patel-Schneider<peter.patel-schnei...@nuance.com>* napisał(a) ----
>
>     I agree that it seems harder to crowdsource ontologies. However,
>     Wikipedia
>     seems to have a half-decent organization, so maybe it is possible.
>     My view is
>     that Wikipedia is succeeding, and not just in overall organization,
>     because
>     there are Wikipedia editors that challenge and remove incorrect and
>     incoherent
>     information and also spend time oncleanup tasks. I think that any
>     crowdsourced artifact needs to have participants that spend
>     considerable time
>     on these tasks.
>
>     It appears that the DBpedia ontologyhas not been subject to much of
>     this kind
>     of activity. Just today I went through the DBpedia class taxnomy and
>     marked
>     classes that were arguably misplaced. I found over 50 out of about 500
>     non-sport classes(and the sport classes probably all need some
>     attention).
>
>     The misplacement of DBpediaclasses is not justa problem with the
>     DBpedia
>     ontology, of course, as the DBpedia taxonomy is used to generate type
>     statements for all DBpedia resources. The only aspect of DBpedia
>     that makes
>     this not quite so severe a problem is that many of the misplaced
>     classes have
>     few or no instances. However, some of the misplaced classes, e.g.,
>     FictionalCharacter, ChessPlayer, PokerPlayer, Saint, Religious,
>     Monarch,
>     Medician, Galaxy, Restaurant, Country, Grape, and Venue, have a
>     significant
>     number of instances.
>
>     peter
>
>     On 04/15/2014 07:16 PM, Mike Bergman wrote:
>      > Hi Peter,
>      >
>      > My observation is that crowdsourced knowledge bases (namely,
>     Wikipedia,
>      > DBpedia, schema.org, Freebase, etc) can be excellent sources for the
>      > description and characterization of things and entities, but the
>     structures
>      > that may be derived from them will by definition be incoherent at
>     the TBox
>      > level.
>      >
>      > Exhortations to many contributors to be more coherent at a
>     structural level
>      > are not likely, I believe, to meet with much success. The
>     motivations of
>      > contributors and editors are most often local within the KB
>     space. Thus, in
>      > microcosm, many parts of these KBs can look pretty good, but when
>     the scope
>      > extends more broadly across the KB, the coherence breaks down.
>     There aren't
>      > many advocates for structure-wide coherence.
>      >
>      > As an advocate for structure-wide coherence and one who is not
>     afraid to
>      > wade into the fray, perhaps you can work some useful magic. I'm
>     dubious, but
>      > I truly wish you luck.
>      >
>      > Our approach, which we have been working on for some years
>     episodically,
>      > with another episode due shortly, is to use a coherent structure
>     (UMBEL, in
>      > our approach, which is a faithful, simplified subset of Cyc) to
>     provide the
>      > TBox, and then to find defensible ways to map the entity and concept
>      > information in crowdsourced KBs to that structure.
>      >
>      > We have been talking about this for so long that it is time for
>     us to
>      > complete our initial development and put something forward that
>     you and
>      > others can similarly scrutinize. We hope to have something useful
>     by this
>      > summer.
>      >
>      > Thanks, Mike
>      >
>      > On 4/15/2014 6:55 PM, Patel-Schneider, Peter wrote:
>      >> Hmm.
>      >>
>      >> Well, perhaps one could argue that there should be no hierarchy
>     at all.
>      >>
>      >> Using the DBpedia ontology does commit you to a lot of things,
>     many of them
>      >> quite questionable. For example, in the DBpedia ontology
>     churches are
>      >> buildings, which is not true for many churches, and not even
>     true for the
>      >> physical location associated with many churches. This is one of
>     the things
>      >> that I think needs to be changed.
>      >>
>      >> peter
>      >>
>      >>
>      >> On Apr 15, 2014, at 4:57 PM, Paul Houle <ontolo...@gmail.com
>     <mailto:ontolo...@gmail.com>>
>      >> wrote:
>      >>
>      >>> I try not to get hung up on the idea of having one right
>     hierarchy but
>      >>> assume most end users will need to interpret the types that
>     exist in
>      >>> the way that makes sense for what they are doing.
>      >>>
>      >>> The idea of foaf:Agent, which is a superclass of both person and
>      >>> organization, is a powerful concept because of properties shared by
>      >>> these two "things"; for instance, either can be a party to a
>      >>> lawsuit. Even in music you could say a brand like "Michael Jackson"
>      >>> is a team effort.
>      >>>
>      >>> On the other hand some people want :Flutist to be a subclass of
>      >>> :Person and it makes sense to say one person who plays the
>     flute is a
>      >>> :Flutist but you can't say that a trio that all plays the flute
>     is a
>      >>> :Flutist. Everybody has some theorem they expect the system to
>     prove
>      >>> and they won't accept your axiom set unless you can prove their
>      >>> theorem with it.
>      >>>
>      >>> On Tue, Apr 15, 2014 at 6:51 PM, Patel-Schneider, Peter
>      >>> <peter.patel-schnei...@nuance.com
>     <mailto:peter.patel-schnei...@nuance.com>> wrote:
>      >>>> schema.org is controlled by the schema.org partners, Google,
>     Yahoo!,
>      >>>> Bing, and Yandex. Contributions from the community are
>     accepted, but are
>      >>>> vetted before being added to schema.org
>      >>>>
>      >>>> See http://schema.org for more information.
>      >>>>
>      >>>>
>      >>>> One problem with alignment to schema.org is that the formal
>     meaning of
>      >>>> the schema.org ontology is unusual and not fully explained.
>      >>>>
>      >>>> peter
>      >>>>
>      >>>>
>      >>>>
>      >>>> On Apr 14, 2014, at 10:08 PM, 小出 誠二 <seijikoi...@gmail.com
>     <mailto:seijikoi...@gmail.com>> wrote:
>      >>>>
>      >>>>> Dear Peter
>      >>>>>
>      >>>>> For eyes of ontologists, it is well known that DBpedia
>     ontology is
>      >>>>> incorrect, but I have never check about the contents of
>     schema.org. Thank
>      >>>>> you for the info.
>      >>>>>
>      >>>>> I am planning to correct trustable ontologies like
>     schema.org, but I do not
>      >>>>> know how to revise or advice the contents of schema.org.
>      >>>>>
>      >>>>> Does anyone know it? Or does anyone have interest the portal
>     sites of
>      >>>>> trustable ontologies?
>      >>>>>
>      >>>>> Seiji Koide
>      >>>>>
>      >>>>> -----Original Message-----
>      >>>>> From: Patel-Schneider, Peter
>     [mailto:peter.patel-schnei...@nuance.com]
>      >>>>> Sent: Tuesday, April 15, 2014 1:50 PM
>      >>>>> To: dbpedia-discussion@lists.sourceforge.net
>     <mailto:dbpedia-discussion@lists.sourceforge.net>
>      >>>>> Subject: [Dbpedia-discussion] probably incorrect mapping to
>     schema.org from
>      >>>>> MusicalArtist
>      >>>>>
>      >>>>> The ontology says that MusicalArtist is a subclass of
>      >>>>> schema.org:MusicGroup.
>      >>>>>
>      >>>>> This seemed very odd to me, but then I looked at schema.org
>     and noticed
>      >>>>> that
>      >>>>> schema.org:MusicGroup can also be for *solo* artists. However,
>      >>>>> MusicalArtist is for any musical artist, not just soloists.
>     So this mapping
>      >>>>> still looks incorrect.
>      >>>>>
>      >>>>> peter
>      >>
>      >>
>     
> ------------------------------------------------------------------------------
>      >> Learn Graph Databases - Download FREE O'Reilly Book
>      >> "Graph Databases" is the definitive new guide to graph databases
>     and their
>      >> applications. Written by three acclaimed leaders in the field,
>      >> this first edition is now available. Download your free book today!
>      >> http://p.sf.net/sfu/NeoTech
>      >> _______________________________________________
>      >> Dbpedia-discussion mailing list
>      >> Dbpedia-discussion@lists.sourceforge.net
>     <mailto:Dbpedia-discussion@lists.sourceforge.net>
>      >> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>      >>
>
>
>     
> ------------------------------------------------------------------------------
>     Learn Graph Databases - Download FREE O'Reilly Book
>     "Graph Databases" is the definitive new guide to graph databases and
>     their
>     applications. Written by three acclaimed leaders in the field,
>     this first edition is now available. Download your free book today!
>     http://p.sf.net/sfu/NeoTech
>     _______________________________________________
>     Dbpedia-discussion mailing list
>     Dbpedia-discussion@lists.sourceforge.net
>     <mailto:Dbpedia-discussion@lists.sourceforge.net>
>     https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>

-- 
__________________________________________

Michael K. Bergman
CEO  Structured Dynamics LLC
319.621.5225
skype:michaelkbergman
http://structureddynamics.com
http://mkbergman.com
http://www.linkedin.com/in/mkbergman
__________________________________________

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to