Hi Valentina,
I am not sure whether I understand you correctly. There might be cases
of metonymy in DBpedia, but as far as I can see, Wikipedia is usually
quite good at separating them via disambiguation pages, I am not sure
whether there are too many example.
The problem with the degrees, as far as I can tell, is not a metonymy
one (degrees are just degrees, I have never seen them used to refer to a
university), but simply a series of shortcomings in DBpedia. What
happens here inside DBpedia is the following:
* First, we find an infobox which says that someone's almaMater is, say,
"Princeton University (B.A.)". Both Princeton and B.A. are linked to the
respective Wikipedia pages.
* The extraction framework extracts two statements from that:
PersonX almaMater Princeton_University, and
PersonX almaMater Bachelor_of_Arts
(the second one being an error, which is very hard to avoid in the
general case)
* Since that happens a few times, we infer that Bachelor_of_Arts is a
University.
So in that case, I think it's purely a DBpedia problem. If you are aware
of any actual cases of metonymy, however, I am curious to hear about that.
All the best,
Heiko
Am 13.10.2014 16:33, schrieb Valentina Presutti:
Hi Heiko,
thanks for the prompt reply and the explanation.
However, the interesting thing is that these entities are clearly used
with more than one sense (at least in the US culture), so the issue
comes from this fact originally in my opinion.
I mentioned two cases here, but if you check you can see that all
these types of entities (Degrees) have the same problem.
My suggestion (if that can help) is to identify such metonym cases and
have a special approach: having different entities as the number of
senses.
However, the Wikipedia page of such entities defines them as
degrees…not sure if this can be useful to notice for you.
Valentina
On 13 Oct 2014, at 09:03, Heiko Paulheim
<[email protected]
<mailto:[email protected]>> wrote:
Hi Valentina,
(and CCing the DBpedia discussion list)
this is an effect of the heuristic typing we employ in DBpedia [1].
It works correctly in many cases, and sometimes it fails - as for
these examples (the classic tradeoff between coverage and precision).
To briefly explain how the error comes into existence: we look at the
distribution of types that occur for the ingoing properties of an
untyped instance. For dbpedia:Bachelor_of_Arts, there are, among
others, 208 ingoing properties with the predicate
dbpedia-owl:almaMater (which is already questionable). For that
predicate, 87.6% of the objects are of type dbpedia-owl:University.
So we have a strong pattern, with many supporting statements, and we
conclude that dbpedia:Bachelor_of_Arts is a university. That
mechanism, as I said, works reasonable well, but sometimes fails at
single instances, like this one. For dbpedia:Academic_degree, you'll
find similar questionable statements involving that instace, that
mislead the heuristic typing algorithm.
With the 2014 release, we further tried to reduce errors like these
by filtering common nouns using WordNet before assigning types to
instances, but both "Academic degree" and "Bachelor of Arts" escaped
our nets here :-(
The public DBpedia endpoint loads both the infobox based types and
the heuristic types. If you need a "clean" version, I advise you to
set up a local endpoint and load only the infobox based types into it.
Best,
Heiko
[1]http://www.heikopaulheim.com/documents/iswc2013.pdf
Am 13.10.2014 02:42, schrieb Valentina Presutti:
Dear all,
I noticed that dbpedia:Bachelor_of_Arts
<http://dbpedia.org/page/Bachelor_of_Arts>, as well as other similar
entities (dbpedia:Bachelor_of_Engineering,
dbpedia:Bachelor_of_Science, etc.), is typed as dbpedia-owl:University
I would expect a type like “Academic Degree” but if you look at
dbpedia:Academic_Degree, its type is again dbpedia-owl:University
however, its definition is (according to dbpedia):
"An academic degree is a college or university diploma, often
associated with a title and sometimes associated with an academic
position, which is usually awarded in recognition of the recipient
having either satisfactorily completed a prescribed course of study
or having conducted a scholarly endeavour deemed worthy of his or
her admission to the degree. The most common degrees awarded today
are associate, bachelor's, master's, and doctoral degrees.”
Showing that there are at least two different meanings associated
with the term: college/university and title.
I thing that different meanings should be separated so as to allow
applications to refer to the different entities: a university or a
title.
At least for me this causes errors in automatic relation extraction...
Wdyt?
Valentina
--
Prof. Dr. Heiko Paulheim
Data and Web Science Group
University of Mannheim
Phone: +49 621 181 2646
B6, 26, Room C1.08
D-68159 Mannheim
Mail:[email protected]
Web:www.heikopaulheim.com
--
Prof. Dr. Heiko Paulheim
Data and Web Science Group
University of Mannheim
Phone: +49 621 181 2646
B6, 26, Room C1.08
D-68159 Mannheim
Mail: [email protected]
Web: www.heikopaulheim.com