Hi David,
Thanks for the hint. I remember trying LUPedia a few months ago -- now it
has a defined API, which is a good addition. Unfortunately, the quality of
results could be improved quite a bit.
Here is a scientific statement that I would like to see annotated:
"Albizia julibrissin has anxiolytic-like effects that are mediated by the
changes of the serotonergic nervous system, especially 5-HT1A receptors."
LUPedia is unable to identify any entities in this string, although DBpedia
would contain them.
http://dbpedia.org/resource/Albizia_julibrissin
http://dbpedia.org/resource/Anxiolytic
http://dbpedia.org/page/5-HT1A_receptor
et cetera.
It seems to recognize person names, as for the string "Michael Jackson", the
following URIs are returned:
# http://dbpedia.org/resource/Parademon
# http://dbpedia.org/resource/Michael_Jackson
The first result is a bit puzzling (DBpedia tells me that 'In the DC
Universe, Parademons are monstrous shock troops of Apokolips used by
Darkseid to maintain the order of Apokolips.').
LUPedia does not seem to do any kind of stemming either, as submitting the
string "Michael Jacksons" reduces the list of extracted URIs to:
# http://dbpedia.org/resource/Parademon
LUPedia in its current form will not perform too well in practical settings.
Cheers,
Matthias Samwald
--------------------------------------------------
From: "Davide Palmisano" <[email protected]>
Sent: Tuesday, February 02, 2010 2:27 PM
To: "Matthias Samwald" <[email protected]>
Cc: <[email protected]>
Subject: Re: DBpedia-based entity recognition service / tool?
Hi Matthias,
have you ever tried this http://lupedia.ontotext.com/ ? Perhaps it may
help.
cheers,
Davide
On Tue, Feb 2, 2010 at 1:26 PM, Matthias Samwald <[email protected]> wrote:
Dear LOD community,
I would be glad to hear your advice on how to best accomplish a simple
task:
extracting DBpedia entities (identified with DBpedia URIs) from a string
of
text. With good accuracy and recall, possibly with some options to
constraint the recognized entities to some subset of DBpedia, based on
categories. The tool or service should be performant enough to process
large
numbers of strings in a reasonable amount of time.
Given the prolific creation of tiny tools and services in this community
I
am puzzled about my inability to find anything that accomplishes this
task.
Could you point me to something like that? Are there tools/services for
Wikipedia that I could use?
Zemanta seems to be too much geared towards 'enhanced blogging', while
OpenCalais does not return Wikipedia/DBpedia identifiers. Please correct
me
if I am wrong.
Cheers,
Matthias