Thanks Pablo,

I find the disambiguation subject very challenging and intriguing. By any
chance,
do you (or anyone) have any pointers to some presentations, background
documentation or lectures about disambiguation?

I would like to keep this topic alive as disambiguation can truly make a
difference
in our implementations.

Best regards,
David

On Mon, Sep 3, 2012 at 5:31 PM, Pablo N. Mendes <pablomen...@gmail.com>wrote:

> Hi David,
> The challenge you described is usually referred to as "(in document)
> coreference resolution". It is very related to the entity disambiguation
> problem, as entity disambiguation can be seen as cross-document coreference
> resolution (by using identifiers from a pre-established KB). However I
> think it's worth thinking of it separately from (but in close connection
> with) the targeted entity disambiguation problem. This is because there are
> many alternatives with pros and cons, including:
> 1. clustering mentions at recognition time and then disambiguating,
> 2. clustering mentions after disambiguating, or
> 3. jointly disambiguating/clustering.
>
> In DBpedia Spotlight we use a very simple heuristic rule: if a first name
> or last name is spotted, we look backwards for a full name and assign
> everyone in the chain to the same entity [1]. It is a very crude
> assumption, but works quite well in practice.
>
> Cheers,
> Pablo
>
> [1]
>
> https://github.com/dbpedia-spotlight/dbpedia-spotlight/blob/master/core/src/main/scala/org/dbpedia/spotlight/filter/annotations/CoreferenceFilter.scala
>
> On Thu, Aug 23, 2012 at 5:27 PM, David Riccitelli <da...@insideout.io
> >wrote:
>
> > Thanks Kritarth,
> >
> > Let me discuss another case, with another example: there's a text like
> this
> > "Valentino Rossi won the MotoGP. Everybody loves Rossi.".
> >
> > Right now the enhancer correctly identifies "Valentino Rossi (racer)" in
> > the TextAnnotation "Valentino Rossi", while makes different suggestions
> for
> > the TextAnnotation "Rossi" , sorted by ranking (unfortunately Valentino
> > Rossi non being the first):
> >  - "Daniele De Rossi (soccer player)"
> >  - "Vasco Rossi (singer)"
> >  - "Valentino Rossi (racer)"
> >
> > In this case would the disambiguation engine boost the score of the
> > EntityAnnotation "Valentino Rossi (racer)"?
> >
> > BR,
> > David
> >
> > On Thu, Aug 23, 2012 at 4:43 PM, kritarth anand <
> kritarth.an...@gmail.com
> > >wrote:
> >
> > > Hi David,
> > > Thanks for your interest.
> > >
> > > What would a sentence like this yield, "Paris is not the city in United
> > > States" ?
> > >
> > > It would yield Paris,Texas too. Well those are one the reasons the
> > problem
> > > is very hard.
> > >
> > > Kritarth
> > >
> > > On Thu, Aug 23, 2012 at 7:06 PM, David Riccitelli <da...@insideout.io
> > > >wrote:
> > >
> > > > What would a sentence like this yield, "Paris is not the city in
> United
> > > > States" ?
> > > >
> > > > On Thu, Aug 23, 2012 at 4:23 PM, kritarth anand <
> > > kritarth.an...@gmail.com
> > > > >wrote:
> > > >
> > > > > Dear members of Stanbol community,
> > > > >
> > > > > I hereby would like to discuss about the next few iterations of the
> > > > > Disambiguation Engine. The Disambiguation Engine, To Disambiguate
> > > Engines
> > > > > few versions of Engines have been prepared. I would like to briefly
> > > > > describe them below. I hope to become a permanent committer for
> > Stanbol
> > > > if
> > > > > my contribution is considered after this GSOC period. I will be
> > > > committing
> > > > > the code versions soon. And applying patch to JIRA soon.
> > > > >
> > > > > 1. How disambiguation Engine problem was approached.
> > > > >  For certain text annotations there are might be many Entity
> > > Annotations
> > > > > mapped, It was required to rank them in the order of there
> > likelihood.
> > > > > Paris is the a small city in the United States.
> > > > >
> > > > > a.The Paris is this sentence without disambiguation (using Dbpedia
> as
> > > > > vocabulary). There are three entity annotations mapped 1. Paris,
> > > France ,
> > > > > 2. Paris, Texas 3. Paris, *Something* (The entity mapped with
> highest
> > > > > fise:confidence is Paris, France.)
> > > > > b. Now how would disambiguation by humans take place. On reading
> the
> > > line
> > > > > an individual thinks of the context the text is referring to. Doing
> > so
> > > he
> > > > > realizes that since the text talks about Paris and also about
> United
> > > > > States. The Paris mentioned here is More Like Paris,Texas(which is
> in
> > > > > United States) and therefore must refer to it.
> > > > > c. The approach followed in implementation takes inspiration from
> the
> > > > > example and works in the following manner somewhat follows the
> pseudo
> > > > code
> > > > > below.
> > > > >     for( K: TextAnnotations)
> > > > >     {    List EntityAnnotations =getEntityAnnotationsRelated(K);
> > > > >         Context=GetContextInformation(K);
> > > > >
> > > > >         List Results=QueryMLTVocabularies(K, Context);
> > > > >         updateConfidences(Result,EntityAnnotations)
> > > > >     }
> > > > >
> > > > > d. My current approach to handle disambiguation involved a lot of
> > > > > variations however for the purpose of simplicity I'll talk only
> about
> > > > > differences in obtaining "Context".
> > > > >
> > > > > 2. The Context Procurement:
> > > > > a. All Entity Context: The context would be decided on by all the
> > > > > textannotations of the text. It proves to show good results for
> > shorter
> > > > > texts, but introduces lot of redundant annotations in longer ones
> > > making
> > > > > context less useful
> > > > > b. All link Context: The context is decided on the basis of site or
> > > > > reference link associated with the text annotations, which of
> course
> > > can
> > > > be
> > > > > required to disambiguate. So it does not behave in a very good
> > fashion
> > > > > c. Selection Context: The selection context is basically contains
> > text
> > > > one
> > > > > sentence prior and after the current one. Also another version
> worked
> > > > with
> > > > > Text Annotations in this region of text.
> > > > > d. Vicinity Entity Context: The vicinity annotation detection
> > measures
> > > > > distance in the neighborhood of the text annotation.
> > > > >
> > > > > 3. Future
> > > > > a. With a running POC of this Engine it can be used to create an
> > > advanced
> > > > > version like the Spotlight approach or using Markov Logic Networks
> > > > > discussed earlier.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > David Riccitelli
> > > >
> > > >
> > > >
> > >
> >
> ********************************************************************************
> > > > InsideOut10 s.r.l.
> > > > P.IVA: IT-11381771002
> > > > Fax: +39 0110708239
> > > > ---
> > > > LinkedIn: http://it.linkedin.com/in/riccitelli
> > > > Twitter: ziodave
> > > > ---
> > > > Layar Partner Network<
> > > >
> > >
> >
> http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1
> > > > >
> > > >
> > > >
> > >
> >
> ********************************************************************************
> > > >
> > >
> >
> >
> >
> > --
> > David Riccitelli
> >
> >
> >
> ********************************************************************************
> > InsideOut10 s.r.l.
> > P.IVA: IT-11381771002
> > Fax: +39 0110708239
> > ---
> > LinkedIn: http://it.linkedin.com/in/riccitelli
> > Twitter: ziodave
> > ---
> > Layar Partner Network<
> >
> http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1
> > >
> >
> >
> ********************************************************************************
> >
>
>
>
> --
> ---
> Pablo N. Mendes
> http://pablomendes.com
> Events: http://wole2012.eurecom.fr
>



-- 
David Riccitelli

********************************************************************************
InsideOut10 s.r.l.
P.IVA: IT-11381771002
Fax: +39 0110708239
---
LinkedIn: http://it.linkedin.com/in/riccitelli
Twitter: ziodave
---
Layar Partner 
Network<http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1>
********************************************************************************

Reply via email to