[GSOC] FOAF Co-reference based Entity Disambiguation WorkFlow

Dileepa Jayakody Wed, 31 Jul 2013 14:07:36 -0700

Hi All,

As the third milestone of my project I will describe my initial design of
the FOAF Co-reference based entity disambiguation engine here.


The main disambiguation technique used here is FOAF co-reference. This aims
to merge multiple fise:EntityAnnotations identified by different surface
mentions in the text to a single FOAF entity by identifying co-reference
relationships between the entity-labels. I have an idea to introduce 2 new
fise properties called fise:coref, fise:not-coref to denote the coreference
relationships between the entities. Would like your thoughts on this idea.

Contextual information extracted from the ContentItem will be used to
identify the most suitable entity-annotation to the selected-text context.
The co-reference calculation will be initially done in a Rule-based manner.
Later a machine learning approach (SVM based) will be followed to upgrade
the system.
The disambiguation engine will calculate a disambiguation score (ds) by
performing FOAF co-reference operations on the contextual information
extract from the content items and modify the fise:confidence value for
each EntitySuggestions.

Basic co-reference rule to be used is :
{?p a owl:IFP. ?a ?p ?x. ?b ?p ?x) => {?a :coref ?b}
{?p a owl:FP . ?a ?p ?x. ?a ?p ?y.) => { ?x :coref ?y}

IFP : inverse-functional property
FP : functional property
coref : co-referent

The co-reference operations will be mainly 3 types.  These 3 types will be
implemented as sub-modules in the disambiguation-engine. The
Map<TextAnnotation,Set<Suggestions>> will go through each module (in a
chain-mode) for improved disambiguation results.  Mainly the disambiguation
process aims at People disambiguation;it also could be used for the
Organization type disambiguation.

The 3 sub-modules in the engine are as follows.

1. Co-reference by foaf-field literal matching :
This will perform a similarity matching of entity-label fields with FOAF
fields like foaf:name, givenName, firstName, familyName, nick  and update
confidence values for co-referring entities  (eg: matching firstName,
givenName and nickname mentions in the content: 'Tim Bernes' Lee is also
identified as 'timbl' as a nickname).
It should also detect TextAnnotations of email addresses (if available) and
match them with foaf:mbox,foaf:personalMailbox fields and phone numbers
with foaf:phone.
In this module, direct literal matching is performed.

2. Co-reference by relationship links :
This module perform neighborhood comparison with other People,
Organizations mentioned in the context. The relationships will be analysed
via foaf:knows field. foaf:seeAlso, foaf:sameAs will be used as the main
co-reference field to detect different EntityAnnotations referring to the
same Entity.
To detect relationships with organizations, foaf:schoolHomePage,
foaf:workplaceHomePage will be used.
To detect membership in groups, foaf:Group, foaf:member will be used as
keys.
To detect gender of the person in the context, the surface mention he/she
will be matched against the foaf:gender.

3. Topic based matching :
The fise:TopicAnnotations will be matched against foaf:interest (links to a
document), foaf:TopicInterest (links to an agent/entity), foaf:topic and
foaf:primaryTopic.

I will use the same algorithm to calculate the disambiguation score as used
in SolrMLT disambiguation engine in Stanbol.

The algorithm:

    dc := (oc* cw / ( cw + dw)) + (ds * dw / ( cw + dw))

    oc ... original-confidence [0..1]
    ds ... disambiguation-score [0..1]
    dc ... disambiguated - confidence [0..1]
    cw ... original-confidence-weight
    dw ... disambiguation-weight


Some questions I have:

   - Is it a good idea to chain many enhancement engines other than my
   foaf-site-engine such as NLP-Engines, TokenizerEngine, POSEngine to provide
   many Entity Suggestions as possible before executing disambiguation?
   - Can I use 'topic' enhancement engine in Stanbol to provide
   fise:TopicAnnotations required in the 3rd module?
   - Does SentimentAnalysis engine work? if so will
   fise:SentimentAnnotations be useful for Topic based matching?

Would like your suggestions, ideas as much as possible to improve my FOAF
co-reference based disambiguation engine.

Below is a block diagram of the workflow.

[image: Inline image 1]

source :
http://creately.com/diagram/example/hjs4yd0e1/FOAF_Disambiguation_WorkFlow

Thanks,
Dileepa

Reference :
1. "Computing FOAF Co-reference Relations with Rules and Machine
Learning",Jennifer Sleeman and Tim Finin, University of Maryland, Baltimore
County, In proceedings of The Third International Workshop on Social Data
on the Web, November 2010

[GSOC] FOAF Co-reference based Entity Disambiguation WorkFlow

Reply via email to