Author: rwesten
Date: Fri Jun 15 07:11:08 2012
New Revision: 1350479

URL: http://svn.apache.org/viewvc?rev=1350479&view=rev
Log:
added section for Entity Tagging with Disambiguation support

Added:
    
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png
   (with props)
Modified:
    
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext

Modified: 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext?rev=1350479&r1=1350478&r2=1350479&view=diff
==============================================================================
--- 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
 (original)
+++ 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext
 Fri Jun 15 07:11:08 2012
@@ -191,7 +191,40 @@ In the next section "Entity Disambiguati
 
 ## Entity Disambiguation
 
-TODO: Work in progress
+Entity Disambiguation is required if an entity detected in the analyzed text 
can refer to different Entities. The following figure shows an example where 
"Bob Marley" is detected as a person in the text however there are two possible 
matches within the controlled vocabulary.
+
+![Entity Disambiguation](es_entitydisambiguation.png "Bob Marley as spotted in 
the Text may refer to two different persons in DBpedia.org)
+
+The fact that one Entity detected in the Text - represented by a 
'fise:TextAnnotation' may have multiple suggested Entities - represented by the 
two 'fise:EntityAnnotation's - has a negative impact on [Entity 
Tagging](#entity-tagging) interface that suggest tags based on 
'fise:entityAnnotation's. This is because such an interface would show in the 
above case two suggestions: (1) for 
['dbpedia:Bob_Marley'](http:dbpedia.org/resource/Bob_Marley) and (2) for 
[dbpedia:Bob_Marley_(comedian)](http://dbpedia.org/resource/Bob_Marley_%28comedian%29).
 So even if the user want to tag this content with "Bob Marley" he will need to 
reject at least one of the two suggestions.
+
+Adding explicit support for Entity Disambiguation to an Entity Tagging user 
interface can solve this problem by grouping suggested entities along 
'fise:TextAnnotation's they are suggested for. 
+
+### Grouping suggested Entities
+
+The goal of the Entity Tagging UI with disambiguation support is to show only 
a single tag suggestion for all Entities suggested for the same section in the 
analyzed text. To active this we need to follow the link between 
'fise:EntityAnnotation' and 'fise:TextAnnotation'.
+
+There are several options on how to do active this. Here an option is 
presented that starts with iterating over 'fise:EntityAnnotation's because the 
assumption that one wants to improve an existing [Entity 
Tagging](#entity-tagging) interface.
+
+1. Iterate over all 'fise:EntityAnnotation' instances. This refers to all 
resources such as "{entity-annotation} rdf:type fise:EntityAnnotation". 
+    * For more information on how to collect information for extracted 
Entities see the [according section](#process-suggested-entities) in the simple 
[Entity Tagging](#entity-tagging) interface.
+2. Retrieve the 'fise:TextAnnotation' referenced by processed 
'fise:EntityAnnotation's. For that one needs to retrieve the value(s) of the 
'dc:relation' property.
+3. While iterating over the 'fise:EntityAnnotation's establish a mapping 
'fise:TextAnnotation' -> 'fise:EntityAnnotation','fise:EntityAnnotation, ...
+    * the list of 'fise:EntityAnnotation's for each 'fise:TextAnnotation' 
needs to be sorted based on the value of the 'fise:confidence' property of the 
EntityAnnotation. Ensure that the EntityAnnotation with the higher confidence 
is first in the list. 'fise:confidence' values are in the range 0..1 where 
higher numbers represent a higher certainly.
+4. Suggest tags based on 'fise:TextAnnotation's - keys in the mapping created 
in step (3).
+    * Allow users to easily accept the Entity with the highest rank - 
['dbpedia:Bob_Marley'](http:dbpedia.org/resource/Bob_Marley) in the above 
example. Especially if the confidence of the first suggestion is high (e.g. >= 
0.8) and considerable higher as confidence values of other options.
+    * Provide users with the possibility to inspect further suggested options 
- to disambiguate between different options.
+
+### Showing the extraction Context 
+
+To allow users to more easily disambiguate between the suggested Entities it 
is important to provide them with information about the extraction context of 
the suggested entities. This is of special importance if content is not 
completely visible to the user (e.g. because it is to long to fit on the screen 
or the content is of a type that can not be rendered within the browser).
+
+Assuming the suggested Entities are grouped by 'fise:TextAnnotation' - as 
explained in the above section - one can use the information provided by the 
TextAnnotation to visualize the context and therefore helping the user in with 
the disambiguation task.
+
+The following information of the TextAnnotation can be used for this task:
+
+* 'fise:selection-context': This is the text surrounding the extracted Entity. 
The exact size of this context depends on the configuration and the 
EnhancementEngine. But typically it is the current sentence or about 50 
charters before an after the selection
+* 'fise:selected-text': This is the text representing the extracted Entity - 
the section of the text the Entity was suggested for. The 'fise:selected-text' 
MUST BE contained within the 'fise:selection-context' so user interfaces to 
want to highlight the selected part of the context can use a contains query in 
the selection context for the selected text. In case of multiple matches it is 
typically sufficient to highlight all occurrences
+* 'fise:start' and 'fise:end' values could be also used to determine the 
location however because those offset are relative to the start of the content 
it is typically easier to use the occurrences of the selected text within the 
selection context.
 
 ## Occurrence based Annotation
 

Added: 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png?rev=1350479&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream


Reply via email to