Author: rwesten Date: Fri Jun 15 07:11:08 2012 New Revision: 1350479 URL: http://svn.apache.org/viewvc?rev=1350479&view=rev Log: added section for Entity Tagging with Disambiguation support
Added: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png (with props) Modified: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext Modified: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext URL: http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext?rev=1350479&r1=1350478&r2=1350479&view=diff ============================================================================== --- incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext (original) +++ incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancementstructure.mdtext Fri Jun 15 07:11:08 2012 @@ -191,7 +191,40 @@ In the next section "Entity Disambiguati ## Entity Disambiguation -TODO: Work in progress +Entity Disambiguation is required if an entity detected in the analyzed text can refer to different Entities. The following figure shows an example where "Bob Marley" is detected as a person in the text however there are two possible matches within the controlled vocabulary. + + + +The fact that one Entity detected in the Text - represented by a 'fise:TextAnnotation' may have multiple suggested Entities - represented by the two 'fise:EntityAnnotation's - has a negative impact on [Entity Tagging](#entity-tagging) interface that suggest tags based on 'fise:entityAnnotation's. This is because such an interface would show in the above case two suggestions: (1) for ['dbpedia:Bob_Marley'](http:dbpedia.org/resource/Bob_Marley) and (2) for [dbpedia:Bob_Marley_(comedian)](http://dbpedia.org/resource/Bob_Marley_%28comedian%29). So even if the user want to tag this content with "Bob Marley" he will need to reject at least one of the two suggestions. + +Adding explicit support for Entity Disambiguation to an Entity Tagging user interface can solve this problem by grouping suggested entities along 'fise:TextAnnotation's they are suggested for. + +### Grouping suggested Entities + +The goal of the Entity Tagging UI with disambiguation support is to show only a single tag suggestion for all Entities suggested for the same section in the analyzed text. To active this we need to follow the link between 'fise:EntityAnnotation' and 'fise:TextAnnotation'. + +There are several options on how to do active this. Here an option is presented that starts with iterating over 'fise:EntityAnnotation's because the assumption that one wants to improve an existing [Entity Tagging](#entity-tagging) interface. + +1. Iterate over all 'fise:EntityAnnotation' instances. This refers to all resources such as "{entity-annotation} rdf:type fise:EntityAnnotation". + * For more information on how to collect information for extracted Entities see the [according section](#process-suggested-entities) in the simple [Entity Tagging](#entity-tagging) interface. +2. Retrieve the 'fise:TextAnnotation' referenced by processed 'fise:EntityAnnotation's. For that one needs to retrieve the value(s) of the 'dc:relation' property. +3. While iterating over the 'fise:EntityAnnotation's establish a mapping 'fise:TextAnnotation' -> 'fise:EntityAnnotation','fise:EntityAnnotation, ... + * the list of 'fise:EntityAnnotation's for each 'fise:TextAnnotation' needs to be sorted based on the value of the 'fise:confidence' property of the EntityAnnotation. Ensure that the EntityAnnotation with the higher confidence is first in the list. 'fise:confidence' values are in the range 0..1 where higher numbers represent a higher certainly. +4. Suggest tags based on 'fise:TextAnnotation's - keys in the mapping created in step (3). + * Allow users to easily accept the Entity with the highest rank - ['dbpedia:Bob_Marley'](http:dbpedia.org/resource/Bob_Marley) in the above example. Especially if the confidence of the first suggestion is high (e.g. >= 0.8) and considerable higher as confidence values of other options. + * Provide users with the possibility to inspect further suggested options - to disambiguate between different options. + +### Showing the extraction Context + +To allow users to more easily disambiguate between the suggested Entities it is important to provide them with information about the extraction context of the suggested entities. This is of special importance if content is not completely visible to the user (e.g. because it is to long to fit on the screen or the content is of a type that can not be rendered within the browser). + +Assuming the suggested Entities are grouped by 'fise:TextAnnotation' - as explained in the above section - one can use the information provided by the TextAnnotation to visualize the context and therefore helping the user in with the disambiguation task. + +The following information of the TextAnnotation can be used for this task: + +* 'fise:selection-context': This is the text surrounding the extracted Entity. The exact size of this context depends on the configuration and the EnhancementEngine. But typically it is the current sentence or about 50 charters before an after the selection +* 'fise:selected-text': This is the text representing the extracted Entity - the section of the text the Entity was suggested for. The 'fise:selected-text' MUST BE contained within the 'fise:selection-context' so user interfaces to want to highlight the selected part of the context can use a contains query in the selection context for the selected text. In case of multiple matches it is typically sufficient to highlight all occurrences +* 'fise:start' and 'fise:end' values could be also used to determine the location however because those offset are relative to the start of the content it is typically easier to use the occurrences of the selected text within the selection context. ## Occurrence based Annotation Added: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png URL: http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png?rev=1350479&view=auto ============================================================================== Binary file - no diff available. Propchange: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/es_entitydisambiguation.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream