[ 
https://issues.apache.org/jira/browse/STANBOL-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13796727#comment-13796727
 ] 

Rupert Westenthaler edited comment on STANBOL-1013 at 10/16/13 12:35 PM:
-------------------------------------------------------------------------

This will also remove the dependency of the FstLinkingEngine to the 
EntityLinkingEngine.

As part of the work on the FstLinkingEngine (STANBOL-1128) the Entity Spotting 
part has been made more modular. This should further support work on this issue.


was (Author: rwesten):
This will also remove the dependency of the FstLinkingEngine to the 
EntityLinkingEngine.

As part of the work on the FstLinkingEngine (STANBOL-1128) the Entity Spotting 
part has been made more modular. This should further support this issue.

> Seperate (Entity)Spotting and (Entity)Linking
> ---------------------------------------------
>
>                 Key: STANBOL-1013
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1013
>             Project: Stanbol
>          Issue Type: Improvement
>          Components: Enhancement Engines, Enhancer
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>
> Currently the EntityLinking engine performs two major tasks
> (1) Spotting: detect the words in the analyzed Text that should be linked to 
> the controlled Vocabulary. For that words are categorized as "linkable", 
> "matchable" and "others". Also Chunks are considered for this task.
> (2) Linking: Creates searches for "linkable" words while considering 
> "matchable" words. Labels of suggested Entities are tokenized and matched 
> against "linkable" and "matchable" words in the text. The 
> EntityLinkingConfiguration ise used to configure this task.
> See the documentation of the EntityLinkingEngine [1] for details.
> (1) is configured by using the TextProcessingConfiguration and implemented by 
> the ProcessingState class. (2) is configured by the 
> EntityLinkingConfiguration and implemented by the EntityLinker class.
> Proposed Workplan:
> =====
> 1. clean-up and improve the internal APIs used by the EntityLinking engine
> 2. define a public API for describing Entity Spotting results: Possibilities 
> include
>     * using the metadata of the ContentItems (e.g. fise:TextAnnotations)
>     * annotations in the AnalyzedText contentpart
>     * some additional ContentPart
> 3 Split-up (1) and (2) as two separate EnhancementEngines so that
>    * (1) NlpSpottingEngine: Spots potential Entities by using NLP processing 
> results
>    * (2) EntityLinkingEngine: Links Entities of a Controlled Vocabulary based 
> on Spotting results
> 4. Integrate Named Entity Linking into the new Spotting & Linking workflow
>     * By allowing Spotters to also annotate spotted Entities to carry 
> additional metadata (e.g. the type as suggested by NER)
>     * Extending the EntityLinkingEngine to make use of those metadata when 
> searching/matching Entities from linked Vocabularies. 
> [1] 
> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to