[ 
https://issues.apache.org/jira/browse/STANBOL-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Christ updated STANBOL-740:
----------------------------------

    Component/s: Engine - Keyword Extractor
    
> Adopt the KeywordLinkingEngine to use the AnalyzedText content part
> -------------------------------------------------------------------
>
>                 Key: STANBOL-740
>                 URL: https://issues.apache.org/jira/browse/STANBOL-740
>             Project: Stanbol
>          Issue Type: Sub-task
>          Components: Engine - Keyword Extractor
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>
> The KeywordLinkingEngine currently does both NLP processing AND linking 
> against the target vocabulary. Up to now this was the only possibility as 
> separating those two things was not feasible with the limitations of the RDF 
> metadata.
> With the introduction of the AnalyzedText content part the NLP processing 
> part needs no longer be part of the KeywordLinkingEngine.
> This issue covers
> * removal of the NLP related functionality from the KeywordLinkingEngine
> * reimplementation of the linking part on top of the API provided by the 
> AnalyzedText contentpart
> * add support fore new features of the NLP chain
>     * use lemmas - if available - for entity lookup
>     * use POS tagset mappings to the OLIA ontology to decide what tokens to 
> lookup
> After this change the KeywordLinkingEngine will also be able to work in 
> combination with any NLP framework that is integrated with the Stanbol NLP 
> components (writes its data to the AnalyzedText content part). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to