Rupert Westenthaler created STANBOL-740:
-------------------------------------------

             Summary: Adopt the KeywordLinkingEngine to use the AnalyzedText 
content part
                 Key: STANBOL-740
                 URL: https://issues.apache.org/jira/browse/STANBOL-740
             Project: Stanbol
          Issue Type: Sub-task
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler


The KeywordLinkingEngine currently does both NLP processing AND linking against 
the target vocabulary. Up to now this was the only possibility as separating 
those two things was not feasible with the limitations of the RDF metadata.

With the introduction of the AnalyzedText content part the NLP processing part 
needs no longer be part of the KeywordLinkingEngine.

This issue covers

* removal of the NLP related functionality from the KeywordLinkingEngine
* reimplementation of the linking part on top of the API provided by the 
AnalyzedText contentpart
* add support fore new features of the NLP chain
    * use lemmas - if available - for entity lookup
    * use POS tagset mappings to the OLIA ontology to decide what tokens to 
lookup

After this change the KeywordLinkingEngine will also be able to work in 
combination with any NLP framework that is integrated with the Stanbol NLP 
components (writes its data to the AnalyzedText content part). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to