[
https://issues.apache.org/jira/browse/STANBOL-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rupert Westenthaler resolved STANBOL-1251.
------------------------------------------
Resolution: Fixed
Fix Version/s: 0.12.0
A first working version of the Engine is available in trunk (1.0.0-SNAPSHOT)
and the 0.12 branch. Further improvements (see TODO comments in the engine)
should be done in their own issues.
> Pos tag based Phrase extraction Engine
> --------------------------------------
>
> Key: STANBOL-1251
> URL: https://issues.apache.org/jira/browse/STANBOL-1251
> Project: Stanbol
> Issue Type: New Feature
> Components: Enhancement Engines
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Fix For: 0.12.0
>
>
> Implement an Enhancement Engine that uses POS tags to extract Noun and Verb
> Phrases
> In Stanbol POS annotations can be aligned to concepts of the OLIA ontology
> (see documentation at [1] for detailed information). This alignment allows
> engines to language independent determine the lexical categories of tokens in
> the text.
> The Pos-Chunker Engine will use those lexical categories of tokens to extract
> Noun and Verb phrases by using the following rules
> ### Noun Phrases
> * start: noun, pronoun, determiners, adjectives
> * continuation: nouns, adpositions, adjectives, punctations
> * end: noun, pronoun, determiners, adjectives
> * required: noun
> ### Verb Phrases
> * start: verb, adverb
> * continuation: verb, adverb, punctations
> * end: verb, adverb
> * required: verb
> This engine will allow to configure the processed languages (e.g. to
> deactivate it for languages where other chunker are available).
> The EnhancementEngine ordering will be ServiceProperties.ORDERING_NLP_CHUNK
> The current plan is to make this engine also available in the 0.12 branch
> [1]
> http://stanbol.staging.apache.org/docs/trunk/components/enhancer/nlp/nlpannotations
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)