[jira] [Resolved] (STANBOL-1251) Pos tag based Phrase extraction Engine

Rupert Westenthaler (JIRA) Wed, 22 Jan 2014 01:58:14 -0800

     [ 
https://issues.apache.org/jira/browse/STANBOL-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rupert Westenthaler resolved STANBOL-1251.
------------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.12.0

A first working version of the Engine is available in trunk (1.0.0-SNAPSHOT) 
and the 0.12 branch. Further improvements (see TODO comments in the engine) 
should be done in their own issues.

> Pos tag based Phrase extraction Engine
> --------------------------------------
>
>                 Key: STANBOL-1251
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1251
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Enhancement Engines
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>             Fix For: 0.12.0
>
>
> Implement an Enhancement Engine that uses POS tags to extract Noun and Verb 
> Phrases
> In Stanbol POS annotations can be aligned to concepts of the OLIA ontology 
> (see documentation at [1] for detailed information). This alignment allows 
> engines to language independent determine the lexical categories of tokens in 
> the text.
> The Pos-Chunker Engine will use those lexical categories of tokens to extract 
> Noun and Verb phrases by using the following rules
> ### Noun Phrases
> * start: noun, pronoun, determiners, adjectives
> * continuation: nouns, adpositions, adjectives, punctations
> * end: noun, pronoun, determiners, adjectives
> * required: noun
> ### Verb Phrases
> * start: verb, adverb
> * continuation: verb, adverb, punctations
> * end: verb, adverb
> * required: verb
> This engine will allow to configure the processed languages (e.g. to 
> deactivate it for languages where other chunker are available).
> The EnhancementEngine ordering will be ServiceProperties.ORDERING_NLP_CHUNK
> The current plan is to make this engine also available in the 0.12 branch
> [1] 
> http://stanbol.staging.apache.org/docs/trunk/components/enhancer/nlp/nlpannotations



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (STANBOL-1251) Pos tag based Phrase extraction Engine

Reply via email to