UIMA POSTaggerTrainer wrongly parses token annotations
------------------------------------------------------

                 Key: OPENNLP-204
                 URL: https://issues.apache.org/jira/browse/OPENNLP-204
             Project: OpenNLP
          Issue Type: Bug
          Components: POS Tagger, UIMA Integration
    Affects Versions: tools-1.5.1-incubating
            Reporter: Nicolas Hernandez
             Fix For: tools-1.5.2-incubating


Affects the opennlp-uima package, in particular the 
opennlp/uima/postag/POSTaggerTrainer.java class.

This AE is expected to parse token annotations and to build two data 
structures. The first one is an array of the token coveredTexts and the second 
an array of associated tags (the tags are specified by a feature structure path 
set in parameter). 

In practice, the tag value of the current token is wrongly added to the token 
array. 

This can be easily solved by changing the name of the data structure: from 
`tokens` to `tags` at line 200.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to