[ 
https://issues.apache.org/jira/browse/OPENNLP-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185100#comment-14185100
 ] 

Hugo Mougard commented on OPENNLP-676:
--------------------------------------

There was the same problem in uimaFIT and it was resolved with an index 
(Map<Sentence, Token>) instead of two lists (List<Sentence>, List<Token>). I 
think the end result is the same complexity wise both for time and space usage.

> POSTagger UIMA AE broken because of AnnotationComboIterator
> -----------------------------------------------------------
>
>                 Key: OPENNLP-676
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-676
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: POS Tagger, UIMA Integration
>    Affects Versions: tools-1.5.3
>         Environment: Oracle JDK8, Debian Jessie 64b
>            Reporter: Hugo Mougard
>            Assignee: Joern Kottmann
>             Fix For: 1.6.0
>
>
> The AnnotationComboIterator helper class used by the UIMA POSTagger accesses 
> its iterators unsafely.
> The consequence is that the AE breaks even on very simple CASes such as the 
> CAS showcased on this repository (text of 9 letters, 2 sentence annotations 
> and 9 token annotations): 
> https://github.com/m09/postagger-iterator-bug/blob/master/in.xmi
> The repository linked above contains an example program that crashes on my 
> setup. It's fully maven 3 aware so you can normally launch it quite easily.
> Here is a patch that should address the issue: 
> https://raw.githubusercontent.com/m09/postagger-iterator-bug/master/iterator.patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to