[ https://issues.apache.org/jira/browse/UIMA-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Richard Eckart de Castilho updated UIMA-3075: --------------------------------------------- Affects Version/s: (was: 2.4.0C) 2.4.0SDK Assignee: Richard Eckart de Castilho > Unambiguous non-strict subiterator may return annotations outside the given > annotation's range > ---------------------------------------------------------------------------------------------- > > Key: UIMA-3075 > URL: https://issues.apache.org/jira/browse/UIMA-3075 > Project: UIMA > Issue Type: Bug > Affects Versions: 2.4.0SDK > Reporter: Alexander N Thomas > Assignee: Richard Eckart de Castilho > Priority: Minor > > REPRO: using a tokenizer that matches on "[^ ]" on "aaa bbb ccc ddd" I get > four token annotations > "aaa" 0-3 > "bbb" 4-7 > "ccc" 8-11 > "ddd" 12-15 > I then iterate over the token annotations while printing the covered text, > begin and end, make an unambiguous non-strict subiterator, and iterate over > the subiterations printing out their covered text, begin and end all indented. > Iterator<Annotation> iter = > jcas.getAnnotationIndex(Token.type).iterator(); > while (iter.hasNext()) { > Annotation a = iter.next(); > System.out.println("\"" + a.getCoveredText() + "\"" + " > [" + a.getBegin() + ", " + a.getEnd() + ")"); > Iterator<Annotation> featIter = > jcas.getAnnotationIndex().subiterator(a, false, false); > while (featIter.hasNext()) { > Annotation b = featIter.next(); > System.out.println("\t\"" + b.getCoveredText() > + "\"" + " [" + b.getBegin() + ", " + b.getEnd() + ")"); > } > } > The output is > "aaa" [0, 3) > "bbb" [4, 7) > "bbb" [4, 7) > "ccc" [8, 11) > "ccc" [8, 11) > "ddd" [12, 15) > "ddd" [12, 15) > I think this can be fixed by adding an extra check at Subiterator.java ln: 127 > NOW > while (it.isValid() && ((start > annot.getBegin()) || (strict && > annot.getEnd() > end))) { > it.moveToNext(); > } > POSSIBLE FIX > while (it.isValid() && ((start > annot.getBegin() && annot.getBegin() <= > end) || (strict && annot.getEnd() > end))) { > it.moveToNext(); > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira