[ 
https://issues.apache.org/jira/browse/UIMA-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13655521#comment-13655521
 ] 

Richard Eckart de Castilho edited comment on UIMA-2808 at 5/12/13 12:06 PM:
----------------------------------------------------------------------------

It was a bug in the implementation of Subiterator.initUnambiguousSubiterator(). 
It internally maintains a list of the annotations within the boundary 
annotation. This list has been created as:

# iterator skips over all index annotation that occur before the boundary 
annotation
# if iterator is still valid, add current annotation to list *without checking 
if this is within the boundary*
# iterator continues adding all the other annotations within the boundary

Step 2 was done to get a reference annotation for checking overlap. I changed 
the logic, so that the reference annotation may be null at the very beginning 
and the overlap check is skipped in that case.

As far as I can tell, this affects all Apache UIMA releases so far, so I set 
the affect version to 2.1.

Test case added.

                
      was (Author: rec):
    It was a bug in the implementation of 
Subiterator.initUnambiguousSubiterator(). It internally maintains a list of the 
annotations within the boundary annotation. This list has been created as:

# iterator skips over all index annotation that occur before the boundary 
annotation
# if iterator is still valid, add current annotation to list *without checking 
if this is without the boundary*
# iterator continues adding all the other annotations within the boundary

Step 2 was done to get a reference annotation for checking overlap. I changed 
the logic, so that the reference annotation may be null at the very beginning 
and the overlap check is skipped in that case.

As far as I can tell, this affects all Apache UIMA releases so far, so I set 
the affect version to 2.1.

Test case added.

                  
> JCasUtil Subiterator returns annotations which are not within borders of the 
> container (parent) annotation if parameter "strict" is set to "false"
> --------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: UIMA-2808
>                 URL: https://issues.apache.org/jira/browse/UIMA-2808
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 2.1
>            Reporter: Thomas G.
>            Assignee: Richard Eckart de Castilho
>             Fix For: 2.4.1SDK
>
>         Attachments: JCasUtilSubiteratorUIMATest.zip
>
>
> * JCasUtil Subiterator returns annotations which are not within the border of 
> the container (parent) annotation if parameter "strict" is set to "false"
> * See attached maven project for test setup, java classes, a SIMPLIFIED 
> typesystem and the test CAS xml-file.
> * We have two annotations, "SentenceAnnotation" and "ValueAnnotation". A 
> "SentenceAnnotation" covers a sentence and the "ValueAnnotation" covers a 
> numerical value. 
> * We have the following example plank text:
> ** "This is sentence A with no value. This is sentence B with value 377."
> ** Creates two sentence annotations ("This is sentence A with no value." and 
> "This is sentence B with value 377.") and one value annotation ("377").
> ** Now, if i want to get all "ValueAnnotation" within a "SentenceAnnotation", 
> i iterate over each "SentenceAnnotation" and use JCasUtil.iterator(...) to 
> get the ValueAnnotations with the following parameters: 
> JCasUtil.iterator(currentSentence, ValueAnnotation.class, false, false);
> ** As a result, i get for the first sentence also the value of the second 
> sentence and this might be wrong because even if "strict" is set to "false", 
> the  begin of the "ValueAnnotation" should be smaller than the end of the 
> "SentenceAnnnotation". But in the example given the begin of the 
> "ValueAnnotation" is after the end of the FIRST "SentenceAnnotation"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to