Marshall Schor wrote:
> 
> Thilo Goetz wrote:
>> Adam Lally wrote:
>>   
>>> On Mon, Aug 10, 2009 at 5:32 PM, Marshall Schor<m...@schor.com> wrote:
>>>     
>>>> Adam Lally wrote:
>>>>       
>>>>> On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmann<kottm...@gmail.com> wrote:
>>>>>
>>>>>         
>>>>>> Marshall Schor wrote:
>>>>>>
>>>>>>           
>>>>>>> The generification of FSIndex currently specifies one type, <T extends
>>>>>>> FeatureStructure> that is the type of item being returned.
>>>>>>>
>>>>>>>
>>>>>>> The contains and find methods have arguments of type FeatureStructure.
>>>>>>> These could be changed to take type "T".
>>>>>>>
>>>>>>>
>>>>>>>             
>>>>>> No I do not think that they could be changed to take type T.
>>>>>> Lets take the case of the contains method.
>>>>>> The javadoc says:
>>>>>> "Check if the index contains an element equal to the given feature 
>>>>>> structure
>>>>>> according to the
>>>>>> ordering of the index. Note that this is in general not the same as 
>>>>>> feature
>>>>>> structure identity."
>>>>>> and it for the param fs it says "The FS we're looking for.". There is no
>>>>>> place where
>>>>>> it says that contains can only be called for FSes which have the type of 
>>>>>> the
>>>>>> index.
>>>>>>
>>>>>> The change of the parameter from FeatureStructure to T would also change
>>>>>> the contract of the method a little, because then it would not be 
>>>>>> possible
>>>>>> anymore
>>>>>> to pass a FeatureStructure which has not type T.
>>>>>>
>>>>>>
>>>>>>           
>>>>> I agree.  It's sometimes useful to call FSIterator.moveTo method and
>>>>> pass an FS of a Type other than the one that the index was defined
>>>>> over, as part of implementing something like a subiterator.
>>>>>
>>>>>         
>>>> I agree with the case where it's the "bag" index being used, because
>>>> that uses a test which works on all feature structures.
>>>>
>>>> However, for Set and Sorted, the implication of passing a FS which is
>>>> not in the type hierarchy is, according to the JavaDocs, "undefined".
>>>> This is because the code assumes the layout of the features and their
>>>> values is appropriate for the type.  In other words, if the type of some
>>>> key was a string, it might take a value from the main int heap and use
>>>> it as an index into the string array - and if the int heap object was
>>>> not the right type, it could pull an arbitrary value from that slot, and
>>>> end up throwing an array index out of bounds exception.  When I looked,
>>>> it didn't appear to me that the code checked for any kind of type
>>>> subsumption before proceeding... (but I may have missed it...)
>>>>
>>>> It could turn out that the data (whatever is being pulled) would just
>>>> happen to "match", even though the types are different.
>>>>
>>>> Because this is an undefined operation that could throw various kinds of
>>>> runtime errors, or return an equal match where none really exists, I
>>>> think it should not be allowed, for set and sorted indexes.
>>>>
>>>>       
>>> In the case of the AnnotationIndex, the object that you pass to
>>> FSIterator.moveTo must be a subtype of Annotation (else you would get
>>> all the weird effects that you describe).  But it is still valid for a
>>> user to do:
>>> FSIndex<AnnotType1> index = cas.getAnnotationIndex(annotType1);
>>> index.moveTo(annotType2);
>>>
>>> where annotType1 and annotType2 are both subtypes of uima.tcas.Annotation.
>>>
>>> In general, the object that you pass to moveTo() must be a subtype of
>>> the type that was in the index definition (in the user's descriptor,
>>> or for the case of the built-in AnnotationIndex,
>>> uima.tcas.Annotation).
>>>
>>>   -Adam
>>>     
>> One concrete example of Adam's point: suppose you have a sentenceFS
>> and a tokenIterator.  Then tokenIterator.moveTo(sentenceFS) will
>> position the token iterator at the first word in the sentence (modulo
>> some subtleties that are beside the point here).  Very useful.
>>   
> 
> Yes.  This works because the "index" being iterated over is the general
> annotation index (the one that's built-it) - and the presumeption is
> that "token" and "sentenceFS" are both subtypes of AnnotationFS.
> 
> Is there any reason *not* to add a check to the various methods in
> indexing that take one or more Feature Structures, to see if they are
> being passed a subtype of the type being indexed (except for bag indexes)?

Sure.  Suppose B < A, and we have an index on B, but the ordering
relation is defined in terms of features that B inherits from
           A.
Then I can use an A to position iterators on the B index.  This
works now, and I don't see why we should prohibit it.


> -Marshall
> 
> 
>> --Thilo
>>
>>
>>
>>   

Reply via email to