Re: [Fwd: Re: Iterators: problem when using standard methods in combination with moveTo*]

Thilo Goetz Thu, 12 Jul 2007 06:59:11 -0700

What is the expected behavior?  Here's the impl:

  public boolean hasNext() {
    return isValid();
  }


  public Object next() {
    Object result = get();
    moveToNext();
    return result;
  }

Perfectly reasonable, but has some consequences
that may not be obvious.  For example, when you
do

FS fs1 = it.next();
FS fs2 = it.get();

then fs1 != fs2.  Is that intuitive?  I don't know.
Is it fixable?  Not easily, no.

We also have this more subtle behavior:

FS fs1 = it.next();
it.moveToPrevious();
FS fs2 = it.next();

The last line may throw a NoSuchElementException.  Why?
Because the first line may invalidate the iterator, and
then moveToPrevious() will not normally make the iterator
valid again (sometimes it will, depending on the iterator
implementation).

So in terms of what is reasonable, the iterators behave
as expected.  Still, because the interacations are so
subtle, it is not a good idea to mix the paradigms.  I
never do, even though I think I understand what's going
on.

I'm pretty sure I've documented this before.  I don't know
where that text went.  Maybe I dreamed it.

--Thilo

Marshall Schor wrote:
> Thilo - is this "fixable" - so it just works as users expect?
> 
> -Marshall
> 
> -------- Original Message --------
> Subject:     Re: Iterators: problem when using standard methods in
> combination with moveTo*
> Date:     Thu, 12 Jul 2007 13:33:31 +0200
> From:     Thilo Goetz <[EMAIL PROTECTED]>
> Reply-To:     [EMAIL PROTECTED]
> To:     [EMAIL PROTECTED]
> References:
> <[EMAIL PROTECTED]>
> 
> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
> 
> 
> 
> Hi Julien,
> 
> Julien Nioche wrote:
>> Thilo and Marshall,
>>
>> Thanks for sharing the tip. Indeed it would be a good idea to add this
>> little example to the documentation.
>>
>> A quick comment about the Iterator methods. I had a problem with the
>> following piece of code:
>>
>> /while (wordFormIterator.hasNext()){
>> WordForm wf = (WordForm)wordFormIterator.next();
>> if (wf.getBegin()==token.getBegin() && wf.getEnd()==token.getEnd()){
>> liste.add(wf);
>> }
>> else {
>> //  move back
>> wordFormIterator.moveToPrevious();
>>  return liste;
>>  }
>> }
>> /
>> The last element of the iterator was never accessible because
>> /hasNext()/ returned false despite the fact that there WAS an element
>> left in there. /moveToPrevious /had been previously called on this
>> iterator.
>>
>> Should not /hasNext() /return true even if the cursor has been moved
>> forward or backward within the iterator? Or is the use of the legacy
>> methods (hasNext(), next()) incompatible with the /moveTo* /methods?
> 
> hm, I thought this was in our documentation, but couldn't find it myself.
> You should not mix the use of next()/hasNext() with the methods defined
> in the FSIterator interface.  They do not work well together.  If you use
> the FSIterator APIs, you should use them exclusively.  Sorry about that.
> I'll add a comment to the javadocs.
> 
>>
>> Thanks
>>
>> Julien
>>> To be a bit more explicit, here's some code that will determine how
>>> many tokens the longest sentence in the document contains.  It's a
>>> silly example, but it illustrates the concept.  Maybe this should go
>>> in the docs.  Note: I have not actually run this code, it may not
>>> work immediately ;-)
>>>
>>>     CAS cas = ...;
>>>     Type sentenceType =
>>> cas.getTypeSystem().getType("yourSentenceTypeName");
>>>     Type tokenType = cas.getTypeSystem().getType("yourTokenTypeName");
>>>     FSIterator sentenceIt =
>>> cas.getAnnotationIndex(sentenceType).iterator();
>>>     AnnotationIndex tokenIndex = cas.getAnnotationIndex(tokenType);
>>>     FSIterator tokenIt;
>>>     int maxLen = 0;
>>>     int currentLen;
>>>     for (sentenceIt.moveToFirst(); sentenceIt.isValid();
>>> sentenceIt.moveToNext()) {
>>>       tokenIt = tokenIndex.subiterator((AnnotationFS) sentenceIt.get());
>>>       currentLen = 0;
>>>       for (tokenIt.moveToFirst(); tokenIt.isValid();
>>> tokenIt.moveToNext()) {
>>>     ++currentLen;
>>>       }
>>>       maxLen = ((maxLen < currentLen) ? currentLen : maxLen);
>>>     }
>>>     System.out.println("Longest sentence contains " + maxLen + "
>>> tokens.");
>>>
>>> --Thilo
>>>
>>> Marshall Schor wrote:
>>>  
>>>> Did you consider using subIterators?  These are (briefly) described in
>>>> section 4.7.4 of the Apache UIMA Reference book, and may include
>>>> exactly
>>>> what you're trying to get at - an interator over elements that are
>>>> "contained" in the span of other elements.
>>>>
>>>> -Marshall
>>>>
>>>> Julien Nioche wrote:
>>>>    
>>>>> Hi,
>>>>>
>>>>> Sorry if someone already asked the question.
>>>>> Is there a direct way to obtain from a Cas all the annotations of a
>>>>> given type located between two positions in the text? Something like
>>>>> getContained(String type,int start,int end)?
>>>>> I am trying to get all the Tokens contained within a specific
>>>>> Sentence. I have used iterators for doing that and compared the offset
>>>>> with those of the Sentence but it is a bit tedious. Have I missed
>>>>> something obvious?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Julien
>>>>>
>>>>>
>>>>>       
>>
> 
> 
>

Re: [Fwd: Re: Iterators: problem when using standard methods in combination with moveTo*]

Reply via email to