On Aug 9, 2012, at 9:18 AM, Chen, Pei wrote:
> To get all the BaseTokens for a particular sentence, if we use the 
> .subiterator, the types has be stored in the FSindexes in a certain order 
> otherwise it could just return an empty list.  This would require the users 
> of annotators to understand the ordering of types and have it preconfigured.
> 
> FSIterator<Annotation> tokensInSentenceIterator = 
> jcas.getAnnotationIndex(BaseToken.type).subiterator(sentence);
> 
> uimaFIT already created a convenience method that seems to do something 
> similar which will always return the expected tokens.  Does anyone know if 
> this was part of the motivation?

Yes, that was exactly the motivation to avoid using subiterators. Our 
experience in uimaFIT was that subiterators never did what you wanted them to 
do.

> Is the performance hit (if any) worth the ease of use?

I doubt there's a performance hit. Take a look at the source for 
JCasUtil.selectCovered vs. org.apache.uima.cas.impl.Subiterator. If anything, 
selectCovered is probably doing less.

But of course you could time it and find out for sure.

Steve

Reply via email to