Re: Iterators in CAS

Christian Mauceri Sat, 13 Oct 2007 16:42:22 -0700

Hi Katja,

I'm afraid my answer comes too late but the nice thinh is preciselythat. In the chunk of code I sent you it is what I wanted to show:You iter over all the annotator if you detect a np all the tokens youare going to detect after will belong to this np till you reach anotherone. The iterators in UIMA are already structured according to theposition but also to the hierarchy, so if you iter over the genericiterator and test the class its elements belongs to you are sure firstto detect fisrt np (0,10) in your example and the other elements will beof type token till you find another np. If your np are not contiguousyou always have the possibility to check token.end <= np.end but anywaythe order is guaranteed.


Ekaterina Buyko wrote:

Hi Christian,

Thank you very much.

What I had orinally in mind would be a method in UIMA such as:
Sentence [] sentence = token.getOverlapAnnotation (Sentence.type);

But I have still some questions to your proposal:

If you get an iterator over all annotations, it is ok.
Do you know what is the order the annotations are in?
If I have for example the annotations (numbers are respective beginand end)
NP np (0,10)
Token token1(0,5), token2(6, 10)

Then I get index. How are they ordered?
np, token1, token2?

And what will be if they have the same span?
NP np (0,5)
Token token1(0,5)

With best regards

Katja



Christian Mauceri schrieb:
Hi Ekaterina,
if I understood your question, it is possible and even a nice featureof UIMA. I have more or less the same problems, I have two types ofannotations contexts and forms (sentences and token for you). So Ihave TAEs which marks contexts and forms then I have another TAE (aCAS consumer in my very simple case) which do the following.:
      // A context
       TCollocation tc = null;
      // A form
       TForm f = null;

      // I first iter over all the annotations
Iterator annot =jcas.getJFSIndexRepository().getAnnotationIndex().iterator();
       while(annot.hasNext()) {
           Annotation a = (Annotation)annot.next();
// then I test if it is a context TCollocation or a formTForm
           if (a instanceof TCollocation) {
               tc = (TCollocation)a;
               //System.out.println(tc.getMatch());
           } else if (a instanceof TForm) {
               f = (TForm) a;
           }
       }
That's all the nice thing is that the iterator respects the positionorder in the text and the inclusion hierarchy so you are sure thecurrent form belongs to the current context.
I hope it is helpfull and I did not say baloneys, at least works finefor me.
Regards.
Christian.


Ekaterina Buyko wrote:
Hi all!
In UIMA 2.1 it is possible to create a sub-iterator in order toiterate over annotations which are within the begin-end span of theselected type.
For example:
AnnotationIndex sentenceIndex = (AnnotationIndex) aJCas.getJFSIndexRepository().getAnnotationIndex(Sentence.type);
AnnotationIndex tokenIndex = (AnnotationIndex) aJCas
               .getJFSIndexRepository().getAnnotationIndex(Token.type);

       // iterate over Sentences
       FSIterator sentenceIterator = sentenceIndex.iterator();
       while (sentenceIterator.hasNext()) {

           Sentence sentence = (Sentence) sentenceIterator.next();

           // iterate over Tokens
           FSIterator tokenIterator = tokenIndex.subiterator(sentence);
I would like to have a more extended functionality. I need to knowthe annotations which are in the span of begin-end of the selectedannotation type. These annotations can overlap the span of theselected type.
For example noun phrases. If I iterate over tokens, I would like toknow, if this token is inside a noun phrase or not. Now, I amworking with Hashtables. But I am looking for an other solution.
How could I solve this problem?

Bets regards

Ekaterina


--
Cordialement/Regards
Christian Mauceri
http://hermeneute.com/Christian

Re: Iterators in CAS

Reply via email to