Hi Katja,
I'm afraid my answer comes too late but the nice thinh is precisely
that. In the chunk of code I sent you it is what I wanted to show:
You iter over all the annotator if you detect a np all the tokens you
are going to detect after will belong to this np till you reach another
one. The iterators in UIMA are already structured according to the
position but also to the hierarchy, so if you iter over the generic
iterator and test the class its elements belongs to you are sure first
to detect fisrt np (0,10) in your example and the other elements will be
of type token till you find another np. If your np are not contiguous
you always have the possibility to check token.end <= np.end but anyway
the order is guaranteed.
Ekaterina Buyko wrote:
Hi Christian,
Thank you very much.
What I had orinally in mind would be a method in UIMA such as:
Sentence [] sentence = token.getOverlapAnnotation (Sentence.type);
But I have still some questions to your proposal:
If you get an iterator over all annotations, it is ok.
Do you know what is the order the annotations are in?
If I have for example the annotations (numbers are respective begin
and end)
NP np (0,10)
Token token1(0,5), token2(6, 10)
Then I get index. How are they ordered?
np, token1, token2?
And what will be if they have the same span?
NP np (0,5)
Token token1(0,5)
With best regards
Katja
Christian Mauceri schrieb:
Hi Ekaterina,
if I understood your question, it is possible and even a nice feature
of UIMA. I have more or less the same problems, I have two types of
annotations contexts and forms (sentences and token for you). So I
have TAEs which marks contexts and forms then I have another TAE (a
CAS consumer in my very simple case) which do the following.:
// A context
TCollocation tc = null;
// A form
TForm f = null;
// I first iter over all the annotations
Iterator annot =
jcas.getJFSIndexRepository().getAnnotationIndex().iterator();
while(annot.hasNext()) {
Annotation a = (Annotation)annot.next();
// then I test if it is a context TCollocation or a form
TForm
if (a instanceof TCollocation) {
tc = (TCollocation)a;
//System.out.println(tc.getMatch());
} else if (a instanceof TForm) {
f = (TForm) a;
}
}
That's all the nice thing is that the iterator respects the position
order in the text and the inclusion hierarchy so you are sure the
current form belongs to the current context.
I hope it is helpfull and I did not say baloneys, at least works fine
for me.
Regards.
Christian.
Ekaterina Buyko wrote:
Hi all!
In UIMA 2.1 it is possible to create a sub-iterator in order to
iterate over annotations which are within the begin-end span of the
selected type.
For example:
AnnotationIndex sentenceIndex = (AnnotationIndex) aJCas
.getJFSIndexRepository().getAnnotationIndex(Sentence.type);
AnnotationIndex tokenIndex = (AnnotationIndex) aJCas
.getJFSIndexRepository().getAnnotationIndex(Token.type);
// iterate over Sentences
FSIterator sentenceIterator = sentenceIndex.iterator();
while (sentenceIterator.hasNext()) {
Sentence sentence = (Sentence) sentenceIterator.next();
// iterate over Tokens
FSIterator tokenIterator = tokenIndex.subiterator(sentence);
I would like to have a more extended functionality. I need to know
the annotations which are in the span of begin-end of the selected
annotation type. These annotations can overlap the span of the
selected type.
For example noun phrases. If I iterate over tokens, I would like to
know, if this token is inside a noun phrase or not. Now, I am
working with Hashtables. But I am looking for an other solution.
How could I solve this problem?
Bets regards
Ekaterina
--
Cordialement/Regards
Christian Mauceri
http://hermeneute.com/Christian