The select form you're using iterates using UIMA's built-in Annotation index. This index is sorting the annotations based on 3 criteria:
1) the begin (ascending order) 2) the end (descending order) 3) the type priority You can use the 3rd criterion to set a preference ordering among two annotations of different types, which have the same begin / end. You specify the type priorities as part of Analysis Engine metadata, see http://uima.apache.org/d/uimaj-current/references.html#ugr.ref.xml.component_descriptor.aes.primitive -Marshall On 11/20/2016 9:52 PM, William Colen wrote: > Hi, > > In Portuguese we have contractions, that are words composed by, for > example, a preposition + article, pronoun or an adverb. > > Example: > > Nós acreditávamos nele. (We believed him.) > > Where "nele" can be divided into "em" + "ele". (in + him) > > To properly analyze this, I created two token annotation with the same > begin and end, but the first I associated with the POS Tag preposition, and > the second pronoun. > > This is especially important when we are doing chunking, because the first > token will be part of a prepositional phrase, while the second of a nominal > phrase. > > How can I guarantee that when I call UIMAFit JCasUtil.select I will get the > tokens ordered, first the preposition, second the pronoun? > > Thank you, > William >
