The select form you're using iterates using UIMA's built-in Annotation index. 
This index is sorting the annotations based on 3 criteria: 

1) the begin (ascending order)

2) the end (descending order)

3) the type priority

You can use the 3rd criterion to set a preference ordering among two annotations
of different types, which have the same begin / end.
You specify the type priorities as part of Analysis Engine metadata, see
http://uima.apache.org/d/uimaj-current/references.html#ugr.ref.xml.component_descriptor.aes.primitive

-Marshall

On 11/20/2016 9:52 PM, William Colen wrote:
> Hi,
>
> In Portuguese we have contractions, that are words composed by, for
> example, a preposition + article, pronoun or an adverb.
>
> Example:
>
> Nós acreditávamos nele. (We believed him.)
>
> Where "nele" can be divided into "em" + "ele". (in + him)
>
> To properly analyze this, I created two token annotation with the same
> begin and end, but the first I associated with the POS Tag preposition, and
> the second pronoun.
>
> This is especially important when we are doing chunking, because the first
> token will be part of a prepositional phrase, while the second of a nominal
> phrase.
>
> How can I guarantee that when I call UIMAFit JCasUtil.select I will get the
> tokens ordered, first the preposition, second the pronoun?
>
> Thank you,
> William
>

Reply via email to