Great! Thank you!
2016-11-23 12:33 GMT-02:00 Marshall Schor <[email protected]>: > UIMA allows you to define custom indexes. So you can define a new sorted > index > (for example, let's name it "nameOfYourNewIndex") that is like the > annotator > index, except that its keys are a) the begin feature, ascending, 2) the end > feature, descending, and 3) the special extra feature you have to sort > otherwise > equal annotations. You would define this index to be over the most > specific > type that is the type or supertype of all Feature Structures you want this > index > to apply to (let's say you have a JCas class for this, called > JCasClassOfTheType). > > Then you can use uimaFIT's your own index (see docs), that include your > extra > feature. Then you would use a form such as this: > > // get the index instance from the JCas > FSIndex<JCasClassOfTheType> index = jcas.getIndex("nameOfYourNewIndex", > JCasClassOfTheType); > > // get an iterator from the index > FSIterator<JCasClassOfTheType> iterator = index.iterator(); > > With this, there is no need to have the user first collect all the > instances, > and then sort them; UIMA does this for you. > > Hope this helps! -Marshall > > > On 11/21/2016 8:05 PM, William Colen wrote: > > Thank you, Marshall. > > What if they are of the same type? > > The workaround for me was to add a feature I can store a integer which I > > use to sort the annotations. It is not a good approach because the user > > will need to remember to sort it before using. > > > > Thank you > > William > > > > 2016-11-21 20:10 GMT-02:00 Marshall Schor <[email protected]>: > > > >> The select form you're using iterates using UIMA's built-in Annotation > >> index. > >> This index is sorting the annotations based on 3 criteria: > >> > >> 1) the begin (ascending order) > >> > >> 2) the end (descending order) > >> > >> 3) the type priority > >> > >> You can use the 3rd criterion to set a preference ordering among two > >> annotations > >> of different types, which have the same begin / end. > >> You specify the type priorities as part of Analysis Engine metadata, see > >> http://uima.apache.org/d/uimaj-current/references.html# > >> ugr.ref.xml.component_descriptor.aes.primitive > >> > >> -Marshall > >> > >> On 11/20/2016 9:52 PM, William Colen wrote: > >>> Hi, > >>> > >>> In Portuguese we have contractions, that are words composed by, for > >>> example, a preposition + article, pronoun or an adverb. > >>> > >>> Example: > >>> > >>> Nós acreditávamos nele. (We believed him.) > >>> > >>> Where "nele" can be divided into "em" + "ele". (in + him) > >>> > >>> To properly analyze this, I created two token annotation with the same > >>> begin and end, but the first I associated with the POS Tag preposition, > >> and > >>> the second pronoun. > >>> > >>> This is especially important when we are doing chunking, because the > >> first > >>> token will be part of a prepositional phrase, while the second of a > >> nominal > >>> phrase. > >>> > >>> How can I guarantee that when I call UIMAFit JCasUtil.select I will get > >> the > >>> tokens ordered, first the preposition, second the pronoun? > >>> > >>> Thank you, > >>> William > >>> > >
