Did you get inspired by the uimaFIT JCasUtils? ;) Do you have any experience yet with the Java 8 Stream API? It might be worth to take it into account already and trying to design such an API-change in UIMA in such a way that it plays nicely with the stream API. E.g. filtering for annotations with certain properties or mapping annotations into a collection of feature values, covered texts, etc.
Cheers, -- Richard On 20.03.2015, at 22:11, Marshall Schor <[email protected]> wrote: > In UIMA, we support a Java-friendly style for accessing Feature Structures, > typically via the JCas which supplies a Java Class for each UIMA type. > > A typical use pattern for indexes is to get a named index from the index > repository for a CAS view, and then use that index to get an iterator over > instances of that (or use the index as an "Iterable" in a for loop). > > For iterators returning JCas instances, it would be nice to have those take > advantage of generics. So, for example, it would be nice to be able to write > without casts, etc.: > > AnnotationIndex<SourceDocumentInformation> > anIndexOverSourceDocumentInformationType = > cas.getAnnotationIndex(the_UIMA_type_for_SourceDocumentInformation); > > or > > AnnotationIndex<SourceDocumentInformation> > anIndexOverSourceDocumentInformationType = > cas.getAnnotationIndex(SourceDocumentInformation.class); // << NEW! > > and then > > Iterator<SourceDocumentInformation> iter = > anIndexOverSourceDocumentInformationType.iterator(); > > or using Iterable: > > for (SourceDocumentInformation sdi : > anIndexOverSourceDocumentInformationType) { > ... // use as an iterable > } > > It would be nice to be able to combine these two things (getting the index, > then > getting the iterator over the index) by chaining these, like this: > > Iterator<SourceDocumentInformation> iter = > cas.getAnnotationIndex(SourceDocumentInformation.class).iterator(); > > I've found a way to update the way generics are used to have the Index and > FSIterator APIs work like this. One thing I couldn't discover how to do was > to > have this form work: > > Iterator<SourceDocumentInformation> iter = > cas.getAnnotationIndex(the_UIMA_type_for_SourceDocumentInformation).iterator(); > > This variant (which inserts a Java Class) works, though (I didn't realize Java > even supported this syntax until recently: > > Iterator<SourceDocumentInformation> iter = cas. <SourceDocumentInformation> > getAnnotationIndex(the_UIMA_type_for_SourceDocumentInformation).iterator(); > > I plan to add the support for making generics useful in this way, plus the > alternative which allows passing the class instance of a UIMA type (e.g., > SourceDocumentInformation.class), and welcome suggestions for improvements :-) > ; I hope the support for "iterables" (like the "for" loop example above) will > be especially useful. > > -Marshall
