[ https://issues.apache.org/jira/browse/UIMA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15499438#comment-15499438 ]
Richard Eckart de Castilho commented on UIMA-1524: -------------------------------------------------- I think we are experimenting here a bet in the range between positional-style, builder-style (declarative), and functional style (imperative) and kind of fluctuating with respect to preferences towards them. Consider the below as my thinking out loudly as I try to follow Marshalls thoughts as rendered in the wiki. Maybe we need to pick the kinds of statements that we can build with this new API apart a bit. Marshall actually did that quite nicely in the Gliffy diagram in the wiki. So let's try an example... {code} cas.select() .type(Token.class) .within(sentence) {code} The above looks like builder code for an iterator or collection, but it lacks a terminal statement like: asList(), stream(), iterator(), etc. (all of which the Gliffy provides). My understanding of the Gliffy is that Marshall imagines that this builder-like API is not only a builder but at the same time implements the Java stream API. That means the builder does not have to be terminated explicitly but can be terminated at any time simply by calling any of the Stream API methods. But for sake of clarity, I'll just add a terminal builder step now (fsIterator()). {code} cas.select() .type(Token.class) // result type of fsIterator .within(sentence) // location condition .fsIterator() // result {code} Now I would argue that any such statement can only have one result type, so only one *type* call in the builder. So a statement like the following would *not* make too much sense: {code} cas.select() .type(Token.class) // result type of fsIterator .type(Lemma.class) // whoops? .within(sentence) // location condition .fsIterator() // result {code} So it seems to be quite sensible and economic to drop the *type* builder call and conflate it into the *select* call: {code} cas.select(Token.class) // result type of fsIterator .within(sentence) // location condition .fsIterator() // result {code} *Decision requirement:* Should we entirely drop the *type* call? Should we throw an exception if it is called twice? There are multiple ways that users want to specify types, e.g. as class, type, string, or even nothing (i.e. not making a type restriction): {code} select(Token.class) select(Token.type) select("my.type.Token") select() {code} As for location conditions (covered, covering, following, preceding, relative, between, at, ...), there are some cases where multiple conditions *could* be sensible. Note that I include "at" in the location conditions here where the Gliffy in the wiki seems currently to consider "at" to have a different quality from e..g "covering" or "following". {code} cas.select(Token.class) // result type of fsIterator .within(sentence) // location condition 1 .following(predicateVerb) // location condition 2 .fsIterator() // result {code} The ability to configure some additional behaviors for the builder are sensible, e.g.: {code} cas.select(Token.class) // result type of fsIterator .within(sentence) // location condition 1 .following(predicateVerb) // location condition 2 .typePriorities() .strict() .fsIterator() // result {code} However, if we allow multiple conditions, then the question is whether the behaviors should apply to the whole builder only locally to individual conditions. *Decision requirement:* We need to decide whether we want to allow multiple location conditions (as above) or not. If not, should we throw an exception if it is called twice? I tend towards liking the idea of multiple location conditions (although not all combinations are sensible) if that is not too hard to implement. The code for the different select methods in uimaFIT is very tightly tuned to particular location conditions and I am unsure how straightforward it would be to dynamically combine them. Normally, results are delivered in index order. It appears as if the reverse() behavior is simply changing that to go in reverse-index order. I.e. it is a declarative reverse for which there is also a signature that includes a boolean parameter: {code} cas.select(Token.class) // result type of fsIterator .following(predicateVerb) .reverse(true) .typePriorities() .strict() .fsIterator() // result {code} Each location condition could be augmented by secondary conditions, e.g. a "displacement" (which Marshall calls offset). E.g. here we retrieve all Tokens following the Token 3 positions right of the predicateVerb token. {code} Token predicateVerb = ... cas.select(Token.class) // result type of fsIterator .following(predicateVerb, 3) .fsIterator() // result {code} The case above could also be simulated without the displacement, e.g. {code} Token predicateVerb = ... cas.select(Token.class) // result type of fsIterator .following(predicateVerb, 3) .stream() .skip(3) // result {code} ... but that mightwork always. E.g. here we retrieve all Tokens following the Verb 3 positions right of the predicateVerb Verb. So here the offset does not apply to the Token index but to the Verb index. {code} Verb predicateVerb = ... cas.select(Token.class) // result type of fsIterator .following(predicateVerb, 3) .fsIterator() // result {code} But I am actually unsure as to what the semantics of the displacement are. *Decision requirement:* When an displacement is specified in a location condition, does it operate on the index of the selected type (here Token) or on the index of the condition type (here Verb)? Another afterthought on the exercise: the stream API does not work with enhanced for loop. If the builder implements its builder API + the stream API, then it would be nice if it could also implement the iterable API: {code} for (Token t : cas.select(Token.class).following(predicateVerb)) { // do something... } {code} Omitted here are thoughts on index() and limit() which are included in the wiki description and seem to fit in nicely with the builder API. Some aspects like unordered, nonoverlapping, I did not consider yet. > JFSIndexRepository should be enhanced with new generic methods > -------------------------------------------------------------- > > Key: UIMA-1524 > URL: https://issues.apache.org/jira/browse/UIMA-1524 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework > Affects Versions: 2.3 > Reporter: Joern Kottmann > > Existing methods should be overloaded with an additional Class argument to > specify the exact return type. This changes make down casting of returned > objects unnecessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)