The XML descriptor for AEs is already serving two purposes, it describes a component, and it configures it. I do not think it should be further watered down by evolving into a DSL for workflows, in particular if that means that workflow logic is leaking outside the flow part of the descriptor. Combining your proposal with our other proposal regarding the embedding of an expression language in the external configuration mechanism, is further watering down the responsibility of the flow controller. What are currently clearly assigned responsibilities and a clear design becomes quite a muddy patch.
I think it is a good idea to explore new and improved possibilities for flow descriptions. But since once a thing is in the core, it is unlikely to ever get out, I do not think research on how such a thing should be done should happen in the core. In particular, as long as changes to the core are under the heading of "must remain compatible" (cf. UIMA-2670, or other recent discussions on this list), I suggest that experimental extensions should be developed outside the core. There are already enough things in the core that should be addressed and cruft that could be removed or fixed up and better integrated. Since the modification you suggest appears to be possibly only with modifications to the core and since it ignores the encapsulation of flow logic within the flow descriptor, I suggest to try and approach the user requirements from a different angle which do not have these problems. The motivation for the suggested changes appear to be related to using UIMA setups for experimentation which requires extensive parametrization. There are many people who already do that. E.g. two of the publications presented during the UIMA workshop last Monday where highlighting different approaches of building high-level experimental workflows for UIMA pipelines: CSE Framework: A UIMA-based Distributed System for Configuration Space Exploration 14-17 Elmer Garduno, Zi Yang, Avner Maiberg, Collin McCormack, Yan Fang, Eric Nyberg in http://ceur-ws.org/Vol-1038/ Bluima: a UIMA-based NLP Toolkit for Neuroscience 34-41 Renaud Richardet, Jean-Cédric Chappelier, Martin Telefont in http://ceur-ws.org/Vol-1038/ and similarly, although from a different venue A lightweight framework for reproducible parameter sweeping in information retrieval Richard Eckart de Castilho, Iryna Gurevych, http://dl.acm.org/citation.cfm?id=2064248 There may be additional work on building experiments (with UIMA) that should be considered, e.g. the original uimaFIT publications, since experimentation was one of the main considerations behind its development, although "E" didn't make it into the name. The flow controller offers a clear extension point for the kind of modifications you are trying to introduce to the core outside of the flow controller, with the idea that existing flow controllers would not need to be modified. That has a certain appeal, but I would argue that people who want to have more control over the flow and parametrize it should switch to a new flow controller which offers that functionality. It may be more work for the adopters of this new controller, but it would not weaken the overall architecture by supplementing existing extension points with alternative and invasive mechanisms. -- Richard On 27.09.2013, at 21:54, Marshall Schor <[email protected]> wrote: > A modified proposal, with rationale from the previous discussion. > > 1) add one attribute, not 2. Per Thilo's suggestion, make it the run= rather > than the skip=. > Current external override file syntax support simple negation computing that > is > somewhat readable, to allow changing just one value to pick an alternative. > > 2) Have the attribute associated with the delegate element, rather than with > the > flow constraint. Rationale: flow constraint is optional, and relates to flow; > the meaning of this specification is more concerned with choosing an > alternative > or skipping, regardless of how many places the delegate might appear in a > flow. > > 3) Preserve a substantial amount of backwards compatibility. This includes > having previously written flow controllers continue to work unmodified. For > skipping, this means the delegate will still be entered into internal tables, > given to the flow controller, etc., but the dispatch of the CAS will be as if > it > was done with a no-op annotator; this applies also to initialization. This > allows flow controllers that had a hard-coded flow to continue to work. For > alternatives (using the same delegate "key"), have the picked one entered into > the internal tables. > > 4) The CDE will need to have a way to (optionally) specify a External > Configuration File to use when dealing with a descriptor, and have a defined > strategy if no such file is available. It should store the path to the > specified file in a property-local setting for subsequent use in Eclipse. > > 5) UIMA-AS, if managing an aggregate as an asynchronous aggregate, may need to > recognize skipped delegates. > > 6) External Resource Definitions... for skipped / alternative items: These > might > or might not be referred to by other delegates. These are handled for > backwards > compability: for alternatives (sharing the same key), only the one which is > picked is included. For skipping: the resource omitted only if there are no > references to it from non-skipped things. Otherwise it is included, even > though > its associated annotator is skipped. > > I've probably forgotten some things... > > -Marshall
