Re: A proposal for a slight augmentation of aggregate component descriptors

Burn Lewis Fri, 27 Sep 2013 06:16:26 -0700

In the GALE project we relied on descriptor editing, removing unwanted
delegates and their flow element, but in the Watson project we have many
nested aggregates and removing a delegate that has a parameter overridden
by its parent aggregate may require many edits.  Hence the advantage of a
solution built in to UIMA.


Selectively enabling/disabling individual delegates is taking away
responsibility that I believe is better left solely to the flow
controller.  It should be the master, telling the framework which delegates
to initialize and which to ignore, providing a single location to define
the complete flow.  Custom flow controllers can already read a flow
definition from a parameter, external or internal, and we could add a way
to parametrize the flow constraints ... perhaps as a new type that accepts
a list of delegates or external parameters containing lists.

  <flowConstraints>
   <flowList>
     alpha, beta, ${subFlow1},
     gamma
     ${subflow2}
     ...
   </flowList>
  </flowConstraints>

~Burn


On Fri, Sep 27, 2013 at 8:45 AM, Marshall Schor <[email protected]> wrote:

> On 9/26/2013 5:39 PM, Richard Eckart de Castilho wrote:
> > On 26.09.2013, at 23:28, Marshall Schor <[email protected]> wrote:
> >
> >> I think there's a tradeoff when using Specifications - they're more
> clear when
> >> they have the information locally, and harder to understand when they
> point to
> >> an unknown arbitrary thing.
> > It is interesting you mention this, because the documentation clearly
> states
> >
> > "As with the delegateAnalysisEngine element, the flowController element
> may contain either a complete flowControllerDescription or an import, but
> the import is recommended."
> >
> > (Source:
> http://uima.apache.org/d/uimaj-2.4.2/references.html#ugr.ref.xml.component_descriptor.flow_controller
> )
> >
> > Your statement also appears to contradict the idea of a configuration of
> a specifier via external variables in the first place, as these contain
> information that is not locally available.
> Yes, it appears this way.  This is why I see these kinds of design choices
> as
> shades of grey, because there are arguments on both sides of many issues,
> and
> the art seems to be in finding pragmatic compromise choices, driven by
> actual
> use cases.
>
> We initially did not have external variables; but as UIMA use became more
> widespread, the users started asking for things along these lines, with
> clear
> reasons.
> >
> >> Generally, the UIMA spec design philosophy have tried to encourage
> community and
> >> part-interoperability by leaning toward making things more transparent
> / obvious.
> > I don't understand this statement. How does the community come in here?
> Community means to have an active and widespread ecosystem containing
> component
> developers, component assemblers, experimenters of all kinds, commercial
> product
> developers, and users of all kinds.  These people bring widely different
> skill-sets with them, and we want to enable the wider community being able
> to
> build upon one another's work, successfully.
>
> -Marshall
> > I see how type system specifiers help interoperability, but I don't
> actually see this too much for component specifiers. They appear to be more
> deployment specifiers than anything else.
> >
> > -- Richard
> >
> >> -Marshall
> >
> >> On 9/26/2013 5:06 PM, Richard Eckart de Castilho wrote:
> >>> Another alternative could even be to control the import to point the
> desired flow:
> >>>
> >>> <flowController key="[String]">
> >>>    <import location="${xxx}"/>
> >>> </flowController>
> >>>
> >>> That would completely remove the need for any skipping attributes and
> work without dynamically generated descriptors.
> >>>
> >>> -- Richard
> >>>
> >>> On 26.09.2013, at 23:00, Richard Eckart de Castilho <[email protected]>
> wrote:
> >>>
> >>>> Not the controller, but its configuration. The skipping is clearly
> affecting the flow. So why not add something to the flowConstraints, e.g.:
> >>>>
> >>>> <flowConstraints>
> >>>> <fixedFlow>
> >>>>   <node>[String]</node>
> >>>>   <node>[String]</node>
> >>>>   ...
> >>>> </fixedFlow>
> >>>> <skip>
> >>>>   <node>[String]</node>
> >>>>   <node>[String]</node>
> >>>>   ...
> >>>> </skip>
> >>>> </flowConstraints>
> >>>>
> >>>> or
> >>>>
> >>>> <flowConstraints>
> >>>> <fixedFlow>
> >>>>   <node skip="true">[String]</node>
> >>>>   <node>[String]</node>
> >>>>   ...
> >>>> </fixedFlow>
> >>>> </flowConstraints>
> >>>>
> >>>> Personally, I'd not make any modifications to the descriptor at all,
> but rather would just skip the delegate when programmatically creating the
> descriptor. We do that all the time in our experiments. But if that is for
> some reason not an option and the extension is a strong requirement, the
> change should at least be made at the location that conceptually makes most
> sense (imho).
> >>>>
> >>>> @Marshall: do you want to provide some more background why you do not
> simply create the descriptors programmatically and externalize this
> skipping, including, etc. into your experimental setup?
> >>>>
> >>>> -- Richard
> >>>>
> >>>> On 26.09.2013, at 22:53, Peter Klügl <[email protected]>
> wrote:
> >>>>
> >>>>> Am 26.09.2013 22:51, schrieb Richard Eckart de Castilho:
> >>>>>> I believe this is a concern of the flow controller and should not
> be configured on the delegates, but rather within the flow controller
> configuration.
> >>>>> That was also my first guess, but do you really wanna touch or
> change the flow controller for just skipping a component?
> >>>>>
> >>>>> Peter
> >>>>>
> >>>>>> -- Richard
> >>>>>>
> >>>>>> On 26.09.2013, at 17:23, Marshall Schor <[email protected]> wrote:
> >>>>>>
> >>>>>>> To handle the use cases briefly described on the user list for
> selectively
> >>>>>>> skipping some annotators in an aggregate, based on some externally
> supplied
> >>>>>>> configuration data, I'd like to propose something along these
> lines:
> >>>>>>>
> >>>>>>> * Add to the existing element <delegateAnalysisEngine
> key="[String]"> one or two
> >>>>>>> additional attributes.  One would be "skip=${xxx}" and the other
> would be its
> >>>>>>> inverse (for improved readability, only, not logically needed):
> "run=${xxx}",
> >>>>>>> where the value of the parameter would need to be "true" or
> "false" (or "yes" or
> >>>>>>> "no").
> >>>>>>>
> >>>>>>> The parameter could be written literally as "true", etc., but also
> could be
> >>>>>>> written using the standard variable naming syntax used elsewhere
> in the
> >>>>>>> descriptors, and would be resolved by settings in the now-standard
> "external
> >>>>>>> overrides" files used by UIMA.  This means that the external
> overrides would
> >>>>>>> continue to be a place where all of the specific configuration
> info for a
> >>>>>>> particular "run" could be placed, together.
> >>>>>>>
> >>>>>>> The implementation would do nothing new if the parameters were
> indicating to run
> >>>>>>> the delegate, but if they were indicating it should be skipped or
> not run, then
> >>>>>>> the effect would be as if the delegate had been edited out of the
> xml descriptor.
> >>>>>>>
> >>>>>>> This would satisfy some pleas from some user groups for help in
> managing their
> >>>>>>> descriptors across various related experiments.
> >>>>>>>
> >>>>>>> An example: a user might have a delegate which came in two forms:
> one to run
> >>>>>>> "locally", and the other to run "remote".
> >>>>>>>
> >>>>>>> They could then include both descriptors in the aggregate, and
> have only one of
> >>>>>>> them "active", by coding:
> >>>>>>>
> >>>>>>> <delegateAnalysisEngine key="NE-detector"
>  run="NE-Detector-local"> ...
> >>>>>>> </delegateAnalysisEngine>
> >>>>>>> <delegateAnalysisEngine key="NE-detector"
> skip="NE-Detector-local"> ...
> >>>>>>> </delegateAnalysisEngine>
> >>>>>>>
> >>>>>>> WDYT?
> >>>>>>>
> >>>>>>> -Marshall
> >
>
>

Re: A proposal for a slight augmentation of aggregate component descriptors

Reply via email to