Re: A proposal for a slight augmentation of aggregate component descriptors

Richard Eckart de Castilho Thu, 26 Sep 2013 14:39:44 -0700

On 26.09.2013, at 23:28, Marshall Schor <[email protected]> wrote:

> I think there's a tradeoff when using Specifications - they're more clear when
> they have the information locally, and harder to understand when they point to
> an unknown arbitrary thing.


It is interesting you mention this, because the documentation clearly states

"As with the delegateAnalysisEngine element, the flowController element may 
contain either a complete flowControllerDescription or an import, but the 
import is recommended."

(Source: 
http://uima.apache.org/d/uimaj-2.4.2/references.html#ugr.ref.xml.component_descriptor.flow_controller)

Your statement also appears to contradict the idea of a configuration of a 
specifier via external variables in the first place, as these contain 
information that is not locally available.

> Generally, the UIMA spec design philosophy have tried to encourage community 
> and
> part-interoperability by leaning toward making things more transparent / 
> obvious.

I don't understand this statement. How does the community come in here? I see 
how type system specifiers help interoperability, but I don't actually see this 
too much for component specifiers. They appear to be more deployment specifiers 
than anything else.

-- Richard

> -Marshall


> On 9/26/2013 5:06 PM, Richard Eckart de Castilho wrote:
>> Another alternative could even be to control the import to point the desired 
>> flow:
>> 
>> <flowController key="[String]">
>>    <import location="${xxx}"/>
>> </flowController>
>> 
>> That would completely remove the need for any skipping attributes and work 
>> without dynamically generated descriptors.
>> 
>> -- Richard
>> 
>> On 26.09.2013, at 23:00, Richard Eckart de Castilho <[email protected]> wrote:
>> 
>>> Not the controller, but its configuration. The skipping is clearly 
>>> affecting the flow. So why not add something to the flowConstraints, e.g.:
>>> 
>>> <flowConstraints>
>>> <fixedFlow>
>>>   <node>[String]</node>
>>>   <node>[String]</node>
>>>   ...
>>> </fixedFlow>
>>> <skip>
>>>   <node>[String]</node>
>>>   <node>[String]</node>
>>>   ...
>>> </skip>
>>> </flowConstraints>
>>> 
>>> or 
>>> 
>>> <flowConstraints>
>>> <fixedFlow>
>>>   <node skip="true">[String]</node>
>>>   <node>[String]</node>
>>>   ...
>>> </fixedFlow>
>>> </flowConstraints>
>>> 
>>> Personally, I'd not make any modifications to the descriptor at all, but 
>>> rather would just skip the delegate when programmatically creating the 
>>> descriptor. We do that all the time in our experiments. But if that is for 
>>> some reason not an option and the extension is a strong requirement, the 
>>> change should at least be made at the location that conceptually makes most 
>>> sense (imho). 
>>> 
>>> @Marshall: do you want to provide some more background why you do not 
>>> simply create the descriptors programmatically and externalize this 
>>> skipping, including, etc. into your experimental setup?
>>> 
>>> -- Richard
>>> 
>>> On 26.09.2013, at 22:53, Peter Klügl <[email protected]> wrote:
>>> 
>>>> Am 26.09.2013 22:51, schrieb Richard Eckart de Castilho:
>>>>> I believe this is a concern of the flow controller and should not be 
>>>>> configured on the delegates, but rather within the flow controller 
>>>>> configuration.
>>>> That was also my first guess, but do you really wanna touch or change the 
>>>> flow controller for just skipping a component?
>>>> 
>>>> Peter
>>>> 
>>>>> -- Richard
>>>>> 
>>>>> On 26.09.2013, at 17:23, Marshall Schor <[email protected]> wrote:
>>>>> 
>>>>>> To handle the use cases briefly described on the user list for 
>>>>>> selectively
>>>>>> skipping some annotators in an aggregate, based on some externally 
>>>>>> supplied
>>>>>> configuration data, I'd like to propose something along these lines:
>>>>>> 
>>>>>> * Add to the existing element <delegateAnalysisEngine key="[String]"> 
>>>>>> one or two
>>>>>> additional attributes.  One would be "skip=${xxx}" and the other would 
>>>>>> be its
>>>>>> inverse (for improved readability, only, not logically needed): 
>>>>>> "run=${xxx}",
>>>>>> where the value of the parameter would need to be "true" or "false" (or 
>>>>>> "yes" or
>>>>>> "no").
>>>>>> 
>>>>>> The parameter could be written literally as "true", etc., but also could 
>>>>>> be
>>>>>> written using the standard variable naming syntax used elsewhere in the
>>>>>> descriptors, and would be resolved by settings in the now-standard 
>>>>>> "external
>>>>>> overrides" files used by UIMA.  This means that the external overrides 
>>>>>> would
>>>>>> continue to be a place where all of the specific configuration info for a
>>>>>> particular "run" could be placed, together.
>>>>>> 
>>>>>> The implementation would do nothing new if the parameters were 
>>>>>> indicating to run
>>>>>> the delegate, but if they were indicating it should be skipped or not 
>>>>>> run, then
>>>>>> the effect would be as if the delegate had been edited out of the xml 
>>>>>> descriptor.
>>>>>> 
>>>>>> This would satisfy some pleas from some user groups for help in managing 
>>>>>> their
>>>>>> descriptors across various related experiments.
>>>>>> 
>>>>>> An example: a user might have a delegate which came in two forms: one to 
>>>>>> run
>>>>>> "locally", and the other to run "remote".
>>>>>> 
>>>>>> They could then include both descriptors in the aggregate, and have only 
>>>>>> one of
>>>>>> them "active", by coding:
>>>>>> 
>>>>>> <delegateAnalysisEngine key="NE-detector"  run="NE-Detector-local"> ...
>>>>>> </delegateAnalysisEngine>
>>>>>> <delegateAnalysisEngine key="NE-detector" skip="NE-Detector-local"> ...
>>>>>> </delegateAnalysisEngine>
>>>>>> 
>>>>>> WDYT?
>>>>>> 
>>>>>> -Marshall

Re: A proposal for a slight augmentation of aggregate component descriptors

Reply via email to