[ 
https://issues.apache.org/jira/browse/UIMA-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Eckart de Castilho updated UIMA-2812:
---------------------------------------------
    Fix Version/s: 2.3.0uimaFIT

> Support ResultSpecification
> ---------------------------
>
>                 Key: UIMA-2812
>                 URL: https://issues.apache.org/jira/browse/UIMA-2812
>             Project: UIMA
>          Issue Type: New Feature
>          Components: uimaFIT
>            Reporter: Richard Eckart de Castilho
>             Fix For: 2.3.0uimaFIT
>
>
> Provide support for controlling the output of a component using a 
> ResultSpecification. Consider the e.g. use-case that a component can produce 
> a "PartOfSpeech" annotation, but it should not, because another component in 
> the same pipeline has already produced that or will later produce it. Here 
> some pseudocode:
> {noformat}
> AnalysisEngineDescription aed = createPrimitiveDescription(Parser.class);
> // Tell Parser not to produce PartOfSpeech annotations
> ResultUtil.removeType(aed, PartOfSpeech.class);
> {noformat}
> *How to "remove" a type?* UIMA requires that a ResultSpecification contains 
> _all_ the types that the component produces, which would normally requiring 
> to add all types except the ones that should not be produced. uimaFIT has 
> access to _capability_ annotations, which it could use to pre-fill a result 
> specification with all the types that a component could produce, allowing the 
> user to conveniently remove the ones not required.
> *How to transport the information?* Unfortunately, there appears to be no way 
> to store the ResultSpecification as part of an _AnalysisEngineDescription_. 
> As far as I can see, UIMA has two ways to control the ResultSpecification for 
> a component:
> * via the components _capabilities_
> * via a parameter passed to the _AnalysisEngine.process_ method (or via 
> _setResultSpecification_) 
> There are two scenarios I can imagine: 
> * _at description time_: changes to the result specification are added to the 
> descriptor.
> ** Add the ResultSpecification to the component descriptor -- unfortunately 
> is not supported by UIMA.
> ** Change the _capabilities_. E.g. uimaFIT creates an AE descriptor with the 
> capabilities filled in, then one could add or remove types/features there.
> * _at runtime_: uimaFIT could be used to acquire an initial 
> ResultSpecification from the annotation on the AE class, which can then be 
> modified to add/remove types/features. The final specification needs to be 
> passed in some way into the pipeline execution code
> ** _along with the component descriptor_: pairs of {descriptor, resultspec} 
> needed to be passed to the pipeline execution code (e.g. SimplePipeline), 
> making the API more complex.
> ** _as part of already instantiated components_: in case of SimplePipeline, 
> there are also non-descriptor-based methods that could be used, in which case 
> the result specifications could be set on each component individually before 
> passing them into the pipeline code.
> *Does it fit into the uimaFIT concept?* So far, it was possible to implement 
> uimaFIT in such a way that all information pertaining to the component 
> configuration could be reflected, configured, and stored in a descriptor, so 
> that any UIMA execution engine could then pick up the descriptor and execute 
> the component as it was configured. UIMA appears to be lacking the concept of 
> a ResultSpecification as part of the descriptors. In particular, that seems 
> to affect ability to configure results within aggregate analysis engines.
> *Conclusion* Since a ResultSpecification cannot be stored in a descriptor, 
> the next best thing appears to be adding some convenience methods to change 
> the reflected capabilities in the descriptor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to