Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Eddie Epstein Mon, 30 Jul 2012 08:28:25 -0700

Hi Richard,

The _InitialView is different in order to maintain compatibility with
UIMA applications that predated Views. For example, an application
creates a CAS and then operates on it without creating any Views.


Regards,
Eddie

On Mon, Jul 30, 2012 at 4:42 AM, Richard Eckart de Castilho
<[email protected]> wrote:
> Hello Eddie,
>
> I did not try it (yet), but I agree that this should work. While I understand 
> your argumentation, my subjective feeling
> is that the naming of SofAs at the pipeline level and at the AE level should 
> be completely independent and the
> mapping flexible. I think the _InitialView should not receive a special 
> treatment in this context.
>
> I'll get back if I run into really substantial problems or if your suggestion 
> should  not work out.
>
> Thanks!
>
> -- Richard
>
> Am 17.06.2012 um 18:11 schrieb Eddie Epstein:
>
>> Richard,
>>
>> Non-default views are currently created by application code, not by
>> the framework. The absence of an expected view is more clearly
>> diagnostic than the highly varied errors that would come if the
>> framework automatically created a view.
>>
>> Sofa mapping is intended to solve your scenario by having the CR fill
>> the default _IntialView and then mapping view A to the _InitialView
>> for the analyzer. When analyzer asks for view(A) it would get
>> _InitialView.
>>
>> Did you try this?
>>
>> Eddie
>>
>>
>> On Fri, Jun 15, 2012 at 5:36 PM, Richard Eckart de Castilho
>> <[email protected]> wrote:
>>> Am 11.06.2012 um 20:11 schrieb Eddie Epstein:
>>>
>>>> Can you be a bit more explicit what the failing scenario is?
>>>
>>> Take a scenario where you need want to access the CASes produced by an 
>>> aggregate pipeline directly - no CAS consumer, but you want to use a reader 
>>> to fill the CASes (this is what is implemented in the demo below).
>>>
>>> Now add the need for sofa mapping to that scenario, because you want to run 
>>> a complex analysis. The collection reader is not sofa aware, but you do 
>>> want it to write to some view "A" instead of writing to the "_initialView", 
>>> because "A" is what the next component will process. This is possible now, 
>>> because in the AnalysisEngineDescription I can declare sofa mappings for 
>>> the reader. However, I would get an exception due to UIMA-2419.
>>>
>>>> I'm definitely confused by wrapping a CR in an AE descriptor. Is it
>>>> possible to paste here an aggregate descriptor using sample components
>>>> from the UIMA SDK that demonstrates the problem?
>>>
>>> So here is the demo of wrapping a CR in an AE - no sofa mappings here 
>>> because they would cause an exception. The SimpleReader
>>> creates a single CAS and set the text, the SimpleAnalyzer additionally sets 
>>> the document language. It's a very basic example.
>>> The full runnable sources are at
>>>
>>> http://code.google.com/p/uimafit/source/browse/trunk/uimaFIT/src/test/java/org/uimafit/factory/AggregateWithReaderTest.java
>>>
>>> /**
>>>  * Demo of disguising a reader as a CAS multiplier. This works because 
>>> internally, UIMA wraps
>>>  * the reader in a CollectionReaderAdapter. This nice thing about this is, 
>>> that in principle
>>>  * it would be possible to define sofa mappings. However, UIMA-2419 
>>> prevents this.
>>>  */
>>> @Test
>>> public void demoAggregateWithDisguisedReader() throws UIMAException {
>>>  ResourceSpecifierFactory factory = 
>>> UIMAFramework.getResourceSpecifierFactory();
>>>
>>>  AnalysisEngineDescription reader = 
>>> factory.createAnalysisEngineDescription();
>>>  reader.getMetaData().setName("reader");
>>>  reader.setPrimitive(true);
>>>  reader.setImplementationName(SimpleReader.class.getName());
>>>  
>>> reader.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
>>>
>>>  AnalysisEngineDescription analyzer = 
>>> factory.createAnalysisEngineDescription();
>>>  analyzer.getMetaData().setName("analyzer");
>>>  analyzer.setPrimitive(true);
>>>  analyzer.setImplementationName(SimpleAnalyzer.class.getName());
>>>
>>>  FixedFlow flow = factory.createFixedFlow();
>>>  flow.setFixedFlow(new String[] { "reader", "analyzer" });
>>>
>>>  AnalysisEngineDescription aggregate = 
>>> factory.createAnalysisEngineDescription();
>>>  aggregate.getMetaData().setName("aggregate");
>>>  aggregate.setPrimitive(false);
>>>  aggregate.getAnalysisEngineMetaData().setFlowConstraints(flow);
>>>  
>>> aggregate.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
>>>  aggregate.getAnalysisEngineMetaData().getOperationalProperties()
>>>      .setMultipleDeploymentAllowed(false);
>>>  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("reader", 
>>> reader);
>>>  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("analyzer", 
>>> analyzer);
>>>
>>>  AnalysisEngine pipeline = UIMAFramework.produceAnalysisEngine(aggregate);
>>>  CasIterator iterator = 
>>> pipeline.processAndOutputNewCASes(pipeline.newCAS());
>>>  while (iterator.hasNext()) {
>>>    CAS cas = iterator.next();
>>>    System.out.printf("[%s] is [%s]%n", cas.getDocumentText(), 
>>> cas.getDocumentLanguage());
>>>  }
>>> }
>
> --
> -------------------------------------------------------------------
> Richard Eckart de Castilho
> Technical Lead
> Ubiquitous Knowledge Processing Lab (UKP-TUD)
> FB 20 Computer Science Department
> Technische Universität Darmstadt
> Hochschulstr. 10, D-64289 Darmstadt, Germany
> phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
> [email protected]
> www.ukp.tu-darmstadt.de
> Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
> -------------------------------------------------------------------
>
>
>
>
>
>

Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Reply via email to