On 04.02.2014, at 16:44, Luca Foppiano <l...@foppiano.org> wrote: > On Tue, Feb 4, 2014 at 4:14 PM, Luca Foppiano <l...@foppiano.org> wrote: > I've tried to specify two different View names, or the same (input/output >> views) but without success. It seems that either the mapping is not >> effective or I'm doing something wrong. >> >> If you could quickly have a look, here is what I've changed: >> >> https://github.com/lfoppiano/uima-fit-sample-pipeline/commit/148ef74601d28d2c3781786160121c94dde487dd > > Apologize for the high amount of emails. > > Might be that I in the Dictionary Annotator do I have to use the > @SofaCapability to enable it? > > If so, how I could possibly integrate a AE that I don't have control over > the code, into uima fit?
Adding a @SofaCapability should only have one effect: UIMA will not try to supply the default view "_InitialView" (constant: CAS.NAME_DEFAULT_SOFA) to the AE, but rather expect that the AE fetches the views it needs from the CAS. If you to not specify a @SofaCapability, then UIMA should supply the default view CAS.NAME_DEFAULT_SOFA to the AE. Let's look at your project: You read a TEI file with annotations in to the CAS. Then you create a new view (SOFA_NAME_TEXT_ONLY) containing only the text from the TEI file - no markup. AnalysisEngineDescription whitespaceEngine = createEngineDescription( WhitespaceTokenizer.class, "SofaNames", new String[]{SimpleParserAE.SOFA_NAME_TEXT_ONLY}); It looks like the SofaNames parameter for the WhitespaceTokenizer should be used when multiple views are to be processed at once. This parameter allows to have a single tokenizer in the pipeline to affect multiple views. With view mappings, the tokenizer would need to be added to the pipeline once per view. Instead of using this parameter, you could also use a mapping. Finally, you write the result out. Now to the view: The reader loads the TEI data into the default view (CAS.NAME_DEFAULT_SOFA). The SimpleParserAE fetches the data from the default view and stores it into SOFA_NAME_TEXT_ONLY. The WhitespaceTokenizer operates on the SOFA_NAME_TEXT_ONLY (currently via parameter). The DictionaryAnnotator knows nothing about views - thus it operates on the default view (CAS.NAME_DEFAULT_SOFA). The consumer explicitly fetches the SOFA_NAME_TEXT_ONLY in its code and works on that. Now to the mappings. Currently you have this mapping: builder.add(preparationEngine); builder.add(whitespaceEngine); builder.add(dictionaryEngine, SimpleParserAE.SOFA_NAME_TEXT_ONLY, SimpleParserAE.SOFA_NAME_TEXT_ONLY); This means that view SOFA_NAME_TEXT_ONLY is renamed to SOFA_NAME_TEXT_ONLY for the dictionaryEngine (so actually this has no effect at all). All other AEs have no mappings. The correct mapping for the dictionaryEngine should be builder.add(dictionaryEngine, CAS.NAME_DEFAULT_SOFA, SimpleParserAE.SOFA_NAME_TEXT_ONLY); so the SOFA_NAME_TEXT_ONLY is supplied as the default view to the dictionaryEngine. Similarly, it should be possible to remove the view parameter from whitespaceEngine and the getView call from the consumer and use these mappings: builder.add(preparationEngine); builder.add(whitespaceEngine, CAS.NAME_DEFAULT_SOFA, SimpleParserAE.SOFA_NAME_TEXT_ONLY); builder.add(dictionaryEngine, CAS.NAME_DEFAULT_SOFA, SimpleParserAE.SOFA_NAME_TEXT_ONLY); builder.add(casConsumer, CAS.NAME_DEFAULT_SOFA, SimpleParserAE.SOFA_NAME_TEXT_ONLY); I didn't actually try to modify your code and run this, because your code uses absolute paths. Cheers, -- Richard