Am 10.06.2012 um 19:03 schrieb Marshall Schor: > Hmmm, it seems to me that something is wrong if a UIMA pipeline ended up > sending a CAS to a sofa-unaware component without a default view having been > set up. I would guess that in this situation, it would be better to throw an > exception rather than hide this by automatically creating the view. If a > missing view is created, its subject-of-analysis would be left unset? I'm > guessing that most sofa-unaware annotators would not expect that, and would > fail in mysterious ways. > > What would be the use cases where it would be more valuable to create the > view, rather than signal something's amiss?
My use-case is an aggregate analysis engine that uses a CollectionReader as its first component (a CasMultiplier may also work, I didn't test that). UIMA doesn't support sofa mappings for readers other than in CPEs (or I missed something). We would like to add support for sofa-mapped readers in uimaFIT though and would like to do so implementing as little infrastructure as possible on top of UIMA. Ideally, we'd just cleverly configure UIMA to get the feature implemented. So, to work around that fact that CollectionReaderDescriptions do not support sofa mappings, I configured an AnalysisEngineDescription for a CollectionReader. UIMA internally doesn't really care much which kind of processing component is declared in an AnalysisEngineDescription, because internally it is all handled the same. I dimly remember a post to one of the UIMA mailing lists saying that the distinction between readers, analysis engines and consumers is largely arbitrary and that everything could be done with CasMultipliers as well. So when I run the aggregate, the collection reader tries to write data to some mapped sofa, but the sofa does not yet exist. The reader is not sofa-aware, so it shouldn't have to create its initial view itself. If I use a sofa-unaware CasMultiplier instead, I suppose the same thing will happen. The reader/CasMultiplier would set the sofa of course, but since it is sofa-unaware, it wouldn't create the view. I guess another option should be to change CollectionReaderAdapter to create any missing initial view for sofa-unaware readers. That would not have any side other component type and it would solve the problem for my use-case as well. The problem is, that doesn't work, because the PrimitiveAnalysisEngine_impl.classAnalysisComponentProcess() already tries to access the mapped view and fails. Changing that to test if the mAnalysisComponent is a sofa-unaware CollectionReaderAdapter and creating a new view only in that case looks rather like a hack to me, although it would probably resolve the situation. I didn't test that yet, but if you think it reasonable, I can check it. Actually, thinking about it, I wonder if missing views should not be created on the first request in general. I have several times seen people use some helper methods that try to get a view and if an exception is thrown create the view and return it. Or maybe it'd make sense to simply add the possibility to declare sofa mappings to the CollectionReaderDescription. -- Richard -- ------------------------------------------------------------------- Richard Eckart de Castilho Technical Lead Ubiquitous Knowledge Processing Lab (UKP-TUD) FB 20 Computer Science Department Technische Universität Darmstadt Hochschulstr. 10, D-64289 Darmstadt, Germany phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117 [email protected] www.ukp.tu-darmstadt.de Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de -------------------------------------------------------------------
