Hello Alex, you could consider using a CasMultiplier. The name is misleading in this case, it can also function as a "merger". The principle is simple. It receives CASes and it produces CASes but it can decide to produce multiple output CASes for single input CAS or it can choose to produce an output CAS only in certain cases. The the latter case it can server as a "merger". There would need to be some metadata (some dedicated FeatureStructure) in the CAS that your CasMultiplier-based component can use to decide if a output CAS should be created.
Cheers, Richard Am 16.09.2011 um 10:43 schrieb Alexander Klenner: > Hello, > > I have a question concerning the merging of different UIMA pipelines. Say I > have 3 different annotators that work on the same document (The CAS sofa data > is identical for each of the pipelines) They do this parallel and all of them > produce different annotations but in a sofa with the same name(_textView). > Finally I have 3 serialized XCAS files in three different folders, coming > from different nodes of a cluster. > > Is there an UIMA conform way to merge the corresponding xml files into one > CAS object that has all the annotations of the three separate files? I could > easily do this with a non uima java class that just adds all the annotation > information into one file. Since the sofa data is the same, the offset > information of the annotations will be correct, but I'd rather stay in the > UIMA context. > > Cheers, > > Alex -- ------------------------------------------------------------------- Richard Eckart de Castilho Technical Lead Ubiquitous Knowledge Processing Lab FB 20 Computer Science Department Technische Universität Darmstadt Hochschulstr. 10, D-64289 Darmstadt, Germany phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117 [email protected] www.ukp.tu-darmstadt.de Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de -------------------------------------------------------------------
