You could write a custom flow controller which checks if CAS represents a segment is A or X and depending on that forwards the corresponding CAS either to the processing components or directly to the end of the pipeline.
-- Richard Am 07.08.2013 um 08:33 schrieb <[email protected]>: > Dear Marshall, > > Consider an input text from which only some parts should be processed. After > processing the text should be there in one piece again. Let A denote parts of > no interest and let b denote parts to analyse further. XAX is split up into > X, A, and X. There is nothing to do for the X segments. A has to be put into > the pipeline. I only know how to use the CAS Multiplier if every segment has > to be processed. But in this case some segments have to be left out. Is there > a way to bypass the pipeline for the X segments? How to do the splitting and > combining? > > Cheers, > Armin > > > -----Ursprüngliche Nachricht----- > Von: Marshall Schor [mailto:[email protected]] > Gesendet: Mittwoch, 7. August 2013 02:51 > An: [email protected] > Betreff: Re: Processing a List of Strings with UIMA Addons components > > > On 8/6/2013 6:10 PM, Mathaeus Dejori wrote: >> Hi, >> >> I'd like to use UIMA AS to annotate a large list of text segments. >> Instead of passing each text segment individually to the >> AnalysisEngine I'd like to pass the entire list at once. >> >> As far as I understand I can use the cas.setSofaDataArray() to pass a >> list of Strings and get back Annotations that refer to particular segments. >> However, in doing so I won't be able to use any of the existing >> Annotators (e.g. Concept Mapper) as their process(cas, spec) function >> expects the cas.getDocumentText(). >> >> Is there a design pattern for uima to consume a list of strings, pass >> individual elements to specific Annotators and combine all the results >> at the end? > If what you are trying to do is to take an input CAS which has a bunch of > "strings" and send each one thru a pipeline, the normal UIMA design pattern > for that is to use a CAS Multiplier at the start which gets as input the CAS > with all the strings, and then puts each one into another CAS and send it > through the > pipeline. If the combining you want to do is to combine all the results into > another CAS, then you can use another CAS Multiplier at the end which > receives the individual string CASes, and accumulates results until all the > parts are done, and then outputs a "result" CAS with the combined result. > > See > http://uima.apache.org/d/uimaj-2.4.1/tutorials_and_users_guides.html#ugr.tug.cm > > -Marshall
