You could write a custom flow controller which checks if CAS represents a 
segment is A or X and depending on that forwards the corresponding CAS either 
to the processing components or directly to the end of the pipeline. 

-- Richard

Am 07.08.2013 um 08:33 schrieb <[email protected]>:

> Dear Marshall,
> 
> Consider an input text from which only some parts should be processed. After 
> processing the text should be there in one piece again. Let A denote parts of 
> no interest and let b denote parts to analyse further. XAX is split up into 
> X, A, and X. There is nothing to do for the X segments. A has to be put into 
> the pipeline. I only know how to use the CAS Multiplier if every segment has 
> to be processed. But in this case some segments have to be left out. Is there 
> a way to bypass the pipeline for the X segments? How to do the splitting and 
> combining?
> 
> Cheers,
> Armin
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Marshall Schor [mailto:[email protected]] 
> Gesendet: Mittwoch, 7. August 2013 02:51
> An: [email protected]
> Betreff: Re: Processing a List of Strings with UIMA Addons components
> 
> 
> On 8/6/2013 6:10 PM, Mathaeus Dejori wrote:
>> Hi,
>> 
>> I'd like to use UIMA AS to annotate a large list of text segments. 
>> Instead of passing each text segment individually to the 
>> AnalysisEngine I'd like to pass the entire list at once.
>> 
>> As far as I understand I can use the cas.setSofaDataArray() to pass a 
>> list of Strings and get back Annotations that refer to particular segments.
>> However, in doing so I won't be able to use any of the existing 
>> Annotators (e.g. Concept Mapper) as their process(cas, spec) function 
>> expects the cas.getDocumentText().
>> 
>> Is there a design pattern for uima to consume a list of strings, pass 
>> individual elements to specific Annotators and combine all the results 
>> at the end?
> If what you are trying to do is to take an input CAS which has a bunch of 
> "strings" and send each one thru a pipeline,  the normal UIMA design pattern 
> for that is to use a CAS Multiplier at the start which gets as input the CAS 
> with all the strings, and then puts each one into another CAS and send it 
> through the
> pipeline.   If the combining you want to do is to combine all the results into
> another CAS, then you can use another CAS Multiplier at the end which 
> receives the individual string CASes, and accumulates results until all the 
> parts are done, and then outputs a "result" CAS with the combined result.
> 
> See 
> http://uima.apache.org/d/uimaj-2.4.1/tutorials_and_users_guides.html#ugr.tug.cm
> 
> -Marshall

Reply via email to