Re: How to "push" documents into CPE/CollectionReader?

Eddie Epstein Fri, 15 Feb 2008 08:07:31 -0800

>
> However, I think this way I will not be able to benefit from some of the
> frameworks features like exception handling, multithreading etc..., but
> for the
> time being I think I can live without that,
>
> Christoph
>
>
Another approach will shortly be available with the uima-as extension. The
entire CPE can be implemented with all the exception handling of the CPM and
with much more flexible and extensive scalability. With uima-as, the top
level CPE is a standard UIMA aggregate, which may or may not include a CAS
multiplier component acting like a collection reader.


CPE without CM: Create an application that pushes documents (CASes) into the
aggregate using the uima-as client API, which supports both synchronous and
asynchronous interfaces. With the asynchronous interface a simple parameter
determines how many outstanding requests are allowed at the same time,
preventing overload of the aggregate's input queue. After processing the CAS
will be returned to the application, with whatever content is left in them.
This design is good if the overhead of serializing the CAS from the
application to the aggregate is much less than the work done in the
aggregate.

CPE with CM: To minimize framework overhead, the application could send a
CAS to the aggregate containing a pointer to a set of documents. The CM
component would act as a collection reader, creating the CASes with the
individual documents to be processed. Only the original CAS will be returned
to the application, which at this point could have customized status on the
results of processing.

In both cases scale up could be at the aggregate level, with multiple
instances of the aggregate processing client requests, as well as at the
delegate level, with any delegate being replicated in the same process as
the aggregate or itself scaled out across machines.

Eddie

Re: How to "push" documents into CPE/CollectionReader?

Reply via email to