Hi Chuck,

Great questions. The major issue that makes UIMA AS somewhat hard to
understand is that UIMA AS, although advertised as a scale out
framework, is lacking life cycle management for processes. It has so
far been focused on the details of interconnecting UIMA compliant
components in multi-threaded and multi-process configurations and on
error handling.

On Mon, Aug 15, 2011 at 5:29 PM, Charles Bearden
<[email protected]> wrote:
> We have used UIMA as a CPE to run several fairly simple pipelines, including
> some using cTAKES components [1]. UIMA AS is billed as "the next generation
> scalability replacement for the Collection Processing Manager (CPM)", and
> I'm trying to wrap my head around it by using it for some of the tasks we
> did previously with CPEs and the CPM.
>
> Neither the Getting Started [2] nor the UIMA AS user manual [3] cover the
> practicalities of deploying asynchronous pipelines, so I'm relying on the
> README that comes with uima-as-2.3.1-bin.tar.gz. If there is a better
> document to work from, please let me know :-) UIMA is my first exposure to a
> Big Java Framework, so my knowledge & intuitions about it are not deep.
>
> It looks to me as if there are two basic patterns:
> (1) start the broker ('startBroker.sh'), and then
> (2) use 'runRemoteAsyncAE.sh' to both connect the CR with the queue via the
> '-c' argument and to deploy the AS AEs via the '-d' flag; or
>
> (1) start the broker ('startBroker.sh');
> (2) deploy one or more instances of the AS AE with 'deployAsyncService.sh',
> and then
> (3) use 'runRemoteAsyncAE.sh' to both connect the CR with the queue via the
> '-c' argument.
>
> Do I have this right?

The first pattern is basically a "getting started" example, and the
second typical for larger deployments.

RunRemoteAsyncAE.java is sample application code and useful tool for
exercising services. UIMA_Service.java, the program called by
deployAsyncService, is a useful tool and sample code for deploying
services; for example it can easily be adapted into a servlet
container.

>
> One challenge we face is that some essential third-part components are not
> thread-safe, and so it looks to me as if I'll have to scale out instances of
> those components by deploying them in their own JVMs and not by means of a
> single deployment with
>
>  <scaleout numberOfInstances="20"/>
>
> in the deployment descriptor.

Right, non thread-safe components are simply scaled out as multiple
processes all pulling from the same queue. Multi-thread scaling is
more essential for vertical scale out of analytics sharing large
in-memory objects.

Eddie

Reply via email to