Thanks Adam for the detailed response!  The document that stated the number 
processing pipelines vs. the CAS pool size is on page 44 of the UIMA References 
(version 2.2.2).  Has anyone done any empirical test on what would be the best 
ratio of # threads to CAS pool size?  Or any consideration of how to choose the 
number of threads in a CPE?

Your comment on CAS Consumer running on a different thread is new to me.  I 
thought that a CAS Consumer is acting like a cas processor and is driven by the 
same pipeline thread that controls the AEs.  If not, it would definitely cause 
sync problems.  Are CAS consumers also multi-threaded?  How can I determine or 
configure how many threads are used to drive CAS consumers?

Thanks a lot!

Nick

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Adam Lally
Sent: Thursday, April 16, 2009 5:04 PM
To: [email protected]
Subject: Re: Running CPE with multi-threading

Hi,

On Thu, Apr 16, 2009 at 11:22 AM, Duan, Nick <[email protected]> wrote:
> I have a set of annotators bundled as an aggregate AE and configured in
> a CPE. It runs fine with a single thread, but deadlocked with 2 or more
> threads.  The AE was developed without any consideration of
> thread-safety.  I am trying to find out the possible causes of the
> deadlocks, and hope to get answers to the following questions from this
> community:
>
> 1.  When running CPE with multiple threads (e.g. multiple pipelines),
> does each thread instantiate its own annotator objects or AE instance,
> or do all threads share the same instances?  If the former is true, I
> think I don't have to worry about changing each of the annotators to
> make the thread-safe.

Each thread instantiates its own AE instance.  So you don't have to
worry about thread-safety issues within an AE instance, but you still
have to worry about thread-safety for any static data that's shared
across instance.  Try to make sure you don't use any static fields
(other than static final Strings or primitive types), and if you do
absolutely need a static field, make sure all access to it is
synchronized.

> 2.  What's the relationship between the CAS Pool Size and the number of
> threads?  The document indicates that the number of the processing
> pipelines should be equal to or greater than CAS pool size.  I would
> think the opposite should be true.  In one of the examples bundled with
> the UIMA-2.2.2 distribution, the pool size was set to 2 while the number
> of pipes was set to 1.
>

You are right, it sounds like the documentation is wrong.  Where in
the documentation does it say that?  The pool size should be at least
as big as the number of threads, or else you would have idle threads.
I don't think this would cause a deadlock, though.  It is sometimes
useful to have 1 more CAS than you have processing threads, if your
CAS Consumers (which run in a different thread) could benefit from
running concurrently with your Analysis Engines.

 -Adam
This communication, along with any attachments, is covered by federal and state 
law governing electronic communications and may contain company proprietary and 
legally privileged information.  
If the reader of this message is not the intended recipient, you are hereby 
notified that any dissemination, distribution, use or copying of this message 
is strictly prohibited.  
If you have received this in error, please reply immediately to the sender and 
delete this message.  Thank you.

Reply via email to