Hi, On Thu, Apr 16, 2009 at 11:22 AM, Duan, Nick <[email protected]> wrote: > I have a set of annotators bundled as an aggregate AE and configured in > a CPE. It runs fine with a single thread, but deadlocked with 2 or more > threads. The AE was developed without any consideration of > thread-safety. I am trying to find out the possible causes of the > deadlocks, and hope to get answers to the following questions from this > community: > > 1. When running CPE with multiple threads (e.g. multiple pipelines), > does each thread instantiate its own annotator objects or AE instance, > or do all threads share the same instances? If the former is true, I > think I don't have to worry about changing each of the annotators to > make the thread-safe.
Each thread instantiates its own AE instance. So you don't have to worry about thread-safety issues within an AE instance, but you still have to worry about thread-safety for any static data that's shared across instance. Try to make sure you don't use any static fields (other than static final Strings or primitive types), and if you do absolutely need a static field, make sure all access to it is synchronized. > 2. What's the relationship between the CAS Pool Size and the number of > threads? The document indicates that the number of the processing > pipelines should be equal to or greater than CAS pool size. I would > think the opposite should be true. In one of the examples bundled with > the UIMA-2.2.2 distribution, the pool size was set to 2 while the number > of pipes was set to 1. > You are right, it sounds like the documentation is wrong. Where in the documentation does it say that? The pool size should be at least as big as the number of threads, or else you would have idle threads. I don't think this would cause a deadlock, though. It is sometimes useful to have 1 more CAS than you have processing threads, if your CAS Consumers (which run in a different thread) could benefit from running concurrently with your Analysis Engines. -Adam
