On 6/4/07, KANO, Yoshinobu <[EMAIL PROTECTED]> wrote:
# org.apache.uima.flow.ParallelStep is not included in the Apache UIMA
2.1.0 release, is it?


No, it was added after the release was cut.  It will be in 2.2.

Let me explain my problem again, a bit more precisely.
My propose is both the latency and the throughput.


To reduce the latency, we need the ParallelStep with real concurrent
(multi-threaded) process, as you wrote.
In this case, our purpose is a sort of demonstrations.
I understood the current implementation by your explanation
and it is enough for me about the latency problem.
I will just wait for the real concurrent implementation to be done
someday.


About the throughput issue, please assume that we have a multi-core/CPU
machine or remote machines as web services.

a. When the resource is multi-core/CPU/node,
does Apache UIMA Flow make a new thread for each AnalysisEngine?
Or always a single thread for an entire work flow?


If you are using a Collection Processing Engine, you specify the
number of processing pipelines in the CPE Descriptor's
"processingUnitThreadCount" attribute.  This lets you utilize your
multiple cores.  If you are not using a Collection Processing Engine,
see (b) below.

b. There is a class named "MultiprocessingAnalysisEngine_impl".
Does it mean that it can start processing another CAS before finishing a
previous CAS?
In other words, does it mean that this AnalysisEngine is multi-threaded
and can process two or more CASes simultaneously?


The MultiprocessingAnalysisEngine_impl internally keeps a pool of AEs,
each of which processes one CAS at a time.  Therefore the whole
MultiprocesingAnalysisEngine can process multiplie CASes at the same
time.  Note that you don't construct this class directly - instead
call UIMAFramework.produceAnalysisEngine and pass the optional
argument that says how many concurrent requests you need to be able to
process.  See "Multi-threaded Applications" in the UIMA Tutorials and
Users Guides book for more information.

-Adam

Reply via email to