Hi,
On 6/4/07, KANO, Yoshinobu <[EMAIL PROTECTED]> wrote:
Hi,
It seems that the current Apache UIMA API does not have a way
to define concurrent workflows, or in other words "MultiStep" sort of steps.
Are there any way to make a workflow branch and join again?
There is a ParallelStep, which your custom Flow Controller can return.
However, the runtime does not currently invoke multiple components
concurrently (if you return a ParallelStep it will just choose an
arbitrary serial order). Note that if you are processing a collection
and concerned with getting high throughput, you generally don't need
parallel execution. Parallel execution is usually only important when
you want to reduce the latency of processing an individual CAS.
There is another issue around the concurrent processing.
In most cases, a CAS does not depend on another CAS,
a UIM analytic uses information of a single CAS when it processes the
very CAS.
Does Apache UIMA have a sort of flags to show inter-CAS dependencies?
I'm not sure but the workflow pipeline can process two or more CASes at
the same time in the current implementation?
An Annotator can only process one CAS at a time. CAS Consumers are
generally used to do collection-level processing, where the CAS
Consumer looks at multiple CASes (one at a time) and builds some
aggregate data structure.
If you need a component that can have multiple CASes at once AND
update them, you can sort-of accomplish this with a CAS Multiplier,
but it is tricky. The CAS multiplier would need to take in multiple
CASes and output multiplie *new* CASes, which requires copying data
between CASes.
Can you say more about what you would use this feature for?
-Adam